regex replace first element - javascript

I have the need to replace a HTML string's contents from one <br> to two. But what I can't achieve is when I have one tag following another one:
(<br\s*\/?>)
will match all the tags in this text:
var text = 'text<BR><BR>text text<BR>text;'
will match and with the replace I will have
text = text.replace.replace(/(<br\s*\/?>)>/gi, "<BR\/><BR\/>")
console.log(text); //text<BR/><BR/><BR/><BR/>text text<BR/><BR/>text;"
Is there a way to only increment one tag with the regex? And achieve this:
console.log(text); //text<BR/><BR/><BR/>text text<BR/><BR/>text;"
Or I only will achieve this with a loop?

You may use either
var text = 'text<BR><BR>text text<BR>text;'
text = text.replace(/(<br\s*\/?>)+/gi, "$&$1");
console.log(text); // => text<BR><BR><BR>text text<BR><BR>text;
Here, (<br\s*\/?>)+/gi matches 1 or more sequences of <br>s in a case insensitive way while capturing each tag on its way (keeping the last value in the group beffer after the last it, and "$&$1" will replace with the whole match ($&) and will add the last <br> with $1.
Or
var text = 'text<BR><BR>text text<BR>text;'
text = text.replace(/(?:<br\s*\/?>)+/gi, function ($0) {
return $0.replace(/<br\s*\/?>/gi, "<BR/>") + "<BR/>";
})
console.log(text); // => text<BR/><BR/><BR/>text text<BR/><BR/>text;
Here, the (?:<br\s*\/?>)+ will also match 1 or more <br>s but without capturing each occurrence, and inside the callback, all <br>s will get normalized as <BR/> and a <BR/> will get appended to the result.

You can use negative look ahead (<br\s*\/?>)(?!<br\s*\/?>)/ to increment only the last tag if there are any consecutive:
var text = 'text<BR><BR>text text<BR>text;'
text = text.replace(/(<br\s*\/?>)(?!<br\s*\/?>)/gi, "<BR\/><BR\/>")
console.log(text);

Related

Replace first occurrence with one value and second with another

Giving this original string...
Test text _with bold_ and perhaps one another text _with bold in the same string_.
... how to efficiently replace the first occurrence of " _ " with "< b >" and the second occurrence " _ " with "< /b >" to achieve the following result:
Test text <b>with bold</b> and perhaps one more text <b>with bold in the same string</b>.
Note: I have an array of hundreds of those strings that will need to go through this process in order to render in the page.
You can use regex for this.
The replace-pattern is the following:
_(.*?)_ with the flag g at the end - so it will replace until all occurances are satisfied.
The ? in the regex says it will stop matching at the first _ afert the opening _ (non-greedy).
<b>$1</b> says replace the matched string with this. Where the $1 refers to the content matched in the brackets ()
var text = "This is _bold text_ and here _some more_";
var text_replaced = text.replace(/_(.*?)_/g, '<b>$1</b>');
document.getElementById('result').innerHTML = text_replaced;
<span id="result" />
You can run a while loop which checks if there any more underscores in the text and replaces them, assuming that there must be an even number of "_" in the text:
var test = "text _with bold_ and perhaps one another text _with bold in the same string_.";
b_index = test.indexOf("_");
while (b_index != -1) {
test = test.replace("_", "<b>");
test = test.replace("_", "</b>");
b_index = test.indexOf("_");
}
After the while loop, you can assign the innerHTML of whichever element you wish to the variable test.

Modify tag position with regex

Suppose I have following string:
var text = "<p>Some text <ins>Text1</p><p>Text2 </ins><ins>Some other text </ins>and another text<ins>Text3</p><p>Text4 </ins></p>"
I need to clean up the above string into
var text = "<p>Some text Text1</p><p><ins>Text2 </ins><ins>Some other text </ins>and another text Text3</p><p><ins>Text4 </ins></p>"
Assume Text1, Text2, Text3, Text4 are random string
I tried below but just mess up:
text.replace(/<ins>(.*?)<\/p><p>/g, '</p><p><ins>');
Thanks
ADDITIONAL EXPLANATION
Take a look at this:
<ins>Text1</p><p>Text2 </ins>
Above is wrong. It should be:
Text1</p><p><ins>Text2 </ins>
Please try the following regex:
function posChange() {
var text = "<p>Some text <ins>Text1</p><p>Text2 </ins><ins>Some other text </ins>and another text<ins>Text3</p><p>Text4 </ins></p>";
var textnew = text.replace(/(<ins>)([^<]+)(<\/p><p>)([^<]+)/g, '$2$3$1$4');
alert(textnew);
}
posChange()
REGEX EXPLANATION:
/(<ins>) 1st capturing group (i.e: <ins>)....$1
([^<]+) 2nd capturing group (i.e: Text1)....$2
(<\/p><p>) 3rd capturing group (i.e: </p><p>)..$3
([^<]+) 4th capturing group (i.e: Text2 )...$4
/g match all occurrences
Based on the requirements, for each match:
Original String: $1 $2 $3 $4
should be replaced with
New String: $2 $3 $1 $4
In this way, the position of each capturing group gets shifted with the help of regex.
You can remove all <ins>:
text = text.replace(/<ins>/g, '');
and then replace every string ending with </ins> and not containing any tag with sum of <ins> and this string:
var matches = text.match(/[^<>]+<\/ins>/g)
for (i = 0; i < matches.length; i++) {
text = text.replace(matches[i], '<ins>' + matches[i]);
}
result:
<p>Some text Text1</p><p><ins>Text2 </ins><ins>Some other text </ins>and another textText3</p><p><ins>Text4 </ins></p>

Replace last digit occurrence in square brackets

I have a variable like:
var text = 'researchOrganisationTrait.keywords[0].freeKeyword[1].texts[en_GB]';
Which I wish to maintain the index of the last occurrence (dynamic added content)
I have tried using the code like:
var text = 'researchOrganisationTrait.keywords[0].freeKeyword[1].texts[en_GB]';
text = text.replace(/\[\d*](?!.*\[)/, '[newIndex]');
alert(text);
But this does not replace freeKeyword[1] with freeKeyword[newIndex]
How to I match the last occurrence of square digit?
JSFiddle: http://jsfiddle.net/4eALF/
Append \d:
text = text.replace(/\[\d+](?!.*\[\d)/, '[newIndex]')

Using Regex to remove html elements and leave the content

Lets say I have the following html
<b>Item 1</b> Text <br>
<b>Item 2</b> Text <br>
<b>Item 3</b> Text <br>
<p><font color="#000000" face="Arial, Helvetica, sans-serif"><b>Item 4:</b></font></p>
<p><font color="#000000" face="Arial, Helvetica, sans-serif">Detailed Description</font></p>
and am using the following regex to capture data (Item 1:.*?<br>)/gi which returns <b>Item 1</b> Text <br>
How do i drop or remove the <b>,</b> and <br>
to be left with
Item 1 Text
I've been trying to make sense of this code <(\w+)[^>]*>.*<\/\1>, but so far no luck. All the examples I have seen on here seem to require an id class, which my html does not have so i'm a bit stuck in getting those examples to fit my problem.
Try this reg ex: <[^>]*>
This will remove all the html with or without attributes and closing tags.
This should do the trick:
var matches = stringToTest.match(/(Item \d+.*?<br\/?>)/gi);
for (var i = 0; i < matches.length; i++) {
matches[i] = matches[i].replace(/<[^>]+>/g, '');
}
alert(matches);
If you have jQuery:
alert(
$.map(stringToTest.match(/(Item \d+.*?<br\/?>)/gi), function(v) { return v.replace(/<[^>]+>/g, '') })
);
This regex will match b and br tags:
</?br?\s*/?>
To use it in Javascript you write something like this:
result = subject.replace(/<\/?br?\s*\/?>/img, "");
All the matched tags will be replaced with an empty string.
In my experience it is better to replace br tags with a space and replace normal inline tags with empty string. If that is what you want to do, this next regex matches only b tags:
</?b\s*/?>
and this one matches only br tags:
</?br\s*/?>
in a regex, what is between () represents capture groups that can be later accessed as variables (\1 \2 \3 etc.) or sometimes $1 $2 $3. So simply use them to capture the text you want.
I think this regex would work for you:
<b>(Item \d+)</b>(.*?)<br>
in details, the expression means:
(Item \d+): Any string formatted as "Item [at least 1 digit]"
(.*?): any group of characters, the ? minimizes the number of characters in the sequence.
So now in <b>Item 5434</b>hel34lo 0345 345<br>, with regex above your captured groups are:
\1 = Item 5434
\2 = hel34lo 0345 345
I've never programmed in javascript, but more precisely, this piece of code might work:
var myString = "<b>Item 5434</b>hel34lo 0345 345<br>";
var myRegexp = /<b>(Item \d+)</b>(.*?)<br>/g;
var match = myRegexp.exec(myString);
alert(match[1]); // Item 5434
alert(match[2]); // hel34lo 0345 345

JavaScript Replace Text with HTML Between it

I want to replace some text in a webpage, only the text, but when I replace via the document.body.innerHTML I could get stuck, like so:
HTML:
<p>test test </p>
<p>test2 test2</p>
<p>test3 test3</p>
Js:
var param = "test test test2 test2 test3";
var text = document.body.innerHTML;
document.body.innerHTML = text.replace(param, '*' + param + '*');
I would like to get:
*test test
test2 test2
test3* test3
HTML of 'desired' outcome:
<p>*test test </p>
<p>test2 test2</p>
<p>test3* test3</p>
So If I want to do that with the parameter above ("test test test2 test2 test3") the <p></p> would not be taken into account - resulting into the else section.
How can I replace the text with no "consideration" to the html markup that could be between it?
Thanks in advance.
Edit (for #Sonesh Dabhi):
Basically I need to replace text in a webpage, but when I scan the
webpage with the html in it the replace won't work, I need to scan and
replace based on text only
Edit 2:
'Raw' JavaScript Please (no jQuery)
This will do what you want, it builds a regex expression to find the text between tags and replace in there. Give it a shot.
http://jsfiddle.net/WZYG9/5/
The magic is
(\s*(?:<\/?\w+>)*\s*)*
Which, in the code below has double backslashes to escape them within the string.
The regex itself looks for any number of white space characters (\s). The inner group (?:</?\w+>)* matches any number of start or end tags. ?: tells java script to not count the group in the replacement string, and not remember the matches it finds. < is a literal less than character. The forward slash (which begins an end html tag) needs to be escaped, and the question mark means 0 or 1 occurrence. This is proceeded by any number of white space characters.
Every space within the "text to search" get replaced with this regular expression, allowing it to match any amount of white space and tags between the words in the text, and remember them in the numbered variables $1, $2, etc. The replacement string gets built to put those remembered variables back in.
Which matches any number of tags and whitespace between them.
function wrapTextIn(text, character) {
if (!character) character = "*"; // default to asterik
// trim the text
text = text.replace(/(^\s+)|(\s+$)/g, "");
//split into words
var words = text.split(" ");
// return if there are no words
if (words.length == 0)
return;
// build the regex
var regex = new RegExp(text.replace(/\s+/g, "(\\s*(?:<\\/?\\w+>)*\\s*)*"), "g");
//start with wrapping character
var replace = character;
//for each word, put it and the matching "tags" in the replacement string
for (var i = 0; i < words.length; i++) {
replace += words[i];
if (i != words.length - 1 & words.length > 1)
replace += "$" + (i + 1);
}
// end with the wrapping character
replace += character;
// replace the html
document.body.innerHTML = document.body.innerHTML.replace(regex, replace);
}
WORKING DEMO
USE THAT FUNCTION TO GET TEXT.. no jquery required
First remove tags. i.e You can try document.body.textContent / document.body.innerText or use this example
var StrippedString = OriginalString.replace(/(<([^>]+)>)/ig,"");
Find and replace (for all to be replace add 1 more thing "/g" after search)
String.prototype.trim=function(){return this.replace(/^\s\s*/, '').replace(/\s\s*$/, '');};
var param = "test test test2 test2 test3";
var text = (document.body.textContent || document.body.innerText).trim();
var replaced = text.search(param) >= 0;
if(replaced) {
var re = new RegExp(param, 'g');
document.body.innerHTML = text.replace(re , '*' + param + '*');
} else {
//param was not replaced
//What to do here?
}
See here
Note: Using striping you will lose the tags.

Categories

Resources