Regex with multiple start and end characters that must be the same - javascript

I would like to be able to search for strings inside a special tag in a string in JavaScript. Strings in JavaScript can start with either " or ' character.
Here an example to illustrate what I want to do. My custom tag is called <my-tag. My regex is /('|")*?<my-tag>((.|\n)[^"']*?)<\/my-tag>*?('|")/g. I use this regex pattern on the following strings:
var a = '<my-tag>Hello World</my-tag>'; //is found as expected
var b = "<my-tag>Hello World" + '</my-tag>'; //is NOT found, this is good!
var c = "<my-tag>Hello World</my-tag>"; //is found as expected
var d = '<my-tag>something "special"</my-tag>'; //here the " char causes a problem
var e = "<my-tag>something 'special'</my-tag>"; //here the " char causes a problem
It works well with a and also c where it finds the tag with the containing text. It also does not find the text in b which is what I want. But in case d and e the tag with content is not found due to the occurrence of the " and ' character. What I want is a regex where inside the tag " is allowed if the string is start with ', and vice versa.
Is it possible to achieve this with one regex, or is the only thing I can do is to work with two separate regex expressions like
/(")*?<my-tag>((.|\n)[^']*?)<\/my-tag>*?(")/g and /(')*?<my-tag>((.|\n)[^"]*?)<\/my-tag>*?(')/g ?

It's not pretty, but I think this would work:
/("<my-tag>((.|\n)[^"]*?)<\/my-tag>"|'<my-tag>((.|\n)[^']*?)<\/my-tag>')/g

You should be able to use de match from the first match ('|") and reuse it for the second match. Something like the following:
/('|")<my-tag>.*?<\/my-tag>\1/g
This should make sure to match the same character at the beginning and the end.
But you really shouldn't use regex for parsing HTML.

Related

Check first character of a word in a string begins with #

A few days ago I posted a similar question, but I do not quite understand the principle. Are there good resources where the replace function combined with regular expressions is explained?
Anyways, right now I have the following problem: A string which starts with # should be placed in an link. So #test should be replaced to #test .
Also, these rules should apply:
The string can only contain one #, which is at the beginning.
If there are more strings, also replace them. I thought you can do this by putting /g behind the regex?
This is what I have so far:
value = "is #test";
var text = value.replace(/^.*(#)(\w+).*$/, "<a href='$2'>$1$2</a>");
My output
#test
EDIT:
The link is now working. However, the word "is" is missing.
You need to capture the ambient text:
value = "is #test or what";
var text = value.replace(/^(.*)#(\w+)(.*)$/, "$1<a href='$2'>#$2</a>$3");
Or just capture less:
var text = value.replace(/#(\w+)/, "<a href='$1'>#$1</a>");
When performing a .replace(), you need to include all the characters in the RegExp that you wish to replace, preserving the ones you want to keep with parentheses.
var test = 'is #test';
function makeAt(string){
return string.replace(/^.*(#)(\w+).*$/, "<a href='$2'>$1$2</a>");
}
console.log(makeAt(test));

Javascript Regex match everything after last occurrence of string

I am trying to match everything after (but not including!) the last occurrence of a string in JavaScript.
The search, for example, is:
[quote="user1"]this is the first quote[/quote]\n[quote="user2"]this is the 2nd quote and some url https://www.google.com/[/quote]\nThis is all the text I\'m wirting about myself.\n\nLook at me ma. Javascript.
Edit: I'm looking to match everything after the last quote block. So I was trying to match everything after the last occurrence of "quote]" ? Idk if this is the best solution but its what i've been trying.
I'll be honest, i suck at this Regex stuff.. here is what i've been trying with the results..
regex = /(quote\].+)(.*)/ig; // Returns null
regex = /.+((quote\]).+)$/ig // Returns null
regex = /( .* (quote\]) .*)$/ig // Returns null
I have made a JSfiddle for anyone to have a play with here:
https://jsfiddle.net/au4bpk0e/
One option would be to match everything up until the last [/quote], and then get anything following it. (example)
/.*\[\/quote\](.*)$/i
This works since .* is inherently greedy, and it will match every up until the last \[\/quote\].
Based on the string you provided, this would be the first capturing group match:
\nThis is all the text I\'m wirting about myself.\n\nLook at me ma. Javascript.
But since your string contains new lines, and . doesn't match newlines, you could use [\s\S] in place of . in order to match anything.
Updated Example
/[\s\S]*\[\/quote\]([\s\S]*)$/i
You could also avoid regex and use the .lastIndexOf() method along with .slice():
Updated Example
var match = '[\/quote]';
var textAfterLastQuote = str.slice(str.lastIndexOf(match) + match.length);
document.getElementById('res').innerHTML = "Results: " + textAfterLastQuote;
Alternatively, you could also use .split() and then get the last value in the array:
Updated Example
var textAfterLastQuote = str.split('[\/quote]').pop();
document.getElementById('res').innerHTML = "Results: " + textAfterLastQuote;

Javascript Regex only replacing first match occurence

I am using regular expressions to do some basic converting of wiki markup code into copy-pastable plain text, and I'm using javascript to do the work.
However, javascript's regex engine behaves much differently to the ones I've used previously as well as the regex in Notepad++ that I use on a daily basis.
For example- given a test string:
==Section Header==
===Subsection 1===
# Content begins here.
## Content continues here.
I want to end up with:
Section Header
Subsection 1
# Content begins here.
## Content continues here.
Simply remove all equals signs.
I began with the regex setup of:
var reg_titles = /(^)(=+)(.+)(=+)/
This regex searches for lines that begin with one or more equals with another set of one or more equals. Rubular shows that it matches my lines accurately and does not catch equals signs in the middle of contet. http://www.rubular.com/r/46PrkPx8OB
The code to replace the string based on regex
var lines = $('.tb_in').val().split('\n'); //use jquery to grab text in a textarea, and split into an array of lines based on the \n
for(var i = 0;i < lines.length;i++){
line_temp = lines[i].replace(reg_titles, "");
lines[i] = line_temp; //replace line with temp
}
$('.tb_out').val(lines.join("\n")); //rejoin and print result
My result is unfortunately:
Section Header==
Subsection 1===
# Content begins here.
## Content continues here.
I cannot figure out why the regex replace function, when it finds multiple matches, seems to only replace the first instance it finds, not all instances.
Even when my regex is updated to:
var reg_titles = /(={2,})/
"Find any two or more equals", the output is still identical. It makes a single replacement and ignores all other matches.
No one regex expression executor behaves this way for me. Running the same replace multiple times has no effect.
Any advice on how to get my string replace function to replace ALL instances of the matched regex instead of just the first one?
^=+|=+$
You can use this.Do not forget to add g and m flags.Replace by ``.See demo.
http://regex101.com/r/nA6hN9/28
Add the g modifier to do a global search:
var reg_titles = /^(=+)(.+?)(=+)/g
Your regex is needlessly complex, and yet doesn't actually accomplish what you set out to do. :) You might try something like this instead:
var reg_titles = /^=+(.+?)=+$/;
lines = $('.tb_in').val().split('\n');
lines.forEach(function(v, i, a) {
a[i] = v.replace(reg_titles, '$1');
})
$('.tb_out').val(lines.join("\n"));

Regex replace text outside html tag

I'm working on an autocomplete component that highlights all ocurrences of searched text. What I do is explode the input text by words, and wrap every ocurrence of those words into a
My code looks like this
inputText = 'marriott st';
text = "Marriott east side";
textSearch = inputText.split(' ');
for (var i in textSearch) {
var regexSearch = new RegExp('(?!<\/?strong>)' + textSearch[i]), "i");
var textReplaced = regexSearch.exec(text);
text = text.replace(regexSearch, '< strong>' + textReplaced + '< /strong>');
}
For example, given the result: "marriott east side"
And the input text: "marriott st"
I should get
<strong>marriot< /strong > ea < strong >st < /strong > side
And i'm getting
<<strong>st</strong>rong>marriot</<strong>st </strong>rong>ea<<strong>st</strong> rong>s</strong> side
Any ideas how can I improve my regex, in order to avoid ocurrences inside the html tags? Thanks
/(?!<\/?strong>)st/
I would process the string in one pass. You can create one regular expression out of the search string:
var search_pattern = '(' + inputText.replace(/\s+/g, '|') + ')';
// `search_pattern` is now `(marriot|st)`
text = text.replace(RegExp(search_pattern, 'gi'), '<strong>$1</strong>');
DEMO
You could even split the search string first, sort the words by length and combine them, to give a higher precedence to longer matches.
You definitely should escape special regex characters inside the string: How to escape regular expression special characters using javascript?.
Before each search, I suggest getting (or saving) the original search string to work on each time. For example, in your current case that means you could replace all '<strong>' and '</strong>' tags with ''. This will help keep your regEx simple, especially if you decide to add other html tags and formatting in the future.

java script Regular Expressions patterns problem

My problem start with like-
var str='0|31|2|03|.....|4|2007'
str=str.replace(/[^|]\d*[^|]/,'5');
so the output becomes like:"0|5|2|03|....|4|2007" so it replaces 31->5
But this doesn't work for replacing other segments when i change code like this:
str=str.replace(/[^|]{2}\d*[^|]/,'6');
doesn't change 2->6.
What actually i am missing here.Any help?
I think a regular expression is a bad solution for that problem. I'd rather do something like this:
var str = '0|31|2|03|4|2007';
var segments = str.split("|");
segments[1] = "35";
segments[2] = "123";
Can't think of a good way to solve this with a regexp.
Here is a specific regex solution which replaces the number following the first | pipe symbol with the number 5:
var re = /^((?:\d+\|){1})\d+/;
return text.replace(re, '$15');
If you want to replace the digits following the third |, simply change the {1} portion of the regex to {3}
Here is a generalized function that will replace any given number slot (zero-based index), with a specified new number:
function replaceNthNumber(text, n, newnum) {
var re = new RegExp("^((?:\\d+\\|){"+ n +'})\\d+');
return text.replace(re, '$1'+ newnum);
}
Firstly, you don't have to escape | in the character set, because it doesn't have any special meaning in character sets.
Secondly, you don't put quantifiers in character sets.
And finally, to create a global matching expression, you have to use the g flag.
[^\|] means anything but a '|', so in your case it only matches a digit. So it will only match anything with 2 or more digits.
Second you should put the {2} outside of the []-brackets
I'm not sure what you want to achieve here.

Categories

Resources