How to use JavaScript replace() without triggering the special replacement patterns? - javascript

I'm trying to inject some code into a file by using replace. The problem is if the target code contains any of the replace() function's special character sequences, namely $$, it will do something different than what I expect, which is to just copy the characters exactly as they are.
'replaceme'.replace('replaceme', '$$');
you would think this would result in $$ but it actually returns $
Is there a way to disable this functionality so that it maintains the $$ like I want?

No idea why the $$ wasn't working, I too am wondered.
But you can get the result by using the callback function.
'replaceme'.replace('replaceme', () => '$$');

If you do not like to apply the replacement string patterns, you can use a replacement function:
'replaceme'.replace('replaceme', (_match) => '$$');

This might be a bit hacky, but you can create a new method on String that replaces all $ characters in the replacement string with $$:
function replaceRaw(original, match, text) {
text = text.replaceAll('$', '$$$');
return original.replace(match, text);
}
console.log(replaceRaw('replaceme', 'replaceme', '$$'));
Note that $$ in the replacement text will become a single $, so this new method makes all $s in the replace $$ (see this MDN article).

Related

What is the function of .source in context of this new RegExp

I ran into the below monster of a regex in the wild today. The regex is meant to validate a url.
function superUrlValidation(url) {
return new RegExp(/^/.source + "((.+):\/\/)?" + /(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/.source, "i")
.test(url);
}
I've never seen .source used in a regex like this so I looked it up.
The MDN docs for RegExp.prototype.source states:
The source property returns a String containing the source text of the regexp object, and it doesn't contain the two forward slashes on both sides and any flags.
... and gives this example:
var regex = /fooBar/ig;
console.log(regex.source); // "fooBar", doesn't contain /.../ and "ig".
I understand the MDN example (you're getting the source text of the regex object after it is created, makes sense), but I dont understand how this is being used in the superUrlValidation regex above.
How is the source being used before the regex object is completed and what does this accomplish? I cant find any documentation showing .source being used in this way.
Note that .source is used twice in the regex, at the beginning and the end
Use of .source everywhere in your regex seems totally unnecessary, may be just a trick to avoid double escaping. In fact even use of new RegExp is not needed and you can get away with just the regex literal as this:
var re = /^((.+):\/\/)?(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i;
/^/ is a regex literal, meaning it's a valid regex object in it's own right. This means that /^/.source === "^".
This seems like an arbitrary example of using the source property as this means the author could have just placed a "^" in it's place, or even just put a ^ at the beginning of the next string, and it would have the same effect.
The .source property returns the content of the regex between the forward slashes as you say. so the result of the above is equivalent to this string:
/^((.+):\/\/)?(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i
In JavaScript you can write regexes like this: /matchsomething/ or using the RegExp function/constructor above. It looks like the code you found is the result of someone not know what they were doing. They seem to have taken a few regexes using the literal syntax (i.e /match_here/) and plugged it into the constructor version and stuck them all together.
I can't see any benefit in using the source property this way. I would just use the string version or the constructor version. Or better, find out what the original author intended and write it again or find a respected regex library with the criteria you need.
And, yeah, wow. It's massive.

Can't replace() using regex

I want to replace all "w" with "e":
var d = document.getElementById('hash').innerHTML.replace("/w/g","e");
But it doesn't replace anything! Earlier I tried using replace("w","e"), but it replaced only the first "w". So, what to do?
If you actually want to modify the content (innerHTML) of an element (the original version of your question made it very likely), you should use this:
var el = document.getElementById('hash');
el.innerHTML = el.innerHTML.replace(/w/g, 'e');
... as replace (as any other string method) doesn't change the operated string in place - instead it creates a new string (result of replacement operation) and returns it.
If you only want to make a new, transformed version of the element's contents, the only change should be using proper regex object, written either with regex literal (delimited by / symbols) or with new RegExp(...) construct. In this particular case the latter would definitely be an overkill. So just drop the quotes around /w/g (already done in the code snippet above), and you'll be fine. )
As for '....replace('w', 'e') does replace only once...' part, that's actually quite a common gotcha: when used with a string as the first argument, .replace() will do its work no more than once.
You're giving replace a string instead of a regex. Use a real regex:
var d = document.getElementById('hash').innerHTML.replace(/w/g,"e");

removing phpbb tag using regex javascript

I'm trying to remove a rectangular brackets(bbcode style) using javascript, this is for removing unwanted bbcode.
I try with this.
theString .replace(/\[quote[^\/]+\]*\[\/quote\]/, "")
it works with this string sample:
theString = "[quote=MyName;225]Test 123[/quote]";
it will fail within this sample:
theString = "[quote=MyName;225]Test [quote]inside quotes[/quote]123[/quote]";
if there any solution beside regex no problem
The other 2 solutions simply do not work (see my comments). To solve this problem you first need to craft a regex which matches the innermost matching quote elements (which contain neither [QUOTE..] nor [/QUOTE]). Next, you need to iterate, applying this regex over and over until there are no more QUOTE elements left. This tested function does what you want:
function filterQuotes(text)
{ // Regex matches inner [QUOTE]non-quote-stuff[/quote] tag.
var re = /\[quote[^\[]+(?:(?!\[\/?quote\b)\[[^\[]*)*\[\/quote\]/ig;
while (text.search(re) !== -1)
{ // Need to iterate removing QUOTEs from inside out.
text = text.replace(re, "");
}
return text;
}
Note that this regex employs Jeffrey Friedl's "Unrolling the loop" efficiency technique and is not only accurate, but is quite fast to boot.
See: Mastering Regular Expressions (3rd Edition) (highly recommended).
Try this one:
/\[quote[^\/]+\].*\[\/quote\]$/
The $ sign indicates that only the closing quote element at the end of the string should be used to determine the ending of the quote you're trying to remove.
And i added a "." before the asterisk so that this will match any sign in between. I tested this with your two strings and it worked.
edit: I don't exactly know how you are using that. But just as an addition. If you want the pattern also to match to a string where no attributes are added for example:
[quote]Hello[/quote]
You should change the "+" sign into an asterisk as well like this:
/\[quote[^\/]*\].*\[\/quote\]$/
This answer has flaws, see Ridgerunner's answer for a more correct one.
Here's my crack at it.
function filterQuotes(text)
{
return text.replace(/\[(\/)?quote([^\/]*)?\]/g,"");
}

Why does a $ in the replacement string passed to the replace() function break some of the time? [duplicate]

This question already has answers here:
`string.replace` weird behavior when using dollar sign ($) as replacement
(3 answers)
Closed last month.
I have a variable that I'm using to build a JavaScript function call, and JavaScript's .replace() to surround the line of text with a span and onclick event. The .replace() portion looks like this:
code.replace(/(\d{4}\s+)?(LOCAL|PARAMETER|GLOBAL)\s+USING\s+([\S]+)/g,
"<span class=\"natprint_popup\" onclick=\"getNaturalCode('"
+ lib
+ "','$3','##test_prod_qual|',0,'Y'); return false;\">$&</span>");
The only problem is that the variable lib contains a $ at the end some of the time; for example, lib == DPDRI$. This causes the JavaScript on my page to break and I get output that breaks at the end of lib and displays the rest of the Javascript function parameters as plain text:
,'DPDPDRNO','TEST',0,'Y'); return false;">
I've been looking fruitlessly for answers for a few days now. I've tried doing lib.replace(/\$/g, "\\$"); and the \$ is successfully making its way into the variable but it still breaks my code. It seems like the JavaScript engine is trying to interpret the $ at the end of lib as a captured match and it's making it blow up. Anyone have any ideas how to make this work?
See the Specifying a string as a parameter section of the replace() documentation on MDC:
The replacement string can include the following special replacement patterns:
Pattern Inserts
$$ Inserts a "$".
...
$' Inserts the portion of the string that follows the matched substring.
...
Note that since the contents of your variable is being inserted as a single-quote-wrapped parameter to the function you're calling from onclick, it will always be followed by a ' - so when it ends with a $, you'll have inadvertently created a replacement pattern:
... onclick=\"getNaturalCode('DPDRI$' ...
Now, you could just change how you quote that particular parameter. But to be safe, you should really escape the $ symbol:
+ lib.replace(/\$/g, "$$$$")
The above modification will convert "DPDRI$" into "DPDRI$$" prior to its insertion into the replacement string, allowing the final replacement to contain a literal $.

How to extract a string from a larger string?

Using jQuery how would I find/extract the "Spartan" string if the outputted HTML page had the following..
<a href="/mytlnet/logout.php?t=e22df53bf4b5fc0a087ce48897e65ec0">
<b>Logout</b>
</a> : Spartan<br>
Regular Expressions. Or by splitting the string in a more tedious fashion.
Since I'm not a big regex-junkie, I would likely get the text-equivalent using .text(), then split the result on ":", and grab the second index (which would be the 'Spartan' text).
if the pattern is going to be consistent you can your RegEx
or if the markup is going to be the same you can try the jQuery HTML Parser
As well as using regular expressions, I've also abusively used these functions to do the things it seems you want to do (strip html and such from a text string):
//removes all HTML tags
function striptags(stringToStrip) {
return stringToStrip.replace(/(<([^>]+)>)/ig,"");
}
//standard trim function for JavaScript--removes leading and trailing white space
function trim(stringToTrim) {
return stringToTrim.replace(/^\s+|\s+$/g,"");
}
Incredible regular expressions that have saved me a lot of time.

Categories

Resources