Javascript Replace via RegEx - javascript

I am trying to replace 3 chars with 3 other chars to build/mask an email address for a form.
This works only once or on the first instance of finding it:
email = "email1#domain!com|email2#domain!com|email3#domain!com";
email.replace("#","#").replace("!",".").replace("|",",");
The above code resulted in: email1#domain.com,email2#domain!com|email3#domain!com
After some reading I read about using RegEx which is the portion of coding I can never wrap my head around:
email.replace("/#/g","#").replace("/!/g",".").replace("/|/g",",");
That didn't work either and left it the same as the original var.
What am I doing wrong?

Do not put quotes around the regex. Regexes are literals that use / as a boundary.
Additionally, you will need to escape the | because it has a special meaning.
Finally, .replace is not transformative. It returns the result.
email = email.replace(/#/g,'#').replace(/!/g,'.').replace(/\|/g,',');

Using regex literals, you omit the quotes (and you'll need to escape the pipe):
email.replace(/#/g,"#").replace(/!/g,".").replace(/\|/g,",");

email = "email1#domain!com|email2#domain!com|email3#domain!com";
email=email.replace(/#/g,"#").replace(/!/g,".").replace(/\|/g,",");

Related

How to split a string in javascript by special char \

I believe that this is simple and I'm missing something. I want to split a physical path in windows with javascript. So I try with String#split function, but my result was inespected.
For this string
"C:\CLC\VIDA\Web\_REPOSITORIO\Colectivos\ReembolsosWeb\TMP_011906169_01_01.pdf"
I'm getting this result
var test = "C:\CLC\VIDA\Web\_REPOSITORIO\Colectivos\ReembolsosWeb\TMP_011906169_01_01.pdf";
test.split("\"); //throws error
test.split("\\"); //result in -> ["C:CLCVIDAWeb_REPOSITORIOColectivosReembolsosWebTMP_011906169_01_01.pdf"]
test.split(/\\/); // -> the regex is the same as above
One last thing, in my test, I found that to get the result that I want I could do it like this
var test2 = "C:\\CLC\\VIDA\\Web\\_REPOSITORIO\\Colectivos\\ReembolsosWeb\\TMP_011906169_01_01.pdf"
test2.split("\\"); // -> ["C:", "CLC", "VIDA", "Web", "_REPOSITORIO", "Colectivos", "ReembolsosWeb", "TMP_011906169_01_01.pdf"]
So my question is, how can I split the string from test var to get the array from the last case?
Strings in javascript support escape sequences via the backslash (\). For example if you need a tab in your string you can add a \t anywhere in your string and it will be replaced with a tab, a \n will be replaced with a new line.
The backslashes in test are either converted to their respective characters or dropped because they are invalid escape sequences.
To get around this you can escape one backslash with another to get a single normal backslash. The downside is that this cannot be done in javascript. Generally I paste my string in to notepad/N++/Code/Sublime and replace all \ with \\
Since you are hard coding the string you need to escape all backslashes. After that you can use test.split("\\") which, itself contains an escaped backslash.
So, as far as Javascript is concerned, your code looks like this.
var test = "C:CLCVIDAWeb_REPOSITORIOColectivosReembolsosWebTMP_011906169_01_01.pdf";
To make javascript see the string correctly you need to make it look like this...
var test = "C:\\CLC\\VIDA\\Web\\_REPOSITORIO\\Colectivos\\ReembolsosWeb\\TMP_011906169_01_01.pdf";
Firstly, note that when you have a single backslash in a string, it is used for escaping the next character. It is just ignored if there is no special character next to it to escape.
Now, just have a look at your string :
var test = "C:\CLC\VIDA\Web\_REPOSITORIO\Colectivos\ReembolsosWeb\TMP_011906169_01_01.pdf"
Don't you think all of your single backslashes will be ignored here?
So, the solution is simple, what you have already tried successfully. To escape all your backslashes with another backslash.
var test2 = "C:\\CLC\\VIDA\\Web\\_REPOSITORIO\\Colectivos\\ReembolsosWeb\\TMP_011906169_01_01.pdf"
test2.split("\\"); // -> ["C:", "CLC", "VIDA", "Web", "_REPOSITORIO", "Colectivos", "ReembolsosWeb", "TMP_011906169_01_01.pdf"]
But, are you worried about any dynamic data which has such backslash? (For example, coming from a text input or a file input.) Don't think about escaping the backslash inside it. Because you don't need to do that! It's already a well formatted string for you, which you can use as it is. You need to escape only when you are hard coding the string yourself.

Regex: get string between last character occurence before a comma

I need some help with Regex.
I have this string: \\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam
and want to get the result: ["dolor", "conseteteur", "diam"]So in words the word between the last backslash and a comma or the end.
I've already figured out a working test, but because of reasons it won't work in neitherChrome (v44.0.2403.130) nor IE (v11.0.9600.17905) console.There i'm getting the result: ["\loremipsumdolor,", "\sitametconseteteur,", "\sadipscingelitrseddiam"]
Can you please tell me, why the online testers aren't working and how i can achieve the right result?
Thanks in advance.
PS: I've tested a few online regex testers with all the same result. (regex101.com, regexpal.com, debuggex.com, scriptular.com)
The string
'\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam'
is getting escaped, if you try the following in the browser's console you'll see what happens:
var s = '\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam'
console.log(s);
// prints '\loremipsumdolor,\sitametconseteteur,\sadipscingelitrseddiam'
To use your original string you have to add additional backslashes, otherwise it becomes a different one because it tries to escape anything followed by a single backslash.
The reason why it works in regexp testers is because they probably sanitize the input string to make sure it gets evaluated as-is.
Try this (added an extra \ for each of them):
str = '\\\\lorem\\ipsum\\dolor,\\\\sit\\amet\\conseteteur,\\\\sadipscing\\elitr\\sed\\diam'
re = /\\([^\\]*)(?:,|$)/g
str.match(re)
// should output ["\dolor,", "\conseteteur,", "\diam"]
UPDATE
You can't prevent the interpreter from escaping backslashes in string literals, but this functionality is coming with EcmaScript6 as String.raw
s = String.raw`\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam`
Remember to use backticks instead of single quotes with String.raw.
It's working in latest Chrome, but I can't say for all other browsers, if they're moderately old, it probably isn't implemented.
Also, if you want to avoid matching the last backslash you need to:
remove the \\ at the start of your regexp
use + instead of * to avoid matching the line end (it will create an extra capture)
use a positive lookahead ?=
like this
s = String.raw`\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam`;
re = /([^\\]+)(?=,|$)/g;
s.match(re);
// ["dolor", "conseteteur", "diam"]
You may try this,
string.match(/[^\\,]+(?=,|$)/gm);
DEMO

How do I replace a double-quote with an escape-char double-quote in a string using JavaScript?

Say I have a string variable (var str) as follows-
Dude, he totally said that "You Rock!"
Now If I'm to make it look like as follows-
Dude, he totally said that "You Rock!"
How do I accomplish this using the JavaScript replace() function?
str.replace("\"","\\""); is not working so well. It gives unterminated string literal error.
Now, if the above sentence were to be stored in a SQL database, say in MySQL as a LONGTEXT (or any other VARCHAR-ish) datatype, what else string optimizations I need to perform?
Quotes and commas are not very friendly with query strings. I'd appreciate a few suggestions on that matter as well.
You need to use a global regular expression for this. Try it this way:
str.replace(/"/g, '\\"');
Check out regex syntax and options for the replace function in Using Regular Expressions with JavaScript.
Try this:
str.replace("\"", "\\\""); // (Escape backslashes and embedded double-quotes)
Or, use single-quotes to quote your search and replace strings:
str.replace('"', '\\"'); // (Still need to escape the backslash)
As pointed out by helmus, if the first parameter passed to .replace() is a string it will only replace the first occurrence. To replace globally, you have to pass a regex with the g (global) flag:
str.replace(/"/g, "\\\"");
// or
str.replace(/"/g, '\\"');
But why are you even doing this in JavaScript? It's OK to use these escape characters if you have a string literal like:
var str = "Dude, he totally said that \"You Rock!\"";
But this is necessary only in a string literal. That is, if your JavaScript variable is set to a value that a user typed in a form field you don't need to this escaping.
Regarding your question about storing such a string in an SQL database, again you only need to escape the characters if you're embedding a string literal in your SQL statement - and remember that the escape characters that apply in SQL aren't (usually) the same as for JavaScript. You'd do any SQL-related escaping server-side.
The other answers will work for most strings, but you can end up unescaping an already escaped double quote, which is probably not what you want.
To work correctly, you are going to need to escape all backslashes and then escape all double quotes, like this:
var test_str = '"first \\" middle \\" last "';
var result = test_str.replace(/\\/g, '\\\\').replace(/\"/g, '\\"');
depending on how you need to use the string, and the other escaped charaters involved, this may still have some issues, but I think it will probably work in most cases.
var str = 'Dude, he totally said that "You Rock!"';
var var1 = str.replace(/\"/g,"\\\"");
alert(var1);

JS - RegExp for detecting ".-" , "-."

I am bit confused with the RegExp I should be using to detect ".-", "-." it indeed passes this combinations as valid but in the same time, "-_","_-" get validated as well. Am I missing something or not escaping something properly?
var reg=new RegExp("(\.\-)|(\-\.)");
Actually seems any combination containing '-' gets passed. it
Got it thank you everyone.
You need to use
"(\\.-)|(-\\.)"
Since you're using a string with the RegExp constructor rather than /, you need to escape twice.
>>> "asd_-ads".search("(\.\-)|(\-\.)")
3
>>> "asd_-ads".search(/(\.\-)|(\-\.)/)
-1
>>> "asd_-ads".search(new RegExp('(\\.\-)|(\-\\.)'))
-1
In notation /(\.\-)|(\-\.)/, the expression would be right.
In the notation you chose, you must double all backslashes, because it still has a special meaning of itself, like \\, \n and so on.
Note there is no need to escape the dash here: var reg = new RegExp("(\\.-)|(-\\.)");
If you don't need to differentiate the matches, you can use a single enclosing capture, or none at all if you only want to check the match: "\\.-|-\\." is still valid.
You are using double quotes so the . doesn't get escaped with one backslash, use this notation:
var reg = /(\.\-)|(\-\.)/;

Regex to match all instances not inside quotes

From this q/a, I deduced that matching all instances of a given regex not inside quotes, is impossible. That is, it can't match escaped quotes (ex: "this whole \"match\" should be taken"). If there is a way to do it that I don't know about, that would solve my problem.
If not, however, I'd like to know if there is any efficient alternative that could be used in JavaScript. I've thought about it a bit, but can't come with any elegant solutions that would work in most, if not all, cases.
Specifically, I just need the alternative to work with .split() and .replace() methods, but if it could be more generalized, that would be the best.
For Example:
An input string of: +bar+baz"not+or\"+or+\"this+"foo+bar+
replacing + with #, not inside quotes, would return: #bar#baz"not+or\"+or+\"this+"foo#bar#
Actually, you can match all instances of a regex not inside quotes for any string, where each opening quote is closed again. Say, as in you example above, you want to match \+.
The key observation here is, that a word is outside quotes if there are an even number of quotes following it. This can be modeled as a look-ahead assertion:
\+(?=([^"]*"[^"]*")*[^"]*$)
Now, you'd like to not count escaped quotes. This gets a little more complicated. Instead of [^"]* , which advanced to the next quote, you need to consider backslashes as well and use [^"\\]*. After you arrive at either a backslash or a quote, you need to ignore the next character if you encounter a backslash, or else advance to the next unescaped quote. That looks like (\\.|"([^"\\]*\\.)*[^"\\]*"). Combined, you arrive at
\+(?=([^"\\]*(\\.|"([^"\\]*\\.)*[^"\\]*"))*[^"]*$)
I admit it is a little cryptic. =)
Azmisov, resurrecting this question because you said you were looking for any efficient alternative that could be used in JavaScript and any elegant solutions that would work in most, if not all, cases.
There happens to be a simple, general solution that wasn't mentioned.
Compared with alternatives, the regex for this solution is amazingly simple:
"[^"]+"|(\+)
The idea is that we match but ignore anything within quotes to neutralize that content (on the left side of the alternation). On the right side, we capture all the + that were not neutralized into Group 1, and the replace function examines Group 1. Here is full working code:
<script>
var subject = '+bar+baz"not+these+"foo+bar+';
var regex = /"[^"]+"|(\+)/g;
replaced = subject.replace(regex, function(m, group1) {
if (!group1) return m;
else return "#";
});
document.write(replaced);
Online demo
You can use the same principle to match or split. See the question and article in the reference, which will also point you code samples.
Hope this gives you a different idea of a very general way to do this. :)
What about Empty Strings?
The above is a general answer to showcase the technique. It can be tweaked depending on your exact needs. If you worry that your text might contain empty strings, just change the quantifier inside the string-capture expression from + to *:
"[^"]*"|(\+)
See demo.
What about Escaped Quotes?
Again, the above is a general answer to showcase the technique. Not only can the "ignore this match" regex can be refined to your needs, you can add multiple expressions to ignore. For instance, if you want to make sure escaped quotes are adequately ignored, you can start by adding an alternation \\"| in front of the other two in order to match (and ignore) straggling escaped double quotes.
Next, within the section "[^"]*" that captures the content of double-quoted strings, you can add an alternation to ensure escaped double quotes are matched before their " has a chance to turn into a closing sentinel, turning it into "(?:\\"|[^"])*"
The resulting expression has three branches:
\\" to match and ignore
"(?:\\"|[^"])*" to match and ignore
(\+) to match, capture and handle
Note that in other regex flavors, we could do this job more easily with lookbehind, but JS doesn't support it.
The full regex becomes:
\\"|"(?:\\"|[^"])*"|(\+)
See regex demo and full script.
Reference
How to match pattern except in situations s1, s2, s3
How to match a pattern unless...
You can do it in three steps.
Use a regex global replace to extract all string body contents into a side-table.
Do your comma translation
Use a regex global replace to swap the string bodies back
Code below
// Step 1
var sideTable = [];
myString = myString.replace(
/"(?:[^"\\]|\\.)*"/g,
function (_) {
var index = sideTable.length;
sideTable[index] = _;
return '"' + index + '"';
});
// Step 2, replace commas with newlines
myString = myString.replace(/,/g, "\n");
// Step 3, swap the string bodies back
myString = myString.replace(/"(\d+)"/g,
function (_, index) {
return sideTable[index];
});
If you run that after setting
myString = '{:a "ab,cd, efg", :b "ab,def, egf,", :c "Conjecture"}';
you should get
{:a "ab,cd, efg"
:b "ab,def, egf,"
:c "Conjecture"}
It works, because after step 1,
myString = '{:a "0", :b "1", :c "2"}'
sideTable = ["ab,cd, efg", "ab,def, egf,", "Conjecture"];
so the only commas in myString are outside strings. Step 2, then turns commas into newlines:
myString = '{:a "0"\n :b "1"\n :c "2"}'
Finally we replace the strings that only contain numbers with their original content.
Although the answer by zx81 seems to be the best performing and clean one, it needes these fixes to correctly catch the escaped quotes:
var subject = '+bar+baz"not+or\\"+or+\\"this+"foo+bar+';
and
var regex = /"(?:[^"\\]|\\.)*"|(\+)/g;
Also the already mentioned "group1 === undefined" or "!group1".
Especially 2. seems important to actually take everything asked in the original question into account.
It should be mentioned though that this method implicitly requires the string to not have escaped quotes outside of unescaped quote pairs.

Categories

Resources