Escaping backslash in a string containing backslash - javascript

I have a string containing I\u2019m (with backslashes not escaped)
var myString = 'I\\u2019m'; // I\u2019m
But then I need a function that 'escape backslashes' that string, so the function I'm looking for would return I'm
backslashString(myString); // I'm
I've tried using eval:
function backslashString(input){
input = input.replace(/'/g, "\\'"); // Replace ' with \' that's going to mess up eval
return eval(`'${input}'`);
}
But is there a proper way of doing it? I'm looking for a function that escape backslashes a string containing I\u2019m to I'm and also handles if there's an extra backslash (A lost \ backslash)
EDIT:
I did not ask what I meant from the start. This not only applies to unicode characters, but applies to all backslash characters including \n

The backslashes aren’t the real problem here - the real problem is the difference between code and data.
\uXXXX is JavaScript syntax to write the Unicode codepoint of a character in a text literal. It gets replaced with the actual character, when the JavaScript parser interprets this code.
Now you have a variable that contains the value I\u2019m already - that is data. This does not get parsed as JavaScript, so it does mean the literal characters I\u2019m, and not I’m. eval can “fix” that, because the missing step of interpreting this as code is simply what eval does.
If you do not want to use eval (and thereby invite all the potential risks that entails, if the input data is not completely under your control), then you can parse those numeric values from the string using regular expressions, and then use String.formCharCode to create the actual Unicode character from the given code point:
var myString = 'I\\u2019m and I\\u2018m';
var myNewString = myString.replace(/\\u([0-9]+)/g, function(m, n) {
return String.fromCharCode(parseInt(n, 16)) }
);
console.log(myNewString)
/\\u([0-9]+)/g - regular expression to match this \uXXXX format (X=digits), g modifier to replace all matches instead of stopping after the first.
parseInt(n, 16) - to convert the hexadecimal value to a decimal first, because String.fromCharCode wants the latter.

decodeURIComponent(JSON.parse('"I\\u2019m"'));
OR for multiple
'I\\\u2019m'.split('\\').join().replace(/,/g,'');
'I\u2019m'.split('\\').join().replace(/,/g,'');

Looks like there's no other way other than eval (JSON.parse doesn't like new lines in strings)
NOTE: The function would return false if it has a trailing backslash
function backslashString(input){
input = input.replace(/`/g, '\\`'); // Escape quotes for input to eval
try{
return eval('`'+input+'`');
}catch(e){ // Will return false if input has errors in backslashing
return false;
}
}

Related

\E can't replace from= '\E\\E\10.1.2.154\E\bcs\E\30877_P9999_Adult{2}_02_05_2019_0329p.pdf'

var str='\E\\E\10.1.2.154\E\bcs\E\30877_P9999_Adult{2}_02_05_2019_0329p.pdf';
var res=str.replace('\E', '');
I am getting return like this:
\E.1.2.154EcsE877_P9999_Adult{2}_02_05_2019_0329p.pdf
I need to replace all '\E' from string and expecting output like this (\\10.1.2.154\bcs\30877_P9999_Adult{2}_02_05_2019_0329p.pdf). Some body please advise on this . I tried to do several way to fix this. No luck. When I tried with C# it's working fine.
static void Main(string[] args)
{
string str=#"\E\\E\10.1.2.154\E\bcs\E\30877_P9999_Adult{2}_02_05_2019_0329p.pdf";
str=str.Replace(#"\E","");
Console.WriteLine(str);
Console.Read();
}
But, I need it in JavaScript.
var str='\E\\E\10.1.2.154\E\bcs\E\30877_P9999_Adult{2}_02_05_2019_0329p.pdf';
var res=str.replace(/\\E/g, '');
console.log(str, res)
You must use RegExg to eliminate the escape inside a string in JavaScript.
In JavaScript the equivalent of the C# # prefix is String.raw followed by a template literal (notice the backtics).
And to replace all occurrences, not just one, you need to pass a regex to replace with g modifier.
var str=String.raw`\E\\E\10.1.2.154\E\bcs\E\30877_P9999_Adult{2}_02_05_2019_0329p.pdf`;
var res=str.replace(/\\E/g, '');
console.log(res);
NB: The backslash in a regular expression is an escape character, so you need \\ for one literal backslash.
If for some reason you really want to avoid the use of a regex, then there is the split/join trick, but it is a bit slower:
var str=String.raw`\E\\E\10.1.2.154\E\bcs\E\30877_P9999_Adult{2}_02_05_2019_0329p.pdf`;
var res=str.split(String.raw`\E`).join('');
console.log(res);
Older JS engines
For older JS engines that do not support String.raw you need to use standard string literals, which use the backslash as escape character. So then you need to double all of them. But this is only needed when you write the string as a literal. When you get the string via some API, then there is no need to alter the string before doing the replacement:
var str='\\E\\\\E\\10.1.2.154\\E\\bcs\\E\\30877_P9999_Adult{2}_02_05_2019_0329p.pdf';
var res=str.replace(/\\E/g, '');
console.log(res);

Line endings (also known as Newlines) in JS strings

It is well known, that Unix-like system uses LF characters for newlines, whereas Windows uses CR+LF.
However, when I test this code from local HTML file on my Windows PC, it seems that JS treat all newlines as separated with LF. Is it correct assumption?
var string = `
foo
bar
`;
// There should be only one blank line between foo and bar.
// \n - Works
// string = string.replace(/^(\s*\n){2,}/gm, '\n');
// \r\n - Doesn't work
string = string.replace(/^(\s*\r\n){2,}/gm, '\r\n');
alert(string);
// That is, it seems that JS treat all newlines as separated with
// `LF` instead of `CR+LF`?
I think I found an explanation.
You are using an ES6 Template Literal to construct your multi-line string.
According to the ECMAScript specs a
.. template literal component is interpreted as a sequence of Unicode
code points. The Template Value (TV) of a literal component is
described in terms of code unit values (SV, 11.8.4) contributed by the
various parts of the template literal component. As part of this
process, some Unicode code points within the template component are
interpreted as having a mathematical value (MV, 11.8.3). In
determining a TV, escape sequences are replaced by the UTF-16 code
unit(s) of the Unicode code point represented by the escape sequence.
The Template Raw Value (TRV) is similar to a Template Value with the
difference that in TRVs escape sequences are interpreted literally.
And below that, it is defined that:
The TRV of LineTerminatorSequence::<LF> is the code unit 0x000A (LINE
FEED).
The TRV of LineTerminatorSequence::<CR> is the code unit 0x000A (LINE FEED).
My interpretation here is, you always just get a line feed - regardless of the OS-specific new-line definitions when you use a template literal.
Finally, in JavaScript's regular expressions a
\n matches a line feed (U+000A).
which describes the observed behavior.
However, if you define a string literal '\r\n' or read text from a file stream, etc that contains OS-specific new-lines you have to deal with it.
Here are some tests that demonstrate the behavior of template literals:
`a
b`.split('')
.map(function (char) {
console.log(char.charCodeAt(0));
});
(String.raw`a
b`).split('')
.map(function (char) {
console.log(char.charCodeAt(0));
});
'a\r\nb'.split('')
.map(function (char) {
console.log(char.charCodeAt(0));
});
"a\
b".split('')
.map(function (char) {
console.log(char.charCodeAt(0));
});
Interpreting the results:
char(97) = a, char(98) = b
char(10) = \n, char(13) = \r
You could use the regular expression: /^\s*[\r\n]/gm
Code example:
let string = `
foo
bar
`;
string = string.replace(/^\s*[\r\n]/gm, '\r\n');
console.log(string);

What are the actual uses of ES6 Raw String Access?

What are the actual uses of String.raw Raw String Access introduced in ECMAScript 6?
// String.raw(callSite, ...substitutions)
function quux (strings, ...values) {
strings[0] === "foo\n"
strings[1] === "bar"
strings.raw[0] === "foo\\n"
strings.raw[1] === "bar"
values[0] === 42
}
quux `foo\n${ 42 }bar`
String.raw `foo\n${ 42 }bar` === "foo\\n42bar"
I went through the below docs.
http://es6-features.org/#RawStringAccess
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/template_strings
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/raw
http://www.2ality.com/2015/01/es6-strings.html
https://msdn.microsoft.com/en-us/library/dn889830(v=vs.94).aspx
The only the thing that I understand, is that it is used to get the raw string form of template strings and used for debugging the template string.
When this can be used in real time development? They were calling this a tag function. What does that mean?
What concrete use cases am I missing?
The best, and very nearly only, use case for String.raw I can think of is if you're trying to use something like Steven Levithan's XRegExp library that accepts text with significant backslashes. Using String.raw lets you write something semantically clear rather than having to think in terms of doubling your backslashes, just like you can in a regular expression literal in JavaScript itself.
For instance, suppose I'm doing maintenance on a site and I find this:
var isSingleUnicodeWord = /^\w+$/;
...which is meant to check if a string contains only "letters." Two problems: A) There are thousands of "word" characters across the realm of human language that \w doesn't recognize, because its definition is English-centric; and B) It includes _, which many (including the Unicode consortium) would argue is not a "letter."
So if we're using XRegExp on the site, since I know it supports \pL (\p for Unicode categories, and L for "letter"), I might quickly swap this in:
var isSingleUnicodeWord = XRegExp("^\pL+$"); // WRONG
Then I wonder why it didn't work, facepalm, and go back and escape that backslash, since it's being consumed by the string literal.
Easy enough in that simple regex, but in something complicated, remembering to double all those backslashes is a maintenance pain. (Just ask Java programmers trying to use Pattern.)
Enter String.raw:
let isSingleUnicodeWord = XRegExp(String.raw`^\pL+$`);
Example:
let isSingleUnicodeWord = XRegExp(String.raw`^\pL+$`); // L: Letter
console.log(isSingleUnicodeWord.test("Русский")); // true
console.log(isSingleUnicodeWord.test("日本語")); // true
console.log(isSingleUnicodeWord.test("العربية")); // true
console.log(isSingleUnicodeWord.test("foo bar")); // false
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.min.js"></script>
Now I just kick back and write what I mean. I don't even really have to worry about ${...} constructs used in template literals to do substitution, because the odds of my wanting to apply a quantifier {...} to the end-of-line assertion ($) are...low. So I can happily use substitutions and still not worry about backslashes. Lovely.
Having said that, though, if I were doing it a lot, I'd probably want to write a function and use a tagged template instead of String.raw itself. But it's surprisingly awkward to do correctly:
// My one-time tag function
function xrex(strings, ...values) {
let raw = strings.raw;
let max = Math.max(raw.length, values.length);
let result = "";
for (let i = 0; i < max; ++i) {
if (i < raw.length) {
result += raw[i];
}
if (i < values.length) {
result += values[i];
}
}
console.log("Creating with:", result);
return XRegExp(result);
}
// Using it, with a couple of substitutions to prove to myself they work
let category = "L"; // L: Letter
let maybeEol = "$";
let isSingleUnicodeWord = xrex`^\p${category}+${maybeEol}`;
console.log(isSingleUnicodeWord.test("Русский")); // true
console.log(isSingleUnicodeWord.test("日本語")); // true
console.log(isSingleUnicodeWord.test("العربية")); // true
console.log(isSingleUnicodeWord.test("foo bar")); // false
<script src="https://cdnjs.cloudflare.com/ajax/libs/xregexp/3.1.1/xregexp-all.min.js"></script>
Maybe the hassle is worth it if you're using it in lots of places, but for a couple of quick ones, String.raw is the simpler option.
First, a few things:
Template strings is old name for template literals.
A tag is a function.
String.raw is a method.
String.raw `foo\n${ 42 }bar\` is a tagged template literal.
Template literals are basically fancy strings.
Template literals can interpolate.
Template literals can be multi-line without using \.
String.raw is required to escape the escape character \.
Try putting a string that contains a new-line character \n through a function that consumes newline character.
console.log("This\nis\nawesome"); // "This\nis\nawesome"
console.log(String.raw`This\nis\nawesome`); // "This\\nis\\nawesome"
If you are wondering, console.log is not one of them. But alert is. Try running these through http://learnharmony.org/ .
alert("This\nis\nawesome");
alert(String.raw`This\nis\nawesome`);
But wait, that's not the use of String.raw.
Possible uses of String.raw method:
To show string without interpretation of backslashed characters (\n, \t) etc.
To show code for the output. (As in example below)
To be used in regex without escaping \.
To print windows director/sub-directory locations without using \\ to much. (They use \ remember. Also, lol)
Here we can show output and code for it in single alert window:
alert("I printed This\nis\nawesome with " + Sring.raw`This\nis\nawesome`);
Though, it would have been great if It's main use could have been to get back the original string. Like:
var original = String.raw`This is awesome.`;
where original would have become: This\tis \tawesome.. This isn't the case sadly.
References:
http://exploringjs.com/es6/ch_template-literals.html
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/raw
Template strings can be useful in many situations which I will explain below. Considering this, the String.raw prevents escapes from being interpreted. This can be useful in any template string in which you want to contain the escape character but do not want to escape it. A simple example could be the following:
var templateWithBackslash = String.raw `someRegExp displayed in template /^\//`
There are a few things inside that are nice to note with template strings.
They can contain unescaped line breaks without problems.
They can contain "${}". Inside these curly braces the javascript is interpreted instead.
(Note: running these will output the result to your console [in browser dev tools])
Example using line breaks:
var myTemplate = `
<div class="myClass">
<pre>
My formatted text
with multiple lines
{
asdf: "and some pretty printed json"
}
</pre>
</div>
`
console.log(myTemplate)
If you wanted to do the above with a normal string in Javascript it would look like the following:
var myTemplate = "\
<div class="myClass">\
<pre>\
My formatted text\
with multiple lines\
{\
asdf: "and some pretty printed json"\
}\
</pre>\
</div>"
console.log(myTemplate)
You will notice the first probably looks much nicer (no need to escape line breaks).
For the second I will use the same template string but also insert the some pretty printed JSON.
var jsonObj = {asdf: "and some pretty printed json", deeper: {someDeep: "Some Deep Var"}}
var myTemplate = `
<div class="myClass">
<pre>
My formatted text
with multiple lines
${JSON.stringify(jsonObj, null, 2)}
</pre>
</div>
`
console.log(myTemplate)
In NodeJS it is extremely handy when it comes to filepath handling:
var fs=require('fs');
var s = String.raw`C:\Users\<username>\AppData\Roaming\SomeApp\someObject.json`;
var username = "bob"
s=s.replace("<username>",username)
fs.readFile(s,function(err,result){
if (err) throw error;
console.log(JSON.parse(result))
})
It improves readability of filepaths on Windows. \ is also a fairly common separator, so I can definitely see why it would be useful in general. However it is pretty stupid how \ still escapes `... So ultimately:
String.raw`C:\Users\` //#==> C:\Users\`
console.log(String.raw`C:\Users\`) //#==> SyntaxError: Unexpected end of input.
In addition to its use as a tag, String.raw is also useful in implementing new tag functions as a tool to do the interleaving that most people do with a weird loop. For example, compare:
function foo(strs, ...xs) {
let result = strs[0];
for (let i = 0; i < xs.length; ++i) {
result += useFoo(xs[i]) + strs[i + 1];
}
return result;
}
with
function foo(strs, ...xs) {
return String.raw({raw: strs}, ...xs.map(useFoo));
}
The Use
(Requisite knowledge: tstring §.)
Instead of:
console.log(`\\a\\b\\c\\n\\z\\x12\\xa9\\u1234\\u00A9\\u{1234}\\u{00A9}`);
.you can:
console.log(String.raw`\a\b\c\n\z\x12\xa9\u1234\u00A9\u{1234}\u{00A9}`);
"Escaping"
<\\u> is fine, yet <\u> needs "escaping", eg:
console.log(String.raw`abc${'\\u'}abc`);
.Dit <\\x>, <\x>,
<console.log(String.raw`abc${`\\x`}abc`)>;
.<\`>, <`>, <console.log(String.raw`abc${`\``}abc`)>;
.<\${>, <${&>, <console.log(String.raw`abc${`$\{`}abc`)>;
.<\\1> (till <\\7>), <\1>, <console.log(String.raw`abc${`\\1`}abc`)>;
.<\\>, endunit <\>, <console.log(String.raw`abc${`\\`}`)>.
Nb
There's also a new "latex" string. Cf §.
I've found it to be useful for testing
my RegExps. Say I have a RegExp which
should match end-of-line comments because
I want to remove them. BUT, it must not
match source-code for a regexp like /// .
If your code contains /// it is not the
start of an EOL comment but a RegExp, as
per the rules of JavaScript syntax.
I can test whether my RegExp in variable patEOLC
matches or doesn't /// with:
String.raw`/\//` .match (patEOLC)
In other words it is a way to let my
code "see" code the way it exists in
source-code, not the way it exists
in memory after it has been read
into memory from source-code, with
all backslashes removed.
It is a way to "escape escaping" but
without having to do it separately
for every backslash in a string, but
for all of them at the same time.
It is a way to say that in a given
(back-quoted) string backslash
shall behave just like any other
character, it has no special
meaning or interpretation.

JSON.stringify - Strange behaviour with backslash in array

I was playing with JS while I noticed a strange behaviour with backslash \ inserted into a string within an array printed using JSON.stringify(). Of course the backslash is used to escaping special chars, but what happens if we need to put backslash in a string? Just use backslash to escape itself you're thinking, but it doesn't work with JSON.stringify
This should print one backslash
array = ['\\'];
document.write(JSON.stringify(array));
This should print two backslashes
array = ['\\\\'];
document.write(JSON.stringify(array));
Am I missing something? Could be that considered as a bug of JSON.stringify?
It is correct. JSON.stringify will return the required string to recreate that object - as your string requires that you escape a backslash, it will also return the required escape backslash to generate the string properly.
Try this:
array = ['\\'];
var x = JSON.stringify(array)
var y = JSON.parse(x)
if (array[0] == y[0]) alert("it works")
or
array = ['\\'];
if (JSON.parse(JSON.stringify(array))[0] == array[0]) alert("it really works")

replaceAll in javascript and dollar sign

I have a string with 3 dollar signs e.g. $$$Test123. I would like to display this string in a div.
The problem is that when I use replace I get $$Test123 - 2 dollar signs instead of 3.
example:
var sHtml="<_content_>";
var s="$$$Test";
sHtml= sHtml.replace("<_content_>", s);
Now the result of sHtml is $$Test;
Any idea how can it be solved?
javascript does not have a default replace all function. You can write your own like this
function replaceAll(txt, replace, with_this) {
return txt.replace(new RegExp(replace, 'g'),with_this);
}
$ has a special meaning when included in a string for the second argument of a call to replace(). Normally, you would use it to refer to matched expressions within the original string. For example:
"foo foooo".replace(/fo+/g, "$&bar");
//-> "foobar foooobar"
In the example above, $& refers back to the entire match, which is foo in the first word and foooo in the second.
Your problem stems from the special meaning of $. In order to use a literal $ in the match, you must chain two together so that the first escapes the second. To have 3 literal $ symbols, you must chain 6 together, like so:
var sHtml="<_content_>";
var s="$$$$$$Test";
sHtml= sHtml.replace("<_content_>", s);
//-> "$$$Test"
Quotes are your friend
var sHtml="<_content_>"
var s="$$$Test";
sHtml= sHtml.replace("<_content_>", s);
Try this replaceAll function:
http://www.dumpsite.com/replaceAll.php
It performs a replace all using the javascript replace function via a regular expression for speed, and at the same time eliminates the side effects that occur when regular expression special characters are inadvertently present in either the search or replace string.
Using this function you do not have to worry about escaping special characters. All special characters are pre escaped before the replaceAll is preformed.
This function will produce the output you are expecting.
Try it out and provide your feeback.

Categories

Resources