Convert unicode characters to their character - javascript

I have a file of localized properties coming in.
The file is like this:
str1=Rawr
str2=This is a dot \u00B7
In str2, they mean that \u00B7 is the unicode and not the actual string \\u00B7. Is there anyway to parse strings to the unicode chars are converted?

Add double quotes around the value – then JSON.parse can do the job for you.
If you want to read and parse
str1=Rawr
str2=This is a dot \u00B7
as one value, then you will need to replace the line breaks with \n before doing so, otherwise it’ll break the syntax of the “string” you are passing to JSON.parse.

Related

regex for capturing the values

need some help with nested double quotes regex,
I have the following string:
"abcd-1234\":" : value\":1234\":
and I want to capture the entire string and separate it out into key and value pair but I am not able to come with a proper regex.
Basically, I have the following string format -->
"key" : "value"
and I want to find a proper regex for the string format.
I am able capture the key and value individually with the following regex -->
((^[\"]).*\2(?![^:]))
But not able to get a proper regex for the entire string.
Please, can someone help me with the regex.
Imagine the following string: "\\" - That contains \" but is still a complete, valid string. You can't just 'ignore \" - you have to count backslashes.
(?:[^"\\]|\\.) will cover any 'in-the-string' character: Either a backslash followed by anything (. is anything), or any character at all, as long as it isn't either a backslash, or a quote. A string is a quote, followed by any amount of those, followed by a quote, thus, a regexp appears.
However, regexps probably aren't the right tool for the job. This looks like a part of a JSON formatted input; there are JSON parsers that do a much better job on this, covering far more cases.

String.replace() in case of different encodings

When I use JSON.stringfy().replace(/[\t\r\n]/g,"").trim() on response messages (lambda functions callbacks) from different system I face an issue where \t will be replaced with \\t and \ to \\\
Is there a way to avoid this?
I tried to search for answers but only found articles for base cases.
JSON.stringify's specific purpose is to convert what you give it to JSON. If what you give it is a string with backslashes in it, then what you'll get back is the JSON representation of that string, which is the string encased in double quotes (") with any special characters, such as backslashes, escaped with a backslash, newlines converted to \n, carriage returns converted to \r, etc.
Example:
const str = document.querySelector("input").value;
console.log("The string:", str);
console.log("JSON.stringify's output:", JSON.stringify(str));
<input type="text" value="This string has a backslash in it: \ For instance, here's a backslash followed by a t: \t">
That's what JSON.stringify does. If you don't want that, don't use JSON.stringify.
...in case of different encodings
That part is irrelevant. By the time you're dealing with a JavaScript string, it doesn't matter what encoding was used to represent that string (in an HTML file, a .js file, etc.). Once it's in memory, it's in the one format for JavaScript strings defined by the language (which is essentially UTF-16, except invalid surrogate pairs are allowed).

Is it wrong to use Pipe symbol (|) in a json key?

I have a object something like:
{
"category|subCategory" : value
}
Is it wrong to use "|" (which I intend to use as a delimiter) in key of an object?
It is valid. Property names may be any string.
Wrongness seems like a moral judgement that is a matter of opinion.
According to the standard, any string can be used as the key.
A string is a sequence of Unicode code points wrapped with quotation marks (U+0022). All code points may
be placed within the quotation marks except for the code points that must be escaped: quotation mark
(U+0022), reverse solidus (U+005C), and the control characters U+0000 to U+001F.
Even {"⛄|⛱|☠": "is valid"} is valid.

Why is this string unparseable?

JSON.parse('["foo", "bar\\"]'); //Uncaught SyntaxError: Unexpected end of JSON input
When I look at the above code everything seems grammatically correct. It's a JSON string that I assumed would be able to be converted back to an array that contains the string "foo", and the string "bar\" since the first backslash escapes the second backslash.
So why is there an unexpected end of input? I'm assuming it has something to do with the backslashes, but I can't figure it out.
It seems like your code should be:
JSON.parse('["foo", "bar\\\\"]');
Your Json object is indeed ["foo", "bar\\"] but if you want it to be represented in a JavaScript code you need to escape again the \ characters, thus having four \ characters.
Regards
You'd need to double escape. With template literals and String.raw you could do:
JSON.parse(String.raw`["foo", "bar\\"]`);

Javascript CR+LF will break string?

When storing '\n\r' inside a string constant it will make the Javascript engine throw an error like "unterminated string" and so on.
How to solve this?
More info: basically I want to use Javascript to select text into a TEXTAREA HTML field and insert newlines. When trying to stuff those constants, I get an error.
String literals must not contain plain line break characters like CR and LF:
A 'LineTerminator' character cannot appear in a string literal, even if preceded by a backslash \. The correct way to cause a line terminator character to be part of the string value of a string literal is to use an escape sequence such as \n or \u000A.
So having a line break like this is invalid:
"foo
bar"
Instead you need to use an escape sequence like:
"foo\nbar"

Categories

Resources