Unicode -- What's going on here? - javascript

This code:
console.log('😀');
console.log('\uD83D\uDE00');
From HTML script tag:
😀
😀
Ran pasted into browser console (same browser):
😀
😀
What's going on here that causes the first console.log('😀'); to fail when it's included with a script tag, but work fine when run in the browser console. The obvious problem seems to be that it isn't being converted to a surrogate pair, since the second line works as expected.

Your HTML file is not saved in the same encoding that the HTTP headers or HTML meta tags advertise. The file is interpreted in the wrong encoding resulting in the wrong characters. That doesn't matter for the unicode escape sequence, which is pure ASCII, it does matter for the non-ASCII literal.
Concrete guess: the file is saved as UTF-8 but advertised as ISO-8859-1.

Related

Retain subscript string in javascript code

I have a javascript code containing subscript strings in an array e.g ["Câ‚‹â‚‚" , "Dâ‚‹â‚‚", etc.] In local environment, the script works fine. However, in live web server, the string became ["C?2", "D?2"] which makes my whole script doesn't execute properly. Subscript characters don't retain. Is there anyway where I could 'escape' these characters?
LOCAL
LIVE
Finally, I found the answer. Thanks Jacob for giving me the hint.
As what #Domorodec suggests, Notepad is only the answer. Copy and paste your code in Notepad and save it with UTF-8 Encoding.
Here's the link for the answer.
Please do note that the character may not be look as what you've expected it to be (weird characters will be printed in the javascript file), however, the code works fine it displayed the expected HTML text normally.

Character in the url is changed when using window.location.search

I have an url like: file:///C:/Users/index.html?Scale:%20Service-Qualität
When I use window.location.search to get the parameter in the url, in this case the parameter should be Scale: Service-Qualität but what I actually received was Scale:%20Service-Qualit%C3%A4t, I dont know why my character ä changed to %C3%A4 and when I tested in the console it displayed as Scale: Service-Qualität
Can anyone help me to fix this problem?
I found the solution for my problem. What I need to do is decode again my url using: decodeURIComponent(url); then I will get again exact url string.
You are seeing two issues here.
The ä being converted into %C3%A4 is called URL or percent encoding.
It's because URLs can't, technically, contain Unicode characters.
Browsers and servers work around this by converting non-ASCII characters in URLs to their percent encoded equivalents.
It's generally nothing to worry about.
In your case however, there seems to be an actual problem as well. The weird output in the console could be because your web page uses a single-byte encoding (like ISO-8859-1) instead of UTF-8.
Switching the web page to UTF-8 might solve the problem, using this Meta tag:
<meta charset="utf-8"/>
and, of course, saving the HTML file as UTF-8 in your editor.

Squared Question Mark Sign on CSV file read from JS

I'm reading a CSV file in my JS, but characters with accent (á, ó...) are being replaced with a black square question mark (�).
I always have this sort of problem in PHP, but, i'm using JS and i don't know how to fix that.
The problem is in the UTF8 codification of the file, of the HTML, is there a way to fix this in code?
Thanks
This character is U+FFFD, REPLACEMENT CHARACTER, commonly used to replace invalid data in streams thought to be some Unicode encoding.
For example if you had the text "Résumé" encoded as IS0 8859-1 and wanted to convert it to UTF-16, but told the conversion routine that the text was UTF-8 then the library would probably produce the UTF-16 text "R�sum�" (the other alternative would be to throw an error and not give any results).
Another way these may appear is if a web page declares that it is UTF-8 but it is not actually UTF-8. The browser is likely to do the re-encoding described above and the replacement characters will show up in the rendered web-page, but viewing the source with an editor that ignores or disregards the HTML encoding info will show the characters correctly.
From your comments it looks like your process is something like:
Excel -> export to csv -> process csv in js -> produce html
Windows software typically uses the platform's 'encoding for non-Unicode programs' for encoding eight bit text, not UTF-8. So the CSV file is probably Windows CP1252 (If you're using a version of windows set up for most of the western world), and if your javascript program is reading that data and copying it directly into HTML source that's supposed to be UTF-8, that would cause a problem that fits your description.
What you need to do convert from whatever encoding the CSV is using to UTF-8. Javascript doesn't really have the facilities to do this so your best bet is probably to convert the file after exporting it from Excel but before accessing it in JS.
Other alternatives are to change the encoding the HTML page is using to whatever the csv uses, or to not specify an encoding and leave it up to the browser to guess.

Writing Javascript using UTF-16 character encoding

Here is what I am trying but am not sure how to get this working or if it is even possible -
I have an HTML page MyHTMLPage.htm and I want to src a Javascript from this HTML file. This is pretty straightforward. I plan to include a <script src = "MyJavascript.js"></script> tag in my HTML file and that should take care of it.
However, I want to create my Javascript file using UTF-16 encoding. So, I plan to use the following tag <script charset="UTF-16" src="MyJavascript.js"></script> in my HTML file to take care of that
Now the problem I am really stuck at is how do I create the Javascript using UTF-16 encoding - E.g. let's say my Javascript code is alert(1); I created my Javascript file with the contents as \u0061\u006c\u0065\u0072\u0074\u0028\u0031\u0029\u003b but that does not seem to execute as valid Javascript at runtime.
To summarize, here is what I have -
MyHTMLPage.html
...
...
...
<script charset="UTF-16" src="MyJavascript.js"></script>
...
...
...
MyJavascript.js
\u0061\u006c\u0065\u0072\u0074\u0028\u0031\u0029\u003b
When I open the HTML page in Firefox, I get the error - "Syntax error - Illegal character" right at the beginning of the MyJavascript.js file. I have also tried adding the BOM character "\ufeff" at the beginning of the above Javascript but I still get the same error.
I know I could create my Javascript file as - "alert(1);" and then save it using UTF-16 encoding using the text editor and then the browser runs it fine however is there a way I could use "\u" notation (or an alternate escape character) and still get the Javascript to execute fine?
Thanks,
You are misunderstanding character encoding. Character encoding is a scheme of how characters are represented as bits behind the scenes.
You would not write \u004a in your file to "make it utf-16" as that is literally a sequence of 6 characters:
\, u, 0, 0, 4, a
And if you saved the above as utf-16, it would be represented as the following bits:
005C0075
00300030
00340061
Had you saved it as utf-8 it would be:
5C753030
3461
Which takes 50% of the space and bandwidth. It takes even less to write that character literally ("J"): just a byte
(4A) in utf-8.
The "\u"-notation is a way to reference any BMP character by just using a small set of ascii characters. If you were
working with a text editor with no unicode support, you could write "\u2665", instead of literally writing "♥" and the
browser would show it properly.
If you for some weird reason still want to use utf-16, simply write the code normally, save the file as utf-16 and serve it with the proper charset header.

\u00C2 is not defined error in javascript?

I got this error on following line.
$j(id).dateplustimepicker( "setTime" ,timeVal);
Can you please help me to solve this error?
The error is probably not in this line because no string constants are evaluated there. You wouldn't get this error if, for example, id contained the value.
When you get the error again, open the JavaScript console of your browser and look at the complete stack trace. The innermost frame is where you need to be looking.
[EDIT] Since you found the character in jquery-dateplustimepicker.js, this points to the real cause of the problem.
Every text file on your computer has an encoding. But there is no way to tell which one. The problem you have means: Your text/JS file is in UTF-8 encoding but your web server sends it to the browser with a different encoding. The browser then tries to read it but finds odd characters -> error.
Another reason for the error is that someone edited the file, using the wrong encoding. That can happen on Windows, for example, when you load the file with Cp-125x and saves it as UTF-8.
To check, download the file from the web server and do a binary compare with the original.
I got the answer but i forgot to upload here....Actually i got this problem because, There is  character is present in standard library file of jquery. File name is jquery-dateplustimepicker.js. The character either need encoding otherwise, it creates problem. The character must replace with white space instead of Â.

Categories

Resources