String.replace strange behaviour in IE11 - javascript

I have encountered a strange issue with IE11. Consider the following (part of the riot.js framework):
var s = "{JSON.stringify(\\{ amount: Math.floor(reward.amount) \\})}";
var s1 = s.replace(/\\{/g, '\uFFF0');
When running this code on localhost, it runs fine. But when running from our staging environment, the \{ fragment is replaced not by \uFFF0 (codepoint 65520) but by \uFFFD (codepoint 65533). That means it fails later when trying to replace the special character back to {.
The replace method is the browser's native one. The file that contain both the HTML (string is a DOM attribute) and the javascript is returned by the server with charset=utf-8 header and encoded as such. In staging environment, it is bundled with other files (not compression or mangling though) and still encoded in utf-8.
I have no idea why it does that, or why it's not systematic.

\uFFFD, or as visualized: �, is the browsers's way of showing you the character in the string is an invalid character.
The string itself will still contain \uFFF0, but since that character is not defined, the browser renders �, instead.
For me, in the console,
Google Chrome shows: ￰ (White box, black border, with a question mark).
Safari shows: ￰ (White box, black border, with a question mark).
Internet explorer shows: ￰ (White box, black border).
Edge shows: ￰ (White box, black border).
Firefox shows: nothing.
They're all the same string. Just a different visual representation, depending on the browser.
var s = "{JSON.stringify(\\{ amount: Math.floor(reward.amount) \\})}",
s1 = s.replace(/\\{/g, '\uFFF0'),
charCode = s1.charCodeAt(16);
document.write(charCode + ' ' + String.fromCharCode(charCode));
document.write('|');
document.write('￰'.charCodeAt(0));
document.write('|');
document.write('x'.replace('x', '\uFFF0').charCodeAt(0));
(For me, in Chrome, this snippet shows me: 65520 ￰|65520|65520)

I tracked the issue down to UglifyJs. By default, it replaces escaped unicode chars with the actual character. So:
var s1 = s.replace(/\\{/g, '\uFFF0');
Becomes this in the bundled file:
var s1 = s.replace(/\\{/g, '￰');
Which is not visible when debugging with source map, and does not behave well in IE11.
Solution is to add ASCIIOnly: true option to Uglify.
NB: Uglify doc refers to this option as either ascii-only or ascii_only, but the only variant really taken into consideration is ASCIIOnly.

Related

pre html - M-BM- bash scripts syntax highlighting on the web - remove ASCII representation of byte sequence 0xc2 0xa0

If you copy the bash script (contained in a <pre> element and highlighted using the SyntaxHighlighter plugin http://alexgorbatchev.com/SyntaxHighlighter/) from https://hoodlogic.tk/pre_bash_nbsp.html and save it in a file called test.sh, when you try to run the test.sh bash script (in Linux tested on Ubuntu 16.04), you get the following syntax errors:
bash -n test.sh
test.sh: line 2: syntax error near unexpected token `then'
test.sh: line 2: `    if [ ! -z "$1" ]; then'
If you use the cat -A command on the script, you'll see lots of this in the output:
cat -A test.sh
M-BM-M-BM-M-BM-M-BM-M-BM-M-BM-M-BM-M-BM-M-BM-M-BM-
You can remove these using sed (referenced here: https://askubuntu.com/questions/357248/how-to-remove-special-m-bm-character-with-sed), but my question is how do I copy the script client side from the browser and have it so that the characters are converted to the Linux space character?
Anyone know? Copying the text out of the <code> blocks using this jQuery snippet doesn't remove the M-BM characters either when pasted into a <textarea> in the browser.
var textToCopy = "";
var i = 0;
$(".code .container .line", elem.parent()).each(function(e){
if(i == 0){
textToCopy += $(this).text();
}else{
textToCopy += "\n" + $(this).text();
}
i++;
});
$("textarea").val(textToCopy);
This appears to have nothing to do with line endings. If you download the raw batch file manually, the script works as it should, as there are no non-breaking spaces in the source, but if you load it into the <pre> for SyntaxHighlighting using PHP's echo file_get_contents("file"); command, are now added to your script.
Anyone know the solution?
Looks like it is a combination of things (including different browser behavior):
The syntax highlighter script should not be using entries in the code sections since it already uses the following CSS style white-space: pre;. Thus, it should look like this:
<style>
.spaces { white-space: pre; }
</style>
<code class="bash spaces"> </code>
But the syntax highlighter script produces this:
<code class="bash spaces"> </code>
That causes Chrome to copy the text with . Firefox won't though because of this: https://bugzilla.mozilla.org/show_bug.cgi?id=359303
So, because the syntax highlighter script does it wrong, you have to convert the characters to normal spaces before the text gets copied:
textToCopy.replace(new RegExp(String.fromCharCode(160), "g"), " ");
Unfortunately, it's very complicated, and I wish that Chrome behaved like Firefox in this case and automatically converted characters to an actual space for Linux and Windows. But since it doesn't, I have to handle it differently.

Why are browsers inserting a `\9` into this style attribute

var div = document.createElement('div');
div.style.cssText = "background-image: url('http://imgs.com/mouse.png ')";
div.style.cssText
//=> "background-image: url("http://imgs.com/mouse.png\9 ");"
The above code, with that being a tab after the .png suffix, when run in either Chrome or Firefox console will output the shown line with with \9 appended to the URL. Why is that?
I've read that \9 is some sort of a hack for targeting only specific versions of IE, but why would Chrome and Firefox be automatically inserting that?
PS: For additional browser fun, can you guess what happens if that URL is relative and you want to resolve it with the classic <a> href trick? Here's what would happen on the fictional http://imgs.com domain:
var a = document.createElement('a');
// Actually parsed from the above response, not assigned manually
a.href = "/mouse.png\\9 ";
a.href
//=> "http://imgs.com/mouse.png/9" (in Chrome)
//=> "http://imgs.com/mouse.png%5C9" (in Firefox)
So while the URL with the \9 suffix would still actually work, the resolved URLs will now point to incorrect locations.
When the console has to print out source code, it tries to do so in a way that's unambiguous. A tab character is not obviously a tab, and there are many other special characters with that same problem. Therefore, non-printable or non-obvious characters are rendered using an appropriate escape sequence, where "appropriate" means "appropriate to the language context".
CSS escapes look like \nnnnnn where the ns are hex digits. There can be from one to six such digits. The escape \9 is the escape for the tab character.
Note that in actual CSS source text, you too can use actual tab characters and the notation \9 interchangeably and get precisely the same results, because the CSS parser interprets \9 as a single tab character, just like an actual tab character.
The described behavior of the console is not in any standard (because the developer tools themselves are not standardized), but it's the sort of thing that any designer of such tools is very likely to do.

New line characters Desktop vs Server - Updated

I am trying to catch new line characters with JavaScript.But here is the catch.
For example with a string as argument to my function;
s = "some_text_with_a_\n_new_line";
Document.write(s);
It will get detected by my script;
Now if I get my text from a textarea and I press enter to produce a line break. It will get detected and interpreted as \n. Now the texte area and the working script are on my desktop. Window 10.
When i upload it to the server (IIS 7 on windows "Godaddy") its not working anymore so I have try some variant like \r\n , \r non of them work. Actually it dont mater much to me cause all I'm looking for is the escaping .
Here is a piece of script I use to find them:
nex = string.indexOf(tokout[i]); // where tokout = '\n'
spacer = 1; //spacer to escape found token
spc = tokin.length; ///last token
if(tokin[spc-1] === '\n'){ //// spacer to set cursor after token
spacer = 2;
}
BBCode = input.substr(curseur,nex+spacer); // we have bbcode
The script is working on my desktop but on the server I can't get the line break!
How it works:
I get the position of the \n.
Set the position of my cursor right after it.
and what is in BBcode is the line break.
For some reason it only work on my PC not the server. So I'm thinking it has to be the way break line are interpreted by the server!
They are not interpretted as space. They are displayed as space. To display AS IS you could surround with a pre tag. e.g.
document.write('<pre>' + s + '</pre>');
Also, you should be looking for \n not //n
i.e. s.indexOf('\n')
if you are trying to output to a document, you could replace the \n with <br/>
e.g. s.replace(/\n/g, '<br/>')
If you want to see line break in HTML you have to use <br /> HTML tag:
document.write("some_text_with_a_<br />_new_line");
However in javascript \n is a valid line break character. For example when you use console.log or alert you will see 2 lines:
console.log("some_text_with_a_\n_new_line");
alert("some_text_with_a_\n_new_line");

Reading From text file Javascript

Okay, this may be hard to explain.
The Passwords don't work the usernames do.
I am reading from a text file.
"username,password" is the structure for the below is the text file
John,BOL12345
Mary2,BOL77777
Anna,BOL54321
test,BOL12345
The top 3 do not work alone i only need the top 3
but once i add the "test,BOL12345"
the password BOL12345 does work
but without the "test,BOL12345" The password "BOL12345" does not work or any of the other ones
I am doing this all in javascript below will be the code snippet.. please ask any questions as i do not understand why this happens.
The JavaScript Below
The "lines" = the text file above
lines = x.responseText.split("\n");
for (i=0; i < lines.length; i++)
{
test1 = lines[i].split(",")
username.push(test1[0]);
password.push(test1[1]);
}
var tempUsername = document.getElementById('username').value;
var tempPassword = document.getElementById('password').value;
var arraycontainsusername = (username.indexOf(tempUsername) > -1);
var arraycontainspassword = (password.indexOf(tempPassword) > -1);
alert(password);
if (arraycontainsusername && arraycontainspassword) {
window.location.href = "listing.htm";
};
Educated guess: your file is using \r\n. since you're splitting by \n the \r is left in and corrupts each string. try splitting by \r\n and see what happens. That would explain why adding the last line would work, since there's no newline at the end there won't be a trailing character to mess up the indexOf search.
different operating systems handle text files differently. Windows uses CRLF (Carriage Return Line Feed) to jump to the next line, while *NIX variants use LF. old MacOS versions use CR. Your code was assuming the file came from a *NIX environment, where LF (or \n) is the norm, when it came from a windows environment, where CRLF (or \r\n) is the norm (not accurate since you can make text files with LF in windows and with CRLF in *NIX, buy you get the picture).
To handle all cases correctly, I'd recommend normalizing the string before working on it:
x.responseText.replace(/\r\n|\r(?!\n)/g, '\n').split('\n');
that seemingly chinese string in the middle is actually a regular expression that matches either \r\n or \r (but only when \r isn't followed by \n). this way you can replace all your CRLFs and CRs to LF and handle text coming from any environment.
you can simplify that regex because of the order of the tokens, to /\r\n|\r/, but I'm leaving it in because it illustrates a neat concept (lookaheads - that bit (?!\n) says if and only if not immediately followed by a \n). With that said /\r\n|\r/ will perform better, especially when handling large files

Javascript string variable unquoted?

I am using the QuickBlox JavaScript API. Looking through their code, I found this line:
var URL_REGEXP = /\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi;
It appears that it has declared a string variable that is a regular expression pattern. Then it goes ahead to use that variable thus:
return str.replace(URL_REGEXP, function(match) {
url = (/^[a-z]+:/i).test(match) ? match : 'http://' + match;
url_text = match;
return '' + escapeHTML(url_text) + '';
});
I am wondering how is this possible? The var declared in the first line should be a string, but it is unquoted. Shouldn't this be a syntax error?
I went ahead and tested this code on my browser, and it works! This mean's I've got some learning to do here... Can anyone explain how this variable is declared?
Additionally, I tried to run the same code on my friends computer, the Chrome debugger throws a syntax error on the variable declaration line (unexpected token '/'). I am using Chrome Version 36.0.1985.143 m, my friend is using the same thing, but on my computer, it all works fine, on my friends computer, the code stops at the first variable declaration because of "syntax error".
Is there some setting that is different?
Any help would be appreciated.
UPDATE
Thanks for the quick answers. I've come from a PHP background, so thought that all regular expressions has to be initialized as strings :P.
Anyone can reproduce the syntax error I'm getting on my friends computer? (It still happens after disabling all extensions). I can't reproduce it either, and that's what is frustrating me.
UPDATE 2
I have tested and my friends computer and looked through the source. It appear to be due to some encoding problems (I'm not sure what). The regular expression is shown like this:
var URL_REGEXP = /\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?芦禄鈥溾€濃€樷€橾))/gi;
(The characters at the end of the code is some random chinese characters, it seems).
How can I change the encoding to match his browser/system? (He is running on a Windows 7 Chinese simplified system).
It is not a String variable. It is a regular expression.
Calling var varname = /pattern/flags;
is effective to calling var varname = new RegExp("pattern", "flags");.
You can execute the following in any browser that supports a JavaScript console:
>>> var regex = /(?:[\w-]+\.)+[\w-]+/i
>>> regex.exec("google.com")
... ["google.com"]
>>> regex.exec("www.google.com")
... ["www.google.com"]
>>> regex.exec("ftp://ftp.google.com")
... ["ftp.google.com"]
>>> regex.exec("http://www.google.com")
Anyone can reproduce the syntax error I'm getting on my friends computer? (It still happens after disabling all extensions). I can't reproduce it either, and that's what is frustrating me.
According to RegExp - JavaScript documentation:
Regex literals was present in ECMAScript 1st Edition, implemented in JavaScript 1.1. Use an updated browser.
No, it shouldn't be a syntax error. In Javascript, RegExp objects are not strings, they are a distinct class of objects. /.../modifiers is the syntax for a RegExp literal.
I can't explain the syntax error you got on your friend's computer, it looks fine to me. I pasted it into the Javascript console and it was fine.

Categories

Resources