Javascript \x escapes within a mixed string - javascript

I have a string that I got by using Ajax to load a preview of a web page. The title comes out like this:
You Can\xe2\x80\x99t Handle the Truth About Facebook Ads, New Harvard Study Shows
I need to replace those escape codes with human readable text. I have tried String.fromCharCode(), but that doesn't return anything in the case of a mixed string, only if you send it character codes only.
Is there a function I can use to fix this string?

Here's one way to do it:
const str_orig = 'You Can\\xe2\\x80\\x99t Handle the Truth About Facebook Ads, New Harvard Study Shows';
console.log("Before: " + str_orig);
const str_new = str_orig.replace(
/(?:\\x[\da-fA-F]{2})+/g,
m => decodeURIComponent(m.replace(/\\x/g, '%'))
);
console.log("After: " + str_new);
The idea is to replace \x by % in the string (which produces a URL encoded string), then apply decodeURIComponent, which handles UTF-8 decoding for us, turning %e2%80%99 into a single character: ’ (U+2019, RIGHT SINGLE QUOTATION MARK).

Melpomene had the answer above, I just wanted to add one additional snip here: The above solution left an occasional \ or \n for some cases, so I modified it like this:
titleSuggest.replace(/(?:\\x[\da-fA-F]{2})+/g, m =>
decodeURIComponent(m.replace(/\\x/g, '%'))).replace(/\\n/g,
'<br>').replace(/\\/g, '')

Related

Why do I need to replace \n with \n?

I have a line of data like this:
1•#00DDDD•deeppink•1•100•true•25•100•Random\nTopics•1,2,3,0•false
in a text file.
Specifically, for my "problem", I am using Random\nTopics as a piece of text data, and I then search for '\n', and split the message up into two lines based on the placement of '\n'.
It is stored in blockObj.msg, and I search for it using blockObj.msg.split('\n'), but I kept getting an array of 1 (no splits). I thought I was doing something fundamentally wrong and spent over an hour troubleshooting, until on a whim, I tried
blockObj.msg = blockObj.msg.replace(/\\n/g, "\n")
and that seemed to solve the problem. Any ideas as to why this is needed? My solution works, but I am clueless as to why, and would like to understand better so I don't need to spend so long searching for an answer as bizarre as this.
I have a similar error when reading "text" from an input text field. If I type a '\n' in the box, the split will not find it, but using a replace works (the replace seems pointless, but apparently isn't...)
obj.msg = document.getElementById('textTextField').value.replace(/\\n/g, "\n")
Sorry if this is jumbled, long time user of reading for solutions, first time posting a question. Thank you for your time and patience!
P.S. If possible... is there a way to do the opposite? Replace a real "\n" with a fake "\n"? (I would like to have my dynamically generated data file to have a "\n" instead of a new line)
It is stored in blockObj.msg, and I search for it using blockObj.msg.split('\n'),
In a JavaScript string literal, \n is an escape sequence representing a new line, so you are splitting the data on new lines.
The data you have doesn't have new lines in it though. It has slash characters followed by n characters. They are data, not escape sequences.
Your call to replace (blockObj.msg = blockObj.msg.replace(/\\n/g, "\n")) works around this by replacing the slashes and ns with new lines.
That's an overcomplicated approach though. You can match the characters you have directly. blockObj.msg.split('\\n')
in your text file
1•#00DDDD•deeppink•1•100•true•25•100•Random\nTopics•1,2,3,0•false
means that there are characters which are \ and n thats how they are stored, but to insert a new line character by replacement, you are then searching for the \ and the n character pair.
obj.msg = document.getElementById('textTextField').value.replace(/\\n/g, "\n")
when you do the replace(/\\n/g, "\n")
you are searching for \\n this is the escaped version of the string, meaing that the replace must find all strings that are \n but to search for that you need to escape it first into \\n
EDIT
/\\n/g is the regex string..... \n is the value... so /\REGEXSTUFFHERE/g the last / is followed by regex flags, so g in /g would be global search
regex resources
test regex online

Escaping quotes in Javascript variable from Classic ASP

How can I escape quotes using a Classic ASP variable in javascript/jQuery? The ASP variable is taken from a DB. I'm using:
var goala = "<%=(goal_a)%>";
But obviously that appears as
var goala = "<p>testing "quotation" marks</p>";
when the page loads, which breaks the function with unexpected identifier.
edit: I'm using using jQuery not "how can I achieve this using jQuery" sorry wasn't clear.
Any ideas? Thanks
Adapting the answer for another language gives a more robust solution:
Function JavascriptStringEncode(text)
If IsEmpty(text) Or text = "" Or IsNull(text) Then
JavascriptStringEncode = text
Exit Function
End If
Dim i, c, encoded, charcode
' Adapted from https://stackoverflow.com/q/2920752/1178314
encoded = ""
For i = 1 To Len(text)
c = Mid(text, i, 1)
Select Case c
Case "'"
encoded = encoded & "\'"
Case """"
encoded = encoded & "\"""
Case "\"
encoded = encoded & "\\"
Case vbFormFeed
encoded = encoded & "\f"
Case vbLf
encoded = encoded & "\n"
Case vbCr
encoded = encoded & "\r"
Case vbTab
encoded = encoded & "\t"
Case "<" ' This avoids breaking a <script> content, in case the string contains "<!--" or "<script" or "</script"
encoded = encoded & "\x3C"
Case Else
charcode = AscW(c)
If charcode < 32 Or charcode > 127 Then
encoded = encoded & GetJavascriptUnicodeEscapedChar(charcode)
Else
encoded = encoded & c
End If
End Select
Next
JavascriptStringEncode = encoded
End Function
' Taken from https://stackoverflow.com/a/2243164/1178314
Function GetJavascriptUnicodeEscapedChar(charcode)
charcode = Hex(charcode)
GetJavascriptUnicodeEscapedChar = "\u" & String(4 - Len(charcode), "0") & charcode
End Function
It is done also with the help of this answer on how to get the javascript unicode escaping, and it has the benefits explained in this other answer.
Note that I have not specially escaped " and ' as suggested in that other answer, because I consider that in an html "regular" context (attribute values by example), a HTMLEncode must be additionally done anyway, and it will take care of quotes (and of ampersands, ...).
< is still specially handled due to the <script> context case, where HTMLEncode cannot be used (it won't be html decoded from the Javascript code standpoint, when used inside a <script> tag). See here for more on the <script> case.
Of course, a way better solution is to avoid putting any Javascript directly in the HTML, but have it all in separated Javascript files. Data should be given through data- attributes on html tags.
You've asked how to do this "Using jQuery." You can't. By the time jQuery would be involved, the code would already be invalid. You have to fix this server-side.
Classic ASP is unlikely to have anything built-in that will help you solve this in the general case.
Note that you have to handle more than just " characters. To successfully output text to a JavaScript string literal, you'll have to handle at least the quotes you use (" or '), line breaks, any other control characters, etc.
If you're using VBScript as your server-side language, you can use Replace to replace the characters you need to replace:
var goala = "<%=Replace(goal_a, """", "\""")%>";
Again, though, you'll need to build a list of the things you need to handle and work through it; e.g.
var goala = "<%=Replace(Replace(Replace(goal_a, """", "\"""), Chr(13), "\n"), Chr(10), "\r")%>";
...and so on.
If your server-side language is JScript, you can use replace in much the same way:
var goala = "<%=goal_a.replace(/"/g, "\\\").replace(/\r/g, "\\r").replace(/\n/g, "\n")%>";
...and so on. Note the use of regular expressions with the g flag so that you replace all occurrences (if you use a string for the first argument, it just replaces the first match).
It has been a while I dealt with this stuff.
You have to encode your data to use it inside an attribute.
Try this.
<%=server.HTMLEncode(goal_a)%>

Converting double backslash into backslash used for escaping?

I have a javascript string that contains \\n. When this is displayed out to a webpage, it shows literally as \n (as expected). I used text.replace(/\\n/g, '\n') to get it to act as a newline (which is the desired format).
I'm trying to determine the best way to catch all such instances (including similar instances like tabs \\t -> \t).
Is there a way to use regex (can't determine how to copy the matched wildcard letter to use in the replacement string) or anything else?
As mentioned by dandavis in the comments in original post, JSON.parse() ended up working for me.
i.e. text = JSON.parse(text);
Second answer, the first was wrong.
JavaScript works on a special way in this case. Read this for more details.
In your case it should be one of this ...
var JSCodeNewLine = "\u000A";
text.replace(/\\n/g, JSCodeNewLine);
var JSCodeCarriageReturnNewLine = "\u000D\u000A";
text.replace(/\\n/g, JSCodeCarriageReturnNewLine);

Error due to single quote ' in javascript function call

I am passing value dynamically to javascript function.
I am retrieving data from database and filling to javascript function, it does not have a static binding.
share_it(data_from_mysql_database);
like
share_it('value from mysql database');
Some times value contain a single quote (').
like:
share_it(' Essentially you'll have to have a good academic history ');
So function call gives error that:
Uncaught SyntaxError: Unexpected identifier
You can use the \ character to escape such characters:
share_it(' Essentially you\'ll have to have a good past academic ');
Or, you can switch to using double quotes if you know you will need to embed a single quote character:
share_it(" Essentially you'll have to have a good past academic ");
You can freely switch between double " and single ' quotes where you need the other in a literal string:
share_it(" Essentially you'll have to have a good past academic ");
Only in cases where you need both, you need to escape the repeating character:
share_it(" Essentially you'll have to have a good \"past\" academic ");
You can also replace the ' in the string with &#39.
You ought to be converting special chars on the upstream rather than the downstream. Converting it on the upstream saves time later when an inexperienced developer does not care to escape the data on the downstream when sent to the client. Since you have not properly converted the data on the upstream, you have no choice. You should escape it.
share_it(escape(data_from_mysql_database));
Example"
> escape("You're awesome");
'You%27re%20awesome'
>

JS/XSS: When assigning user-provided strings to variables; is it enough to replace <,>, and string delimiter?

If a server-side script generates the following output:
<script>
var a = 'text1';
var b = 'text2';
var c = 'text3';
</script>
, and the values (in this example "text1", "text2" and "text3") are user supplied (via HTTP GET/POST), is it enough to remove < and > from the input and to replace
'
with
' + "'" + '
in order to be safe from XSS? (This is my main question)
I'm particularly worried about the backslash not being escaped because an attacker could unescape the trailing '. Could that be a potential problem in this context? If the variable assignments were not separated by line breaks, an attacker could supply the values
text1
text2\
;alert(1);//
and end up with working JS code like
<script>
var a = 'text1'; var b = 'text2\'; var c = ';alert(1);//text3';
</script>
But since there are line breaks that shouldn't be a problem either. Am I missing something else?
It would be more secure to JSON encode your data, instead of rolling your own Javascript encoding function. When dealing with web application security, rolling your own is almost always not the answer. A JSON representation would handle the quotes and backslashes and any other special characters.
Most server side languages have a JSON module. Some also have a function specifically for what you're doing such as HttpUtility.JavaScriptStringEncode for the .NET framework.
If you were to roll your own, then it would be better to replace the characters for example like " to \x22, instead of changing single quotes or removing them. Also consider there is a multitude of creative XSS attacks that you'd need to defend against.
The end result, whatever method you use, is your data should remain intact when presented to the user. For example it's no good having O"Neil if someone's name is O'Neil.

Categories

Resources