How to safely prepare a string for JavaScript using PHP? [duplicate] - javascript

This question already has answers here:
How do I pass variables and data from PHP to JavaScript?
(19 answers)
Closed 2 years ago.
Considering the following piece of code rendered via PHP:
<script type="application/javascript">
const str = '<?= $str ?>';
</script>
How to prepare the $str so it can be rendered safely?
P.S. Safely is defined as "JavaScript gets the same Unicode contents as $str has in PHP".

You could use PHP's json_encode:
<script type="application/javascript">
const str = <?=json_encode($str)?>;
</script>
This will return a JSON representation of your String, meaning that:
it will be wrapped with double quotes
escape double quotes which might be in the String itself
escape Unicode characters
With this code alone (const str = ...), there is absolutely zero risk of XSS. The String is safe by itself, and can be manipulated by JS.
However, it can become an XSS hazard if you use that String like this:
eval(str); for obvious reasons
elem.innerHTML = str; for example if str === '<button onclick="sendCookiesToEvilServer()">Click me</button>', or '<style>body { display: none; }</style>'
... that list is not exhaustive
If you want to display that String to the user, prefer .innerText to .innerHTML, or maybe look at strip_tags.
If you do need to use innerHTML because the string is allowed to contain some HTML, you need to properly sanitize it against XSS. And it's a little harder, because then, you need a parser to allow/remove only some HTML tags and/or attributes. strip_tags will allow you to do so to a certain extent (i.e. you could only allow <b> for example, but that won't prevent that <b> from having an attribute onclick="...").

Related

JavaScript functions breaking with comma in string from PHP [duplicate]

This question already has answers here:
How do I pass variables and data from PHP to JavaScript?
(19 answers)
Closed 8 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
What is the easiest way to encode a PHP string for output to a JavaScript variable?
I have a PHP string which includes quotes and newlines. I need the contents of this string to be put into a JavaScript variable.
Normally, I would just construct my JavaScript in a PHP file, à la:
<script>
var myvar = "<?php echo $myVarValue;?>";
</script>
However, this doesn't work when $myVarValue contains quotes or newlines.
Expanding on someone else's answer:
<script>
var myvar = <?php echo json_encode($myVarValue); ?>;
</script>
Using json_encode() requires:
PHP 5.2.0 or greater
$myVarValue encoded as UTF-8 (or US-ASCII, of course)
Since UTF-8 supports full Unicode, it should be safe to convert on the fly.
Note that because json_encode escapes forward slashes, even a string that contains </script> will be escaped safely for printing with a script block.
encode it with JSON
function escapeJavaScriptText($string)
{
return str_replace("\n", '\n', str_replace('"', '\"', addcslashes(str_replace("\r", '', (string)$string), "\0..\37'\\")));
}
I have had a similar issue and understand that the following is the best solution:
<script>
var myvar = decodeURIComponent("<?php echo rawurlencode($myVarValue); ?>");
</script>
However, the link that micahwittman posted suggests that there are some minor encoding differences. PHP's rawurlencode() function is supposed to comply with RFC 1738, while there appear to have been no such effort with Javascript's decodeURIComponent().
The paranoid version: Escaping every single character.
function javascript_escape($str) {
$new_str = '';
$str_len = strlen($str);
for($i = 0; $i < $str_len; $i++) {
$new_str .= '\\x' . sprintf('%02x', ord(substr($str, $i, 1)));
}
return $new_str;
}
EDIT: The reason why json_encode() may not be appropriate is that sometimes, you need to prevent " to be generated, e.g.
<div onclick="alert(???)" />
<script>
var myVar = <?php echo json_encode($myVarValue); ?>;
</script>
or
<script>
var myVar = <?= json_encode($myVarValue) ?>;
</script>
Micah's solution below worked for me as the site I had to customise was not in UTF-8, so I could not use json; I'd vote it up but my rep isn't high enough.
function escapeJavaScriptText($string)
{
return str_replace("\n", '\n', str_replace('"', '\"', addcslashes(str_replace("\r", '', (string)$string), "\0..\37'\\")));
}
Don't run it though addslashes(); if you're in the context of the HTML page, the HTML parser can still see the </script> tag, even mid-string, and assume it's the end of the JavaScript:
<?php
$value = 'XXX</script><script>alert(document.cookie);</script>';
?>
<script type="text/javascript">
var foo = <?= json_encode($value) ?>; // Use this
var foo = '<?= addslashes($value) ?>'; // Avoid, allows XSS!
</script>
You can insert it into a hidden DIV, then assign the innerHTML of the DIV to your JavaScript variable. You don't have to worry about escaping anything. Just be sure not to put broken HTML in there.
You could try
<script type="text/javascript">
myvar = unescape('<?=rawurlencode($myvar)?>');
</script>
Don’t. Use Ajax, put it in data-* attributes in your HTML, or something else meaningful. Using inline scripts makes your pages bigger, and could be insecure or still allow users to ruin layout, unless…
… you make a safer function:
function inline_json_encode($obj) {
return str_replace('<!--', '<\!--', json_encode($obj));
}
htmlspecialchars
Description
string htmlspecialchars ( string $string [, int $quote_style [, string $charset [, bool $double_encode ]]] )
Certain characters have special significance in HTML, and should be represented by HTML entities if they are to preserve their meanings. This function returns a string with some of these conversions made; the translations made are those most useful for everyday web programming. If you require all HTML character entities to be translated, use htmlentities() instead.
This function is useful in preventing user-supplied text from containing HTML markup, such as in a message board or guest book application.
The translations performed are:
* '&' (ampersand) becomes '&'
* '"' (double quote) becomes '"' when ENT_NOQUOTES is not set.
* ''' (single quote) becomes ''' only when ENT_QUOTES is set.
* '<' (less than) becomes '<'
* '>' (greater than) becomes '>'
http://ca.php.net/htmlspecialchars
I'm not sure if this is bad practice or no, but my team and I have been using a mixed html, JS, and php solution. We start with the PHP string we want to pull into a JS variable, lets call it:
$someString
Next we use in-page hidden form elements, and have their value set as the string:
<form id="pagePhpVars" method="post">
<input type="hidden" name="phpString1" id="phpString1" value="'.$someString.'" />
</form>
Then its a simple matter of defining a JS var through document.getElementById:
<script type="text/javascript" charset="UTF-8">
var moonUnitAlpha = document.getElementById('phpString1').value;
</script>
Now you can use the JS variable "moonUnitAlpha" anywhere you want to grab that PHP string value.
This seems to work really well for us. We'll see if it holds up to heavy use.
If you use a templating engine to construct your HTML then you can fill it with what ever you want!
Check out XTemplates.
It's a nice, open source, lightweight, template engine.
Your HTML/JS there would look like this:
<script>
var myvar = {$MyVarValue};
</script>

TinyMCE - Set Editor Content

I am building an auto-save feature for my TinyMCE editor.
With jQuery, I have saved the editor's content to my database and now want to load the content back in to the editor. I am having troubles with the quotes (") in the HTML coming from the DB.
My code:
var content = "<%=content%>" // Classic ASP variable containing HTML from DB
tinyMCE.activeEditor.setContent(content);
Example output:
var content = "<p>Oh yes, from Churchill, the <em><span style="text-decoration: underline;"><strong>dog</strong></span></em>.</p>"
tinyMCE.activeEditor.setContent(content);
In the variable, "content", there are double quotes in the style tag which is causing a JS error. How do I get around this? Do I replace quotes with single quotes or do I use an escape or encode function? Please help.
VBScript has escaping functionality like Javascript (also ready if you use JScript as default language in classic asp)
Check out Escape and UnEscape.
Following is an efficient way of doing to append server-side variables to client-side js block.
// escape on server-side, unescape with js
var content = unescape("<%= escape(content) %>");
tinyMCE.activeEditor.setContent(content);
Example output:
var content = unescape("%3Cp%3EOh%20yes%2C%20from%20Churchill%2C%20the%20%3Cem%3E%3Cspan%20style%3D%22text-decoration%3A%20underline%3B%22%3E%3Cstrong%3Edog%3C/strong%3E%3C/span%3E%3C/em%3E.%3C/p%3E");
tinyMCE.activeEditor.setContent(content);
If it is just HTML (no javascript) it is probably quickest to replace the double quotes with singles.
str = "<span style="color:black">Text "string"</span>";
tag_pattern = /\<.*\>/m;
matches = str.match(tag_pattern);
//copy matches array and replace quotes, then use matches array to replace into original string
Additionally to BeaverusIVs first answer i have to add that the line
str = "<span style="color:black">Text "string"</span>";
should look like this
str = '<span style="color:black">Text "string"</span>';
otherwise the string is not valid and an error is thrown.

how to extract body contents using regexp [duplicate]

This question already has answers here:
Regular Expression to Extract HTML Body Content
(6 answers)
Closed 8 years ago.
I have this code in a var.
<html>
<head>
.
.
anything
.
.
</head>
<body anything="">
content
</body>
</html>
or
<html>
<head>
.
.
anything
.
.
</head>
<body>
content
</body>
</html>
result should be
content
Note that the string-based answers supplied above should work in most cases. The one major advantage offered by a regex solution is that you can more easily provide for a case-insensitive match on the open/close body tags. If that is not a concern to you, then there's no major reason to use regex here.
And for the people who see HTML and regex together and throw a fit...Since you are not actually trying to parse HTML with this, it is something you can do with regular expressions. If, for some reason, content contained </body> then it would fail, but aside from that, you have a sufficiently specific scenario that regular expressions are capable of doing what you want:
const strVal = yourStringValue; //obviously, this line can be omitted - just assign your string to the name strVal or put your string var in the pattern.exec call below
const pattern = /<body[^>]*>((.|[\n\r])*)<\/body>/im;
const array_matches = pattern.exec(strVal);
After the above executes, array_matches[1] will hold whatever came between the <body and </body> tags.
var matched = XMLHttpRequest.responseText.match(/<body[^>]*>([\w|\W]*)<\/body>/im);
alert(matched[1]);
I believe you can load your html document into the .net HTMLDocument object and then simply call the HTMLDocument.body.innerHTML?
I am sure there is even and easier way with the newer XDocumnet as well.
And just to echo some of the comments above regex is not the best tool to use as html is not a regular language and there are some edge cases that are difficult to solve for.
https://en.wikipedia.org/wiki/Regular_language
Enjoy!

javascript string difference [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
single quotes versus double quotes in js
When to Use Double or Single Quotes in JavaScript
What is the difference (if any) between the javascript strings defined below?
var str1 = "Somestring";
var str2 = 'Somestring';
"" and '' mean two very different things to me predominantly writing code in C++ :-)
EDIT: If there is no difference why are there two ways of achieving the same thing and which is considered better practice to use and why. Thanks!
Javascript treats single and double quotes as string delimiters.
If you use single quotes, you can use double quotes inside the string without escaping them.
If you use double quotes, you can use single quotes inside the string without escaping them.
Both examples evaluate to the same thing.
alert(str1 == str2); // true
alert(str1 === str2); // true
Why two ways? Due to the way javascript allows you to mix the two, you can write html attributes out without messy escapes:
var htmlString1 = "<a href='#'>link</a>";
var htmlString2 = 'link';
As for best practice, there is no convention. Use what feels best.
Personally, I like making sure the Javascript I emit matches the HTML (if I double quote attributes, I will delimit JS string with a ', so emitted attributes will use ").
In Javascript a string is a sequence of zero or more Unicode characters enclosed within single or double quotes (' or "). Double-quote characters may be contained within strings delimited by single-quote characters, and single-quote characters may be contained within strings delimited by double quotes.
In client-side JavaScript programming, JavaScript code often contains strings of HTML code, and HTML code often contains strings of JavaScript code. Like JavaScript, HTML uses either single or double quotes to delimit its strings. Thus, when combining JavaScript and HTML, it is a good idea to use one style of quotes for JavaScript and the other style for HTML.
No difference at all.
I believe the answer is there is no difference. They are both strings.
Here would be the usage of '
var mynewhtml = '<body class="myclass" ></body>';
or using "
var mynewhtml = "<body class='myclass' ></body>";
this also works but IMO is harder to read
var mynewhtml = "<body class=\"myclass\" ></body>";

jQuery.parseJSON throws “Invalid JSON” error due to escaped single quote in JSON

I’m making requests to my server using jQuery.post() and my server is returning JSON objects (like { "var": "value", ... }). However, if any of the values contains a single quote (properly escaped like \'), jQuery fails to parse an otherwise valid JSON string. Here’s an example of what I mean (done in Chrome’s console):
data = "{ \"status\": \"success\", \"newHtml\": \"Hello \\\'x\" }";
eval("x = " + data); // { newHtml: "Hello 'x", status: "success" }
$.parseJSON(data); // Invalid JSON: { "status": "success", "newHtml": "Hello \'x" }
Is this normal? Is there no way to properly pass a single quote via JSON?
According to the state machine diagram on the JSON website, only escaped double-quote characters are allowed, not single-quotes. Single quote characters do not need to be escaped:
Update - More information for those that are interested:
Douglas Crockford does not specifically say why the JSON specification does not allow escaped single quotes within strings. However, during his discussion of JSON in Appendix E of JavaScript: The Good Parts, he writes:
JSON's design goals were to be minimal, portable, textual, and a subset of JavaScript. The less we need to agree on in order to interoperate, the more easily we can interoperate.
So perhaps he decided to only allow strings to be defined using double-quotes since this is one less rule that all JSON implementations must agree on. As a result, it is impossible for a single quote character within a string to accidentally terminate the string, because by definition a string can only be terminated by a double-quote character. Hence there is no need to allow escaping of a single quote character in the formal specification.
Digging a little bit deeper, Crockford's org.json implementation of JSON for Java is more permissible and does allow single quote characters:
The texts produced by the toString methods strictly conform to the JSON syntax rules. The constructors are more forgiving in the texts they will accept:
...
Strings may be quoted with ' (single quote).
This is confirmed by the JSONTokener source code. The nextString method accepts escaped single quote characters and treats them just like double-quote characters:
public String nextString(char quote) throws JSONException {
char c;
StringBuffer sb = new StringBuffer();
for (;;) {
c = next();
switch (c) {
...
case '\\':
c = this.next();
switch (c) {
...
case '"':
case '\'':
case '\\':
case '/':
sb.append(c);
break;
...
At the top of the method is an informative comment:
The formal JSON format does not allow strings in single quotes, but an implementation is allowed to accept them.
So some implementations will accept single quotes - but you should not rely on this. Many popular implementations are quite restrictive in this regard and will reject JSON that contains single quoted strings and/or escaped single quotes.
Finally to tie this back to the original question, jQuery.parseJSON first attempts to use the browser's native JSON parser or a loaded library such as json2.js where applicable (which on a side note is the library the jQuery logic is based on if JSON is not defined). Thus jQuery can only be as permissive as that underlying implementation:
parseJSON: function( data ) {
...
// Attempt to parse using the native JSON parser first
if ( window.JSON && window.JSON.parse ) {
return window.JSON.parse( data );
}
...
jQuery.error( "Invalid JSON: " + data );
},
As far as I know these implementations only adhere to the official JSON specification and do not accept single quotes, hence neither does jQuery.
If you need a single quote inside of a string, since \' is undefined by the spec, use \u0027 see http://www.utf8-chartable.de/ for all of them
edit: please excuse my misuse of the word backticks in the comments. I meant backslash. My point here is that in the event you have nested strings inside other strings, I think it can be more useful and readable to use unicode instead of lots of backslashes to escape a single quote. If you are not nested however it truly is easier to just put a plain old quote in there.
I understand where the problem lies and when I look at the specs its clear that unescaped single quotes should be parsed correctly.
I am using jquery`s jQuery.parseJSON function to parse the JSON string but still getting the parse error when there is a single quote in the data that is prepared with json_encode.
Could it be a mistake in my implementation that looks like this (PHP - server side):
$data = array();
$elem = array();
$elem['name'] = 'Erik';
$elem['position'] = 'PHP Programmer';
$data[] = json_encode($elem);
$elem = array();
$elem['name'] = 'Carl';
$elem['position'] = 'C Programmer';
$data[] = json_encode($elem);
$jsonString = "[" . implode(", ", $data) . "]";
The final step is that I store the JSON encoded string into an JS variable:
<script type="text/javascript">
employees = jQuery.parseJSON('<?=$marker; ?>');
</script>
If I use "" instead of '' it still throws an error.
SOLUTION:
The only thing that worked for me was to use bitmask JSON_HEX_APOS to convert the single quotes like this:
json_encode($tmp, JSON_HEX_APOS);
Is there another way of tackle this issue? Is my code wrong or poorly written?
Thanks
When You are sending a single quote in a query
empid = " T'via"
empid =escape(empid)
When You get the value including a single quote
var xxx = request.QueryString("empid")
xxx= unscape(xxx)
If you want to search/ insert the value which includes a single quote in a query
xxx=Replace(empid,"'","''")
Striking a similar issue using CakePHP to output a JavaScript script-block using PHP's native json_encode. $contractorCompanies contains values that have single quotation marks and as explained above and expected json_encode($contractorCompanies) doesn't escape them because its valid JSON.
<?php $this->Html->scriptBlock("var contractorCompanies = jQuery.parseJSON( '".(json_encode($contractorCompanies)."' );"); ?>
By adding addslashes() around the JSON encoded string you then escape the quotation marks allowing Cake / PHP to echo the correct javascript to the browser. JS errors disappear.
<?php $this->Html->scriptBlock("var contractorCompanies = jQuery.parseJSON( '".addslashes(json_encode($contractorCompanies))."' );"); ?>
I was trying to save a JSON object from a XHR request into a HTML5 data-* attribute. I tried many of above solutions with no success.
What I finally end up doing was replacing the single quote ' with it code ' using a regex after the stringify() method call the following way:
var productToString = JSON.stringify(productObject);
var quoteReplaced = productToString.replace(/'/g, "'");
var anchor = '<a data-product=\'' + quoteReplaced + '\' href=\'#\'>' + productObject.name + '</a>';
// Here you can use the "anchor" variable to update your DOM element.
Interesting. How are you generating your JSON on the server end? Are you using a library function (such as json_encode in PHP), or are you building the JSON string by hand?
The only thing that grabs my attention is the escape apostrophe (\'). Seeing as you're using double quotes, as you indeed should, there is no need to escape single quotes. I can't check if that is indeed the cause for your jQuery error, as I haven't updated to version 1.4.1 myself yet.

Categories

Resources