Prevent XSS by converting any data to a String in Javascript? [duplicate]

Prevent XSS by converting any data to a String in Javascript? [duplicate] - javascript

I'm making a javascript code editor for users on my site. One of the features I built was a custom console.
Users can write console.log in their code and the logged string gets appended to a div on the page doing something like this:
function toConsole(str) {
var myconsole = document.getElementById("console-text");
var message = document.createElement("span");
message.append(str);
myconsole.append(message);
}
str is set to whatever string the user inputs into console.log. Can appending this string run malicious code on my page? (The .append() jQuery api page says 'yes' but I can't seem to get it to interpret anything I write as html)
If so how can I prevent this and how can I test to make sure it's safe?

You can use text() to insert the content as a textNode, which will cause it not to be rendered by the page as markup, but as plain text.
var stringContainingHtmlAndJavascript = '<div><b>This will be bold</b></div>';
$(document.body).text(stringContainingHtmlAndJavascript);
$(document.body).append(stringContainingHtmlAndJavascript);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>

As Taplar said, there are different ways.
You can also add the HTML to a temporarily-tag and extract the text afterwards.
See my third example to do this.
$(function() {
var stringContainingHtmlAndJavascript = '<div><b>This will be bold</b></div>';
$('.test1').append(stringContainingHtmlAndJavascript);
$('.test2').text(stringContainingHtmlAndJavascript);
$('.test3').text($('<i/>').append(stringContainingHtmlAndJavascript).text());
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div class="test1"></div>
<div class="test2"></div>
<div class="test3"></div>

I will suggest you .parseHTML() with a bemol, as stated in the documentation.
.parseHTML(), as the method name says, parses a string to interpret the HTML.
The third argument keepScripts is defaulted to false... Setting it to true would open the gates wide to scripts.
So it "normally removes" the script tags. If no HTML or text is found at all, it returns undefined (like demo case #3). So you problably will need to add an if condition to avoid appending the text "undefined".
So... In the demo below, I used your posted "script to append" quite as-is... I just added the HTML parsing method.
IMPORTANT, case #1 to #4 are safe... But #5 is a breach. If there is an inline on[event] attribute in the parsed HTML, it will go through the "script filter" and may execute.
$(".console_ok").on("click",function(){
toConsole( $(this).prev(".console_input").val() );
});
function toConsole(str) {
str = $.parseHTML(str)[0];
var myconsole = document.getElementById("console-text");
var message = document.createElement("span");
message.append(str);
myconsole.append(message);
}
input{
width: 60em;
}
#console-text{
height:8em;
width:20em;
background-color: #bbb;
border: 1px solid black;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
console.log test #1:<input class="console_input" value="Just text">
<button class="console_ok">OK</button><br>
<br>
console.log test #2:<input class="console_input" value="<h1>Some HTML</h1>">
<button class="console_ok">OK</button><br>
<br>
console.log test #3:<input class="console_input" value="<script>alert('A script!!!');</script>">
<button class="console_ok">OK</button><br>
<br>
console.log test #4:<input class="console_input" value="<div style='height:20px;background-color:red;'><script>alert('A script!!!');</script>And some <b>bold</b> text...</div>">
<button class="console_ok">OK</button><br>
<br>
console.log test #5:<input class="console_input" value="<img src='invalid-path' onerror='alert(`JS EXECUTES HERE!!!`);'>">
<button class="console_ok">OK</button><br>
<br>
My console:<br>
<div id="console-text"></div>
(Please run in ful page mode)
CodePen
You will notice the [0] after $.parseHTML(str)... It's to get the DOM element from the jQuery object, as your function is plain JS. Your function could be written like this too (does the exact same thing):
function toConsole(str) {
str = $.parseHTML(str);
var myconsole = $("#console-text");
var message = $("<span>");
message.append(str);
myconsole.append(message);
}

jquery.append() and jquery.html() are not secure against XSS attack Unless you sanitize the data when you want to display the that to the user.
function encodeHTML(s) {
return s.replace(/&/g, '&').replace(/</g, '<').replace(/"/g, '"');}
but jquery.text() is safe because it does not render the html code and brings it in raw.

Related

Passing HTML value to embedded script

So I have a HTML file with an embedded script. A Java application sends a value to this HTML file. Now I wonder how to pass this value from the HTML down to the script. Is this even possible?
Here is the simplified HTML file with my approach:
<html>
<body>
<div id="test">
[VALUE_FROM_BACKEND] // prints "let valueFromBackend = 1234"
</div>
<script>
console.log(document.getElementById('test').value);
// should return: let valueFromBackend = 1234;
// actually returns: undefined
</script>
</body>
</html>
Unfortunately, I can't pass the value from the Java application directly to the script. I got the above approach from here, but this doesn't work.
Other solutions only focus on getting values from remote HTML pages, declaring the HTML files's source in the script tag. But since it is an embedded script here, this also seems not to work.
Does anyone know how to deal with the situation? Help will be much appreciated.

Only HTML input elements have a value in javascript. A div cannot have a value, which is why your code returns undefined.
To access the text inside a regular HTML element, such as a div, use element.innerText instead.
Here is a working code snippet you can try out:
console.log(document.getElementById('test').innerText);
<div id="test">
let valueFromBackend = 1234
</div>

As you want to get value of a div element, so the syntax is:
document.getElementById('test').innerHTML
Remember that getElementById().value works for input and use getElementById().innerHTML for elements like div

How can I make HTML code non-execute?

What I want to do is allow the user to input a string then display that string in the web page inside a div element, but I don't want the user to be able to add a bold tag or anything that would actually make the HTML text bold. How could I make it so the text entered by the user does not get converted into HTML code, if the text has an HTML tag in it?

Use createTextNode(value) and append it to your element(Standard solution) or innerText(Non standard solution) instead of innerHTML.
For a JQuery solution look at Dan Weber's answer.

here's a neat little function to sanitize untrusted text:
function sanitize(ht){ // tested in ff, ch, ie9+
return new Option(ht).innerHTML;
}
example input/output:
sanitize(" Hello <img src=data:image/png, onmouseover=alert(666) onerror=alert(666)> World");
// == " Hello <img src=data:image/png, onmouseover=alert(666) onerror=alert(666)> World"
It will achieve the same results as setting elm.textContent=str;, but as a function, you can use it easier inline, like to run markdown after you sanitize() so that you can pretty-format input (eg. linking URLs) without running arbitrary HTML from the user.

use .text() when setting the text in the div rather than .HTML. This will render it as text instead of html.
$(document).ready(function() {
// Handler for .ready() called.
$("#change-it").click(function() {
var userLink = $('#usr-input').val().replace(/.*?:\/\//g, "");
$('#users-text').text(userLink);
});
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<input type="text" class="form-control" id="usr-input">
<br>
<button id="change-it" type="button">Update Text</button>
<br>
<div id="users-text"></div>

Why not simply use .text() ?
$('#in').on('keyup', function(e) {
$('#out').text($(this).val());
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<input id="in">
<br>
<div id="out"></div>

How to validate or wrap user inputted HTML to fix unclosed tags

I have text boxes in a form where users can input formatted text or raw HTML. It all works fine, however is a user doesn't close a tag (like a bold tag), then it ruins all HTML formatting after it (it all becomes bold).
Is there a way to either validate the user's input, automatically close tags, or somehow wrap the user input in an element to stop it leaking over?

You may try jquery-clean
$.htmlClean($myContent);

Is there a way to either validate the user's input, automatically close tags, or somehow wrap the user input in an element to stop it leaking over?
Yes: When the user is done editing the text area, you can parse what they've written using the browser, then get an HTML version of the parsed result from the browser:
var div = $("<div>");
div.html($("#the-textarea").val());
var html = div.html();
Live example — type an unclosed tag in and click the button:
$("input[type=button]").on("click", function() {
var div = $("<div>");
div.html($("#the-textarea").val());
var html = div.html();
$(document.body).append("<p>You wrote:</p><hr>" + html + "<hr>End of what you wrote.");
});
<p>Type something unclosed here:</p>
<textarea id="the-textarea" rows="5" cols="40"></textarea>
<br><input type="button" value="Click when ready">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
Important Note: If you're going to store what they write and then display it to anyone else, there is no client-side solution, including the above, which is safe. Instead, you must use a server-side solution to "sanitize" the HTML you get from them, to remove (for instance) malicious content, etc. All the above does is help you get mostly-well-formed markup, not safe markup.
Even if you're just displaying it to them, it would still be best to sanitize it, since they can work around any client-side pre-processing you do.

You could try and use : http://ejohn.org/blog/pure-javascript-html-parser/ .
But if the user is entering the html by hand you could just check to have all tags closed properly. If not, just display an error message to the user.

You can create a jQuery element using the text and then get it's html, like so
Sample
<textarea>
<div>
<div>
<span>some content</span>
<span>some content
</div>
</textarea>
Script
alert($($('textarea').text()).html());
alert($($('textarea').text()).html());
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea>
<div>
<div>
<span>some content</span>
<span>some content
</div>
</textarea>

The simple way to check if entered HTML is actually valid and parseable by browser is to let browser try it out itself using DOMParser. Then you could check if result is ok or not:
function checkHTML(html) {
var dom = new DOMParser().parseFromString(html, "text/xml");
return dom.documentElement.childNodes[0].nodeName !== 'parsererror';
}
$('button').click(function() {
var html = $('textarea').val();
var isValid = checkHTML(html);console.log(isValid)
$('div').html(isValid ? html : 'HTML is not valid!');
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea cols="80" rows="7"><div>Some HTML</textarea> <button style="vertical-align:top">Check</button>
<div></div>

Can't get text to line break with <br> after inserting with innerHTML

I'm relatively new to HTML, js, coming from Delphi. I have reviewed the following answers but they don't seem to work for me, or I am not understanding what they are saying:
Multi line print
.innerHTML <br> breaking
html <br> and innerHTML problem
Problem: all my text is printing in one line, rather than line breaking.
Below is the HTML source.
<!DOCTYPE HTML>
<html>
<head>
<title>Test Module</title>
</head>
<body >
<p id="demo"></p>
<button id="btnRPCTest" onclick="RPCTest()">Test RPC call</button>
<p id="RPCTarget">(RPC output here...)</p>
<script id="broker" src="scripts/rpcbroker.js" type="text/javascript"></script>
<script>
function RPCTest() {
var Div = document.getElementById("RPCTarget")
Broker_CallV("XUS INTRO MSG", []);
var text = Broker_ResultsDelim("<br />\n");
Div.innerHTML = text;
}
</script>
</body>
</html>
Above, the code makes a remote procedure call (RPC) to the server, and the results are put into a custom TStringList object that I created, that at it's core is an array of strings.
Below, is the code that gets back the results as a string, deliminated by the specified parameter ADelim.
function Broker_ResultsDelim(ADelim) {
return TStringList.GetTextDelim(RPCBrokerV.Results, ADelim);
}
Below, the array of strings should be concatenated into one long string, with contained line breaks.
GetTextDelim: function (Self, ADelim) {
return (Self.FData).join(ADelim) + ADelim;
},
When I run this code, and step through it with the Chrome developer console, I can type 'text' in the console, and the line breaks are correct.
> text
"
...............................................................
.................#.............................................
...###############.......##############..........########......
.################......################........####...####.....
.##.....####..........##....####.....###......##.......####....
.###....###...##............###......##......##................
..##....#######.............###.....###.....###................
.......###..##.............##########.......###....#########...
.......###.................###...............##........###.....
.......###.................###...............###......####.....
......###.................###.................####....###......
...###########........############.............##########......
......................................................###......
................................................########.......
...............................................................
<br />
"
But after the text gets inserted into the div, it is all on one line. I can't demonstrate that here, because when I copy the text (which appears on the screen to be all one line), and paste it into this question on StackOverflow, suddenly is is formatted correctly, with line breaks.
This makes me think that the text does contain the line break BR codes , but they are being ignored in the div.
I would think that this should be an easy question for someone knowledgeable in HTML, but it's got me scratching my head in confusion. Thanks in advance for the help.
Kevin

I'm going to ignore how you're generating the text. I'm going to assume that it is returned in a format like this, where there are actual line breaks ("\n") at the end of every line:
var code =
"\
...............................................................\
.................#.............................................\
...###############.......##############..........########......\
.################......################........####...####.....\
.##.....####..........##....####.....###......##.......####....\
.###....###...##............###......##......##................\
..##....#######.............###.....###.....###................\
.......###..##.............##########.......###....#########...\
.......###.................###...............##........###.....\
.......###.................###...............###......####.....\
......###.................###.................####....###......\
...###########........############.............##########......\
......................................................###......\
................................................########.......\
...............................................................\
"
I used \ to concatenate the strings, but a "\n" works as well. Now, all you have to do is:
CSS
html {
font-family: Menlo, Monaco, Consolas, "Courier New", monospace;
/* so that the text is monospace and lines up correctly */
}
HTML
<div id="holder">
</div>
JavaScript
$("#holder").append(code.split("\n").join("<br>"));
// or, without jQuery
document.getElementById("holder").innerHTML = code.split("\n").join("<br>");
fiddle
Your second option is to, as Hardy mentioned in the comments, use a <pre> tag. This will maintain the formatting that you give it, so all you would have to do is insert code into a <pre> tag.

Use <pre> tag in your javascript like:
<script>
function RPCTest() {
var Div = document.getElementById("RPCTarget")
Broker_CallV("XUS INTRO MSG", []);
var text = Broker_ResultsDelim("\n");
Div.innerHTML = "<pre>" + text + "</pre>";
}
</script>

you only have one
<br/>
tag in your output. Notice that it is at the end of the console output you need to replace \n line breaks with
<br />
or replace
<p id="RPCTarget">(RPC output here...)</p>
with
<pre id="RPCTarget">(RPC output here...)</pre>
and remove the trailing
<br />

Showing text from resources.resx in JavaScript

This is example code in ASP.NET MVC 3 Razor:
#section header
{
<script type="text/javascript">
$(function() {
alert('#Resources.ExampleCompany');
});
</script>
}
<div>
<h1>#Resources.ExampleCompany</h1>
</div>
The code above this is just an example, but it also shows my problem with encoding. This variable #Resources.ExampleCompany is a file resources.resx with value ExampleCompany = "Twoja firma / Twój biznes"
In JavaScript, the alert shows the "Twoja firma / Twój biznes".
Why is character 'ó' '&#243'? What am I doing wrong?
In HTML tag, <h1>#Resources.ExampleCompany</h1> is displayed correctly.
UPDATE:
Mark Schultheiss wrote a good hint and my "ugly solution" is:
var companySample = "#Resources.ExampleCompany";
$('#temp').append(companySample);
alert($('#temp').text());
Now the character is ó and looks good, but this is still not answer to my issue.

According to HTML Encoding Strings - ASP.NET Web Forms VS Razor View Engine, the # syntax automatically HTML encodes and the solution is to use the Raw extension-method (e.g., #Html.Raw(Resources.ExampleCompany)) to decode the HTML. Try that and let us know if that works.

Some of this depends upon WHAT you do with the text.
For example, using the tags:
<div id='result'>empty</div>
<div id='other'>other</div>
And code (since you are using jQuery):
var whatitis="Twoja firma / Twój biznes";
var whatitisnow = unescape(whatitis);
alert(whatitis);
alert(whatitisnow);
$('#result').append(whatitis+" changed to:"+whatitisnow);
$('#other').text(whatitis+" changed to:"+whatitisnow);
In the browser, the "result" tag shows both correctly (as you desire) whereas the "other" shows it with the escaped character. And BOTH alerts show it with the escaped character.
See here for example: http://jsfiddle.net/MarkSchultheiss/uJtw3/.

I use following trick:
<script type="text/javascript">
$('<div/>').html("#Resources.ExampleCompany").text();
</script>
Maybe it will help.
UPDATE
I have tested this behavior of Razor more thoroughly and I've found that:
1.When the text is put as normal content of html then #Html.Raw method simply helps and writes char 'ó' without html encoding (not as: ó)
example:
<div> #Html.Raw("ó") </div>
example:
<script type="text/javascript">
var a = $('<div/>').html('#("ó")').text();// or var a = '#Html.Raw("ó")';
console.log(a); // it shows: ó
</script>
2.But if it is put inside html tags as attribute then Razor converts it to: ó and #Html.Raw doesn't help at all
example:
<meta name="description" content="#("ó")" />
Yo can fix it by putting the entire tag to Resource (as in that post) or to string (as in my example)
#("<meta name="description" content="ó" />")
So, sometimes somebody could have been little confused that the answers helps the others but not him.

I had similar issue, but in my case I was assigning a value from Resource to javascript variable. There was the same problem with letter ó encoding. Afterwards this variable was binded to a html object (precisely speaking by knockout binding). In my situation below code give a trick:
var label = '#Html.Raw(Resource.ResourceName)';

Develop Reference

JavaScript is the programming language of the Web.

Prevent XSS by converting any data to a String in Javascript? [duplicate] - javascript

Related

Passing HTML value to embedded script

How can I make HTML code non-execute?

How to validate or wrap user inputted HTML to fix unclosed tags

Can't get text to line break with <br> after inserting with innerHTML

Showing text from resources.resx in JavaScript

Categories

Resources