jQuery don't parse escaped HTML in .html() method

jQuery don't parse escaped HTML in .html() method - javascript

Take for example this HTML:
<td onclick="$(this).html('Wanted HTML: <br>; Unwanted HTML: <script>alert('xss')</script>')">
Click to Show</td>
As you can see, I have already escaped (using PHP) the unwanted HTML to entities. But when you click the box it executes the JavaScript.
If I change .html to .text, it displays the line breaks literally as well.
How can I have it show the the <br>s as line breaks, but the <s and >s as literally less than and greater than signs when you click the box?

The problem is that your characters are being decoded in the onclick, before they reach the JavaScript function.
You need to double-encode. So your example would become:
<td onclick="$(this).html('Wanted HTML: <br>; Unwanted HTML: &lt;script&gt;alert(&#39;xss&#39;)&lt;/script&gt;')">
Click to Show
</td>
Notice that I encoded (single encoded) the tags you do want. I also added the missing semicolon on the first >.
Of course the better solution is to remove this from the HTML entirely. Most developers agree that JavaScript is for interactivity and HTML is for content, and they should mix as little as possible (with the JavaScript hooking into the content with calls such as addEventHandler)

Related

Can I hide a specific character (,) from the browser?

Complete noob here.
The CMS I'm forced to use is spitting out commas (,) on a page and I want to hide them from the viewer as they are messing up my layout.
The code (that I CANNOT edit) typically looks like this:
<img class="" src="https://via.placeholder.com/350x150"> , <img class="" src="https://via.placeholder.com/350x150"> , <img class="" src="https://via.placeholder.com/350x150"> ,
I can however wrap the above html in a div. I can also add and javascript or jquery to either the header or footer of the page. The page contains multiple instances where there are blocks of images- sometimes I might have 4 or 5 images together - all are currently being comma separated. These commas are appearing in the browser and it's these that I want to remove.
A working demo where these commas are removed would be amazing. Thanks all who read this.

Not the prettiest solution but I think you can get the point. Just strip the commas from the contents of the div, since you said you can do that.
container.html().replace(/,/g, "")
Example on JSFiddle

Insert emoji with zero width joiner using Javascript

I have not been able to successfully insert an emoji into the DOM using Javascript when I am given the codepoints and zero width joiners are used.
Consider this emoji: 👩‍👩‍👦
I am able to create a string that looks like this:
👩‍👩‍👦
and insert it into the innerHtml of an element but the 3 characters end up getting displayed instead of the single combined character. If you look at the html on this page for this character, you can see that the html is formatted in the same way as my string is:
https://emojipedia.org/family-woman-woman-boy/
This is only an issue when zero width joiners are used.
So doing this:
el.innerHTML = "👩‍👩‍👦"
should result in a single character but it doesn't, so how can I get the single character to display. NOTE: the character cannot just be added by typing the text into an editor. The content is generated by javascript.

Not really sure what the question is here, but if you have a good UTF8/Unicode editor you can of course just paste the emoji into your text file.
If this is problematic you could build it up using HTML escaping.
Below I have done both, the first just pasting into the editor, unfortunately SO editor is not the best here. And the second one I use using HTML escaping..
Hope this helps..
update: Using your version also seems to work for me using Chrome,
what browsers are you using..?
document.querySelector("#container").innerHTML = "👩‍👩‍👦";
document.querySelector("#container2").innerHTML =
"👩‍👩‍👦";
document.querySelector("#container3").innerHTML =
"👩‍👩‍👦";
<div id="container">
</div>
<div id="container2">
</div>
<div id="container3">
</div>

Invalid location of <script> tag within a HTML <pre> tag

I am going through the example given in JavaScript The Complete Reference 3rd Edition.
The O/P can be seen here, given by the author.
<body>
<h1>Standard Whitespace Handling</h1>
<script>
// STRINGS AND (X)HTML
document.write("Welcome to JavaScript strings.\n");
document.write("This example illustrates nested quotes 'like this.'\n");
document.write("Note how newlines (\\n's) and ");
document.write("escape sequences are used.\n");
document.write("You might wonder, \"Will this nested quoting work?\"");
document.write(" It will.\n");
document.write("Here's an example of some formatted data:\n\n");
document.write("\tCode\tValue\n");
document.write("\t\\n\tnewline\n");
document.write("\t\\\\\tbackslash\n");
document.write("\t\\\"\tdouble quote\n\n");
</script>
<h1>Preserved Whitespace</h1>
<pre>
<script> // in Eclipse IDE, at this line invalid location of tag(script)
// STRINGS AND (X)HTML
document.write("Welcome to JavaScript strings.\n");
document.write("This example illustrates nested quotes 'like this.'\n");
document.write("Note how newlines (\\n's) and ");
document.write("escape sequences are used.\n");
document.write("You might wonder, \"Will this nested quoting work?\"");
document.write(" It will.\n");
document.write("Here's an example of some formatted data:\n\n");
document.write("\tCode\tValue\n");
document.write("\t\\n\tnewline\n");
document.write("\t\\\\\tbackslash\n");
document.write("\t\\\"\tdouble quote\n\n");
</script>
</pre>
</body>
(X)HTML automatically “collapses” multiple whitespace characters down to one whitespace. So, for example, including multiple consecutive tabs in your HTML shows up as only one space character. In this example, the pre tag is used to tell the browser that the
text is preformatted and that it should not collapse the white space inside of it. Similarly, we could use the CSS white-space property to modify standard white space handling. Using pre allows the tabs in the example to be displayed correctly in the output.
So, how to get rid of this warning and do i really need to have a concern for this? I think i am missing something as i have the intuition of the authors not being wrong?

There is nothing wrong in having script inside pre tag. It is just Eclipse IDE validation issue. If you use this html in the browser everything works fine and no warnings are displayed.
Also, if you wanted to show script tag as 'text content' inside pre tag, then have a look at this question: script in pre

Painlessly pass HTML to javascript

I need to pass html to javascript so that I can show the html on demand.
I can do it using textareas by having a textarea tag with the html content on the page, like so: <textarea id="html">{whatever html I want except other textareas}</textarea>
then using jquery I can present it on the page:
$("#target").html($("#html").val());
What I want to know is how to do it properly, without having to use textareas or having the html present in the <body> of the page at all?

You could use jquery templates. It's a bit more complex, but offers lots of other nice features.
https://github.com/codepb/jquery-template

Just save it in a variable:
<script type="text/javascript">
var myHTML = '<div>Foo Bar</div>';
</script>

As far as I know there is no painless way to do this due to the nature of html and javascript.
You can store your html as a string in a javascript variable such as:
var string = '<div class="someClass">your text here</div>';
However you should note that strings are enclosed within ether ' or " and if you use ether in your html you will prematurely end the string and cause errors with invalid javascript.
You can decide to only use one type of quote in your html say " and then ' to hold strings in javascript, but a more concrete way is to escape your quotes in html like so:
<div \"someClass\">your text here</div>
By putting \ before a special character you are telling it that it should ignore this character, however when you go to print it out the character will still print but the \ character won't, giving you functioning html.

Just like remy mentioned, you can use jQuery templates, and it's even cooler if you combine it with mustache! (which supports a lot of platforms)
Plus the mustache jQuery plugin is way more advanced than jQuery templates.
https://github.com/jonnyreeves/jquery-Mustache

Getting unparsed (raw) HTML with JavaScript

I need to get the actual html code of an element in a web page.
For example if the actual html code inside the element is "How to fix"
Running this JavaScript:
getElementById('myE').innerHTML
Gives me "How to fix" which is the parsed HTML.
How can I get the unparsed "How to fix" using JavaScript?

You cannot get the actual HTML source of part of your web page.
When you give a web browser an HTML page, it parses the HTML into some DOM nodes that are the definitive version of your document as far as the browser is concerned. The DOM keeps the significant information from the HTML—like that you used the Unicode character U+00A0 Non-Breaking Space before the word fix—but not the irrelevent information that you used it by means of an entity reference rather than just typing it raw ( ).
When you ask the browser for an element node's innerHTML, it doesn't give you the original HTML source that was parsed to produce that node, because it no longer has that information. Instead, it generates new HTML from the data stored in the DOM. The browser decides on how to format that HTML serialisation; different browsers produce different HTML, and chances are it won't be the same way you formatted it originally.
In particular,
element names may be upper- or lower-cased;
attributes may not be in the same order as you stated them in the HTML;
attribute quoting may not be the same as in your source. IE often generates unquoted attributes that aren't even valid HTML; all you can be sure of is that the innerHTML generated will be safe to use in the same browser by writing it to another element's innerHTML;
it may not use entity references for anything but characters that would otherwise be impossible to include directly in text content: ampersands, less-thans and attribute-value-quotes. Instead of returning it may simply give you the raw character.
You may not be able to see that that's a non-breaking space, but it still is one and if you insert that HTML into another element it will act as one. You shouldn't need to rely anywhere on a non-breaking space character being entity-escaped to ... if you do, for some reason, you can get that by doing:
x= el.innerHTML.replace(/\xA0/g, ' ')
but that's only escaping U+00A0 and not any of the other thousands of possible Unicode characters, so it's a bit questionable.
If you really really need to get your page's actual source HTML, you can make an XMLHttpRequest to your own URL (location.href) and get the full, unparsed HTML source in the responseText. There is almost never a good reason to do this.

What you have should work:
Element test:
<div id="myE">How to fix</div>
JavaScript test:
alert(document.getElementById("myE").innerHTML); //alerts "How to fix"
You can try it out here. Make sure that wherever you're using the result isn't show as a space, which is likely the case. If you want to show it somewhere that's designed for HTML, you'll need to escape it.

You can use a script tag instead, which will not parse the HTML. This is more relevant when there are angle brackets, like loading a lodash or underscore template.
document.getElementById("asDiv").value = document.getElementById("myDiv").innerHTML;
document.getElementById("asScript").value = document.getElementById("myScript").innerHTML;
<div id="myDiv">
<h1>
<%= ${var} %> %>
How to fix
</h1>
</div>
<script id="myScript" type="text/template">
<h1>
<%= ${var} %>
How to fix
</h1>
</script>
<textarea rows="10" cols="40" id="asDiv"></textarea>
<textarea rows="10" cols="40" id="asScript"></textarea>
Because the HTML in a div is parsed, the inner HTML for brackets comes back as
<
, but as a script it does not.

Develop Reference

JavaScript is the programming language of the Web.