i need gets a html code by jsonp like method, and display this on a div element. This works with simple html like an images, text, etc. But now the code can content a JavaScript tags and need insert this on a div, but the javascript, don't runs, i resume it with a example:
var div = document.getElementById('mydiv');
div.innerHTML = '<scipt type="text/javascript"> console.log('I run!'); </script>';
this don't works, too i probe:
var otherdiv = document.createElement('div');
otherdiv.innerHTML = '<scipt type="text/javascript"> console.log('I run!'); </script>';
div.appendChild(otherdiv);
But don't works too.
How I can insert the code in away that the JavaScript runs?
The most straightforward way is to extract all JavaScript and eval it.
You can extract JavaScript with a regular expressions like:
var match = html.match(new RegExp("<script[^>]*>(.+)</script>"));
console.log(match);
This is a far from optimal way of executing JavaScript, very error prone and possibly unsafe. I would strongly advise against it.
You should use a script loader like LAB, yep/nope, or Frame.js. The browser has built-in restrictions, script loading is the best practice for including scripts on a page.
You also would have to escape the single quotes.
console.log(\'I run!\');
Related
I have this variable which contains a script html code
<script>
var script = "<script>console.log('script here')</script>"
</script>
how do we programmatically escape the / in the closing tag </script> so it will look like the code below
<script>
var script = "<script>console.log('script here')<\/script>"
</script>
It does not work as you think.
The first fragment of code does not work because the browser finds the </script> piece in the string and thinks that it is the closing tag of the script element. It treats the rest of the script and the real </script> closing tag as regular text and displays it in the page (except for the </script> tag).
This means that only a fragment of your script is parsed, the parser finds a syntax error in it (the string is not closed) and the script does not run.
There is no way to fix this using JavaScript code. It is not a coding problem. It is an HTML problem (kind of) and its only solution is to write the HTML in a way that avoids the issue.
The HTML document contains a closing tag </script> inside the body of a script element. For normal HTML content (a paragraph, for example) the solution is straight forward: use < and > to encode < and >:
<p> This is a paragraph that contains a <p> closing tag</p>
You should do it anyway everywhere you want < and > to represent themselves (to be rendered and not interpreted as tag markers) to produce correct HTML.
This simple solution is not possible in the <script> element because the content of the <script> element is not parsed by the HTML parser. It only finds the first appearance of the </script> closing tag and passes the content to the JavaScript parser. And the JavaScript parser does not understand < and >.
However, there is a simple solution for your problem. Make sure that the script does not contain the string </script> and everything will work without problems.
Usually, this is done either by writing:
var script = "<script>console.log('script here')<\/script>"
or by splitting the string in two sub-strings in the middle of the script word:
var script = "<script>console.log('script here')</scr" + "ipt>"
The first solution looks a little better.
Another, even easier, solution is to not put the JavaScript code into an inline script element but keep it in a .js file and link that file into the HTML document:
<script src="my-fancy-script.js"></script>
The file my-fancy-script.js looks like this:
var script = "<script>console.log('script here')</script>"
This way, the content of the my-fancy-script.js file is passed directly to the JavaScript parser that is not fooled by any appearance of </script> in the code.
An approach could be:
var script = "<script>console.log('script here')</script>";
script = script.replace('</script>', '<\/script>');
The same for the opposite:
var script = "<script>console.log('script here')<\/script>";
script = script.replace('<\/script>', '</script>');
As this answer indicates, a good way to parse HTML in JavaScript is to simply re-use the browser's HTML-parsing capabilities like so:
var el = document.createElement( 'html' );
el.innerHTML = "<html><head><title>titleTest</title></head><body><a href='test0'>test01</a><a href='test1'>test02</a><a href='test2'>test03</a></body></html>";
// process 'el' as desired
However, this triggers loading extra pages for certain HTML strings, for example:
var foo = document.createElement('div')
foo.innerHTML = '<img src="http://example.com/img.png">';
As soon as this example is run, the browser attempts to load the page:
How might I process HTML from JavaScript without this behavior?
I don't know if there is a perfect solution for this, but since this is merely for processing, you can before assigning innerHTMl replace all src attributes to be notSrc="xyz.com", this way it wont be loaded, and if you need them later in processing you can account for this.
The browser mainly will load images, scripts, and css files, this will fix the first 2, the css can be done by replacing the href attribute.
If you want to parse HTML response without loading any unnecessary resources like images or scripts inside, use DOMImplementation’s createHTMLDocument() to create new document which is not connected to the current one parsed by the browser and behaves as well as normal document.
I have written some code that takes a string of html and cleans away any ugly HTML from it using jQuery (see an early prototype in this SO question). It works pretty well, but I stumbled on an issue:
When using .append() to wrap the html in a div, all script elements in the code are evaluated and run (see this SO answer for an explanation why this happens). I don't want this, I really just want them to be removed, but I can handle that later myself as long as they are not run.
I am using this code:
var wrapper = $('<div/>').append($(html));
I tried to do it this way instead:
var wrapper = $('<div>' + html + '</div>');
But that just brings forth the "Access denied" error in IE that the append() function fixes (see the answer I referenced above).
I think I might be able to rewrite my code to not require a wrapper around the html, but I am not sure, and I'd like to know if it is possible to append html without running scripts in it, anyway.
My questions:
How do I wrap a piece of unknown html
without running scripts inside it,
preferably removing them altogether?
Should I throw jQuery out the window
and do this with plain JavaScript and
DOM manipulation instead? Would that help?
What I am not trying to do:
I am not trying to put some kind of security layer on the client side. I am very much aware that it would be pointless.
Update: James' suggestion
James suggested that I should filter out the script elements, but look at these two examples (the original first and the James' suggestion):
jQuery("<p/>").append("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there")
keeps the text nodes but writes gnu!
jQuery("<p/>").append(jQuery("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there").not('script'))`
Doesn't write gnu!, but also loses the text nodes.
Update 2:
James has updated his answer and I have accepted it. See my latest comment to his answer, though.
How about removing the scripts first?
var wrapper = $('<div/>').append($(html).not('script'));
Create the div container
Use plain JS to put html into div
Remove all script elements in the div
Assuming script elements in the html are not nested in other elements:
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).children().remove('script');
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).find('script').remove();
This works for the case where html is just text and where html has text outside any elements.
You should remove the script elements:
var wrapper = $('<div/>').append($(html).remove("script"));
Second attempt:
node-validator can be used in the browser:
https://github.com/chriso/node-validator
var str = sanitize(large_input_str).xss();
Alternatively, PHPJS has a strip_tags function (regex/evil based):
http://phpjs.org/functions/strip_tags:535
The scripts in the html kept executing for me with all the simple methods mentioned here, then I remembered jquery has a tool for this (since 1.8), jQuery.parseHTML. There's still a catch, according to the documentation events inside attributes(i.e. <img onerror>) will still run.
This is what I'm using:
var $dom = $($.parseHTML(d));
$dom will be a jquery object with the elements found
can anyone explain what happens when you use javascript to insert a javascript based widget?
here's my js code:
var para = document.getElementsByTagName("p");
var cg = document.createElement("div");
cg.setAttribute("class", "twt");
cg.innerHTML='<a href="http://twitter.com/share" class="twitter-share-button"
data-count="vertical" data-via="xah_lee">Tweet</a>
<script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script>';
document.body.insertBefore(cg, para[1]);
it inserts the twitter widget, before the first paragraph. As you can see above, the twitter widget calls for a javascript that shows how many time the page has been tweeted.
doesn't work in Firefox, Chrome, but semi-works in IE8. What should be the expected behavior when this happens? Does the newly inserted js code supposed to execute? If so, how's it differ from if the code is on the page itself?
In order to execute the JS code you insert into a DIV via innerHTML, you need to do something like the following (courtesy of Yuriy Fuksenko at http://www.coderanch.com/t/117983/HTML-JavaScript/Execute-JavaScript-function-present-HTML )
function setAndExecute(divId, innerHTML) {
var div = document.getElementById(divId);
div.innerHTML = innerHTML;
var x = div.getElementsByTagName("script");
for (var i=0;i<x.length;i++) {
eval(x[i].text);
}
}
A slightly more advanced approach is here: http://zeta-puppis.com/2006/03/07/javascript-script-execution-in-innerhtml-the-revenge/ - look for <script> tags, take their content and create a new element into the <head>.
innerHTML does not work to insert script tags (because the linked script, in most browsers, will fail to execute). Really, you should insert the script tag once on the server side and insert only the link at the location of each post (that is, if you are adding this to a blog home page that shows multiple posts, each with their own URLs).
If, for some reason, you decide that you must use one snippet of JavaScript to do it all, at least import the tweet button script in a way that will work, for example, the Google Analytics way or the MediaWiki way (look for the importScriptURI function). (Note that I do not know the specifics of the tweet button, so it might not even work.)
Is it possible to get in some way the original HTML source without the changes made by the processed Javascript? For example, if I do:
<div id="test">
<script type="text/javascript">document.write("hello");</script>
</div>
If I do:
alert(document.getElementById('test').innerHTML);
it shows:
<script type="text/javascript">document.write("hello");</script>hello
In simple terms, I would like the alert to show only:
<script type="text/javascript">document.write("hello");</script>
without the final hello (the result of the processed script).
I don't think there's a simple solution to just "grab original source" as it'll have to be something that's supplied by the browser. But, if you are only interested in doing this for a section of the page, then I have a workaround for you.
You can wrap the section of interest inside a "frozen" script:
<script id="frozen" type="text/x-frozen-html">
The type attribute I just made up, but it will force the browser to ignore everything inside it. You then add another script tag (proper javascript this time) immediately after this one - the "thawing" script. This thawing script will get the frozen script by ID, grab the text inside it, and do a document.write to add the actual contents to the page. Whenever you need the original source, it's still captured as text inside the frozen script.
And there you have it. The downside is that I wouldn't use this for the whole page... (SEO, syntax highlighting, performance...) but it's quite acceptable if you have a special requirement on part of a page.
Edit: Here is some sample code. Also, as #FlashXSFX correctly pointed out, any script tags within the frozen script will need to be escaped. So in this simple example, I'll make up a <x-script> tag for this purpose.
<script id="frozen" type="text/x-frozen-html">
<div id="test">
<x-script type="text/javascript">document.write("hello");</x-script>
</div>
</script>
<script type="text/javascript">
// Grab contents of frozen script and replace `x-script` with `script`
function getSource() {
return document.getElementById("frozen")
.innerHTML.replace(/x-script/gi, "script");
}
// Write it to the document so it actually executes
document.write(getSource());
</script>
Now whenever you need the source:
alert(getSource());
See the demo: http://jsbin.com/uyica3/edit
A simple way is to fetch it form the server again. It will be in the cache most probably. Here is my solution using jQuery.get(). It takes the original uri of the page and loads the data with an ajax call:
$.get(document.location.href, function(data,status,jq) {console.log(data);})
This will print the original code without any javascript. It does not do any error handling!
If don't want to use jQuery to fetch the source, consult the answer to this question: How to make an ajax call without jquery?
Could you send an Ajax request to the same page you're currently on and use the result as your original HTML? This is foolproof given the right conditions, since you are literally getting the original HTML document. However, this won't work if the page changes on every request (with dynamic content), or if, for whatever reason, you cannot make a request to that specific page.
Brute force approach
var orig = document.getElementById("test").innerHTML;
alert(orig.replace(/<\/script>[.\n\r]*.*/i,"</script>"));
EDIT:
This could be better
var orig = document.getElementById("test").innerHTML + "<<>>";
alert(orig.replace( /<\/script>[^(<<>>)]+<<>>/i, "<\/script>"));
If you override document.write to add some identifiers at the beginning and end of everything written to the document by the script, you will be able to remove those writes with a regular expression.
Here's what I came up with:
<script type="text/javascript" language="javascript">
var docWrite = document.write;
document.write = myDocWrite;
function myDocWrite(wrt) {
docWrite.apply(document, ['<!--docwrite-->' + wrt + '<!--/docwrite-->']);
}
</script>
Added your example somewhere in the page after the initial script:
<div id="test">
<script type="text/javascript"> document.write("hello");</script>
</div>
Then I used this to alert what was inside:
var regEx = /<!--docwrite-->(.*?)<!--\/docwrite-->/gm;
alert(document.getElementById('test').innerHTML.replace(regEx, ''));
If you want the pristine document, you'll need to fetch it again. There's no way around that. If it weren't for the document.write() (or similar code that would run during the load process) you could load the original document's innerHTML into memory on load/domready, before you modify it.
I can't think of a solution that would work the way you're asking. The only code that Javascript has access to is via the DOM, which only contains the result after the page has been processed.
The closest I can think of to achieve what you want is to use Ajax to download a fresh copy of the raw HTML for your page into a Javascript string, at which point since it's a string you can do whatever you like with it, including displaying it in an alert box.
A tricky way is using <style> tag for template. So that you do not need rename x-script any more.
console.log(document.getElementById('test').innerHTML);
<style id="test" type="text/html+template">
<script type="text/javascript">document.write("hello");</script>
</style>
But I do not like this ugly solution.
I think you want to traverse the DOM nodes:
var childNodes = document.getElementById('test').childNodes, i, output = [];
for (i = 0; i < childNodes.length; i++)
if (childNodes[i].nodeName == "SCRIPT")
output.push(childNodes[i].innerHTML);
return output.join('');