I wanted to write the contents of <head> inside a <p> using Javascript for an exercise.
I tried with this, but it doesn't work:
var contentHead = document.getElementsByTagName('HEAD')[0].innerHTML;
document.getElementById('paragraph1').innerHTML = contentHead;
You can use .outerHTML. E.g.
const paragraph = document.getElementById("my-paragraph");
paragraph.textContent = document.head.outerHTML;
Edit: As Heretic Monkey pointed out, we have changed from .innerHTML to .textContent. This is because using .innerHTML will actually render the content into the paragraph, causing any scripts to run etc. If you want to be able to view the source rather than execute the content, you must use textContent.
Related
Here's an example of what I'm trying to edit:
<script id="login-popup" type="text/template">
<h3 id="cover-msg" class="modal-title">You need to login to do that.</h3>`
</script>
I would like to add: class="title" to the h3 tag. This is being done via a chrome extension, so I can't control the HTML that is rendered.
Here's the caveat: I can't assume that the template will always be the same, so I can't just replace or edit the entire thing. I need to be able to select certain elements within the text and only add things as needed.
The problem I'm having is that the template seems to just be plain text. So I can't select it with something like #login-popup #cover-msg. Please correct me if I'm wrong.
Is it possible to do this with JavaScript/jQuery?
You can follow this type of procedure which gets the text out of the script tag, inserts it into a DOM element so you can use DOM manipulation on it, then gets the resulting HTML out of that DOM element. This allows you to avoid any manual parsing of the HTML text yourself:
var t = document.getElementById("login-popup");
var div = document.createElement("div");
div.innerHTML = t.innerHTML;
$(div).find("h3").addClass("title");
t.innerHTML = div.innerHTML;
It follows this process:
Get the innerHTML from the script tag
Create a temporary div
Puts the HTML into the temporary div where you can then treat it as DOM elements
Using DOM query, find the <h3>
Adds the class to it
Get the HTML back out of the temporary div
Puts the HTML back into the script tag as the modified version of the template.
It works here: http://jsfiddle.net/jfriend00/mqnf1mmp/.
Just for you guys to note, of course I have read this first:
Javascript get text inside a <span> element
However, my case is not that easy, let alone because I need to do it natively, without jQuery.
Supposing we have this on an arbitrary web page:
<span id="entry1" class="entries">
<img src="http://whereyourpicis.at/pic.jpg" border="0">
++ This is the plain text we want to get from the SPAN block. ++
<span id="nested2"><a onclick="doSomething()">Action!</a></span>
</span>
I've tried anything imaginable, but I can't say any of the "solutions" I tried was a good one, since it feels like a total kludge taking the whole innerHTML and then doing some sed-style regex magic on it.
There must be a more elegant way to accomplish this, which is why I'm asking here.
BTW I've also found out that even nextSibling() cannot work here.
I am not sure this is what you need, because you didn't specify what you need to be an exact output in your example code.
If you need to literally Strip HTML from Text JavaScript
you could use function like this:
function strip(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
return tmp.textContent || tmp.innerText || "";
}
please check this: http://jsfiddle.net/shershen08/7fFWn/3/
If you want to get only the text nodes within an element, I think you'll need to iterate over the element's childNodes and fetch the text nodes. Here's a quick-and-dirty example of a function that will fetch only the text nodes from a given element (it also skips any text nodes that are just whitespace, since those are often added as a result of HTML formatting but don't really mean anything to a human).
say for instance i have the following line:
var arrowBase = document.createElement('div')
Now within this div tag i want to add some HTML (i.e text).
Then i tried the following:
arrowBase.innerHTML('hello');
However this does nothing:S
i have also tried: arrowBase.HTML('hello');
But once again without any result
I know is that rather simple but in my search i could'nt find the answer hope someone is able to help me out here
Read the docs, it is not a method.
arrowBase.innerHTML = 'hello';
arrowBase.textContent = "HELLO"
also does the same thing but only text can be specified. Whereas in innerHTML html tags can be specified along with the text.
I have written some code that takes a string of html and cleans away any ugly HTML from it using jQuery (see an early prototype in this SO question). It works pretty well, but I stumbled on an issue:
When using .append() to wrap the html in a div, all script elements in the code are evaluated and run (see this SO answer for an explanation why this happens). I don't want this, I really just want them to be removed, but I can handle that later myself as long as they are not run.
I am using this code:
var wrapper = $('<div/>').append($(html));
I tried to do it this way instead:
var wrapper = $('<div>' + html + '</div>');
But that just brings forth the "Access denied" error in IE that the append() function fixes (see the answer I referenced above).
I think I might be able to rewrite my code to not require a wrapper around the html, but I am not sure, and I'd like to know if it is possible to append html without running scripts in it, anyway.
My questions:
How do I wrap a piece of unknown html
without running scripts inside it,
preferably removing them altogether?
Should I throw jQuery out the window
and do this with plain JavaScript and
DOM manipulation instead? Would that help?
What I am not trying to do:
I am not trying to put some kind of security layer on the client side. I am very much aware that it would be pointless.
Update: James' suggestion
James suggested that I should filter out the script elements, but look at these two examples (the original first and the James' suggestion):
jQuery("<p/>").append("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there")
keeps the text nodes but writes gnu!
jQuery("<p/>").append(jQuery("<br/>hello<script type='text/javascript'>console.log('gnu!'); </script>there").not('script'))`
Doesn't write gnu!, but also loses the text nodes.
Update 2:
James has updated his answer and I have accepted it. See my latest comment to his answer, though.
How about removing the scripts first?
var wrapper = $('<div/>').append($(html).not('script'));
Create the div container
Use plain JS to put html into div
Remove all script elements in the div
Assuming script elements in the html are not nested in other elements:
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).children().remove('script');
var wrapper = document.createElement('div');
wrapper.innerHTML = html;
$(wrapper).find('script').remove();
This works for the case where html is just text and where html has text outside any elements.
You should remove the script elements:
var wrapper = $('<div/>').append($(html).remove("script"));
Second attempt:
node-validator can be used in the browser:
https://github.com/chriso/node-validator
var str = sanitize(large_input_str).xss();
Alternatively, PHPJS has a strip_tags function (regex/evil based):
http://phpjs.org/functions/strip_tags:535
The scripts in the html kept executing for me with all the simple methods mentioned here, then I remembered jquery has a tool for this (since 1.8), jQuery.parseHTML. There's still a catch, according to the documentation events inside attributes(i.e. <img onerror>) will still run.
This is what I'm using:
var $dom = $($.parseHTML(d));
$dom will be a jquery object with the elements found
I am currently loading a lightbox style popup that loads it's HTML from an XHR call. This content is then displayed in a 'modal' popup using element.innerHTML = content This works like a charm.
In another section of this website I use a Flickr 'badge' (http://www.elliotswan.com/2006/08/06/custom-flickr-badge-api-documentation/) to load flickr images dynamically. This is done including a script tag that loads a flickr javascript, which in turn does some document.write statments.
Both of them work perfectly when included in the HTML. Only when loading the flickr badge code inside the lightbox, no content is rendered at all. It seems that using innerHTML to write document.write statements is taking it a step too far, but I cannot find any clue in the javascript implementations (FF2&3, IE6&7) of this behavior.
Can anyone clarify if this should or shouldn't work? Thanks.
In general, script tags aren't executed when using innerHTML. In your case, this is good, because the document.write call would wipe out everything that's already in the page. However, that leaves you without whatever HTML document.write was supposed to add.
jQuery's HTML manipulation methods will execute scripts in HTML for you, the trick is then capturing the calls to document.write and getting the HTML in the proper place. If it's simple enough, then something like this will do:
var content = '';
document.write = function(s) {
content += s;
};
// execute the script
$('#foo').html(markupWithScriptInIt);
$('#foo .whereverTheDocumentWriteContentGoes').html(content);
It gets complicated though. If the script is on another domain, it will be loaded asynchronously, so you'll have to wait until it's done to get the content. Also, what if it just writes the HTML into the middle of the fragment without a wrapper element that you can easily select? writeCapture.js (full disclosure: I wrote it) handles all of these problems. I'd recommend just using it, but at the very least you can look at the code to see how it handles everything.
EDIT: Here is a page demonstrating what sounds like the effect you want.
I created a simple test page that illustrates the problem:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<title>Document Write Testcase</title>
</head>
<body>
<div id="container">
</div>
<div id="container2">
</div>
<script>
// This doesn't work!
var container = document.getElementById('container');
container.innerHTML = "<script type='text/javascript'>alert('foo');document.write('bar');<\/script>";
// This does!
var container2 = document.getElementById('container2');
var script = document.createElement("script");
script.type = 'text/javascript';
script.innerHTML = "alert('bar');document.write('foo');";
container.appendChild(script);
</script>
</body>
</html>
This page alerts 'bar' and prints 'foo', while I expected it to also alert 'foo' and print 'bar'. But, unfortunately, since the script tag is part of a larger HTML page, I cannot single out that tag and append it like the example above. Well, I can, but that would require scanning innerHTML content for script tags, and replacing them in the string by placeholders, and then inserting them using the DOM. Sounds not that trivial.
Use document.writeln(content); instead of document.write(content).
However, the better method is using the concatenation of innerHTML, like this:
element.innerHTML += content;
The element.innerHTML = content; method will replace the old content with the new one, which will overwrite your element's innerHTML!
Whereas using the the += operator in element.innerHTML += content will append your text after the old content. (similar to what document.write does.)
document.write is about as deprecated as they come. Thanks to the wonders of JavaScript, though, you can just assign your own function to the write method of the document object which uses innerHTML on an element of your choosing to append the supplied content.
Can I get some clarification first to make sure I get the problem?
document.write calls will add content to the markup at the point in the markup at which they occur. For example if you include document.write calls in a function but call the function elsewhere, the document.write output will happen at the point in the markup the function is defined not where it is called.
Therefore for this to work at all the Flickr document.write statements will need to be part of the content in element.innerHTML = content. Is this definitely the case?
You might quickly test if this should work at all by adding a single and simple document.write call in the content that is set as the innerHTML and see what this does:
<script>
var content = "<p>1st para</p><script>document.write('<p>2nd para</p>');</script>"
element.innerHTML = content;
</script>
If that works, the concept of document.write working in content set as the innerHTML of an element might just work.
My gut feeling is that it won't work, but it should be pretty straightforward to test the concept.
So you're using a DOM method to create a script element and append that to an existing element and this then causes the content of the appended script element to execute? That sounds good.
You say that the script tag is part of a larger HTML page and therefore cannot be singled out. Can you not give the script tag an ID and target it? I'm probably missing something obvious here.
In theory, yes, I can single out a script tag that way. The problem is that we potentially have dozens of situations where this occurs, so I am trying to find some cause or documentation of this behavior.
Also, the script tag does not seem to be a part of the DOM anymore after it gets loaded. In our environment, my container div remains empty, so I cannot fetch the script tag. It should work, though, because in my example above the script does not get executed, but is still part of the DOM.