JavaScript - controlling the insertion point for document.write

JavaScript - controlling the insertion point for document.write - javascript

I would like to create a page that runs a 3rd party script that includes document.write after the DOM was already fully loaded.
My page is not XHTML. My problem is that the document.write is overwriting my own page. (which is what it does once the DOM was loaded).
I tried overriding the document.write function (in a way similiar to http://ejohn.org/blog/xhtml-documentwrite-and-adsense/) but that doesn't cover cases where the document.write contains partial tags.
An example that would break the above code is:
document.write("<"+"div");
document.write(">"+"Done here<"+"/");
document.write("div>");
Is there some way to modify the document.write insertion point through JavaScript? Does anyone have a better idea how to do this?

If you're dealing with 3rd party scripts, simply replacing document.write to capture the output and stick it in the right place isn't good enough, since they could change the script and then your site would break.
writeCapture.js does what you need (full disclosure: I'm the author). It basically rewrites the script tags so that each one captures it's own document.write output and puts it in the correct place. The usage (using jQuery) would be something like:
$(document.body).writeCapture().append('<script type="text/javascript" src="http://3rdparty.com/foo.js"></script>');
Here I'm assuming that you want to append to the end of the body. All jQuery selectors and manipulation methods will work with the plugin, so you can inject it anywhere and however you want. It can also be used without jQuery, if that is a problem.

It is possible to override the document.write method. So you can buffer the strings sent to document.write and output the buffer wherever you like. However changing a script from synchronous to asynchronous can cause errors if not handled correctly. Here's an example:
Simplified document.write replacement
(function() {
// WARNING: This is just a simplified example
// to illustrate a problem.
// Do NOT use this code!
var buffer = [];
document.write = function(str) {
// Every time document.write is called push
// the data into buffer. document.write can
// be called from anywhere, so we also need
// a mechanism for multiple positions if
// that's needed.
buffer.push(str);
};
function flushBuffer() {
// Join everything in the buffer to one string and put
// inside the element we want the output.
var output = buffer.join('');
document.getElementById("ad-position-1").innerHTML = output;
}
// Inject the thid-party script dynamically and
// call flushBuffer when the script is loaded
// (and executed).
var script = document.createElement("script");
script.onload = flushBuffer;
script.src = "http://someadserver.com/example.js";
})();
Content of http://someadserver.com/example.js
var flashAdObject = "<object>...</object>";
document.write("<div id='example'></div>");
// Since we buffer the data the getElementById will fail
var example = document.getElementById("example");
example.innerHTML = flashAdObject; // ReferenceError: example is not defined
I've documented the different problems I've encountered when writing and using my document.write replacement: https://github.com/gregersrygg/crapLoader/wiki/What-to-think-about-when-replacing-document.write
But the danger of using a document.write replacement are all the unknown problems that may arise. Some are not even possible to get around.
document.write("<scr"+"ipt src='http://someadserver.com/adLib.js'></scr"+"ipt>");
adLib.doSomething(); // ReferenceError: adLib is not defined
Luckily I haven't come across the above problem in the wild, but that doesn't guarantee it won't happen ;)
Still want to try it out? Try out crapLoader (mine) or writeCapture:
You should also check out friendly iframes. Basically it creates a same-domain iframe and loads everything there instead of in your document. Unfortunately I haven't found any good libraries for handling this yet.

Original Answer before the edit:
Basically the problem of document.write is that it does not work in XHTML documents. The most broad solution then (as harsh as it may seem) is to not use XHTML/xml for your page. Due to IE+XHTML and the mimetype problem, Google Adsense breaking (may be a good thing :), and the general shift towards HTML5 I don't think it's as bad as it seems.
However if you'd really like to use XHTML for your page, then John's script that you linked to is the best you've got at this point. Just sniff for IE on the server. If the request is from IE, don't do anything (and don't serve the application/xhtml+xml mimetype!). Otherwise drop it into the <head> of your page and you're off to the races.
Re-reading your question, is there a specific problem you have with John's script? It is known to fail in Safari 2.0 but that's about it.

You may be interested in the Javascript library I developed which allows to load 3rd party scripts using document.write after window.onload. Internally, the library overrides document.write, appending DOM elements dynamically, running any included scripts which may use document.write as well.
Unlike John Resig's solution (which was part of the inspiration for my own code), the module I developed supports partial writes such as the example you give with the div:
document.write("<"+"div");
document.write(">"+"Done here<"+"/");
document.write("div>");
My library will wait for the end of the script before parsing and rendering the markup. In the above example, it would run once with the full string "<div>Done here</div>" instead of 3 times with partial markup.
I have set up a demo, in which I load 3 Google Ads, an Amazon widget as well as Google Analytics dynamically.

FWIW, I found postscribe to be the best option out there these days - it handles wrapping a pesky ad rendering module like a charm allowing our page to load without being blocked.

In order to alter the content of the page after the DOM has rendered you need to either use a javascript library to append HTML or text at certain points (jQuery, mootools, prototype, ...) or just use the innerHTML property of each DOM element to alter/append text to it. This works crossbrowser and doesn't require any libraries.

There are better ways to do this.
2 ways
1) Append
<html><head>
<script>
window.onload = function () {
var el = document.createTextNode('hello world');
document.body.appendChild(el);
}
</script></head><body></body></html>
2) InnerHTML
<html><head><script>
window.onload = function () {
document.body.innerHTML = 'hello world';
}
</script></head><body></body></html>

Related

Is there a technique to use a W ^ X mechanic in HTML5 to fight XSS?

I wonder if it is possible to tell the Browser to only Execute JS Code that is in the tags in the initial loading of the page. Thus not executing any tags that were inserted dynamically by JS Code with
element.innerHTML = "<script>XSSCode</script>"
I think this might make many XSS attacks impossible.
Edit: el.innerHTML is only one example of adding a new script
tag to a Webpage.

There are two different proposals in your question.
...if it is possible to tell the browser to only execute JS code that
is in the tags in the initial loading of the page
You would then prevent code splitting and force everyone to bundle full scripts. Also, there is document.head.appendChild(...), where child is document.createElement("script") - this functionality kind of creates a script for "initial loading of the page" right in the <head/>. It is a bad idea in many ways to prevent appending scripts to document's DOM.
tags that were inserted dynamically
...it is a different proposal. If you are saying browser still allows JS to create script tags from within the code, but el.innerHTML should not allow script tag at all, this might not be too limiting for certain cases. You can achieve it e.g. by overriding Element.prototype's innerHTML. It is still a bad idea, but might help prevent a certain attack.
An example of the code is in accepted answer here:
Change innerHTML set on the fly
In essence, you would do:
var originalSet = Object.getOwnPropertyDescriptor(Element.prototype, 'innerHTML').set;
Object.defineProperty(Element.prototype, 'innerHTML', {
set: function (value) {
// change it (ok)
var new_value = value.toString().replace(/<script/g,"");
//Call the original setter
return originalSet.call(this, new_value);
}
});
It is not very robust and I would not use it in production. But I could imagine it might help to detect a problem with third party script.

Difference between document.write(X) & document.getElementById("").innerHTML = X

New to javascript, apology if this is a dumb question. The two statement in the title seems to be doing the same thing, are there any particular difference that I need to be aware of?

They do not do the same thing. document.write will just append to the page as its loading, wherever in the page the <script> tag happens to be. If you call document.write after the page is loaded, it will erase the entire page before appending.
On the other hand, document.getElementById("").innerHTML = '' replaces the HTML of a certain element with what you give it (you can also append with .innerHTML += '').
It's highly suggested not to use document.write in your page.

document.write can be used to emit markup during the parsing of the page. It cannot be used for modifying the page after it's parsed. The output of document.write goes straight into the parser as though it had been in the HTML document in the first place
innerHTML, which is not a function but rather a property, exists on all DOM element instances, and can be used to set their content, using markup. This, along with the various DOM methods available on instances, is the primary way that dynamic web pages are done.

remove script after load in memory [duplicate]

As the title says, if I remove a script tag from the DOM using:
$('#scriptid').remove();
Does the javascript itself remain in memory or is it cleaned?
Or... am I completely misunderstanding the way in which browsers treat javascript? Which is quite possible.
For those interested in my reason for asking see below:
I am moving some common javascript interactions from static script files into dynamically generated ones in PHP. Which are loaded on demand when a user requires them.
The reason for doing this is in order to move the logic serverside and and run a small script, returned from the server, clientside. Rather than have a large script which contains a huge amount of logic, clientside.
This is a similar approach to what facebook does...
Facebook talks frontend javascript
If we take a simple dialog for instance. Rather than generating the html in javascript, appending it to the dom, then using jqueryUI's dialog widget to load it, I am now doing the following.
Ajax request is made to dialog.php
Server generates html and javascript that is specific to this dialog then encodes them as JSON
JSON is returned to client.
HTML is appended to the <body> then once this is rendered, the javascript is also appended into the DOM.
The javascript is executed automatically upon insertion and the dynamic dialog opens up.
Doing this has reduced the amount of javasript on my page dramatically however I am concerned about clean up of the inserted javascript.
Obviously once the dialog has been closed it is removed from the DOM using jQuery:
$('#dialog').remove();
The javascript is appended with an ID and I also remove this from the DOM via the same method.
However, as stated above, does using jQuery's .remove() actually clean out the javascript from memory or does it simple remove the <script> element from the DOM?
If so, is there any way to clean this up?

No. Once a script is loaded, the objects and functions it defines are kept in memory. Removing a script element does not remove the objects it defines. This is in contrast to CSS files, where removing the element does remove the styles it defines. That's because the new styles can easily be reflowed. Can you imagine how hard it would be to work out what a script tag created and how to remove it?
EDIT: However, if you have a file that defines myFunction, then you add another script that redefines myFunction to something else, the new value will be kept. You can remove the old script tag if you want to keep the DOM clean, but that's all removing it does.
EDIT2: The only real way to "clean up" functions that I can think of is to have a JS file that basically calls delete window.myFunction for every possible object and function your other script files may define. For obvious reasons, this is a really bad idea.

If your scripts have already executed removing the DOM elements are not going to get rid of them. Go to any page with JavaScript, open up your preferred javascript console and type $("script").remove(). Everything keeps running.
And this demonstrates #Kolink answer:
http://jsfiddle.net/X2mk8/2/
HTML:
<div id="output"></div>
<script id="yourDynamicGeneratedScript">
function test(n) {
$output = $("#output")
$output.append("test " + n + "<br/>")
}
test(1);
</script>
Javascript:
$("script").remove();
// or $("#yourDynamicGeneratedScript").remove();
test(2);
test(3);
test(4);
function test(n) {
$output = $("#output")
$output.append("REDEFINED! " + n + "<br/>")
}
test(5);
test(6);
test(7);

When using CasperJS, is it possible to interact with the DOM of a loaded page before any inline or external Javascript is executed?

The situation I have is that I'm opening a page using CasperJS.
The page in question has some Javascript (a combination of both inline and external) that removes several HTML elements from the document.
However, I want to be able to retrieve those elements using something like getElementsByXPath() within CasperJS before they are removed. Is this possible?
When I dump out the value of getPageContent(), the elements are not in there. However, if I set casper.page.settings.javascriptEnabled = false; before calling the page, getPageContent() now shows the raw HTML before any Javascript is executed, and the missing HTML tags are there. The problem now, though, is that disabling Javascript prevents any usage of evaluate(), so I still can't retrieve the elements. I could probably do it using a regex of some sort on the raw content, but I was hoping there could be a cleaner method of doing it.
Any suggestions welcome!

I've never heard of anyone doing this. I wouldn't say using regex is a bad idea. I usually scrape with a combination of casperjs xpath and python regex it works extremely well and I personally don't think it's any messier than trying to intercept JavaScript before the page is loaded.
That being said, casperjs allows you to inject JavaScript which you could use jquery if it's available on the page you're requesting. The below code fires before anything is loaded. You actually have to go out of your way to add code to prevent this from firing before the page loads.
<script type='text/javascript'>
alert("Stop that parsing!");
</script>

How to dynamically add a Javascript function (and invoke)

Based on a click event on the page, via ajax I fetch a block of html and script, I am able to take the script element and append it to the head element, however WebKit based browsers are not treating it as script (ie. I cannot invoke a function declared in the appended script).
Using the Chrome Developer Tools I can see that my script node is indeed there, but it shows up differently then a script block that is not added dynamically, a non-dynamic script has a text child element and I cannot figure out a way to duplicate this for the dynamic script.
Any ideas or better ways to be doing this? The driving force is there is potentially a lot of html and script that would never be needed unless a user clicks on a particular tab, in which case the relevant content (and script) would be loaded. Thanks!

You could try using jQuery... it provides a method called .getScript that will load the JavaScript dynamically in the proper way. And it works fine in all well known browsers.

How about calling eval() on the content you receive from the server? Of course, you have to cut off the <script> and </script> parts.

If you're using a library like jQuery just use the built-in methods for doing this.
Otherwise you'd need to append it to the document rather than the head like this:
document.write("<scr" + "ipt type=\"text/javascript\" src=\"http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js\"></scr" + "ipt>");
In all honesty, I have no idea why the script tag is cut like that, but a lot of examples do that so there's probably a good reason.
You'll also need to account for the fact that loading the script might take quite a while, so after you've appended this to the body you should set up a timer that checks if the script is loaded. This can be achieved with a simple typeof check on any global variable the script exports.
Or you could just do an eval() on the actual javascript body, but there might be some caveats.
Generally speaking though, I'd leave this kind of thing up to the browser cache and just load the javascript on the page that your tabs are on. Just try not to use any onload events, but rather call whatever initializers you need when the tab is displayed.

Develop Reference

JavaScript is the programming language of the Web.