I'm prerendering my HTML pages for the search engines bots via PhantomJS through Selenium, so that they can see the fully loaded content. Currently, after PhantomJS reached the page, I'm waiting 5 seconds so that I'm sure everything is loaded.
Instead of waiting those 5 seconds every time, one solution I contemplate is to wait until an attribute html-ready on the <body /> tag is set to true:
<html ng-app>
<head>...</head>
<body html-ready="{{htmlReady}}">
...
</body>
</html>
.controller("AnyController", function($scope, $rootScope, AnyService) {
$rootScope.htmlReady = false;
AnyService.anyLongAction(function(anyData) {
$scope.anyData = anyData;
$rootScope.htmlReady = true;
});
})
The question is: will the html-ready attribute always be set to true after any view update has been done (e.g. displaying the anyData)? In other words, is it possible that during a laps, the html-ready attribute is true while the page is not fully loaded yet? If yes, how can it be handled?
It should be done after the digest, thus it has more chances to work as expected.
AnyService.anyLongAction(function(anyData) {
$scope.anyData = anyData;
$timeout(function () {
$rootScope.htmlReady = true;
}, 0, false);
});
But it is useless in terms of the app. You have to watch for changes in every single place, Angular doesn't offer anything to make the task easier.
Fortunately, you are free to abstract from Angular and keep it simple.
var ignoredElements = [];
ignoredElements = ignoredElements.concat($('.continuously-updating-widget').toArray());
var delay = 200; // add to taste
var timeout;
var ready = function () {
$('body').off('DOMSubtreeModified');
clearTimeout(timeoutLimit);
alert('ready');
};
$('body').on('DOMSubtreeModified', function (e) {
if (ignoredElements.indexOf(e.target) < 0) {
clearTimeout(timeout);
timeout = setTimeout(ready, delay);
}
});
var timeoutLimit = setTimeout(ready, 5000);
Feel free to angularify it if needed, though it isn't the production code anyway.
It is a good idea to put the handler into throttle wrapper function (the event will spam all the way). If you use remote requests on the page that can potentially exceed timeout delay, it may be better to combine this approach with several promises from async services and resolve them with $q.all. Still, much better than looking after every single directive and service.
DOMSubtreeModified is considered to be obsolete (it never was really acknowledged, MutationObserver is recommended instead), but current versions of FF and Chrome support it, and it should be ok for Selenium.
Short answer
No. It isn't guaranteed that your markup will be completely rendered when html-ready is set.
Long answer
To the best of my knowledge it's not possible to accurately determine when Angular has finished updating the DOM after the model changed. In general it happens very fast and it doesn't take more than a few cycles to finish, but that's not always the case.
Correctly detecting when a page has finished loading/rendering is actually quite a challenge, and if you take a look at the source code of specialized tools, like prerender, you'll see that they use several different checks in order to try to decide whether a page is ready or not. And even so it doesn't work 100% of the time (Phantom may crash, a request may take longer than usual to complete, and so on).
If you really want to come up with your own solution for this problem, I suggest that you take a look at prerender's source code (or another similar project) to get some inspiration.
Related
I was under the impression that all DOM manipulations were synchronous.
However, this code is not running as I expect it to.
RecordManager.prototype._instantiateNewRecord = function(node) {
this.beginLoad();
var new_record = new Record(node.data.fields, this);
this.endLoad();
};
RecordManager.prototype.beginLoad = function() {
$(this.loader).removeClass('hidden');
};
RecordManager.prototype.endLoad = function() {
$(this.loader).addClass('hidden');
};
The Record constructor function is very large and it involves instantiating a whole bunch of Field objects, each of which instantiates some other objects of their own.
This results in a 1-2 second delay and I want to have a loading icon during this delay, so it doesn't just look like the page froze.
I expect the flow of events to be:
show loading icon
perform record instantiation operation
hide loading icon
Except the flow ends up being:
perform record instantiation operation
show loading icon
hide loading icon
So, you never even see the loading icon at all, I only know its loading briefly because the updates in the chrome development tools DOM viewer lag behind a little bit.
Should I be expecting this behavior from my code? If so, why?
Yes, this is to be expected. Although the DOM may have updated, until the browser has a chance to repaint, you won't see it. The repaint will get queued the same way as all other things get queued in the browser (ie it won't happen until the current block of JavaScript has finished executing), though pausing in a debugger will generally allow it to happen.
In your case, you can fix it using setTimeout with an immediate timeout:
RecordManager.prototype._instantiateNewRecord = function(node) {
this.beginLoad();
setTimeout(function() {
var new_record = new Record(node.data.fields, this);
this.endLoad();
}, 0);
};
This will allow the repaint to happen before executing the next part of your code.
JavaScript is always synchronous. It mimics multi-threaded behavior when it comes to ajax calls and timers, but when the callback gets returned, it will be blocking as usual.
That said, you most likely have a setTimeout in that constructor somewhere (or a method you're using does). Even if it's setTimeout(fnc, 0).
This is a very simple use case. Show an element (a loader), run some heavy calculations that eat up the thread and hide the loader when done. I am unable to get the loader to actually show up prior to starting the long running process. It ends up showing and hiding after the long running process. Is adding css classes an async process?
See my jsbin here:
http://jsbin.com/voreximapewo/12/edit?html,css,js,output
To explain what a few others have pointed out: This is due to how the browser queues the things that it needs to do (i.e. run JS, respond to UI events, update/repaint how the page looks etc.). When a JS function runs, it prevents all those other things from happening until the function returns.
Take for example:
function work() {
var arr = [];
for (var i = 0; i < 10000; i++) {
arr.push(i);
arr.join(',');
}
document.getElementsByTagName('div')[0].innerHTML = "done";
}
document.getElementsByTagName('button')[0].onclick = function() {
document.getElementsByTagName('div')[0].innerHTML = "thinking...";
work();
};
(http://jsfiddle.net/7bpzuLmp/)
Clicking the button here will change the innerHTML of the div, and then call work, which should take a second or two. And although the div's innerHTML has changed, the browser doesn't have chance to update how the actual page looks until the event handler has returned, which means waiting for work to finish. But by that time, the div's innerHTML has changed again, so that when the browser does get chance to repaint the page, it simply displays 'done' without displaying 'thinking...' at all.
We can, however, do this:
document.getElementsByTagName('button')[0].onclick = function() {
document.getElementsByTagName('div')[0].innerHTML = "thinking...";
setTimeout(work, 1);
};
(http://jsfiddle.net/7bpzuLmp/1/)
setTimeout works by putting a call to a given function at the back of the browser's queue after the given time has elapsed. The fact that it's placed at the back of the queue means that it'll be called after the browser has repainted the page (since the previous HTML changing statement would've queued up a repaint before setTimeout added work to the queue), and therefore the browser has had chance to display 'thinking...' before starting the time consuming work.
So, basically, use setTimeout.
let the current frame render and start the process after setTimeout(1).
alternatively you could query a property and force a repaint like this: element.clientWidth.
More as a what is possible answer you can make your calculations on a new thread using HTML5 Web Workers
This will not only make your loading icon appear but also keep it loading.
More info about web workers : http://www.html5rocks.com/en/tutorials/workers/basics/
To see the problem in action, see this jsbin. Clicking on the button triggers the buttonHandler(), which looks like this:
function buttonHandler() {
var elm = document.getElementById("progress");
elm.innerHTML = "thinking";
longPrimeCalc();
}
You would expect that this code changes the text of the div to "thinking", and then runs longPrimeCalc(), an arithmetic function that takes a few seconds to complete. However, this is not what happens. Instead, "longPrimeCalc" completes first, and then the text is updated to "thinking" after it's done running, as if the order of the two lines of code were reversed.
It appears that the browser does not run "innerHTML" code synchronously, but instead creates a new thread for it that executes at its own leisure.
My questions:
What is happening under the hood that is leading to this behavior?
How can I get the browser to behave the way I would expect, that is, force it to update the "innerHTML" before it executes "longPrimeCalc()"?
I tested this in the latest version of chrome.
Your surmise is incorrect. The .innerHTML update does complete synchronously (and the browser most definitely does not create a new thread). The browser simply does not bother to update the window until your code is finished. If you were to interrogate the DOM in some way that required the view to be updated, then the browser would have no choice.
For example, right after you set the innerHTML, add this line:
var sz = elm.clientHeight; // whoops that's not it; hold on ...
edit — I might figure out a way to trick the browser, or it might be impossible; it's certainly true that launching your long computation in a separate event loop will make it work:
setTimeout(longPrimeCalc, 10); // not 0, at least not with Firefox!
A good lesson here is that browsers try hard not to do pointless re-flows of the page layout. If your code had gone off on a prime number vacation and then come back and updated the innerHTML again, the browser would have saved some pointless work. Even if it's not painting an updated layout, browsers still have to figure out what's happened to the DOM in order to provide consistent answers when things like element sizes and positions are interrogated.
I think the way it works is that the currently running code completes first, then all the page updates are done. In this case, calling longPrimeCalc causes more code to be executed, and only when it is done does the page update change.
To fix this you have to have the currently running code terminate, then start the calculation in another context. You can do that with setTimeout. I'm not sure if there's any other way besides that.
Here is a jsfiddle showing the behavior. You don't have to pass a callback to longPrimeCalc, you just have to create another function which does what you want with the return value. Essentially you want to defer the calculation to another "thread" of execution. Writing the code this way makes it obvious what you're doing (Updated again to make it potentially nicer):
function defer(f, callback) {
var proc = function() {
result = f();
if (callback) {
callback(result);
}
}
setTimeout(proc, 50);
}
function buttonHandler() {
var elm = document.getElementById("progress");
elm.innerHTML = "thinking...";
defer(longPrimeCalc, function (isPrime) {
if (isPrime) {
elm.innerHTML = "It was a prime!";
}
else {
elm.innerHTML = "It was not a prime =(";
}
});
}
When looking to improve a page's performance, one technique I haven't heard mentioned before is using setTimeout to prevent javascript from holding up the rendering of a page.
For example, imagine we have a particularly time-consuming piece of jQuery inline with the html:
$('input').click(function () {
// Do stuff
});
If this code is inline, we are holding up the perceived completion of the page while the piece of jquery is busy attaching a click handler to every input on the page.
Would it be wise to spawn a new thread instead:
setTimeout(function() {
$('input').click(function () {
// Do stuff
})
}, 100);
The only downside I can see is that there is now a greater chance the user clicks on an element before the click handler is attached. However, this risk may be acceptable and we have a degree of this risk anyway, even without setTimeout.
Am I right, or am I wrong?
The actual technique is to use setTimeout with a time of 0.
This works because JavaScript is single-threaded. A timeout doesn't cause the browser to spawn another thread, nor does it guarantee that the code will execute in the specified time. However, the code will be executed when both:
The specified time has elapsed.
Execution control is handed back to the browser.
Therefore calling setTimeout with a time of 0 can be considered as temporarily yielding to the browser.
This means if you have long running code, you can simulate multi-threading by regularly yielding with a setTimeout. Your code may look something like this:
var batches = [...]; // Some array
var currentBatch = 0;
// Start long-running code, whenever browser is ready
setTimeout(doBatch, 0);
function doBatch() {
if (currentBatch < batches.length) {
// Do stuff with batches[currentBatch]
currentBatch++;
setTimeout(doBatch, 0);
}
}
Note: While it's useful to know this technique in some scenarios, I highly doubt you will need it in the situation you describe (assigning event handlers on DOM ready). If performance is indeed an issue, I would suggest looking into ways of improving the real performance by tweaking the selector.
For example if you only have one form on the page which contains <input>s, then give the <form> an ID, and use $('#someId input').
setTimeout() can be used to improve the "perceived" load time -- but not the way you've shown it. Using setTimeout() does not cause your code to run in a separate thread. Instead setTimeout() simply yields the thread back to the browser for (approximately) the specified amount of time. When it's time for your function to run, the browser will yield the thread back to the javascript engine. In javascript there is never more than one thread (unless you're using something like "Web Workers").
So, if you want to use setTimeout() to improve performance during a computation-intensive task, you must break that task into smaller chunks, and execute them in-order, chaining them together using setTimeout(). Something like this works well:
function runTasks( tasks, idx ) {
idx = idx || 0;
tasks[idx++]();
if( idx < tasks.length ) {
setTimeout( function(){ runTasks(tasks, idx); },1);
}
}
runTasks([
function() {
/* do first part */
},
function() {
/* do next part */
},
function() {
/* do final part */
}
]);
Note:
The functions are executed in order. There can be as many as you need.
When the first function returns, the next one is called via setTimeout().
The timeout value I've used is 1. This is sufficient to cause a yield, and the browser will take the thread if it needs it, or allow the next task to proceed if there's time. You can experiment with other values if you feel the need, but usually 1 is what you want for these purposes.
You are correct, there is a greater chance of a "missed" click, but with a low timeout value, its pretty unlikely.
I have a function called save(), this function gathers up all the inputs on the page, and performs an AJAX call to the server to save the state of the user's work.
save() is currently called when a user clicks the save button, or performs some other action which requires us to have the most current state on the server (generate a document from the page for example).
I am adding in the ability to auto save the user's work every so often. First I would like to prevent an AutoSave and a User generated save from running at the same time. So we have the following code (I am cutting most of the code and this is not a 1:1 but should be enough to get the idea across):
var isSaving=false;
var timeoutId;
var timeoutInterval=300000;
function save(showMsg)
{
//Don't save if we are already saving.
if (isSaving)
{
return;
}
isSaving=true;
//disables the autoSave timer so if we are saving via some other method
//we won't kick off the timer.
disableAutoSave();
if (showMsg) { //show a saving popup}
params=CollectParams();
PerformCallBack(params,endSave,endSaveError);
}
function endSave()
{
isSaving=false;
//hides popup if it's visible
//Turns auto saving back on so we save x milliseconds after the last save.
enableAutoSave();
}
function endSaveError()
{
alert("Ooops");
endSave();
}
function enableAutoSave()
{
timeoutId=setTimeOut(function(){save(false);},timeoutInterval);
}
function disableAutoSave()
{
cancelTimeOut(timeoutId);
}
My question is if this code is safe? Do the major browsers allow only a single thread to execute at a time?
One thought I had is it would be worse for the user to click save and get no response because we are autosaving (And I know how to modify the code to handle this). Anyone see any other issues here?
JavaScript in browsers is single threaded. You will only ever be in one function at any point in time. Functions will complete before the next one is entered. You can count on this behavior, so if you are in your save() function, you will never enter it again until the current one has finished.
Where this sometimes gets confusing (and yet remains true) is when you have asynchronous server requests (or setTimeouts or setIntervals), because then it feels like your functions are being interleaved. They're not.
In your case, while two save() calls will not overlap each other, your auto-save and user save could occur back-to-back.
If you just want a save to happen at least every x seconds, you can do a setInterval on your save function and forget about it. I don't see a need for the isSaving flag.
I think your code could be simplified a lot:
var intervalTime = 300000;
var intervalId = setInterval("save('my message')", intervalTime);
function save(showMsg)
{
if (showMsg) { //show a saving popup}
params=CollectParams();
PerformCallBack(params, endSave, endSaveError);
// You could even reset your interval now that you know we just saved.
// Of course, you'll need to know it was a successful save.
// Doing this will prevent the user clicking save only to have another
// save bump them in the face right away because an interval comes up.
clearInterval(intervalId);
intervalId = setInterval("save('my message')", intervalTime);
}
function endSave()
{
// no need for this method
alert("I'm done saving!");
}
function endSaveError()
{
alert("Ooops");
endSave();
}
All major browsers only support one javascript thread (unless you use web workers) on a page.
XHR requests can be asynchronous, though. But as long as you disable the ability to save until the current request to save returns, everything should work out just fine.
My only suggestion, is to make sure you indicate to the user somehow when an autosave occurs (disable the save button, etc).
All the major browsers currently single-thread javascript execution (just don't use web workers since a few browsers support this technique!), so this approach is safe.
For a bunch of references, see Is JavaScript Multithreaded?
Looks safe to me. Javascript is single threaded (unless you are using webworkers)
Its not quite on topic but this post by John Resig covers javascript threading and timers:
http://ejohn.org/blog/how-javascript-timers-work/
I think the way you're handling it is best for your situation. By using the flag you're guaranteeing that the asynchronous calls aren't overlapping. I've had to deal with asynchronous calls to the server as well and also used some sort of flag to prevent overlap.
As others have already pointed out JavaScript is single threaded, but asynchronous calls can be tricky if you're expecting things to say the same or not happen during the round trip to the server.
One thing, though, is that I don't think you actually need to disable the auto-save. If the auto-save tries to happen when a user is saving then the save method will simply return and nothing will happen. On the other hand you're needlessly disabling and reenabling the autosave every time autosave is activated. I'd recommend changing to setInterval and then forgetting about it.
Also, I'm a stickler for minimizing global variables. I'd probably refactor your code like this:
var saveWork = (function() {
var isSaving=false;
var timeoutId;
var timeoutInterval=300000;
function endSave() {
isSaving=false;
//hides popup if it's visible
}
function endSaveError() {
alert("Ooops");
endSave();
}
function _save(showMsg) {
//Don't save if we are already saving.
if (isSaving)
{
return;
}
isSaving=true;
if (showMsg) { //show a saving popup}
params=CollectParams();
PerformCallBack(params,endSave,endSaveError);
}
return {
save: function(showMsg) { _save(showMsg); },
enableAutoSave: function() {
timeoutId=setInterval(function(){_save(false);},timeoutInterval);
},
disableAutoSave: function() {
cancelTimeOut(timeoutId);
}
};
})();
You don't have to refactor it like that, of course, but like I said, I like to minimize globals. The important thing is that the whole thing should work without disabling and reenabling autosave every time you save.
Edit: Forgot had to create a private save function to be able to reference from enableAutoSave