How do I know a page is really fully loaded? - javascript

I am using python webkit.WebView and gtk to crawl a web page. However, the web page is kind of dynamically loaded by javascript.
The WebView "load-finished" event is not sufficient to handle this. Is there any indicator/event to let me know that the page is really fully loaded even the content produced by javascript?
Thanks,

There is no real way to determine if that page is fully loaded.
One method is to determine the amount of time since the last request. However, some pages will make repeated requests continually. This is common with tracking scripts and some ad scripts.
What I would do is use a set amount of time after the web view has said it finished loading... 5 seconds or so. It isn't perfect, but is the best you got, as there is no way to determine what "fully loaded" is for an arbitrary page.

Related

How to detect load requests of iframe content elements?

I'm trying to create a secure Darknet with WebRTC DataChannels in pure HTML, so I'm not interested about to know when an iframe has been fully loaded, but instead I'm interested to capture the iframe elements (inline images and so) using a custom scheme so I can be able from the parent page (the one connected to the Darknet) to do the real request and response with the actual data. With FirefoxOS mozbrowserlocationchange event of the Browser API objects (an extension to iframes) I could be able to capture the user navigation, cancel it, do the real request on the Darknet and later inject the iframe with the real content fetched by the parent page, but how could I be able to do the same with inline images and scripts on this loaded page? Or is this not currently possible and should I ask to them about add this functionality?
Obviously, I don't have any control about the iframes content pages, so they would be created by whatever and in any way, and also the usage of Browser API is just because seems to be the most useful to whan I'm trying to do, ideally it would be perfect if this is possible to achieve with plain iframes... :-)
Update:
A half-solution I have thinking about would be since I could be able to capture the mozbrowserlocationchange event to do the real content request of the HTML page and before fill the iframe with it do the request of their linked images and scripts and set them inline to prevent the iframe from doing more request. This would only lead to somewhat very simple web pages compared to current web standards (no AJAX, no async loading of script tags...) but definitely it would be usable up to some point :-)
Anyway, is there any other better alternative?
That sounds like something, that would be possible (straightforward, even) as soon as Service Controllers (previously known as NavigationControllers) are implemented, but I do not know any way to accomplish this via any currently available method.
No wonder you didn't find info about this - the proposal is called "Service workers" (though, previously this was called Event workers, and even before that, they were called - guess what - navigation controllers). This is a lively spec! ;) Find the working draft on GitHub: https://github.com/slightlyoff/ServiceWorker/ with a lengthy explainer document that should get you going.
Also, there is a document with the current Chrome (blink) implementation plans.

Django load parts of pages dynamically as they become available

I'm making a Django page that has a sidebar with some info that is loaded from external websites(e.g. bus arrival times).
I'm new to web development and I recognize this as a bottleneck. As it is, the page hangs for a fraction of a second as it loads the data from the other sites. It doesn't display anything until it gets this info because it runs python scripts to get the data before baking it into the html.
Ideally, it would display the majority of the page loaded directly off my web server and then have a little "loading" gif or something until it actually manages to grab the data before displaying that.
How can I achieve this? I presume javascript will be useful? How can I get it to integrate with my existing poller scripts?
You probably don't need up-to-the-second information, so have another process load the data into a cache, and have your website read it from the local cache.
The easiest but not most beautiful way to integrate something like this would be with iframes. Just make iframes for the secondary stuff, and they will load themselves in due time. No javascript required.

How can I stop loading a web page if it is equiped with frame-buster buster?

How can I stop loading a web page if it uses a frame-buster buster as mentioned in this question, or an even stronger X-Frame-Options: deny like stackoverflow.com? I am creating a web application that has the functionality of loading external web pages into an <iframe> via javascript, but if the user accidentally steps on to websites like google.com or stackoverflow.com, which have a function to bust a frame-buster, I just want to quit loading. In stackoverflow.com, it shows a pop up message asking to disable the frame and proceed, but I would rather stop loading the page. In google, it removes the frame without asking. I have absolutely no intent of click jacking, and at the moment, I only use this application by myself. It is inconvinient that every time I step on to such sites, the frames are broken. I just do not need to continue loading these pages.
Edit
Seeing the answers so far, it seems that I can't detect this before loading. Then, is it possible to load the page in a different tab, and then see if it does not have the frame-buster buster, and then if it doesn't, then load that into the <iframe> within the original tab?
Edit 2
I can also acheive the header or the webpage as an html string through the script language (Ruby) that I am using. So I think I indeed do have access to the information before loading it into an <iframe>.
There's no way to detect this before loading the page since the frame busting is done via a header or is triggered via JavaScript as the page is loading.
Without a server backend you won't be able to as you are pretty limited with the amount of tinkering you can do in javascript due to crossdomain policies.
You might want to consider creating some sort of a blacklist for URLs to stay away from...

javascript file size and page loading times

I'm making an ASP.NET web forms web app. I've just started with the client side scripts. I'm planning to put quite a lot of JavaScript code in a file that will be loaded on each page. I want to know some general guidelines about when to start worrying about the size of the file, in consideration of the users and their page loading times.
The users will mostly be using Internet Explorer 7 and 8, but I suppose the script still will be cached after the first visit? If not, is there any way to make IE cache the file?
They'll be cached, like you suppose, after the first visit, so you don't need to worry unless it actually becomes an issue.

Why do some websites (like facebook) load scripts in an iframe?

Why do some websites (like facebook) load scripts in an iframe?
Is this to allow the site to load more than 2 resources at a time because the iframe's resources are at different URLs?
What you are seing, might be an application of "Comet" communication, using a hidden iframe as data channel. A short explanation of the technique according to Wikipedia:
A basic technique for dynamic web application is to use a hidden IFrame HTML element (an inline frame, which allows a website to embed one HTML document inside another). This invisible IFrame is sent as a chunked block, which implicitly declares it as infinitely long (sometimes called “forever frame”). As events occur, the iframe is gradually filled with script tags, containing JavaScript to be executed in the browser. Because browsers render HTML pages incrementally, each script tag is executed as it is received.
This could be used for something like a chat, where messages are expected to appear without noticeable delay, and preferably without periodical "polling" for new data. If this is what you have come across, you should see several <script> elements in the frame, and more should be added as times go by.
EDIT
So to actually address your question... I don't know! The following information might be helpful, however:
Facebook prepends all of the JS variables and functions with your application ID.
var ID;
becomes
var 1262682068026-ID;
This limits the scope of your javascript to only your application so you can't use the DOM to get at their friends, phone number, email, address, etc, unless authorized. It makes a little sub-sandbox for you to play in.
More info on scoping here:
Facebook Docs
javascript loaded in iframe have no access to parent page objects (cross-domain restriction)
They load comet (aka comet, HTTP Push, long-lived, etc) connections in an iFrame because Internet Explorer eventually drops it:
http://cometdaily.com/2007/10/25/http-streaming-and-internet-explorer/
As it is effectively a continuous long poll, this is a blocker, this hack also increases IE's 2 connection limit leading to better responsiveness, background info:
http://alex.dojotoolkit.org/2006/02/what-else-is-burried-down-in-the-depths-of-googles-amazing-javascript/

Categories

Resources