Phantomjs page.open multiple url's slows down - javascript

I have a set of URLs which I use page.open() to open. After processing the contents, I then call page.release() on the page and then call the function to open another page recursively. The webpage has javascript on it, and I test for a condition showing when the javascript has loaded results. The first page.open() call loads the JS in 1 second but all subsequent calls take about 6seconds. I am using page.release() and the pages loaded aren't blank and phantomjs is not crashing. I am wondering why this is happening. I've also tried using page.close()
doAnalysis = function (i) {
var url = 'http://myurl.com';
page.open(url, function(status) {
//get html and process it
page.release();
doAnalysis(i++);
});
}

Related

How to ensure webpage DOM is fully loaded before using jQuery

I am trying to load a DOM of the currently open webpage in a Google Chrome extension app that I am developing and then retrieve some elements and store them in variables for manipulation. Currently, I am just trying to grab the sender email under the gD class for the specified span. As of now it will output whatever I have in the doStuffWithDOM function twice to the console. I have tried using several variations of ensuring that the page is fully loaded before calling the function like tab.status==complete or/and changeInfo.status==complete but it still calls the function multiple times. Any suggestions on how to ensure the webpages DOM is fully loaded before calling the doStuffWithDOM function?
background.js
function doStuffWithDOM(domContent) {
senderName = $(domContent).find("span.gD");
console.log(senderName);
}
chrome.tabs.onUpdated.addListener(function(id,changeInfo,tab){
if(changeInfo.status=='complete' && tab.status=='complete'){ //To send message after the webpage has loaded
chrome.tabs.sendMessage(tab.id, { text: "report_back" },function(response){
doStuffWithDOM(response);
});
}
})
console output:
test
jQuery.fn.init [prevObject: jQuery.fn.init(62)]
test
jQuery.fn.init [span.gD, prevObject: jQuery.fn.init(70)]

How to follow a document.location.reload in PhantomJS?

I've loaded a page in PhantomJS (using it from NodeJS) and there's a JS function on that page doRedirect() which contains
...
document.cookie = "key=" + assignedKey
document.location.reload(true)
I run doRedirect() from PhantomJS like this
page.evaluate(function() {
return doRedirect()
}).then(function(result) {
// result is null here
})
I'd like PhantomJS to follow the document.location.reload(true) and return the contents of that new page. How can this be done?
document.location.reload() doesn't navigate anywhere, it reloads the page. It's like clicking the refresh button your browser. It's all happening in the frontend, not the server, where 300 Redirect happens.
Simply call that function, wait for PhantomJS to finish loading the page, then ask it for the contents.
You can wait for PhantomJS to finish loading by using the page.onLoadFinished event. Additionally, you might need to use setTimeout() after load to wait some additional amount of time for page content to load asynchronously.
var webPage = require('webpage');
var page = webPage.create();
page.onLoadFinished = function(status) {
// page has loaded, but wait extra time for async content
setTimeout(function() {
// do your work here
}, 2000); // milliseconds, 2 seconds
};

Chrome extension to refresh a page every minute and run a (javascript) script every time it refreshes

What I want:
My propose is to check if new content was added in a page (that I do not own), so I was thinking to make a script that save the last content added in a cookie and refresh the page every minute: If the cookie doesn't match the last content added, that would mean there is new content and I would receive a notification.
Let's try with pseudocode:
main_file:
include: functions.js;
cookie last_content_added= get_first_paragraph();
//Refresh script
do (every_minute){
page_reload();
}
when.page.reload.complete {
run script_check_content
}
functions.js
script_check_content{
var content_check = get_first_paragraph();
if (content_check == cookie[last_content_added])
{
//do nothing
}
else{
//new content was added
play.notification.mp3
cookie[last_content.added] = get_first_paragraph();
}
}
Am I not thinking in an easier solution for what I'm looking for?
I'm new to chrome extensions, if you could separate the code in different files like it was a real extension, I would appreciate very much.
I recommend to use 'chrome.tabs.query', use this to get all tabs that have the specified properties or all tabs if no properties are specified and 'chrome.tabs.executeScript' to inject the javascript code into a page that calls 'window.location.reload(). to refresh the page.
Here's a sample code to get the current tab and reload it using chrome.tab methods:
chrome.tabs.query({active: false, currentWindow: true}, function (arrayOfTabs) {
var code = 'window.location.reload();';
chrome.tabs.executeScript(arrayOfTabs[0].id, {code: code});
});
Also, include 'onCompleted' listener to listen when it is completely loaded and initialized.
chrome.webNavigation.onCompleted.addListener(function callback).
Take a look at MutationObserver, it provides a way to react to changes in a DOM. You can provide a callback to react to DOM changes and don't need to use a timer.

Chrome extensions: best method for communicating between background page and a web site page script

What I want to do is to run go() function in image.js file. I've googled around and I understand that is not possible to run inline scripts.
What is the best method to call the JavaScript I want? Events? Messages? Requests? Any other way?
Here is my code so far:
background.js
chrome.browserAction.onClicked.addListener(function(tab) {
var viewTabUrl = chrome.extension.getURL('image.html');
var newURL = "image.html";
chrome.tabs.create({
url : newURL
});
var tabs = chrome.tabs.query({}, function(tabs) {
for (var i = 0; i < tabs.length; i++) {
var tab = tabs[i];
if (tab.url == viewTabUrl) {
//here i want to call go() function from image.js
}
}
});
});
image.html
<html>
<body>
<script src="js/image.js"></script>
</body>
</html>
image.js
function go(){
alert('working!');
}
There are various ways to achieve this. Based on what exactly you are trying to achieve (which is not clear by your question), one way might be better than the other.
An easy way, would be to inject a content script and communicate with it through Message Passing, but it is not possible to inject content scripts into a page with the chrome-extension:// scheme (despite what the docs say - there is an open issue for correcting the docs).
So, here is one possibility: Use window.postMessage
E.g.:
In background.js:
var viewTabURL = chrome.extension.getURL("image.html");
var win = window.open(viewTabURL); // <-- you need to open the tab like this
// in order to be able to use `postMessage()`
function requestToInvokeGo() {
win.postMessage("Go", viewTabURL);
}
image.js:
window.addEventListener("message", function(evt) {
if (location.href.indexOf(evt.origin) !== -1) {
/* OK, I know this guy */
if (evt.data === "Go") {
/* Master says: "Go" */
alert("Went !");
}
}
});
In general, the easiest method to communicate between the background page and extension views is via direct access to the respective window objects. That way you can invoke functions or access defined properties in the other page.
Obtaining the window object of the background page from another extension page is straightforward: use chrome.extension.getBackgroundPage(), or chrome.runtime.getBackgroundPage(callback) if it's an event page.
To obtain the window object of an extension page from the background page you have at least three options:
Loop through the results of chrome.extension.getViews({type:'tab'}) to find the page you want.
Open the page in the first place using window.open, which directly returns the window object.
Make code in the extension page call a function in the background page to register itself, passing its window object as a parameter. See for instance this answer.
Once you have a reference to the window object of your page, you can call its functions directly: win.go()
As a side note, in your case you are opening an extension view, and then immediately want to invoke a function in it without passing any information from the background page. The easiest way to achieve that would be to simply make the view run the function when it loads. You just need to add the following line to the end of your image.js script:
go();
Note also that the code in your example will probably fail to find your tab, because chrome.tabs.create is asynchronous and will return before your tab is created.

Listen again to `$(window).load` after ajax request

I'm loading an AJAX request of another HTML page, which is then inserted into a DOM element of the current page.
The page I'm getting through AJAX includes link references to stylesheets, as well as multiple images, which must be loaded from the server.
I want to execute code after all resources from the AJAX call loads, including referenced stylesheets and images.
Note that these stylesheets and images are not directly loaded from AJAX but are loaded as a result of the insertion of the HTML from the AJAX call.
Thus, I'm not looking for the success: callback, but rather like another $(window).load(function () { ... }); after the AJAX call (I've tried listening again to $(window).load without success.
Let me know if you need more code.
Checking whether a stylesheet has been loaded is difficult to do -- especially cross-browser. This article suggests having an element that will be changed by a known rule in the loaded stylesheet and polling to check whether its style has been changed to detect loading.
Images are easier, and I would expect they take a lot longer to load so you can probably get away with only checking image loading.
success: function (html) {
var imageloads = [];
$(html).find("img").each(function () {
var dfd = $.Deferred();
$(this).on('load', function () {
dfd.resolve();
});
//Image was cached?
if (this.complete) {
$(this).trigger('load');
}
imageloads.push(dfd);
});
$.when.apply(undefined, imageloads).done(function () {
//images finished loading
});
}

Categories

Resources