How to follow a document.location.reload in PhantomJS? - javascript

I've loaded a page in PhantomJS (using it from NodeJS) and there's a JS function on that page doRedirect() which contains
...
document.cookie = "key=" + assignedKey
document.location.reload(true)
I run doRedirect() from PhantomJS like this
page.evaluate(function() {
return doRedirect()
}).then(function(result) {
// result is null here
})
I'd like PhantomJS to follow the document.location.reload(true) and return the contents of that new page. How can this be done?

document.location.reload() doesn't navigate anywhere, it reloads the page. It's like clicking the refresh button your browser. It's all happening in the frontend, not the server, where 300 Redirect happens.
Simply call that function, wait for PhantomJS to finish loading the page, then ask it for the contents.
You can wait for PhantomJS to finish loading by using the page.onLoadFinished event. Additionally, you might need to use setTimeout() after load to wait some additional amount of time for page content to load asynchronously.
var webPage = require('webpage');
var page = webPage.create();
page.onLoadFinished = function(status) {
// page has loaded, but wait extra time for async content
setTimeout(function() {
// do your work here
}, 2000); // milliseconds, 2 seconds
};

Related

How to play sound in setInterval function?

I am trying to refresh the page and compare it to check if the table data is modified or not after interval..& if modified then play a notification sound.The notification sound duration is of 60 sec.But it play the sound for just a second & then it refresh the page.How can i pause the execution until the sound has been fully played & then continue.
var foo = setInterval(function(){
window.location.reload(1);
var myTable = document.getElementById('admin_id');
var rows = myTable.rows;
var firstRow = rows[0];debugger;
try {
if ($.cookie('new') === null || $.cookie('new') === "" )
{
$.cookie("new", firstRow["cells"][10].innerText);
}
else
{
var cookieValue = $.cookie("new");
if(cookieValue!=firstRow["cells"][10].innerText){
$.cookie("new", firstRow["cells"][10].innerText);
//clearInterval(foo);
document.getElementById('id_audio').play();
}
}
}catch(err){alert(err.message);};
}, 1800000);
//Audio TAg Code
<audio id="id_audio" src="sound.mp3" preload="auto"></audio>
When you reload the page, everything in the page (including your code) is thrown away and replaced by the new page. The code above after the reload call runs briefly (still on the old page), and when it returns the browser terminates any ongoing processes the page is doing and loads the page fresh. Your code does not wait at the reload for the page to reload and then keep going.
Instead of what you're doing, have your code that checks for the new cookie run on page load. Then when you're reloading, just reload — your code will run on the reloaded page (on load).
If you don't want the code run on every page load, but just on reloads, set a flag in session storage before reloading and check it on load:
When reloading:
sessionStorage.reloadedAt = Date.now();
location.reload(1);
Then on load:
if (+sessionStorage.reloadedAt > Date.now() - 30000) {
// Do your checks
}
3000ms = three seconds, adjust as appropriate.

PhantomJS page.injectJs doesn't work

I'm currently trying to write the page source code into a text file by a URL. Everything works well, but I want to additionally inject a JavaScript file. The problem is that the file does not include properly. Only the last pages that are loaded, but others are incomplete.
//phantomjs C:\PhantomJS\Script\test1.js
var fs = require('fs');
var numeroEpisode = 0;
var maxEpisode = 10;
var fichierLien = fs.read('C:\\PhantomJS\\Fichier\\lien.txt');
var ListeLien = fichierLien.split(/[\n]/);
var page = require('webpage').create();
function GetPage()
{
if (numeroEpisode > maxEpisode)
{
phantom.exit();
}
page.open(ListeLien[numeroEpisode], function(status)
{
if(status !== 'success')
{
console.log('Impossible de charger la page.');
}
else
{
console.log('URL: '+ListeLien[numeroEpisode]+'');
page.injectJs('http://mylink.com', function() { });
var path = 'C:\\PhantomJS\\Fichier\\episode_'+numeroEpisode+'.html';
fs.write(path, page.content, 'w');
setTimeout(GetPage, 15000); // run again in 15 seconds
numeroEpisode++;
}
});
}
GetPage();
Don't mix up page.injectJs() and page.includeJs().
injectJs(filename): Loads a local JavaScript file into the page and evaluates it synchronously.
includeJs(url, callback): Loads a remote JavaScript file from the specified URL and evaluates it. Since it has to request a remote resource, this is done asynchronously. The passed callback is called as soon as the operation finished. If you don't use the callback, your code will most likely run before the remote JavaScript was included. Use that callback:
page.includeJs('http://mylink.com', function() {
var path = 'C:\\PhantomJS\\Fichier\\episode_'+numeroEpisode+'.html';
fs.write(path, page.content, 'w');
numeroEpisode++;
setTimeout(GetPage, 15000); // run again in 15 seconds
});
Since the JavaScript that you load changes something on the page, you probably need to load it after all the pages script have run. If this is a JavaScript heavy page, then you need to wait a little. You can wait a static amount of time:
setTimeout(function(){
page.includeJs('http://mylink.com', function() {
//...
});
}, 5000); // 5 seconds
or utilize waitFor to wait until an element appears that denotes that the page is completely loaded. This can be very tricky sometimes.
If you still want to use injectJs() instead of includeJs() (for example because of its synchronous nature), then you need to download the external JavaScript file to your machine and then you can use injectJs().

Refresh just home page

I am currently designing a system which includes a homepage that show the person who logs in only the work they have to do. I have been asked to set up this homepage to refresh every 3 minutes which I have done using this code:
function startTimer() {
var now = new Date();
var minutes = now.getMinutes();
var seconds = now.getSeconds();
var secTime = minutes*60*seconds;
if(secTime % (3*60) == 0){
var refreshTime = 3*60*1000;
} else {
var refreshTime = (secTime % (3*60)) * 1000;
}
setTimeout('refresh()', refreshTime);}
function refresh() {
window.location.href = 'myURL';
}
startTimer();
The problem I currently have is that when I navigate away from this page, but still in the system, it keeps returning me to homepage and I lose what I am working on.
Is there a way that I can keep refreshing homepage for those who haven't moved away from it and stop it when someone does?
I am very new to Javascript so please be patient if I ask a lot of question.
Thank you in advance for any help given.
I assume you are using a shared javascript file on all pages of the site which is why the timer will keep running on every page. You could make sure that the timer only runs on the homepage by checking the page url and wrap your startTimer function inside this check:
if (document.location.href == "http://www.yourhomepage.com"){
startTimer();
}
Replace http://www.yourhomepage.com with whatever url your homepage is on. This will only work if your pages are separate html files. If you are using a hashbang method whereby the document doesn't change, this will not work.
You can use Ajax to refresh the work log part of the page instead of refreshing the whole page.
When you refresh your page, your code redirect you to your home page because of window.location.href = 'myURL';. The location change, and it redirect you everytime to 'myURL'.
You would like to refresh only a part of your page. You have to send a XMLHttpRequest or Ajax request ( you load a page into your current page without reloading your current page ). https://developer.mozilla.org/fr/docs/XMLHttpRequest
When you get the page loaded, you insert the text loaded into the page.
Then, call the function which send request, every "refreshTime" like that
function sendAjax(){
// ... ajax request
// refreshTime = 3 * 60 * 1000;
setTimeout( sendAjax, refreshTime );
}
sendAjax();
Don't use quote arround the function name in setTimout. setTimemout need a function to call (not his name but his value) and time parameters.

Load a javascript/ajax call on click using phantomjs

I am trying to build a webscraper with which I can download the HTML source after information is received from a ajax call on click.
Simply speaking initially I download a the webpage and then on clicking the next button the page is loaded with a new set of images using a ajax call and I need to capture the html source after clicking next.
The next click source looks something like this
Next Page
And on the same page is the javascript function nextpage which handles the ajax call.
Is there a way to do this using phantomjs? I am very new to phantomjs so let me know if anything is not clear.
Currently I am only able to load the contents from original webpage.
var page = require('webpage').create();
page.open('somewebpage', function (status) {
if (status !== 'success') {
console.log('Unable to access network');
} else {
var p = page.evaluate(function () {
return document.getElementsByTagName('html')[0].innerHTML
});
console.log(p);
}
phantom.exit();
});
Thanks
Try:
var content = page.evaluate( function() { return
(new XMLSerializer()).serializeToString( document ); } );

Phantomjs page.open multiple url's slows down

I have a set of URLs which I use page.open() to open. After processing the contents, I then call page.release() on the page and then call the function to open another page recursively. The webpage has javascript on it, and I test for a condition showing when the javascript has loaded results. The first page.open() call loads the JS in 1 second but all subsequent calls take about 6seconds. I am using page.release() and the pages loaded aren't blank and phantomjs is not crashing. I am wondering why this is happening. I've also tried using page.close()
doAnalysis = function (i) {
var url = 'http://myurl.com';
page.open(url, function(status) {
//get html and process it
page.release();
doAnalysis(i++);
});
}

Categories

Resources