How to parse iframe on a website with CasperJS - javascript

I want obtain informations on a website with iframe.
When I parse this website with casperjs with this command :
casper.then(function() {
casper.waitFor(function() {
return this.withFrame('mainframe', function() {});
}, function() {
this.withFrame('mainframe', function() {
console.log(this.echo(this.getHTML()));
});
});
});
My problem is the result, I have content of one iframe only.
How I can obtain a result of any iframe present on my website?

CasperJS doesn't specifically provide a function to wait for an iframe to load. However, you can use the waitForResource() step function to wait for the iframe resource and then act on it.
casper.waitForResource(function check(res){
var iframeSrc = this.getElementAttribute("iframe[name='mainframe']", "src");
return res.url.indexOf(iframeSrc) > -1;
}, function(){
// success
});
When the resource is received, then you can wait inside of the iframe for a specific selector in order to continue with your script as soon as the iframe is fully loaded:
casper.waitForResource(function check(res){
var iframeSrc = this.getElementAttribute("iframe[name='mainframe']", "src");
return res.url.indexOf(iframeSrc) > -1;
}).withFrame('mainframe', function(){
this.waitForSelector("someSelectorForAnElement thatIsLoadedAtTheEnd", function(){
this.echo(this.getHTML());
});
});
casper.waitFor(function() {
return this.withFrame('mainframe', function() {});
}, function() {
This code doesn't wait at all. CasperJS supports a Fluent API, so that you can chain multiple step functions together like this:
casper.start(url)
.then(function(){...})
.wait(3000)
.then(function(){...})
.run();
This means that the result of withFrame() is the casper object which is evaluated to true for that check function. There is no waiting going on.
console.log(this.echo(this.getHTML()));
doesn't make sense, because casper.echo() already prints to the console. Use either
console.log(this.getHTML());
or
this.echo(this.getHTML());
but not both.

Related

Javascript - run function in iFrame after parent window has loaded

Is there a way to know if parent window has loaded from within iframe?
I want to run a function which is in iFrame but I need to run it after all the parent windows are loaded and the event listener will be inside the iframe.
I tried the following but it runs before parent windows are loaded.
window.addEventListener('load', function () {
alert("It's loaded!")
});
One way would be to add the iframe dynamically:
parent:
window.addEventListener('load', function () {
var iframe = document.createElement('iframe');
iframe.src = "https://www.example.com";
document.body.appendChild(iframe);
});
iframe:
window.addEventListener('load', function () {
alert('hello world!');
//doSomethingUseful();
}
This way, you could be certain that they will load in a specific order. However, as they'd be loading in series, the increase in total page load time could become noticeable.
Alternatively, you could use this approach. This one may not work as is, if one page happens to finish loading before the other. If you do opt for this approach, it may be necessary to communicate in both directions so that the first page to load finds out when the second page has loaded. That may look like this:
parent:
newEvent();
window.document.addEventListener('myCustomEventI', newEvent, false);
function newEvent() {
var data = { loaded: true }
var event = new CustomEvent('myCustomEventP', { detail: data })
window.parent.document.dispatchEvent(event);
}
iframe:
newEvent();
window.document.addEventListener('myCustomEventP', handleEvent, false);
function newEvent() {
var data = { loaded: true }
var event = new CustomEvent('myCustomEventI', { detail: data })
window.parent.document.dispatchEvent(event);
}
function handleEvent(e) {
alert('both loaded!');
//doSomethingUseful();
}

AJAX on Button Click runs incrementally

I've implemented a simple AJAX call that is bound to a button. On click, the call takes input from an and forwards the value to a FLASK server using getJSON. Using the supplied value (a URL), a request is sent to a website and the html of a website is sent back.
The issue is the AJAX call seems to run multiple times, incrementally depending on how many times it has been clicked.
example;
(click)
1
(click)
2
1
(click)
3
2
1
Because I am sending requests from a FLASK server to another website, it effectively looks like I'm trying to DDOS the server. Any idea how to fix this?
My AJAX code;
var requestNumber = 1; //done for testing purposes
//RUNS PROXY SCRIPT
$("#btnProxy").bind("click", function() . //#btnProxy is the button
{
$.getJSON("/background_process", //background_process is my FLASK route
{txtAddress: $('input[name="Address"]').val(), //Address is the input box
},
console.log(++requestNumber), //increment on function call
function(data)
{$("#web_iframe").attr('srcdoc', data.result); //the FLASK route retrieves the html of a webpage and returns it in an iframe srcdoc.
});
return false;
});
My FLASK code (Though it probably isn't the cause)
#app.route('/background_process')
def background_process():
address = None
try:
address = request.args.get("txtAddress")
resp = requests.get(address)
return jsonify(result=resp.text)
except Exception, e:
return(str(e))
Image of my tested output (I've suppressed the FLASK script)
https://snag.gy/bikCZj.jpg
One of the easiest things to do would be to disable the button after the first click and only enable it after the AJAX call is complete:
var btnProxy = $("#btnProxy");
//RUNS PROXY SCRIPT
btnProxy.bind("click", function () //#btnProxy is the button
{
btnProxy.attr('disabled', 'disabled');//disable the button before the request
$.getJSON("/background_process", //background_process is my FLASK route
{
txtAddress: $('input[name="Address"]').val(), //Address is the input box
},
function (data) {
$("#web_iframe").attr('srcdoc', data.result); //the FLASK route retrieves the html of a webpage and returns it in an iframe srcdoc.
btnProxy.attr('disabled', null);//enable button on success
});
return false;
});
You can try with preventDefault() and see if it fits your needs.
$("#btnProxy").bind("click", function(e) {
e.preventDefault();
$.getJSON("/background_process",
{txtAddress: $('input[name="Address"]').val(),
},
console.log(++requestNumber),
function(data)
{$("#web_iframe").attr('srcdoc', data.result);
});
return false;
});
Probably you are binding the click event multiple times.
$("#btnProxy").bind("click", function() { ... } );
Possible solutions alternatives:
a) Bind the click event only on document load:
$(function() {
$("#btnProxy").bind("click", function() { ... } );
});
b) Use setTimeout and clearTimeout to filter multiple calls:
var to=null;
$("#btnProxy").bind("click", function() {
if(to) clearTimeout(to);
to=setTimeout(function() { ... },500);
});
c) Clear other bindings before set your calls:
$("#btnProxy").off("click");
$("#btnProxy").bind("click", function() { ... } );

ajaxComplete with getJSON causing loop

I am using ajaxComplete to run some functions after dynamic content is loaded to the DOM. I have two separate functions inside ajaxComplete which uses getJSON.
Running any of the functions once works fine
Running any of them a second time causes a loop cause they are using getJSON.
How do I get around this?
I'm attaching a small part of the code. If the user has voted, clicking the comments button will cause the comments box to open and close immediately.
$(document).ajaxComplete(function() {
// Lets user votes on a match
$('.btn-vote').click(function() {
......
$.getJSON(path + 'includes/ajax/update_votes.php', { id: gameID, vote: btnID }, function(data) {
......
});
});
// Connects a match with a disqus thread
$('.btn-comment').click(function() {
var parent = $(this).parents('.main-table-drop'), comments = parent.next(".main-table-comment");
if (comments.is(':hidden')) {
comments.fadeIn();
} else {
comments.fadeOut();
}
});
});
Solved the problem by checking the DOM loading ajax request URL
$(document).ajaxComplete(event,xhr,settings) {
var url = settings.url, checkAjax = 'list_matches';
if (url.indexOf(checkAjax) >= 0) { ... }
}

casperjs - How to skip part of code on die?

How to skip part of a code on die? The script has to jump to the next label(for example, they all have name LABEL1)
casper.start('http://google.com');
casper.waitForSelector("input[name='q']",
function success() {
this.echo('Google.com page loaded');
},
function fail() {
this.die('Google.com page WAS NOT loaded'); //meet die() function,
//need to jump on LABEL1 without stopping script
});
casper.then(function(){
this.fillSelectors('body', {
"input[name='q']": 'stackoverflow',
}, true);
this.echo('Filled form with search word - stackoverflow');
});
//here can be random number of casper steps
casper.then(function() {
this.captureSelector("search_results.png", "html");
});
//steps, steps
casper.then(function() {
this.echo("search_results.png");
});
//LABEL1
casper.thenOpen('http://wikipedia.org', function() {
this.echo('HELLO');
});
casper.run();
I can't use suites for this purpose, because I have a custom casper module.
You could probably try emitting a custom event which would execute your required step instead of calling a die(). http://docs.casperjs.org/en/latest/events-filters.html#emitting-you-own-events

automatic clicks on links and doing something with every page's DOM

i have some links in a web page ,what i want to do :
Trigger click event on every link
When the page of every link is loaded , do something with page's DOM(fillProducts here)
What i have tried :
function start(){
$('.category a').each(function(i){
$.when($(this).trigger('click')).done(function() {
fillProducts() ;
});
})
}
Thanks
What you want to do is much more complicated than you seem to be giving it credit for. If you could scrape webpages, including AJAX content, in 7 lines of js in the console of a web browser you'd put Google out of business.
I'm guessing at what you want a bit, but I think you want to look at using a headless browser, e.g. PhantomJs. You'll then be able to scrape the target pages and write the results to a JSON file (other formats exist) and use that to fillProducts - whatever that does.
Also, are you stealing data from someone else's website? Cause that isn't cool.
Here's a solution that may work for you if they are sending their ajax requests using jQuery. If they aren't you're going to need to get devilishly hacky to accomplish what you're asking (eg overriding the XMLHttpRequest object and creating a global observer queue for ajax requests). As you haven't specified how they're sending the ajax request I hope this approach works for you.
$.ajaxSetup({
complete: function(jQXHR) {
if(interested)
//do your work
}
});
The code below will click a link, wait for the ajax request to be sent and be completed, run you fillProducts function and then click the next link. Adapting it to run all the clicks wouldn't be difficult
function start(){
var links = $('.category a');
var i = 0;
var done = function() {
$.ajaxSetup({
complete: $.noop//remove your handler
});
}
var clickNext = function() {
$(links.get(i++)).click();//click current link then increment i
}
$.ajaxSetup({
complete: function(jQXHR) {
if(i < links.length) {
fillProducts();
clickNext();
} else {
done();
}
}
});
clickNext();
}
If this doesn't work for you try hooking into the other jqXHR events before hacking up the site too much.
Edit here's a more reliable method in case they override the complete setting
(function() {
var $ajax = $.ajax;
var $observer = $({});
//observer pattern from addyosmani.com/resources/essentialjsdesignpatterns/book/#observerpatternjquery
var obs = window.ajaxObserver = {
subscribe: function() {
$observer.on.apply($observer, arguments);
},
unsubscribe: function() {
$observer.off.apply($observer, arguments);
},
once: function() {
$observer.one.apply($observer, arguments);
},
publish: function() {
$observer.trigger.apply($observer, arguments);
}
};
$.ajax = function() {
var $promise = $ajax.apply(null, arguments);
obs.publish("start", $promise);
return $promise;
};
})();
Now you can hook into $.ajax calls via
ajaxObserver.on("start", function($xhr) {//whenever a $.ajax call is started
$xhr.done(function(data) {
//do stuff
})
});
So you can adapt the other snippet like
function start(){
var links = $('.category a');
var i = 0;
var clickNextLink = function() {
ajaxObserver.one("start", function($xhr) {
$xhr.done(function(data) {
if(i < links.length) {
fillProducts();
clickNextLink();
} else {
done();
}
});
})
$(links.get(i++)).click();//click current link then increment i
}
clickNextLink();
}
try this:
function start(){
$('.category a').each(function(i){
$(this).click();
fillProducts() ;
})
}
I get ya now. This is like say:
when facebook loads, I want to remove the adverts by targeting specific class, and then alter the view that i actually see.
https://addons.mozilla.org/en-US/firefox/addon/greasemonkey/
Is a plugin for firefox, this will allow you to create a javascript file, will then allow you to target a specific element or elements within the html rendered content.
IN order to catch the ajax request traffic, you just need to catcher that within your console.
I can not give you a tutorial on greasemonkey, but you can get the greasemonkey script for facebook, and use that as a guide.
http://mashable.com/2008/12/25/facebook-greasemonkey-scripts/
hope this is it

Categories

Resources