Hiding and showing several iframes in one page every x seconds - javascript

What would be the most efficient way to show and hide iframes in a page every x time?
I was thinking on using a setInterval on a function that uses jQuery's hide and show but this seems inneficient and not very scalable if I needed to hide and show 1 out of 10 iframes in one page that I would also need to hide and show.
$(document).ready(function(){
setInterval(function() {
if ($('#basic').is(":visible") && $('#advanced').not(":visible") ) {
$('#basic').hide();
$('#advanced').show();
}else if($('#basic').not(":visible") && $('#advanced').is(":visible")) {
$('#basic').show();
$('#advanced').hide();
}else if($('#basic').is(":visible") && $('#advanced').is(":visible")) {
$('#basic').hide();
$('#advanced').show();
};
}, 30000);
});
Each id refers to 1 iframe so right now I am only dealing with 2 iframes. The reason I have that last if else statement is because both iframes are being displayed when I load the page.

Just a snippet since the OP asked for it. Posting as an answer so I can format the code a bit better. (warning: I haven't 100% tested it myself yet, this isn't meant as a copy/paste implementation.)
This should show all the hidden frames and hide all the visible frames every 30 sec.
You can obviously easily extend it to only show/hide specific nodes that you can reference by id since frames[theFramesID] will give you the reference and visible status. If you don't need to be able to access specific frames, you can obv simplify this and use an array instead of an object.
Just using some form of loop and caching your nodes instead of requerying the same node over and over again will already increase the scalability. Since you don't need to change the code once you add another frame.
One could probably replace the vanilla functions I used with jquery specific ones if needed. I'm not sure which version of jquery has built-in reduce.
$(document).ready(function(){
var frames = $('iframe').reduce(function ( accumulator, frame ) {
accumulator[frame.id] = {
'reference' : frame,
'visible' : frame.is(":visible")
};
return accumulator;
}, {});
setInterval(function() {
Object.keys(frames).forEach(function ( id ) {
var frame = frames[id];
if (frame.visible) {
frame.visible = false;
frame.reference.hide();
}
else {
frame.visible = true;
frame.reference.show();
}
});
}, 30000);
});

Related

How to scroll down with Phantomjs to load dynamic content

I am trying to scrape links from a page that generates content dynamically as the user scroll down to the bottom (infinite scrolling). I have tried doing different things with Phantomjs but not able to gather links beyond first page. Let say the element at the bottom which loads content has class .has-more-items. It is available until final content is loaded while scrolling and then becomes unavailable in DOM (display:none). Here are the things I have tried-
Setting viewportSize to a large height right after var page = require('webpage').create();
page.viewportSize = { width: 1600, height: 10000,
};
Using page.scrollPosition = { top: 10000, left: 0 } inside page.open but have no effect like-
page.open('http://example.com/?q=houston', function(status) {
if (status == "success") {
page.scrollPosition = { top: 10000, left: 0 };
}
});
Also tried putting it inside page.evaluate function but that gives
Reference error: Can't find variable page
Tried using jQuery and JS code inside page.evaluate and page.open but to no avail-
$("html, body").animate({ scrollTop: $(document).height() }, 10,
function() {
//console.log('check for execution');
});
as it is and also inside document.ready. Similarly for JS code-
window.scrollBy(0,10000)
as it is and also inside window.onload
I am really struck on it for 2 days now and not able to find a way. Any help or hint would be appreciated.
Update
I have found a helpful piece of code at https://groups.google.com/forum/?fromgroups=#!topic/phantomjs/8LrWRW8ZrA0
var hitRockBottom = false; while (!hitRockBottom) {
// Scroll the page (not sure if this is the best way to do so...)
page.scrollPosition = { top: page.scrollPosition + 1000, left: 0 };
// Check if we've hit the bottom
hitRockBottom = page.evaluate(function() {
return document.querySelector(".has-more-items") === null;
}); }
Where .has-more-items is the element class I want to access which is available at the bottom of the page initially and as we scroll down, it moves further down until all data is loaded and then becomes unavailable.
However, when I tested it is clear that it is running into infinite loops without scrolling down (I render pictures to check). I have tried to replace page.scrollPosition = { top: page.scrollPosition + 1000, left: 0 }; with codes from below as well (one at a time)
window.document.body.scrollTop = '1000';
location.href = ".has-more-items";
page.scrollPosition = { top: page.scrollPosition + 1000, left: 0 };
document.location.href=".has-more-items";
But nothing seems to work.
Found a way to do it and tried to adapt to your situation. I didn't test the best way of finding the bottom of the page because I had a different context, but check the solution below. The thing here is that you have to wait a little for the page to load and javascript works asynchronously so you have to use setInterval or setTimeout (see) to achieve this.
page.open('http://example.com/?q=houston', function () {
// Check for the bottom div and scroll down from time to time
window.setInterval(function() {
// Check if there is a div with class=".has-more-items"
// (not sure if there's a better way of doing this)
var count = page.content.match(/class=".has-more-items"/g);
if(count === null) { // Didn't find
page.evaluate(function() {
// Scroll to the bottom of page
window.document.body.scrollTop = document.body.scrollHeight;
});
}
else { // Found
// Do what you want
...
phantom.exit();
}
}, 500); // Number of milliseconds to wait between scrolls
});
I know that it has been answered a long time ago, but I also found a solution to my specific scenario. The result is a piece of javascript that scrolls to the bottom of the page. It is optimized to reduce waiting time.
It is not written for PhantomJS by default, so that will have to be modified. However, for a beginner or someone who doesn't have root access, an Iframe with injected javascript (run Google Chrome with --disable-javascript parameter) is a good alternative method for scraping a smaller set of ajax pages. The main benefit is that it's easily debuggable, because you have a visual overview of what's going on with your scraper.
function ScrollForAjax () {
scrollintervals = 50;
scrollmaxtime = 1000;
if(typeof(scrolltime)=="undefined"){
scrolltime = 0;
}
scrolldocheight1 = $(iframeselector).contents().find("body").height();
$("body").scrollTop(scrolldocheight1);
setTimeout(function(){
scrolldocheight2 = $("body").height();
if(scrolltime===scrollmaxtime || scrolltime>scrollmaxtime){
scrolltime = 0;
$("body").scrollTop(0);
ScrapeCurrentPage(iframeselector);
}
else if(scrolldocheight2>scrolldocheight1){
scrolltime = 0;
ScrollForAjax (iframeselector);
}
else if(scrolldocheight1>=scrolldocheight2){
ScrollForAjax (iframeselector);
}
},scrollintervals);
scrolltime += scrollintervals;
}
scrollmaxtime is a timeout variable. Hope this is useful to someone :)
The "correct" solution didn't work for me. And, from what I've read CasperJS doesn't use window (but I may be wrong on that), which makes me doubt that window works.
The following works for me in the Firefox/Chrome console; but, doesn't work in CasperJS (within casper.evaluate function).
$(document).scrollTop($(document).height());
What did work for me in CasperJS was:
casper.scrollToBottom();
casper.wait(1000, function waitCb() {
casper.capture("loadedContent.png");
});
Which, also worked when moving casper.capture into Casper's then function.
However, the above solution won't work on some sites like Twitter; jQuery seems to break the casper.scrollToBottom() function, and I had to remove the clientScripts reference to jQuery when working within Twitter.
var casper = require('casper').create({
clientScripts: [
// 'jquery.js'
]
});
Some websites (e.g. BoingBoing.net) seem to work fine with jQuery and CasperJS scrollToBottom(). Not sure why some sites work and others don't.
The code snippet below work just fine for pinterest. I researched a lot to scrape pinterest without phantomjs but it is impossible to find the infinite scroll trigger link. I think the code below will help other infinite scroll web page to scrape.
page.open(pageUrl).then(function (status) {
var count = 0;
// Scrolls to the bottom of page
function scroll2btm() {
if (count < 500) {
page.evaluate(function(limit) {
window.scrollTo(0, document.body.scrollHeight || document.documentElement.scrollHeight);
return document.getElementsByClassName('pinWrapper').length; // use desired contents (eg. pin) selector for count presence number
}).then(function(c) {
count = c;
console.log(count); // print no of content found to check
});
setTimeout(scroll2btm,3000);
} else {
// required number of item found
}
}
scroll2btm();
});

I need a new way to detect if there has been a change to an elements HTML

Right now im trying to find a way to detect when an elements HTML has changed.
I'm currently trying:
var a, b;
setInterval(function() {
a = $('#chat').text();
}, 150);
setInterval(function() {
b = $('#chat').text();
if (a !== b) {
alert("There has been a new message.");
}
}, 200);​
What I do is every 150 milliseconds I check for the HTML of #chat and then every 200 seconds I check the HTML again and then check if variable a does not equal to variable b them in the future I will so something with that but for right now I just alert something.
You can see it live here: http://jsfiddle.net/MT47W/
Obviously this way is not working and is not very accurate at all.
Is there a better/different to do/achieve this?
Thanks for any help, I've been trying to figure out how to do this a better for about a week now but I just can't find a fix for this and i'm hoping I posted this problem at the right place, and at the right time.
Use a var to store the element's current text then check against it in a setInverval and update the var to store the current text after checking:
var a = $('#chat').text();
setInterval(function() {
if (a !== $('#chat').text()) { //checks the stored text against the current
alert("There has been a new message."); //do your stuff
}
a = $('#chat').text(); //updates the global var to store the current text
}, 150); //define your interval time, every 0.15 seconds in this case
Fiddle
You may as well store the value in the .data() of the element to avoid using globals.
Example using .data():
$('#chat').data('curr_text', $('#chat').text());
setInterval(function() {
if ($('#chat').data('curr_text') !== $('#chat').text()) {
alert("There has been a new message.");
}
$('#chat').data('curr_text', $('#chat').text());
}, 150);
Fiddle
Another approach, to save client's memory, you can just store the number of child divs your #chat element has:
$('#chat').data('n_msgs', $('#chat').children().length);
setInterval(function() {
if ($('#chat').data('n_msgs') !== $('#chat').children().length) {
alert("There has been a new message.");
}
$('#chat').data('n_msgs', $('#chat').children().length);
}, 150);
Fiddle
EDIT: Here's my very final addition, with a DOM mutation event listener:
$('#chat').on('DOMNodeInserted', function() {
alert("There has been a new message.");
});
Fiddle (not tested in IE < 8)
Note: As noted in the comments, even though mutation events are still supported they're classed as deprecated by W3C due to the performance loss and some incompatibilities across different browsers, therefore it's suggested to use one of the solutions above and only use DOM Mutation events when there's no other way around.
Just checking the last chat would improve efficiency and also do what you want. The only way that it would not work is if the same person sends the same message twice - which most likely will not happen.
I hope this would work:
var lastMessage = $('#chat .body').last().text();
function checkMessages(){
var newLastMessage = $('#chat .body').last().text();
if(lastMessage !== newLastMessage && $('#chat .body').last().length > 0){
//message has changed
alert("There has been a new message.");
lastMessage = $('#chat .body').last().text();
}
setTimeout(function(){checkMessages();},1000);
}
checkMessages();
Js Fiddle
Use crc32 or md5 hashing to check whether data has changed. Just get the html content of the div that you want to check, hash it as a string with either crc32 or md5 and you'll get a string that will represent that content. Use a hidden timestamp etc to be sure that multiple messages by one user get a different hash. If you do this in a setInterval callback, you should be good.
Although it is not highly recommended, it is also possible to use Paul Irish's Duck punching method to tie into jQuery's append function - if you know for sure that the append function is how content is being added (demo):
(function($) {
// store original reference to the method
var _old = $.fn.append;
$.fn.append = function(arg) {
// append as usual
_old.apply(this, arguments);
// do your own checking here
// "this" is a jQuery object
if (this[0].id === "chat") {
alert('there is a new message!');
}
return this;
};
})(jQuery);
With this method, no timing loops are needed.

Copy content from <div> into a form field every time it changes using javascript

What i need to do is copy the content of a div with id #logo to a form field with id #input_2_15.
The content in the div is an image (<img src.../>), but this changes... I have the code to copy the content to the input field when the page loads, but i need a code which copies the content every time the image changes (and it does so without refreshing the page). How can i do this?
Also, is it possible to get the function to only copy the image name eg. 12345.png rather than the whole <img src=..../>?
Miro
You mean
function getUrl(id) {
// function to return the source of an image inside an object with given ID
return $("#"+id).find("img").attr("src");
}
$(document).ready(function() {
var currentImage = getUrl("logo"); // get the url of the div now (empty I guess)
$("#ajaxtrigger").click(function() { // some link with ID ajaxtrigger
$("#logo").load("someurlreturningsomehmtl",function(){ // loads the image
if (currentImage != getUrl("logo")) { // did it change?
currentImage = getUrl("logo"); // save the name
$("#input_2_15").val(currentImage); // update the field
}
})
});
});
$('#logo').find('img').attr('src');
This will give you the image location.. now just append this to your img src in the second div
I see this as a case for a "publish and subscribe" ("Pub/Sub") approach.
In this article, we learn four ways to do Pub/Sub with jQuery 1.7, and I chose Option 1 which exploits a new feature of jquery 1.7, namely its $.Callbacks feature. The article gives a good understanding, which I will not try to better here.
The code below is a slightly modified version of Option 1, avoiding need for a global var:
$.Topic = function(id) {
var callbacks, topic = id && $.Topic.topics[id];
if (!topic) {
callbacks = $.Callbacks();
topic = {
publish: callbacks.fire,
subscribe: callbacks.add,
unsubscribe: callbacks.remove
};
if (id) {
$.Topic.topics[id] = topic;
}
}
return topic;
};
$.Topic.topics = {};//avoid global var by making `topics` a property of the static function `jQuery.Topic`.
$(function() {
//A function to change the logo and fire a publisher.
function changeLogoSrc(src) {
$("#logo img").attr('src', src);
$.Topic('logoSrcChanged').publish(src);
}
// A subscriber which listenes for the 'logoSrcChanged' publisher
// and responds by writing the src string to the required form field
$.Topic('logoSrcChanged').subscribe(function(src) {
$("#input_2_15").val(src);
});
});
Demo here
Thus the code for changing the logo and for updating the form field are, to use the correct jargon, effectively "decoupled".
This approach is arguably overblown for something simple, but would be useful in a more extensive environment where many pub/subs are required.

How to check the status each 10 seconds with JavaScript?

I have a server (mysite.com/status), which returns number of active tasks (just a single integer).
How can I check number of active tasks each 10 seconds with JavaScript and show user something like:
Number of remaining tasks: XXX
And, if number of tasks is 0, then I should load another page.
Make a function set a new timeout calling itself.
function checkTasks(){
// Make AJAX request
setTimeout(checkTasks, 10000);
}
checkTasks(); // Start task checking
with jQuery for AJAX functions... (untested code)
setInterval(function(){
$.get('http://example.com/status',function(d){
// where there is html element with id 'status' to contain message
$('#status').text('Number of remaining tasks: '+d)
if(d == 0){
window.location = '/another/page'
}
})
},10000)
I have thought of different approach, without any AJAX. As you just need to show simple plain data, just use <iframe> to show that dynamic data then with simple JS "reload" the frame every 10 seconds.
HTML would be:
Number of remaining tasks: <iframe id="myFrame" src="mysite.com/status"></iframe>
And the JavaScript:
window.onload = function() {
window.setTimeout(ReloadTime, 10000);
};
function ReloadTime() {
var oFrame = document.getElementById("myFrame");
oFrame.src = oFrame.src;
window.setTimeout(ReloadTime, 10000);
}
Live test case, using time for the same of example.
With some CSS you can make the frame look like part of the text, just set fixed width and height and remove the borders.
setInterval(function(elem){ // here elem is the element node where you want to display the message
var status=checkStatus(); // supposing that the function which returns status is called checkStatus
if(status == 0){
window.location = '/another/page'
}
else {
elem.innerHTML="Number of remaining tasks:"+status;
}
},10000)
Using the javascript library jquery, you can set a repeating infinite loop.
That gets the data from a page and then sets the inner html of an element.
http://jsfiddle.net/cY6wX/14/
This code is untested
edit: demonstration updated
Also I did not use the jquery selector for setting the value in case you do not want to use jquery.

Navigating / scraping hashbang links with javascript (phantomjs)

I'm trying to download the HTML of a website that is almost entirely generated by JavaScript. So, I need to simulate browser access and have been playing around with PhantomJS. Problem is, the site uses hashbang URLs and I can't seem to get PhantomJS to process the hashbang -- it just keeps calling up the homepage.
The site is http://www.regulations.gov. The default takes you to #!home. I've tried using the following code (from here) to try and process different hashbangs.
if (phantom.state.length === 0) {
if (phantom.args.length === 0) {
console.log('Usage: loadreg_1.js <some hash>');
phantom.exit();
}
var address = 'http://www.regulations.gov/';
console.log(address);
phantom.state = Date.now().toString();
phantom.open(address);
} else {
var hash = phantom.args[0];
document.location = hash;
console.log(document.location.hash);
var elapsed = Date.now() - new Date().setTime(phantom.state);
if (phantom.loadStatus === 'success') {
if (!first_time) {
var first_time = true;
if (!document.addEventListener) {
console.log('Not SUPPORTED!');
}
phantom.render('result.png');
var markup = document.documentElement.innerHTML;
console.log(markup);
phantom.exit();
}
} else {
console.log('FAIL to load the address');
phantom.exit();
}
}
This code produces the correct hashbang (for instance, I can set the hash to '#!contactus') but it doesn't dynamically generate any different HTML--just the default page. It does, however, correctly output that has when I call document.location.hash.
I've also tried to set the initial address to the hashbang, but then the script just hangs and doesn't do anything. For example, if I set the url to http://www.regulations.gov/#!searchResults;rpp=10;po=0 the script just hangs after printing the address to the terminal and nothing ever happens.
The issue here is that the content of the page loads asynchronously, but you're expecting it to be available as soon as the page is loaded.
In order to scrape a page that loads content asynchronously, you need to wait to scrape until the content you're interested in has been loaded. Depending on the page, there might be different ways of checking, but the easiest is just to check at regular intervals for something you expect to see, until you find it.
The trick here is figuring out what to look for - you need something that won't be present on the page until your desired content has been loaded. In this case, the easiest option I found for top-level pages is to manually input the H1 tags you expect to see on each page, keying them to the hash:
var titleMap = {
'#!contactUs': 'Contact Us',
'#!aboutUs': 'About Us'
// etc for the other pages
};
Then in your success block, you can set a recurring timeout to look for the title you want in an h1 tag. When it shows up, you know you can render the page:
if (phantom.loadStatus === 'success') {
// set a recurring timeout for 300 milliseconds
var timeoutId = window.setInterval(function () {
// check for title element you expect to see
var h1s = document.querySelectorAll('h1');
if (h1s) {
// h1s is a node list, not an array, hence the
// weird syntax here
Array.prototype.forEach.call(h1s, function(h1) {
if (h1.textContent.trim() === titleMap[hash]) {
// we found it!
console.log('Found H1: ' + h1.textContent.trim());
phantom.render('result.png');
console.log("Rendered image.");
// stop the cycle
window.clearInterval(timeoutId);
phantom.exit();
}
});
console.log('Found H1 tags, but not ' + titleMap[hash]);
}
console.log('No H1 tags found.');
}, 300);
}
The above code works for me. But it won't work if you need to scrape search results - you'll need to figure out an identifying element or bit of text that you can look for without having to know the title ahead of time.
Edit: Also, it looks like the newest version of PhantomJS now triggers an onResourceReceived event when it gets new data. I haven't looked into this, but you might be able to bind a listener to this event to achieve the same effect.

Categories

Resources