PhantomJS executing JavaScript in a popup for data extraction - javascript

So I have a web page with some photos of people. When you click on a photo of the person the JavaScript is executed and produces a popup with some more detailed information such as a description etc.
The link for each photo is as follows:
First I want to start with the basics, of just extracting the description etc. from one single person. So I want to execute the JavaScript above to write the popup window, and when I'm on the popup window I can then extract the content of the div's on the popup.
I've looked at PhantomJS and I really don't know where to start. I've used Cheerio to get some simple information from the page, and I want to move on to executing the popup window through JS and then extracting data from that.
Any help would be appreciated as I'm a bit of a newbie to screen scraping in general.

You can do this analogous to how CasperJS does it.
page.open(yourUrl, function(success){
// mainPage is loaded, so every next page could be a popup
page.onPageCreated = function onPageCreated(popupPage) {
popupPage.onLoadFinished = function onLoadFinished() {
popupPage.evaluate(function(){
// do something in popup page context like extracting data
});
};
};
// click to trigger the popup
page.evaluate(function(){
document.querySelector("a.seeMore").click();
// or something from here: http://stackoverflow.com/questions/15739263/phantomjs-click-an-element
});
});
Do not forget to overwrite/nullify page.onPageCreated before navigating away from the main page.

Related

How do I link to an internal page and click on an element on the linked page with vanilla JS?

I can't figure out how this is done:
I want to link to an internal page with an addEventlistener function. When the linked page loads I want to open a div that is hidden. So something along the line of:
button.addEventListener("click", () => {
const hiddenDiv = document.getElementById("hidden-div")
window.open("/newPage.html" + hiddenDiv.click())
});
I know the code above won't work, but I don't know how it could work, since window.open() ends the execution of the code.
Is there a way to store the function after the page refreshes (without localstorage)? Is there a way to handle it via the url? Or should be done with localstorage?

Stop Safari extension from caching messages

I'm having problems with creating a safari extension that has been causing me headaches.
The problem is this: I have created an extension that gets images from a web page using an injected script. I have a function in the popover that displays the web images and allows the user to click and send the selected image to a backend. This all goes through the global page which handles signals and messages. The scenario is this:
When the popover opens, it sends a signal to the global page to initiate the response (image URLs) from the injected script.
When the global page gets the message from the injected script, it calls the function in the popover passing in the data from the injected script as an argument.
The popover shows all images returned from the global page via the injected script.
The problem is that every time I open the popover, it appends the images from the last call instead of giving a fresh list of images. For instance, on first popover open, I get one image (assuming the page has only one image). If I close the popover and open it again, I get two of the same image. If I close and open the popover a third time, it appends the first image with the one from the second time and gives me 3 of the same image. On the fourth open of the popover I get 1 + 1 + 1 + 1, so 4 images. So it seems to be appending the messages and not giving me a fresh message every time.
My question is: how can I destroy the messages that are being cached after each popover closes? I hope I am being clear. Perhaps something else is happening with my code that I am not aware of. Please help if you can. Here is my code from the global HTML:
function popoverHandler(event) {
//check for popover opening
if (event.target.identifier === "MyPopUp") {
//send message to injected script to send page info
safari.application.activeBrowserWindow.activeTab.page.dispatchMessage("getContent", '', false);
//this works fine, I get this message every time popover opens
console.log('getContent message sent');
//listen for message containing page info from injected script
safari.application.addEventListener('message', function (messageEvent) {
//only get message from current tab
if (messageEvent.name === "pageInfo" && messageEvent.message.url === safari.application.activeBrowserWindow.activeTab.url) {
pageInfo = messageEvent.message;
//the problem seems to be in here. Every time I open the popover, //I get the current page info plus all the page info messages from //the previous time I open the pover, all duplicates of the previous //messges
console.log(pageInfo);
// call a function in the popover, passing the pageInfo data //received from the injected script
safari.extension.popovers[0].contentWindow.onPageDetailsReceived(pageInfo);
}
});
}
}
Ok, so I was able to solve the problem. First of all, I separated the popoverHandler from the eventListener. For some reason, it was firing the function too many times and returning several lists of the same images. The major issue, however, was that in the popover.js, I was storing the list of images as a var. When I removed the var, the data stopped persisting and I was getting a fresh list every time.

reload php page with javascript

I have drawn a chess board in a php page. Every piece is set as draggable, and every tile as droppable. Once a piece is dropped on a tile, I'd like to reload the php page so the board can be drawn anew along with new positions.
How can I do that: reloading the php page with javascript, without displaying a window asking for confirmation such as "To display this page, Firefox must send information that will repeat any action (such as a search or order confirmation) that was performed earlier. ->Cancel; Resend" ?
Or perhaps there are better solutions?
If you want to avoid having refresh reporting data (for any reason, including the user clicking the reload button) then use the POST-REDIRECT-GET pattern (http://en.wikipedia.org/wiki/Post/Redirect/Get). Read that, it will explain what to do.
Quick solution: you could try:
window.location.reload(true); //true sets request type to GET
Use GET, instead of POST and the dialog box you are getting will go away.
Good luck!
Make use of
window.location.reload();
will refresh automatically
<script>
var timer = null;
function auto_reload()
{
window.location = 'http://domain.com/page.php'; //your page location
}
</script>
<!-- Reload page every 10 seconds. -->
<body onload="timer = setTimeout('auto_reload()',10000);">
reference http://davidwalsh.name/automatically-refresh-page-javascript-meta-tags

Problems with refresh function for page, in Appcelerator

Titanium SDK version: 1.6.1
iPhone SDK version: 4.2
I am using JavaScript.
I am developing an app that is fetching information from an API. In this app, on this page, I got two ways of "refreshing" the content. When the window is focused and when I tap the refresh button.
The problem is every time I do a fresh of the page there is a "copy" of the content under the new content. It is like the app just keeps layering on new copies of the content on top of the others on each fresh.
What am I doing wrong in my code? Is there a way to "clear" the page before each refresh. I can imaging that this issue eats a lot of memory.
You can find my code for the page here: http://pastie.org/1778830
this is a common architectural problem, you should separate out the function of creating the table and loading the table's data.
you create the table once when the window is created, and you load the data in the table multiple times. Pseudo code below should give you the basic idea.
var win = Ti.Ui.currentWindow;
(function(){
var table;
// create the table
function initializeWindow() {
}
// load the data, and update table
function loadWindowData() {
}
initializeWindow();
loadWindowData();
// called whenever you want to update window data.
Ti.App.addEventListener('app:refreshTable',loadWindowData);
)();
TableView -
table.setData([]); // First Clear
table.setData(tableData); // Updated Content
ListView -
listView.sections[0].setItems([]);//First Clear
listView.sections[0].setItems(tableData); // Updated Content
For Updating Content inside Window you can use "open" listener.
win.addEventListener("open",loadData);

jQuery code repeating problem

I have a piece of code in jQuery that I use to get the contents of an iFrame after you click a link and once the content is completed loading. It works, but I have a problem with it repeating - at least I think that is what it is doing, but I can't figure out why or how.
jQuery JS:
$(".pageSaveButton").bind("click",function(){
var theID = $(this).attr("rel");
$("#fileuploadframe").load(function(){
var response = $("#fileuploadframe").contents().find("html").html();
$.post("siteCreator.script.php",
{action:"savePage",html:response, id: theID},
function(data){
alert(data);
});
});
});
HTML Links ( one of many ):
<a href="templates/1000/files/index.php?pg=0&preview=false"
target="fileuploadframe" class="pageSaveButton" rel="0">Home</a>
So when you click the link, the page that is linked to is opened into the iframe, then the JS fires and waits for the content to finish loading and then grabs the iframe's content and sends it to a PHP script to save to a file. I have a problem where when you click multiple links in a row to save multiple files, the content of all the previous files are overwritten with the current file you have clicked on. I have checked my PHP and am pretty positive the fault is with the JS.
I have noticed that - since I have the PHP's return value alerted - that I get multiple alert boxes. If it is the first link you have clicked on since the main page loaded - then it is fine, but when you click on a second link you get the alert for each of the previous pages you clicked on in addition to the expected alert for the current page.
I hope I have explained well, please let me know if I need to explain better - I really need help resolving this. :) (and if you think the php script is relevant, I can post it - but it only prints out the $_POST variables to let me know what page info is being sent for debugging purposes.)
Thanks ahead of time,
Key
From jQuery .load() documentation I think you need to change your script to:
$(".pageSaveButton").bind("click",function(){
var theID = $(this).attr("rel");
var lnk = $(this).attr("href");//LINK TO LOAD
$("#fileuploadframe").load(lnk,
function(){
//EXECUTE AFTER LOAD IS COMPLETE
var response = $("#fileuploadframe").contents().find("html").html();
$.post("siteCreator.script.php",
{
action:"savePage",
html:response,
id: theID
},
function(data){alert(data);}
);
});
});
As for the multiple responses, you can use something like blockui to disable any further clicks till the .post call returns.
This is because the line
$("#fileuploadframe").load(function(){
Gets executed every time you press a link. Only add the loadhandler to the iframe on document.ready.
If a user has the ability via your UI to click multiple links that trigger this function, then you are going to run into this problem no matter what since you use the single iframe. I would suggest creating an iframe per save process, that why the rendering of one will not affect the other.

Categories

Resources