I have a small app which calls an URL and scrape the data returned from it. I now want to do something similar for another site but this site uses JavaScript and the results are not included in the html. I've found a way to retrieve the data by using "stringByEvaluatingJavaScript" but to complicate things, the results I want is displayed on the webpage only after I click a button / function on the website:
i.e. To get to display the results I want, I have to:
1) go to the website. (data is displayed but not what I want) 2) click one of the options on the site. (data I really want is displayed)
The URL of this page never changes, as expected being JavaScript. So I want to know if there's a way to call the page so that when the page is displayed, it is already on the option I want, e.g. "https://example.com/page1?option" etc...
I don't know if this is possible since I don't know JavaScript but technically I think it should be?
Thanks.
I would use the Developer Tools/javascript console on your browser
(Chrome has a pretty good one) to see what the browser sends to the
server when you click on the button, then use that as the basis for
your query. – cowbert
#cowbert's suggestion really did the trick! Upon digging more, I found more results in the Chrome console and one of them actually has the link to the data which is what I need!
Thank you to all who contributed! This is my first post here so if I didn't do something right, please forgive me.
Related
A co-worker took this url: https://www.rbi.org.in/Scripts/BS_PressReleaseDisplay.aspx which has month/year pagination via Javascript (see the elements on the right) and was able to give me this url:
https://www.rbi.org.in/Scripts/BS_PressReleaseDisplay.aspx?__EVENTTARGET=&__EVENTARGUMENT=&__VIEWSTATE=%2FwEPDwUKMTg0MTg0MzQ2NmRk1lDKkbV9IbwhES0FyX%2BlSLhp%2FzA%3D&__VIEWSTATEGENERATOR=380F4D6F&__EVENTVALIDATION=%2FwEdAAiUUGGuo52vbcR6TOSGc2%2FnlK%2BXrsQEVyjeDxQ0A4GYXFBwzdjZXczwplb2HKGyLlqLrBfuDtX7nV3nL%2B5njT0xZDpy7WJnvc3tgXY08CYLJD%2BrfdwJAuBoVBISURIXWlx9xf1loRXvygROM%2FA1O%2BNHJounKCGGAHd04zzVhBPZz4BK5Wx46wqhV0iQkxGw1Nhr9A6c&hdnYear=2016&hdnMonth=12&UsrFontCntr%24txtSearch=&UsrFontCntr%24btn=
where I can replace the year after hdnYear and the month after hdnMonth with any year and month, and it will bring me directly to that page. I asked him how he did it, and he said "I used the Network tab in Chrome dev tools." That's about all I could get out of him.
Does anyone know exactly how this is done? For example, I'm now trying to discover similar way to get the actual url for each page of this site: http://www.ojk.go.id/id/regulasi/otoritas-jasa-keuangan/peraturan-dan-keputusan-dewan-komisioner/Default.aspx by looking at the Network tab as I change pages. There is nothing I can see in there that's similar to the above example.
This is how it was done for the rbi.org.in URL you've mentioned
Open Chrome and go to the URL you've given
Right click on the page and select Inspect
Click on the Network tab.
Click on one of the year/month links on the website (the pagination you referred to)
In the Network tab, you'll see a list of GET/POST requests being made by the client (ie, the browser) to the server.
In the Filter box (on the top-left of the Network tab), type in the search filter method:POST.
Click on the entry in the Name column. This will open up more details about the POST request. Scroll down to the section titled Form Data.
Click on the view encoded button in the Form Data section
These are the parameters your friend included in the URL. You'll notice hdnYear and hdnMonth also listed in there. The URL your friend gave can be obtained by clicking on view source
Well I can't really tell you how to exactly reproduce this in the site you're trying to, but I can tell you what your co-worker did.
In the page https://www.rbi.org.in/Scripts/BS_PressReleaseDisplay.aspx:
Open the network tab in dev tools, clean the log if theres anything there.
Click on a year and month
On the network log search for BS_PressReleaseDisplay.aspx in the "Name" column and click on it
Inside the Headers tab go to "Form Data" and click on "view source"
And thats it, theres is the URL parameters that your coworker gave you, you can try doing this on the site you want to reproduce it clicking on another page and searching for Default.aspx, but you'll have to figure out what does each parameter means to find which one is the page number or whatever you're looking for (check it in the parsed view for easier reading).
Screenshots:
http://prnt.sc/emsl2w
http://prnt.sc/emsm2z
Hope this helps you.
The URL he sent you, has URL parameters/query-strings that, is read by the server which then sends you the selected pages.
So basically the servers pics up the request and reads these paramters which then most likely is parsed into a method of some sort, querying a database then returning the result for you.
If your the owner of the linked website, you can implement such solution, otherwise you´re stuck since it requires coding on the backend.
I am replacing the showModalDialog function which no longer works in Chrome and FF. We have many applications using that. The problem is, pop up windows do post instructions to the web server and update the database. For instance if there's a list of accounts on screen and edit is clicked on one of the accounts, an edit page appears as a pop up, posts changes back to the web server, then the list is refreshed with changes. The entire list may be refreshed or just text that changed.
I made a javascript function to do pop up content using overlays. I thought it would be simple to replace showModalDialog calls with the javascript function, but I did not consider post instructions sent by the pop up page to update the database, and complexity to facilitate that. Posting can be done via ajax-like functionality, encapsulated in a set of functions. Before I start writing code to do this I'd like to know what other people have done in this circumstance. Thanks
I wrote some javascript to do everything I want. Since my pop up windows had javascript, I needed to run javascript upon rendering modal content, and also when the modal content went away. This will produce any number of overlays on top of each other, managing each. Content can optionally appear in a frame with a title bar, closely matching the functionality of showModalDialog.
Download at http://bikehappy.org/modal.html . If used, please give feedback saying if it works and provide update suggestions.
So I've scoured the internet for a way to hide the url at the bottom of a page printed using window.print, and it seems the only way to do it is for the user to disable the option in their page settings.
Not ideal, I'm trying to hide the address to our serviceNOW instance from being printed on a form that will be given to our clients customers.
So with hiding it out of the question is there a way I can mask it to say something other than the actual url?
I'm not sure but I think solution 3 here is doing something like this
http://www.codeproject.com/Questions/424312/Can-anyone-help-me-to-hide-Header-Footer-and-Page
But I don't understand that at all.
Is it possible to fool the print dialog into thinking we are on a different page without actually redirecting?
btw I don't see it being useful for this solution but I cannot use jQuery (so many things would have been easier if I could) for some reason it will not work in our ServiceNOW instance.
I have a page A that displays some text from my database. The text is editable and gets autosaved using AJAX. If the user would go away from that page, and then go back to page A using browsers history functionality, the page would not have the latest data (since we went back in history). And the user would edit the old data, which would overwrite the latest data on the server when it gets autosaved.
I assume this is purely a front-end issue, where my server can do nothing about this. What solutions could be aplied? If it was possible do detect with javascript that the user went back in history, then I could simply display a text saying that the user has to refresh the page. But is that even possible? Or are there any better solutions?
There are lots of options and strategies for a situation like this.
The first thing you can do is to try to disable caching on the page. You can use meta tags to do this.
You can also keep track of when the user presses the back button using libraries such as this one. You can respond either on the server or on the client, although you want to be careful because a disabled back button can annoy users.
Should you ever happen to consider using a javascript framework such as AngularJS you can probably keep track of the back button using the framework.
Finally you can solve issues like this with careful page design. If the data on a page can change you might load the current data via ajax before the user has a chance to edit it. By doing this - your "load" code will run even if the user does use the back button. Take a look at this stack for more information on that!
Hope these suggestions help a bit!
If you are using Jquery then use/
$(document).on('pageshow', '#Content' ,function()
in place of
$(document).ready(function()
It will solve your problem, the javascript file that is back end will be loaded when that particular page loads
I have a little web app (which only has 1 page) that allows user to input and select some options. The input texts and selections will be displayed in another div in the form of table. You may want to refer to the example here: http://jsfiddle.net/xaKXM/5/
In this fiddle, you can type anything and after you clicked submit it will get the text input and append them to another table #configtableTable
$('#labels #labelTable tr:last').after(addmore);
$('#configtable #configtableTable tr:last').after(displaymore);
I'm using cherrypy as a mini web server (and thus major codes are written in python) and i know that it has session here but i have no idea how to use it at all as the example given is not really what i want to see.
FYI, i'm not using PHP at all and everything is in a single page. i simply show and hide them. But I want the page to remain as showing #configtableTable and hiding #labelTable even after refresh. Note that the fiddle is just part of the web app which will only show all these after getting a reply from another device.
Not sure about cookie because all the links i've found seem broken. How about jQuery session? Is it applicable in my case? I need some examples of application though :(
okay, to conclude my questions:
1. can i save the page state after refresh? and how? which of the methods mention above is worth trying? is there any examples for me to refer? or any other suggestions?
2. can i simply DISABLE refresh or back after reaching a page?
Thanks everyone in advance :)
Don't disable Refresh and / or back navigation. It's a terrible idea - user's have a certain expectation of what actions those buttons will perform and modifying that leads to a bad user experience.
As for saving state, while you could use session or cookies, if you don't need that data server side, you can save the state on client side as well.
For example, you could use localStorage
Alternatively, you could create an object out of the data in the table, JSON.stringify() it and append it to the url like this: example.com#stateData.
In case of either option, at page load, you'd have to check if there is state data. if you find there is, then use it to recreate the table, instead of displaying the form.
The disadvantage of the first, is that not all browsers support localStorage.
The disadvantage of the second is that URLs have a length limit and so this solution won't necessarily work for you if you're expecting large amounts of data.
EDIT
It appears that Midori does support most HTML5 features including localStorage however, it's turned off by default.. (I'm trying to find a better reference). If you can, just point Midori to html5test to see what HTML5 features it supports.