Download webpages with javascript dynamic content - javascript

I have a php script to download some webpages. The problem is that the downloaded files haven't got the dynamic content which is written by javascript.
I suppose I need a javascript engine or something similar. Is there some php library or command-line program for downloading a webpage with all its dynamic content?
Example of what I need: I want to download the webpage www.example.com/product.html.
Now: I'm able to download the code:
<h1></h1>
What I want: I want to download the code:
<h1>Title written by javascript</h1>

This is happening because the JavaScript is not downloaded properly. You are only downloading the HTML file and not the attached JavaScript files.
Once the JavaScript will be available for the HTML page, all the dynamic content would be coming correctly.
Another Workaround:
You can use selenium like web automation libraries that actually opens the page in a browser, let the browser execute the js and prepare the DOM. After this you can download the HTML content.
One more:
You can make use of PhantomJs to download the HTML page, process the Javascript and give you the final output for save.
https://superuser.com/questions/448514/command-line-browser-with-js-support

Related

How do I automate opening and downloading a webpage?

There is a website that has an HTML <video> element in it when loaded, however this element isn't present if I just download it with wget, so I guess it gets loaded by a script that's only ran when the page is opened in a browser. I need the video's direct link, in an automated fashion.
Could you please tell me if I have the right idea, and if there is a possible solution? Could I for example run a browser from the command line, let it load the page and all of the referenced content, then save the .html file?
You could use headless Chrome, potentially with Puppeteer scripting for that.
Though, depending on the details, there may be easier options that would get you what you need. It sounds like you're currently trying to scrape a third party website using wget. Instead of, or in addition to, requesting the .html content with wget, you could request the relevant javascript file and then extract the video url from there.

Running JavaScript code on PDF pages on my domain?

I it possible to run javascript code on for example: mydomain.com/pdf/pdf-example.pdf?
Any ideas how to accomplish this?
My intention is to add live chat code so that we can help customers while they are looking at the PDF
You can use pdf.js to render a PDF file onto a standard HTML page. So instead of linking directly to the PDF file, you would create a new HTML page, where you would embed the pdf.js viewer, as well as any additional custom Javascript. Here is a demo. The downside to pdf.js, is browser support is limited.
Another option is pdfobject, which enables you to embed PDF's within pages. This uses the browsers native PDF viewer for rendering.

How to manipulate a local text file from an HTML page

I've generated an HTML file that sits on my local disk, and which I can access through my browser. The HTML file is basically a list of links to external websites. The HTML file is generated from a local text file, which is itself a list of links to the remote sites.
When I click on one of the links in the HTML document, as well the browser loading the relevant site (in a new tab), I want to remove the site from the list of sites in the local text file.
I've looked at Javascript, Flask (Python), and CherryPy (Python), but I'm not sure these are valid solutions.
Could someone advise on where I should look next? I'd prefer to do this with Python somehow - because it's what I'm familar with - but I'm open to anything.
Note that I'm running on a Linux box.
First, Javascript cannot modify the local filesystem from the context of a webpage. To allow that would be a massive security concern.
Any server-side web framework can do this, and Flask is a great one to use because it's so lightweight. The general steps you would want to take are:
When / is requested, load the list of links.
Change each link to point to /goto?line=<line_number>.
Display the list to the user.
Then when you click a link:
When /goto is requested, load the list of links.
Remove the line number from the list.
Save the list of links.
Return status code 302, with the real URL as the Location header.
There is many ways to do this
Here is the easiest 3
Use JavaScript
2 install wampserver or similar and use php o modify the file
3 don't use te browser to delete and instead use a bat file to open the browser and remove the link from the text file

Will JavaScript in a PDF run an IOS WebView

I have a PDF file that include JavaScript code for validation of the data entered in the PDF. This runs Adobe Reader. I would like to include the same functionality in my iPad App.
If I load this PDF in a UIWebView, will the JS code run or the JS runs only in Adobe Reader?
You cannot execute JavaScripts from within a PDF in UIWebView, but you might be able to implement those functions externally
The following might help if you just need a UI and you are willing to re-develop the validation logic. Mozilla's pdf.js has an example at https://github.com/mozilla/pdf.js/tree/master/examples/acroforms that is using canvas to display the PDF content using HTML/JS and overlays HTML input elements. However there are no execution of the embedded in PDF JavaScript yet.

How to disable pdf file download option using JQuery?

How to disable pdf download using jquery or javascript.
In my website I am loading some pdf files in iframe. I need to protect my files.
So how can I dissable pdf file download, print those kind of options.
Please help me.
My website created using html. jquery, mysql and php
Since you are delivering pdf file directly into the browser, displayed using Adobe Reader ActiveX, how can it be possible to prevent file download, since the files are displayed after downloaded into your temp directory?
So it is not possible using ANY JavaScript library.
The only way to secure your master PDF files is by creating Images for each page and present those to the user on the web via your own interface (html, flash etc).
You may use ImageMagick along with GhostScript for this.
You may go through www.veryinteractivepeople.com/?p=521
Hope this helps...:)

Categories

Resources