Fastest way to obtain web source with javascript in android - javascript

I'm currently trying to get the source of some page from android.
From pre-checking the source, it contains javascript in it.
In order to be able to parse it correctly by using jsoup, I had to do the following steps:
Load the url into a webView.
use jsoup.parse() on the webView to get the source with the javascript in it.
wb_result.setVisibility( View.GONE );
wb_result.getSettings().setSaveFormData( false );
wb_result.getSettings().setBlockNetworkLoads( true );
wb_result.addJavascriptInterface( new MyJavaScriptInterface( this ), "HtmlViewer" );
wb_result.setWebViewClient( new WebViewClient() {
#Override
public void onPageFinished(WebView view, String url) {
wb_result.loadUrl( "javascript:window.HtmlViewer.showHTML" +
"('<html>'+document.getElementsByTagName('html')[0].innerHTML+'</html>');" );
}
} );
In my opinion, it is a little weird to have a webView in my activity and keep its visibility GONE and only to use it as a middle step in order to get to the source I need + it is slow.
I was wondering what is the fastest solution to obtain a source that contains javascript.
I read about Chrome Custom Tabs and that it should be faster but then I can't hide the opened tab from what I saw and it will disturb the flow of the app.
Specifically, the url im trying to get its source is - link.
Any modern ideas? all the solutions I saw are from 2016.
Thank you

Related

unable to execute javascript code in Android WebView from Service

I'm attempting to create and Android service that performs a task using JavaScript. I came across this post which describes how to run JavaScript code inside of a WebView within a Service using the WindowManager. I am able to create a WebView with an .html and .js file with no problem. It is once I try to pass data from the android .java service to the WebView that I run into an issue.
I have tried doing so in this fashion:
final WindowManager windowManager = (WindowManager) getSystemService(WINDOW_SERVICE);
...
wv = new WebView(this);
wv.setWebViewClient(new WebViewClient());
wv.getSettings().setJavaScriptEnabled(true);
wv.loadUrl("file:///android_asset/www/test.html");
windowManager.addView(wv, params); // params set using method from linked post above
wv.evaluateJavascript("console.log('hello');", null);
wv.loadUrl("javascript:console.log('blah')");
Neither the call to evaluateJavascript() nor loadUrl() appear to have any effect on the WebView (I access the console using the chrome developer tools).
I have tested that in test.html I can add a <script> tag and output text to the console with no issue.
I've also tried calling the functions before adding the view to no avail.
What kind of data you want to pass to Javascript? You could use the WebView.addJavascriptInterface() to "Plant" methods on the HTML document so you can call them from Javascript, invoke in native and return data back to Javascript. Will that help?
If you're trying to execute this from a Service, you'll need to post a Runnable on the UI thread. Read the doc for evaluateJavascript()
It explicitly says it must be called on the UI thread. I think you can just do:
wv.post(new Runnable(){
#Override public void run()
{
wv.evaluateJavascript();
}
});
Other than that, should you include tags?

Android Headless Browsing through WebView?

I am trying to create an Android app for a Website, Which is not mine. But is a search engine for Restaurants. They have no API to work with. And i want to heedlessly browse their website and put the search query in the HTML Form and Click the Submit Button. And then Parse the Results and Use it with my Application Code. After doing loads of research here, i am finally asking for it. Question 1, Question 2, Question 3 and many more that i have looked so far. So all i know so far is if i want to do the same on Google.com i would write:
myWebView.getSettings().setJavaScriptEnabled(true);
myWebView.loadUrl("http://www.google.com/");
myWebView.setWebViewClient(new WebViewClient() {
#Override
public void onPageFinished(WebView view, String url) {
//Load HTML
myWebView.loadUrl("javascript:document.getElementById('q') =" + "StackOverFlow" + "; document.getElementByName('btnK').click();");
}
});
In the above code i am trying to put the search term "StackOverFlow" and Click the Search Button. But its not working. Kindly Help me out in this code or either point me in the right direction.
It's been a while, but for the sake of letting others know, webviews no longer use loadUrl to run Javascript. Try using evaluateJavascript.
Since you've also mentioned headless browsing, I would recommend overriding shouldInterceptRequest in your client to redirect all unnecessary files (such as css, images, and perhaps js depending on the site) to a blank inputstream
myWebView.loadUrl("http://www.google.com/");
after overriding onPageFinished method not before

Android WebView: how to load page after javascript execution

Relating to this SO question,
I want to remove the notifications as well as the flicker that occurs when displaying TeX in a WebView in android.
I tried the AsyncTask method, to preload the page and then "download" it, before loading the data into the WebView but I'm stuck on an IllegalStateException: Target host must not be null, or set in parameters. This error is coming from the following code:
protected String doInBackground(String... url) {
String r = "";
try {
HttpClient hc = new DefaultHttpClient();
HttpGet get = new HttpGet(URLEncoder.encode(url[0], "UTF-8"));
// get.setHeader("host", "http://myapp");
HttpResponse hr = hc.execute(get);
...
}
The line commented out is what I had tried to fix this error, but never worked.
Now I'm thinking I can't get around this because the url[0] that I'm loading is in the form: javascript:somescript(input) executed from a JS file stored in my assets, and maybe HttpGet requires a web server?
How can I hide the notifications and finish the JavaScript before updating the WebView then?
The HttpGet class is for an HTTP request, so the URL must start with http://.
If you are using MathJax as the TeX viewer, then refering to this Stack Overflow question, you could add messageStyle: "none" to your MathJax configuration. This way can disable MathJax's loading notifications in any environments.

Capture Browser Web Page [duplicate]

Is it possible to to take a screenshot of a webpage with JavaScript and then submit that back to the server?
I'm not so concerned with browser security issues. etc. as the implementation would be for HTA. But is it possible?
Google is doing this in Google+ and a talented developer reverse engineered it and produced http://html2canvas.hertzen.com/ . To work in IE you'll need a canvas support library such as http://excanvas.sourceforge.net/
I have done this for an HTA by using an ActiveX control. It was pretty easy to build the control in VB6 to take the screenshot. I had to use the keybd_event API call because SendKeys can't do PrintScreen. Here's the code for that:
Declare Sub keybd_event Lib "user32" _
(ByVal bVk As Byte, ByVal bScan As Byte, ByVal dwFlags As Long, ByVal dwExtraInfo As Long)
Public Const CaptWindow = 2
Public Sub ScreenGrab()
keybd_event &H12, 0, 0, 0
keybd_event &H2C, CaptWindow, 0, 0
keybd_event &H2C, CaptWindow, &H2, 0
keybd_event &H12, 0, &H2, 0
End Sub
That only gets you as far as getting the window to the clipboard.
Another option, if the window you want a screenshot of is an HTA would be to just use an XMLHTTPRequest to send the DOM nodes to the server, then create the screenshots server-side.
Another possible solution that I've discovered is http://www.phantomjs.org/ which allows one to very easily take screenshots of pages and a whole lot more. Whilst my original requirements for this question aren't valid any more (different job), I will likely integrate PhantomJS into future projects.
Pounder's if this is possible to do by setting the whole body elements into a canvase then using canvas2image ?
http://www.nihilogic.dk/labs/canvas2image/
A possible way to do this, if running on windows and have .NET installed you can do:
public Bitmap GenerateScreenshot(string url)
{
// This method gets a screenshot of the webpage
// rendered at its full size (height and width)
return GenerateScreenshot(url, -1, -1);
}
public Bitmap GenerateScreenshot(string url, int width, int height)
{
// Load the webpage into a WebBrowser control
WebBrowser wb = new WebBrowser();
wb.ScrollBarsEnabled = false;
wb.ScriptErrorsSuppressed = true;
wb.Navigate(url);
while (wb.ReadyState != WebBrowserReadyState.Complete) { Application.DoEvents(); }
// Set the size of the WebBrowser control
wb.Width = width;
wb.Height = height;
if (width == -1)
{
// Take Screenshot of the web pages full width
wb.Width = wb.Document.Body.ScrollRectangle.Width;
}
if (height == -1)
{
// Take Screenshot of the web pages full height
wb.Height = wb.Document.Body.ScrollRectangle.Height;
}
// Get a Bitmap representation of the webpage as it's rendered in the WebBrowser control
Bitmap bitmap = new Bitmap(wb.Width, wb.Height);
wb.DrawToBitmap(bitmap, new Rectangle(0, 0, wb.Width, wb.Height));
wb.Dispose();
return bitmap;
}
And then via PHP you can do:
exec("CreateScreenShot.exe -url http://.... -save C:/shots domain_page.png");
Then you have the screenshot in the server side.
This might not be the ideal solution for you, but it might still be worth mentioning.
Snapsie is an open source, ActiveX object that enables Internet Explorer screenshots to be captured and saved. Once the DLL file is registered on the client, you should be able to capture the screenshot and upload the file to the server withing JavaScript. Drawbacks: it needs to register the DLL file at the client and works only with Internet Explorer.
We had a similar requirement for reporting bugs. Since it was for an intranet scenario, we were able to use browser addons (like Fireshot for Firefox and IE Screenshot for Internet Explorer).
This question is old but maybe there's still someone interested in a state-of-the-art answer:
You can use getDisplayMedia:
https://github.com/ondras/browsershot
The SnapEngage uses a Java applet (1.5+) to make a browser screenshot. AFAIK, java.awt.Robot should do the job - the user has just to permit the applet to do it (once).
And I have just found a post about it:
Stack Overflow question JavaScript code to take a screenshot of a website without using ActiveX
Blog post How SnapABug works – and what they should do
I found that dom-to-image did a good job (much better than html2canvas). See the following question & answer: https://stackoverflow.com/a/32776834/207981
This question asks about submitting this back to the server, which should be possible, but if you're looking to download the image(s) you'll want to combine it with FileSaver.js, and if you want to download a zip with multiple image files all generated client-side take a look at jszip.
You can achieve that using HTA and VBScript. Just call an external tool to do the screenshotting. I forgot what the name is, but on Windows Vista there is a tool to do screenshots. You don't even need an extra install for it.
As for as automatic - it totally depends on the tool you use. If it has an API, I am sure you can trigger the screenshot and saving process through a couple of Visual Basic calls without the user knowing that you did what you did.
Since you mentioned HTA, I am assuming you are on Windows and (probably) know your environment (e.g. OS and version) very well.
If you are willing to do it on the server side, there are options like PhantomJS, which is now deprecated. The best way to go would be Headless Chrome with something like Puppeteer on Node.JS. Capturing a web page using Puppeteer would be as simple as follows:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({path: 'example.png'});
await browser.close();
})();
However it requires headless chrome to be able to run on your servers, which has some dependencies and might not be suitable on restricted environments. (Also, if you are not using Node.JS, you might need to handle installation / launching of browsers yourself.)
If you are willing to use a SaaS service, there are many options such as
Restpack
UrlBox
Screenshot Layer
A great solution for screenshot taking in Javascript is the one by https://grabz.it.
They have a flexible and simple-to-use screenshot API which can be used by any type of JS application.
If you want to try it, at first you should get the authorization app key + secret and the free SDK
Then, in your app, the implementation steps would be:
// include the grabzit.min.js library in the web page you want the capture to appear
<script src="grabzit.min.js"></script>
//use the key and the secret to login, capture the url
<script>
GrabzIt("KEY", "SECRET").ConvertURL("http://www.google.com").Create();
</script>
Screenshot could be customized with different parameters. For example:
GrabzIt("KEY", "SECRET").ConvertURL("http://www.google.com",
{"width": 400, "height": 400, "format": "png", "delay", 10000}).Create();
</script>
That's all.
Then simply wait a short while and the image will automatically appear at the bottom of the page, without you needing to reload the page.
There are other functionalities to the screenshot mechanism which you can explore here.
It's also possible to save the screenshot locally. For that you will need to utilize GrabzIt server side API. For more info check the detailed guide here.
As of today Apr 2020 GitHub library html2Canvas
https://github.com/niklasvh/html2canvas
GitHub 20K stars | Azure pipeles : Succeeded | Downloads 1.3M/mo |
quote : " JavaScript HTML renderer The script allows you to take "screenshots" of webpages or parts of it, directly on the users browser. The screenshot is based on the DOM and as such may not be 100% accurate to the real representation as it does not make an actual screenshot, but builds the screenshot based on the information available on the page.
I made a simple function that uses rasterizeHTML to build a svg and/or an image with page contents.
Check it out :
https://github.com/orisha/tdg-screen-shooter-pure-js

Run custom javascript code after loading any website

I am working on taking readings about web browser performance and so need to access the window.performance object of the browser.
To collect this data i have written a javascript file, collect.js which i need to add to the DOM of the page that i need to test eg. www.google.com, www.facebook.com and so on...
Also i need to run this test for about 1000 websites, any manual approach is out of the question. I need it to be automated somehow.
How could i go about doing this?
EDIT: I need to run these tests on an android browser, so i need mobile oriented solutions.
You can create a simple android app with a WebView component. This way you can control which URLs are loaded and also insert your JS code.
http://developer.android.com/guide/tutorials/views/hello-webview.html
EDIT
You can run any javascript like this:
Implement a custom WebView:
public class WebClient extends WebViewClient {
#Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
view.loadUrl(url);
return true;
}
#Override
public void onPageFinished(WebView view, String url) {
// Execute your javascript below
view.loadUrl("javascript:...");
}
}
If you are looking for an automated solution, try PhantomJs this provides an automated headless web browser. Also has access to network traffic
perhaps you can try "bookmarklet"
http://www.bookmarklets.com/
the advantage over greasemonkey script is that it can run on
firefox and explorer

Categories

Resources