C# Collecting data from website after scripts are loaded - javascript

I want to download some code from an HTML website but the data that I need appears after the JavaScript loads (as i know). I tried WebClient but this gets only the HTML code without any JS changes and as far I know there is nothing more I can do. Now I'm trying WebBrowser in WPF and Forms. I have WebBrowser control and I'm navigating to my url address but I'm getting JS errors and scripts are still not loading.
webBrowser1.Navigate(new Uri("http://www.polskieszlaki.pl/atrakcje/woj-slaskie/"));
How to get webpage fully loaded with all scripts?
Btw. I don't need a web browser, I just need to collect some data so the HTML code after scripts are loaded is enough for me.

Related

Call JavaScript PostBack via another Script while parsing Page

I am writing a plugin for google chrome which goes and extracts some data for a webpage and saves it to a local db. I have covered all the parsing of the pages but some info is on a different tab and for now requires me to navigate manually as it is not a basic url but a javascript. My quiestion is, is there a way to script calling a script on the page i am parsing to force the postback and if so how would i go about it. If not is there another way to solve this ?
Below are some links i would like to call from my script on the webpage
Tax
OC19064888

Is there any way to run asp.net web browser control on server?

ASP.NET. VB.NET 3.5
In order to scrape image URLs automatically from some of our clients' websites, we want to inspect the DOM after JavaScript has completed running as often the rendered HTML changes because of onload() JavaScript. The article:
Get the final generated html source using c# or vb.net
shows how to do that with a form with a web browser control on the client but is there a way to do it all on the server (since our process is called in a background thread anyway when the client navigates off a certain aspx page)?
Tia
See my comment for other links for a response.

reading dynamic/java page using webrequest

I have a VB.NET (2010) forms project, and I want to read a webpage that has a java script on it, generating log output.
But when I do a request to the webpage, I only get the static part of the page.
When I fire the URL in a browser, it displays dynamic content and is updated regularry.
Whats the best wat to execute javascript remotely from vb.net.

Show external website in my aspx page and include own javascript

What I am trying to accomplish:
A visitor of my website should be able to load an external website into my website and click elements on this external websites to retrieve the XPath of the element. Like Firebug but completly online.
I have already managed to create a piece of javascript to click elements on MY website and return the XPath of the element.
Now I need to know how to show an external website in my ASP.NET WebForms page and inject my own javascript.
I tried to use a literal control and download the external website's HTML code with a WebClient.
WebClient webClient = new WebClient();
string result = webClient.DownloadString(txtURL.Text);
litWebsite.Text = result;
Problem here is: The the external website's design will be broken if I don't consider the CSS references.
Maybe this is a complete wrong approach.
Any ideas?
Thank you!
I'm not sure it's possible.
Sure, you could display the web page via iframe but unless you owned the content, I'm not sure you can control it with your own javascript.
Maybe I'm wrong but I've always heard that your could do very little with other web content in your own project.

Block JS from loading on certain domains

I have a web service that works through giving users javascript to embed in their code. Users can also place that code on other sites to make it work there. However I also need to allow users to create a blacklist of sites that the JS should not function on. For example, a competitor or an inappropriate site.
Is there a way to check where our JS files are being loaded from, and block loading or break functionality on a per account basis?
Edit: The javascript loads an iframe on the site, so another solution would be to somehow block certain domains from loading an iframe from our server, or serve different content to that iframe
Edit 2: We're also trying to avoid doing this from with the JS because it could be downloaded and modified to get pass the block
Inspecting the url of the page
Yes, the javascript file, when it starts executing, can inspect window.url and see if the url of the main document is ok.
To see where the script was loaded from
It can also go through the dom, looking for the script node which brought in the javascript file itself and see from where the JS was loaded.
However
Anyone can load the javascript into a text editor, then change it to eliminate the tests, then host the modified JS on their own server. Obfuscating or minimizing the JS can slow someone down but obscurity is not security.
One thing you could do is have the javascript load another javascript file. That you serve from the server at a given url. The trick here is that that url will not go to a file but to a server end point that will return a javascript file. The you have that endpoint check for the routes for that user and decide if it will return the javascript you want to work or an error javascript of some kind.
This blog shows how to do it in php.dynamic-javascript-with-php

Categories

Resources