Get Content of a page on another domain - javascript

We want to get the html content of a page on another domain. The following considerations are present:
1- The login page has a I am not a robot recaptcha.
2- The load of page in iFrame is restricted.
3- Could not use jQuery get or load methods because of cross domain restrictions.
With these limitations is it possible to develop a crawler or even use some client side codes to get data?
Thanks

Actually.. NO
But you can take the help of a backend server.
Let the server download the page and send it to the client.
This would solve problems related to CORS restrictions.
Coming to the captcha part, if the page operations are restricted by the captcha, then again there aren't much you can do. If it was that easy, the captcha wouldn't be used in the first place.

Related

Use JavaScript to crawl a website -> Possible and which IP is shown on the crawled site

it is possible to crawl a website within an Angular-App? I am speaking about to call a website from Angular, not crawling an Angular-App. If that so, then I am wondering which IP will be shown on the crawled website. Since JavaScript is client-side, I would suggest, its the IP of the client, not of the server (like probably at nodejs). But all I know, its mostly browser-implemented stuff what we can use in JS, so it is even possible to crawl websites with methods from JavaScript (or Angular)?
Best Regards
Buzz
In theory, you can create an AJAX request to fetch the data with reponse type text/html. That would give you the remote document as a string. The browser wouldn't try to load the JavaScript and CSS in that document, though. That might not be a problem but CORS is. For security reasons, most browsers prevent you from loading data from somewhere else (otherwise, it would be too easy for criminals to put JavaScript into any web page). See here for details: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
If you have control over the second domain, you can configure the server there to send Access-Control-Allow-Origin headers to the browser to allow access from the Angular App.
Note: You could use an iframe to load the other website but when the domains of the current document and the one in the iframe don't match, then you can't access the contents of the iframe from JavaScript.
One way to work around this is to install a proxy on your server. The browser can then ask your server for the pages in question. In this case, the remote web site will get the IP of your server.

including external URL javascript file

I have a main header page that is included in many different applications across a couple of different languages, including Java and classic ASP. The file (file.js) is going to be obsolete soon. We are going to be going to an "out-of-the-box" solution, a new header created by another group. They gave us a link ("google.com") that we need to use to show this new header. I was wondering if there was a simple solution I could implement in my file.js that would show this content to the users. I know an easy way to do it in jsp is
<c:import url="http://google.com"/>
but this won't work in the js file, nor will it work in the jsp. Is there a way for me to do this?
Thank you,
Explosive_donut
Obviously the URL you are really given isn't Google. I suppose the second team is able to modify their own (document) headers sent to clients.
First way I think of is to use AJAX to retrieve the contents of the URL and create a div or select an existing to set its new content.
Unfortunately AJAX is restricted to the Same Origin Policy which can be circumvented with CORS (Cross Origin Resource Sharing). To allow CORS, your remote server as well as your client maschines need to send respective headers. Check out the link for more information.
If you need any more information and/or tutorials, let me know in the comments.

jQuery ajax submitting a form and sending data to alternative domain

I am currently looking into a couple of possibilities for a microsite that I am building. The microsite sits on a different domain to the main site, and we want to use some of the forms from the main site. However we don't the user to see the main sites thank you page for a form submission.
My question is, is it possible to submit the form on the microsite to the action of the main sites form, so essentially I am wanting to submit a form that is set on http://domain1.com to http://domain2.com.
Will I able to this due to cross-site scripting etc?
You are fighting with http://en.wikipedia.org/wiki/Same_origin_policy.
Possible solution will be using a local proxy like http://developer.yahoo.com/javascript/howto-proxy.html
Using plain AJAX this is not possible, no.
You'd need to use a local proxy of some kind to achieve this. The form should submit it's data to a server side script on the same domain. That script should submit the data (using cURL or the like) to the remote location and give any response back to the form.
What you are trying to do is possible, but with some restrictions. Newer browser support cross domain ajax using the x-access-control-allow-origin etc. headers.
You could also use cross domain messaging (see CORS). To get backwards compatibility with older browsers, easyXDM is an option.
Another option is to build a hidden iframe, create a form, and send the data there using a normal for with action pointing to the other domain.
Remember though that Cross Site Request Forgery may be a problem. How do you stop other sites from posting to that same url.
You should be able to do this by placing the link to the action of mainsite in the microsite form. Also you can achive it via ajax, by sending call to the url of the mainsite and fetching results.

Client Side Script to Display all Images from another website

I want to have a form on my page take user input, a URL to be precise, and once that field is complete, have the script go to the destination URL that was entered (in the background), and display all images on that page in a thumbnail view for the user to select. I have been able to get it to work using php but want a client side solution. Any suggestions?
JavaScript is restricted by the same origin policy. It will not be able to read the other site unless it supports CORS. Other option is to use a local proxy [serverside langauge] on your domain to fetch the content.
Client side solution would be tricky. Most browser don't allow cross-domain AJAX calls.
Take a look at http://en.wikipedia.org/wiki/Same_origin_policy

Upload contents of current html page from a bookmarklet

What would be a good way to upload the html content of the current page viewed in the browser to another server from a bookmarklet?
Assuming this url is on a server that requires authentication, so I want to avoid fetching the page on the sever side, but rather would like to see if it's possible to get the contents and upload them directly from within the browser.
Thanks in advance for any suggestions
Elisha
Considering that you are most probably going to have a situation in which the page being viewed in the browser is on a different domain from the domain you want to send the data to, an AJAX request will definitely fail (due to Cross-Domain restrictions). So doing this server side would be your best bet.
Retrieve location.href with XHR into string
Create FORM with desired cross-site action
POST data to server
?????
PROFIT!

Categories

Resources