How can I load some website in my java-script so that I can parse it?
I want to get Html of e.g. www.google.com and I want to select all the tags in it using jquery.
You can't as jquery doesn't allow you to load external resources, unless in the page you want to parse is present the header:
header('Access-Control-Allow-Origin: http://thesitewhereyourjscodeishosted');
If you can't set it, you could use PHP:
<script>
var website = <?php echo file_get_contents("http://websitetoload"); ?>;
</script>
Due to browser security restrictions, Ajax requests are subjected to the same origin policy; the request can not be successfully retrieve data from a different domain, subdomain, port, or protocol.
But you can build a script on your server that requests that content or can use a proxy, then use jQuery ajax to hit the script on your server.
Working Fiddle
It's just proxying a request through Yahoo's servers and getting back a JSONP response even if the requested server doesn't support JSONP.
HTML:
<div id="example"></div>
JavaScript
$('#example').load('http://wikipedia.org');
Here is a similar question like yours Ways to circumvent the same-origin policy?
good luck!
You can easily set up node server that gets the content of the page, and then make an ajax request to your server and get whatever data you need.
Related
I want to capture the HTTP request header fields, primarily the Referer and User-Agent, within my client-side JavaScript. How may I access them?
Google Analytics manages to get the data via JavaScript that they have you embed in you pages, so it is definitely possible.
Related:
Accessing the web page's HTTP Headers in JavaScript
If you want to access referrer and user-agent, those are available to client-side Javascript, but not by accessing the headers directly.
To retrieve the referrer, use document.referrer.
To access the user-agent, use navigator.userAgent.
As others have indicated, the HTTP headers are not available, but you specifically asked about the referer and user-agent, which are available via Javascript.
Almost by definition, the client-side JavaScript is not at the receiving end of a http request, so it has no headers to read. Most commonly, your JavaScript is the result of an http response. If you are trying to get the values of the http request that generated your response, you'll have to write server side code to embed those values in the JavaScript you produce.
It gets a little tricky to have server-side code generate client side code, so be sure that is what you need. For instance, if you want the User-agent information, you might find it sufficient to get the various values that JavaScript provides for browser detection. Start with navigator.appName and navigator.appVersion.
This can be accessed through Javascript because it's a property of the loaded document, not of its parent.
Here's a quick example:
<script type="text/javascript">
document.write(document.referrer);
</script>
The same thing in PHP would be:
<?php echo $_SERVER["HTTP_REFERER"]; ?>
Referer and user-agent are request header, not response header.
That means they are sent by browser, or your ajax call (which you can modify the value), and they are decided before you get HTTP response.
So basically you are not asking for a HTTP header, but a browser setting.
The value you get from document.referer and navigator.userAgent may not be the actual header, but a setting of browser.
One way to obtain the headers from JavaScript is using the WebRequest API, which allows us to access the different events that originate from http or websockets, the life cycle that follows is this:
WebRequest Lifecycle
So in order to access the headers of a page it would be like this:
browser.webRequest.onHeadersReceived.addListener(
(headersDetails)=> {
console.log("Request: " + headersDetails);
},
{urls: ["*://hostName/*"]}
);`
The issue is that in order to use this API, it must be executed from the browser, that is, the browser object refers to the browser itself (tabs, icons, configuration), and the browser does have access to all the Request and Reponse of any page , so you will have to ask the user for permissions to be able to do this (The permissions will have to be declared in the manifest for the browser to execute them)
And also being part of the browser you lose control over the pages, that is, you can no longer manipulate the DOM, (not directly) so to control the DOM again it would be done as follows:
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
code: 'console.log("Headers success")',
});
});
or if you want to run a lot of code
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
file: './headersReveiced.js',
});
});
Also by having control over the browser we can inject CSS and images
Documentation: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onHeadersReceived
I would imagine Google grabs some data server-side - remember, when a page loads into your browser that has Google Analytics code within it, your browser makes a request to Google's servers; Google can obtain data in that way as well as through the JavaScript embedded in the page.
var ref = Request.ServerVariables("HTTP_REFERER");
Type within the quotes any other server variable name you want.
I am having problems getting an image URL from a Wordpress JSON API and fill in an image tag.
Here's my non-working code:
$(document).ready(function() {
$.getJSON('http://interelgroup.com/api/get_post/?post_id=4683', {format: "json"}, function(data) {
$('#thumb').attr("src", data.post.thumbnail_images.full.url);
});
});
And the HTML is like:
<img id="thumb" src="#">
What am I doing wrong?
Help appreciated.
Thanks!
NOTE: My real case is dynamic (I am getting a dynamic list of post IDs and looping through them with $.each()), but for this case I am providing an example with an hardcoded post ID, so you can check the JSON returned.
Your problem is because you can't do cross request using Javascript, say websiteA.com wants to fetch information from websiteB.com with a plain XMLHttpRequest. That's forbidden by the Access Control.
A resource makes a cross-origin HTTP request when it requests a resource from a different domain than the one which served itself. For example, an HTML page served from http://domain-a.com makes an <img> src request for http://domain-b.com/image.jpg. Many pages on the web today load resources like CSS stylesheets, images and scripts from separate domains.
For security reasons, browsers restrict cross-origin HTTP requests initiated from within scripts. For example, XMLHttpRequest follows the same-origin policy. So, a web application using XMLHttpRequest could only make HTTP requests to its own domain. To improve web applications, developers asked browser vendors to allow XMLHttpRequest to make cross-domain requests.
If you know the owner of the website you're trying to read, what you can do is asking them to add your domain to the whitelist in the page headers. If you do so, then you can do as much as $.getJSON as you want.
Another alternative could be using some sort of backend code to read that website and serve it locally. Say your website is example.com, you could have a PHP script that runs on example.com/retrieve.php where you can query that website, adding the "parameter" you need. After that, since example.com is your own website you can just do a $.getJSON to yourself. There's a simple PHP proxy you can use here with a bit of explanation on why you can do it this way.
A third option would be editing the Javascript code to use Yahoo! YQL service. Although there's no guarantee that'll work forever, you can use it to query the website on your behalf and then use it to print whatever you want to the screen. The downside is that this maybe is not ethically correct if you do not own the website you're trying to fetch (plus, they can add a robots.txt file preventing the access).
Hope that helps.
JSONP solves the problem. Just need to add a callback parameter and specify it is a JSONP, like:
$(document).ready(function() {
$.getJSON('http://interelgroup.com/api/get_post/?post_id=4683&callback=?', {format: "jsonp"}, function(data) {
$('#thumb').attr("src", data.post.thumbnail_images.full.url);
});
});
More info here: Changing getJSON to JSONP
Info on JSONP: https://en.wikipedia.org/wiki/JSONP
I am able to execute a web service through the browser but when I try and execute it through xmlhttprequest in javascript I get this error: Origin [] is not allowed by Access-Control-Allow-Origin.
How can I call the webservice through javascript? I'd prefer not to use a framework and just use basic client-side javascript.
For reference this is an instance of the web service I'm trying to consume: http://musicbrainz.org/ws/1/track/?type=xml&query=track:alive
Thanks.
Not possible. You can't have an AJAX call to a different domain. It would need to be done by the server.
You cannot use XMLHttpRequest on a page hosted at one domain to retrieve information from another domain. The browser simply won't allow it.
One common workaround is quite reliable if you have some control over the resource you're retrieving. It's called "JSONP" and the technique is to simply append a <script> tag to the header (dynamically, using JavaScript). That script can of course be hosted on any domain, so there's no cross-site scripting restrictions. If the script were to consist simply of JSON data, it wouldn't do much good. But wrap that JSON data in a function call — a function you control on your side — and it works great.
someFunctionName( { ... } );
If the resource you're retrieving doesn't support JSONP, your only recourse is to write a script on your own server (hosted on the same domain as the page, of course) that retrieves the target data. You can then make a normal AJAX call to your own script.
XmlHttpRequest is for requests for the same domain - it is called "Same origin policy".
If you want to do cross-domain ajax request you have to use jsonp format.
I do not think that using plain JavaScript is good idea - much better solution
is ready-to-use framework e.g. jQuery or Mootools.
In jQuery you have got method $.getJSON to making simple JSON and JSONP requests.
More read about:
How exactly is the same-domain policy enforced?
http://api.jquery.com/jQuery.getJSON/
http://en.wikipedia.org/wiki/JSON#JSONP
NOT Possible, but you can create a proxy on your server, and access the data through the proxy:
Here is a proxy example in PHP:
proxy.php
<?php
//header('Content-type: text/xml');
$url = 'http://musicbrainz.org/ws/1/track/?type=xml&query=track:alivehttp://www.example.com/';
$xml = file_get_contents($url);
echo xml;
?>
You can now use proxy.php as your source url from javascript.
I am trying to get Json format data from this website .. http://www.livetraffic.sg/feeds/json
however when i use ajax.. i run into this particular error in my chrome console.
Error:XMLHttpRequest cannot load. Origin null is not allowed by Access-Control-Allow-Origin.
Is the external website preventing my from using information?
Thanks for your help!!!
Sample of my code:
url = "http://www.livetraffic.sg/home2/get_erp_gantry";
$().ready(function(){
$.get(resturl, function(data) {
//do something here with data
});
});
This is your browser enforcing the same-origin policy. You are not allowed to make requests to domains other than the domain your script was fetched from.
You will have to set up some server-side proxy on the same domain as the one your script is served from and have it supply the data. (You could also cache this data on the server if it would make sense.)
You cannot make cross-domain JSON requests. Your browser will not allow it. If the target domain allows JSONP requests http://en.wikipedia.org/wiki/JSONP#JSONP then you'll be able to use this work-around instead. Else you'll have to make the request server-side.
Simpler you can perform an ajax query to a local php page which contains
header("Content-type: application/json; charset=utf-8");
echo file_get_contents('http://www.livetraffic.sg/home2/get_erp_gantry');
You just must have allow_url_fopen true.
Thanks All! Manage to pull down the Json data from external website using a server side PHP script and then passing variables to my javascript :)
Let's say I have the main page loaded from http://www.example.com/index.html. On that page there is js code that makes an ajax request to http://n1.example.com//echo?message=hello. When the response is received a div on the main page is updated with the response body.
Will that work on all popular browsers?
Edit:
The obvious solution is to put a proxy in front of www.example.com and n1.example.com and set it so that every request going to a subresource of http://www.example.com/n1 gets proxied to http://n1.example.com/.
Cross domain is entirely a different subject. But cross sub-domain is relatively easy. All you need to do is to set the document.domain to be same in both the parent page and the iframe page.
document.domain = "yourdomain.com"
More info here
Note: this technique will only let you interact with iframes from parents of your domain. It does not alter the Origin sent by XMLHttpRequest.
All modern browsers support CORS and henceforth we should leverage this addition.
It works on simple handshaking technique were the 2 domains communicating trust each other by way of HTTP headers sent/received. This was long awaited as same origin policy was necessary to avoid XSS and other malicious attempts.
To initiate a cross-origin request, a browser sends the request with an Origin HTTP header. The value of this header is the site that served the page. For example, suppose a page on http://www.example-social-network.com attempts to access a user's data in online-personal-calendar.com. If the user's browser implements CORS, the following request header would be sent:
Origin: http://www.example-social-network.com
If online-personal-calendar.com allows the request, it sends an Access-Control-Allow-Origin header in its response. The value of the header indicates what origin sites are allowed. For example, a response to the previous request would contain the following:
Access-Control-Allow-Origin: http://www.example-social-network.com
If the server does not allow the cross-origin request, the browser will deliver an error to example-social-network.com page instead of the online-personal-calendar.com response.
To allow access to all pages, a server can send the following response header:
Access-Control-Allow-Origin: *
However, this might not be appropriate for situations in which security is a concern.
Very well explained here in below wiki page.
http://en.wikipedia.org/wiki/Cross-origin_resource_sharing
Another solution that may or may not work for you is to dynamically insert/remove script tags in your DOM that point to the target domain. This will work if the target returns json and supports a callback.
Function to handle the result:
<script type="text/javascript">
function foo(result) {
alert( result );
}
</script>
Instead of doing an AJAX request you would dynamically insert something like this:
<script type="text/javascript" src="http://n1.example.com/echo?callback=foo"></script>
Another workaround, is to direct the ajax request to a php (for example) page on your domain, and in that page make a cURL request to the subdomain.
The simplest solution I found was to create a php on your subdomain and include your original function file within it using a full path.
Example:
www.domain.com/ajax/this_is_where_the_php_is_called.php
Subdomain:
sub.domain.com
Create:
sub.domain.com/I_need_the_function.php
Inside I_need_the_function.php just use an include:
include_once("/server/path/public_html/ajax/this_is_where_the_php_is_called.php");
Now call sub.domain.com/I_need_the_function.php from your javascript.
var sub="";
switch(window.location.hostname)
{
case "www.domain.com":
sub = "/ajax/this_is_where_the_php_is_called.php";
break;
case "domain.com":
sub = "";
break;
default: ///your subdomain (or add more "case" 's)
sub = "/I_need_the_function.php";
}
xmlHttp.open("GET",sub,true);
The example is as simple as I can make it. You may want to use better formatted paths.
I hope this helps some one. Nothing messy here - and you are calling the original file, so any edits will apply to all functions.
New idea: if you want cross subdomain (www.domain.com and sub.domain.com) and you are working on apache. things can get a lot easier. if a subdomain actually is a subdirectory in public_html (sub.domain.com = www.domain.com/sub/. so if you have ajax.domain.com/?request=subject...you can do something like this: www.domain.com/ajax/?request=subject
works like a charm for me, and no stupid hacks, proxies or difficult things to do for just a few Ajax requests!
I wrote a solution for cross sub domain and its been working for my applications. I used iframe and setting document.domain="domain.com" on both sides. You can find my solution at :
https://github.com/emphaticsunshine/Cross-sub-domain-solution