How to fetch a Wikipedia webpage with AJAX or fetch() - javascript

I want to dynamically fetch a Wikipedia webpage in the browser in order to further process the XHTML with XSLTProcessor.
Unfortunately, this does not work because I can't get Wikipedia to send the "Access-Control-Allow-Origin" header in the HTTP response.
I tried to include the "origin" parameter as it is stated on https://www.mediawiki.org/wiki/Manual:CORS, but without success.
It is important to me to obtain the complete web page HTML as it is obtained by the browser when navigating to that page, so the MediaWiki API is out of the question for me.
This is what I have tried:
var url = "https://en.wikipedia.org/wiki/Star_Trek?origin=https://my-own-page.com";
fetch(url).then(function(response){
console.log(response);
});

Unfortunately, this does not work because I can't get Wikipedia to send the "Access-Control-Allow-Origin" header in the HTTP response.
No, you can't. It is up to Wikipedia to decide if they want to explicitly grant permission to JavaScript running on other sites access to their pages.
Since this would allow users' personal information to leak (e.g. logged in Wikipedia pages display the user's username, which could be used to enhance a phishing attack), this is clearly something undesirable.
var url = "https://en.wikipedia.org/wiki/Star_Trek?origin=https://my-own-page.com";
origin is an HTTP request header, not a query string parameter, and is automatically included in cross origin XMLHttpRequest/fetch requests without you needing to do anything special.

Related

Can I use javascript to get a request header in http get request? [duplicate]

I want to capture the HTTP request header fields, primarily the Referer and User-Agent, within my client-side JavaScript. How may I access them?
Google Analytics manages to get the data via JavaScript that they have you embed in you pages, so it is definitely possible.
Related:
Accessing the web page's HTTP Headers in JavaScript
If you want to access referrer and user-agent, those are available to client-side Javascript, but not by accessing the headers directly.
To retrieve the referrer, use document.referrer.
To access the user-agent, use navigator.userAgent.
As others have indicated, the HTTP headers are not available, but you specifically asked about the referer and user-agent, which are available via Javascript.
Almost by definition, the client-side JavaScript is not at the receiving end of a http request, so it has no headers to read. Most commonly, your JavaScript is the result of an http response. If you are trying to get the values of the http request that generated your response, you'll have to write server side code to embed those values in the JavaScript you produce.
It gets a little tricky to have server-side code generate client side code, so be sure that is what you need. For instance, if you want the User-agent information, you might find it sufficient to get the various values that JavaScript provides for browser detection. Start with navigator.appName and navigator.appVersion.
This can be accessed through Javascript because it's a property of the loaded document, not of its parent.
Here's a quick example:
<script type="text/javascript">
document.write(document.referrer);
</script>
The same thing in PHP would be:
<?php echo $_SERVER["HTTP_REFERER"]; ?>
Referer and user-agent are request header, not response header.
That means they are sent by browser, or your ajax call (which you can modify the value), and they are decided before you get HTTP response.
So basically you are not asking for a HTTP header, but a browser setting.
The value you get from document.referer and navigator.userAgent may not be the actual header, but a setting of browser.
One way to obtain the headers from JavaScript is using the WebRequest API, which allows us to access the different events that originate from http or websockets, the life cycle that follows is this:
WebRequest Lifecycle
So in order to access the headers of a page it would be like this:
browser.webRequest.onHeadersReceived.addListener(
(headersDetails)=> {
console.log("Request: " + headersDetails);
},
{urls: ["*://hostName/*"]}
);`
The issue is that in order to use this API, it must be executed from the browser, that is, the browser object refers to the browser itself (tabs, icons, configuration), and the browser does have access to all the Request and Reponse of any page , so you will have to ask the user for permissions to be able to do this (The permissions will have to be declared in the manifest for the browser to execute them)
And also being part of the browser you lose control over the pages, that is, you can no longer manipulate the DOM, (not directly) so to control the DOM again it would be done as follows:
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
code: 'console.log("Headers success")',
});
});
or if you want to run a lot of code
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
file: './headersReveiced.js',
});
});
Also by having control over the browser we can inject CSS and images
Documentation: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onHeadersReceived
I would imagine Google grabs some data server-side - remember, when a page loads into your browser that has Google Analytics code within it, your browser makes a request to Google's servers; Google can obtain data in that way as well as through the JavaScript embedded in the page.
var ref = Request.ServerVariables("HTTP_REFERER");
Type within the quotes any other server variable name you want.

Make REST call in JavaScript without using JSON?

(extremely ignorant question, I freely admit)
I have a simple web page with a button and a label. When I click the button, I want to make a REST call to an entirely different domain (cross-domain, I know that much) and display the results (HTML) in the label.
With other APIs, I've played around with using JSON/P and adding a element on the fly, but this particular API doesn't support JSON so I'm not sure how to go about successfully getting through.
The code I have is:
function getESVData() {
$.get('http://www.esvapi.org/v2/rest/passageQuery?key=IP&passage=John+1', function (data) {
$('#bibleText').html(data);
app.showNotification("Note:", "Load performed.");
});
}
I get an "Access denied." Is there anyway to make this call successfully without JSON?
First off, JSON and JSONP are not the same. JSON is a way of representing information, and JSONP is a hack around the same-origin policy. JSONP works by requesting information from another domain, and that domain returns a script which calls a function (with the name you provided) with the information. You are indeed executing a script on your site that another domain gave to you, so you should trust this other domain.
Now when trying to make cross domain requests you basically have 3 options:
Use JSONP. This has limitations, including the fact that it only works for GET requests, and the server you are sending the request to has to support it.
Make a Cross Origin Resource Sharing (CORS) request. This also must be supported by the server you are sending the request to.
Set up a proxy on your own server. In this situation you set an endpoint on your site that simply relays requests. ie you request the information from your server, your server gets it from the other server and returns it to you.
For your situation, it the other server doesn't have support for other options, it seems like you will have to go with options 3.

How can I access HTTP request headers from Javascript? [duplicate]

I want to capture the HTTP request header fields, primarily the Referer and User-Agent, within my client-side JavaScript. How may I access them?
Google Analytics manages to get the data via JavaScript that they have you embed in you pages, so it is definitely possible.
Related:
Accessing the web page's HTTP Headers in JavaScript
If you want to access referrer and user-agent, those are available to client-side Javascript, but not by accessing the headers directly.
To retrieve the referrer, use document.referrer.
To access the user-agent, use navigator.userAgent.
As others have indicated, the HTTP headers are not available, but you specifically asked about the referer and user-agent, which are available via Javascript.
Almost by definition, the client-side JavaScript is not at the receiving end of a http request, so it has no headers to read. Most commonly, your JavaScript is the result of an http response. If you are trying to get the values of the http request that generated your response, you'll have to write server side code to embed those values in the JavaScript you produce.
It gets a little tricky to have server-side code generate client side code, so be sure that is what you need. For instance, if you want the User-agent information, you might find it sufficient to get the various values that JavaScript provides for browser detection. Start with navigator.appName and navigator.appVersion.
This can be accessed through Javascript because it's a property of the loaded document, not of its parent.
Here's a quick example:
<script type="text/javascript">
document.write(document.referrer);
</script>
The same thing in PHP would be:
<?php echo $_SERVER["HTTP_REFERER"]; ?>
Referer and user-agent are request header, not response header.
That means they are sent by browser, or your ajax call (which you can modify the value), and they are decided before you get HTTP response.
So basically you are not asking for a HTTP header, but a browser setting.
The value you get from document.referer and navigator.userAgent may not be the actual header, but a setting of browser.
One way to obtain the headers from JavaScript is using the WebRequest API, which allows us to access the different events that originate from http or websockets, the life cycle that follows is this:
WebRequest Lifecycle
So in order to access the headers of a page it would be like this:
browser.webRequest.onHeadersReceived.addListener(
(headersDetails)=> {
console.log("Request: " + headersDetails);
},
{urls: ["*://hostName/*"]}
);`
The issue is that in order to use this API, it must be executed from the browser, that is, the browser object refers to the browser itself (tabs, icons, configuration), and the browser does have access to all the Request and Reponse of any page , so you will have to ask the user for permissions to be able to do this (The permissions will have to be declared in the manifest for the browser to execute them)
And also being part of the browser you lose control over the pages, that is, you can no longer manipulate the DOM, (not directly) so to control the DOM again it would be done as follows:
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
code: 'console.log("Headers success")',
});
});
or if you want to run a lot of code
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
file: './headersReveiced.js',
});
});
Also by having control over the browser we can inject CSS and images
Documentation: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onHeadersReceived
I would imagine Google grabs some data server-side - remember, when a page loads into your browser that has Google Analytics code within it, your browser makes a request to Google's servers; Google can obtain data in that way as well as through the JavaScript embedded in the page.
var ref = Request.ServerVariables("HTTP_REFERER");
Type within the quotes any other server variable name you want.

jQuery cross domain image upload

Ok, so basically.
I inject some javascript code into a web page and it uploads an image on that page to another server.
Now I have it working when I run it on my domain (of course), but I need to post the multipart/form-data request to a PHP file that I do not own.
Since it is a upload and not a simple request to just get data, I cannot use jsonp in the initial call since the response would not be in json.
Using James Padolsey's cross domain script, I am able to do $.get and $.post request across domains, but since I am using $.ajax it does not work.
He uses the Yahoo Query Language to acomplish this
This is basically how I am making the request
$.ajax({
url: 'http://website.com/upload.php',
type: 'POST',
contentType:'multipart/form-data',
data: postData,
success: successCallback,
error : function(XMLHttpRequest, textStatus, errorThrown) {
console.log('Error');
}
});
I want to make it completely JavaScript based to avoid making my server do the request.
So to re-cap, I can get the image bytes and make the request with javascript. But so far I cannot make it cross domain since I am $.ajax to set the content Type to "multipart/form-data".
Is there another way to make the request cross domain with or without the YQL?
Making the request with an iframe will not work since the domain of the iframe would change and I would not have access to the response.
This is a well known and difficult problem for web development, know as the Same Origin Policy
Javascript prevents access to most methods and properties to pages across different origins. The term "origin" is defined using the domain name, application layer protocol, and (in most browsers) port number of the HTML document running the script. Two resources are considered to be of the same origin if and only if all these values are exactly the same.
There are several ways around this.
Create your own proxy
Create a page that simply forwards the request to the other server, and returns its response
or, Use Apache's rules to form a proxy (see above link)
Use someone else's proxy
For GET requests which are typical Use YQL to access yahoo's proxy
For POST requests, if the 3rd party supports Open Data Tables
or, Use some other public proxy
See if the 3rd party conforms to the CORS specification
Cross domain POST query using Cross-Origin Resource Sharing getting no data back
If you are willing to allow a little flash on your page, try flXHR
it claims to implement the exact XHR api and also has a jquery plugin
These are pretty much your only options

Cross-domain website promotion

I'd like to offer a way to my users to promote my website, blog etc. on their website.
I can make a banner, logo whatever that they can embed to their site, but I'd like to offer dynamic content, like "the 5 newest entry's title from my blog".
The problem is the same origin policy. I know there is a solution (and I use it): they embed a simple div and a JavaScript file. The JS makes an XmlHttpRequest to my server and gets the data as JSONP, parses the data and inserts into the div.
But is it the only way? Isn't there a better way I could do this?
On the Internet there are tons of widget (or whatever, I don't know how they call...) that gain the data from another domain. How they do that?
A common theme of many of the solutions, instead, is getting JavaScript to call a proxy program (either on the client or the server) which, in turn, calls the web service for you.
The output can be written to the response stream and then is available, via the normal channels, such as the responseText and responseXML properties of XMLHttpRequest.
you can find more solution here :
http://developer.yahoo.com/javascript/howto-proxy.html
or here :
http://www.simple-talk.com/dotnet/asp.net/calling-cross-domain-web-services-in-ajax/
CORS is a different way than JSONP.
Plain AJAX. All your server has to do is to set a specific header: Access-Control-Allow-Origin
More here: http://hacks.mozilla.org/2009/07/cross-site-xmlhttprequest-with-cors/
If you go the JSONP route, you will implicitly ask your users to trust you, as they will give you full access to the resources of their page (content, cookies,...). If they know that they main complain.
While if you go the iframe route there is no problems.One famous example today of embeddable content by iframe is the Like button of facebook.
And making that server side with a proxy or other methods would be much more complex, as there are plenty of environments out there. I don't know other ways.
You can also set the HTTP Access-Control headers in the server side. This way you're basically controlling from the server side on whether the client who has fired the XMLHttpRequest is allowed to process the response. Any recent (and decent) webbrowser will take action accordingly.
Here's a PHP-targeted example how to set the headers accordingly.
header('Access-Control-Allow-Origin: *'); // Everone may process the response.
header('Access-Control-Max-Age: 604800'); // Client may cache this for one week.
header('Access-Control-Allow-Methods: GET, POST'); // Allowed request methods.
The key is Access-Control-Allow-Origin: *. This informs the client that requests originating from * (in fact, everywhere) is allowed to process the response. If you set it to for example Access-Control-Allow-Origin: http://example.com, then the webbrowser may only process the response when the initial page is been served from the mentioned domain.
See also:
MDC - HTTP Access Control

Categories

Resources