Cross Domain Javascript Bookmarklet - javascript

I've been at this for several days and searches including here haven't give me any solutions yet.
I am creating a Bookmarklet which is to interact with a POST API. I've gotten most of it working except the most important part; the sending of data from the iframe (I know horrible! If anyone knows a better solution please let me know) to the javascript on my domain (same domain as API so the communication with the API is no problem).
From the page the user clicks on the bookmarklet I need to get the following data to the javascript that is included in the iFrame.
var title = pageData[0].title;
var address = pageData[0].address;
var lastmodified = pageData[0].lastmodified;
var referralurl = pageData[0].referralurl;
I first fixed it with parsing this data as JSON and sending it through the name="" attribute of the iFrame but realized on about 20% of webpages this breaks. I get an access denied; also it's not a very pretty method.
Does anyone have anyidea on how I can solve this. I am not looking to use POSTS that redirect I want it all to be AJAX and as unobtrusive as possible. It's also worth noting I use the jQuery library.
Thank you very much,
Ice

You should look into easyXDM, it's very easy to use. Check out one of the examples on http://consumer.easyxdm.net/current/example/methods.html

After a lot of work I was able to find a solution using JSONP which is enables Cross Domain Javascript. It's very tricky with the Codeigniter Framework because passing data allong the URLs requires a lot of encoding and making sure you dont have illegal characters. Also I'm still looking to see how secure it really is.

If I understand your question correctly, you might have some success by looking into using a Script Tag proxy. This is the standard way to do cross domain AJAX in javascript frameworks like jquery and extjs.
See Jquery AJAX Documentation

If you need to pass data to the iframe, and the iframe is actually including another page, but that other page is on the same domain (a lot of assumptions, I know).
Then the man page code can do this:
DATA_FOR_IFRAME = ({'whatever': 'stuff'});
Then the code on the page included by the iframe can do this:
window.parent.DATA_FOR_IFRAME;
to get at the data :)

Related

How do I access the console of the website that I want to extract data from?

Sorry for the confusing title. I am a beginner in JavaScript and would like to build this little project to increase my skill level: an image extractor. The user is able to input the website name into the form input. Press Extract and the links of all images show up.
Question: how do I access the website DOM that was entered into the input field?
As mentioned by #Quentin in the comments, browsers enforce restrictions on cross-domain requests like this. The Same Origin policy will prevent your site from pulling the HTML source of a page on a different domain.
Since this is a learning exercise, I'd recommend picking another task that doesn't get into the weeds of cross-origin request security issues. Alternatively, you could implement a "scraper" like this out of the browser using Node (JavaScript), Python, PHP, Ruby, or many other scripting languages.
You could try something like this if you already have the html content:
var html = document.createElement('html');
html.innerHTML = "<html><body><div><img src='image-url.png'></div></body></html>";
console.log(html.querySelector("img").src);
If you also need to get the content via ajax calls, I would suggest doing your entire code server side, using something like scrapy

PHP HttpRequest to create a web page - how to handle long response times?

I am currently using javascript and XMLHttpRequest on a static html page to create a view of a record in Zotero. This works nicely except for one thing: The page html title.
I can of course also change the <title>...</title> tag, but if someone wants to post the view to for example facebook the static title on the web page will be shown there.
I can't think of any way to fix this with just a static page with javascript. I believe I need a dynamically created page from a server that does something similar to XMLHttpRequest.
For PHP there is HTTPRequest. Now to the problem. In the javascript version I can use asynchronous calls. With PHP I think I need synchronous calls. Is that something to worry about?
Is there perhaps some other way to handle this that I am not aware of?
UPDATE: It looks like those trying to answer are not at all familiar with Zotero. I should have been more clear. Zotero is a reference db located at http://zotero.org/. It has an API that can be used through XMLHttpRequest (which is what I said above).
Now I can not use that in my scenario which I described above. So I want to call the Zotero server from my server instead. (Through PHP or something else.)
(If you are not familiar with the concepts it might be hard to understand and answer the question. Of course.)
UPDATE 2: For those interested in how Facebook scraps an URL you post there, please test here: https://developers.facebook.com/tools/debug
As you can see by testing there no javascript is run.
Sorry, im not sure if i understand what you are trying to ask, are you just wanting to change the pages title?
Why not use javascript?
document.title = newTitle
Facebook expects the title (or opengraph :title tags) to be present when it fetches the page. It won't execyte any JavaScript for you to fill in the blanks.
A cool workaround would be to detect the Facebook scraper with PHP by parsing the User Agent string, and serving a version of the page with the information already filled in by PHP instead of JavaScript.
As far as I know, the Facebook scraper uses this header for User Agent: "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
You can check to see if part of that string is present in the header and load the page accordingly.
if (strpos($_SERVER['HTTP_USER_AGENT'], 'facebookexternalhit') !== false)
{
//synchronously load the title and opengraph tags here.
}
else
{
//load the page normally
}

Looking for a way to scrape HTML with JS

As the title suggests, I'm looking for a hopefully straightforward way of scraping all of the HTML from a webpage. Storing it in a string perhaps, and then navigating through that string to pull out the desired element.
Specifically, I want to scrape my twitter page and display my profile picture inside a new div. I know there are several tools for doing just this, but I would anyone have some code examples or suggestions for how I might do this myself?
Thanks a lot
UPDATE
After a very helpful response from T.J. Crowder I did some more searching online and found this resource.
In theory, this is easy. You just do an ajax call to get the text of the page, then use jQuery to turn that into a disconnected DOM, and then use all the usual jQuery tools to find and extract what you need.
$.ajax({
url: "http://example.com/some/path",
success: function(html) {
var tree = $(html);
var imgsrc = tree.find("img.some-class").attr("src");
if (imgsrc) {
// ...add the image to your page
}
}
});
But (and it's a big one) it's not likely to work, because of the Same Origin Policy, which prevents cross-origin ajax calls. Certain individual sites may have an open CORS policy, but most won't, and of course supporting CORS on IE8 and IE9 requires an extra jQuery plug-in.
So to do this with sites that don't allow your origin via CORS, there must be a server involved. It can be your server and you can grab the text of the page you want using server-side code and then send it to your page via ajax (or just build the bits you want into your page when you first render it). All of the usual server-side stacks (PHP, Node, ASP.Net, JVM, ...) have the ability to grab web pages. Or, in some cases, you may be able to use YQL as a cross-domain proxy, using their server rather than your own.

Run/inject javascript on page to get the html and post it to a URL

Before I used to just go to "View source" in the browser and grap all the html and post it into a form on my page. But after there have been inplemented delayed loading with ajax of some of the content I can't do this anymore.
It was not a problem doing it the old way ... but this does not work any more, since I'm missing important information.
Is it possible to somehow run a javascript in the browser, like from a bookmark shortcut or something like that. So I can grep all the html(or better yet, now filter some of the data) and then post it back to my site?
I have no idea what this is called or if its even possible.
I guess a browser extension could do this, but making for all browsers would be a pain, if this could be done with javascript.
All ideas are welcome.
If you are using jquery, you could just use ajax and send the html of the body (or whatever area of the page you want) to your server.
$.post('url-to-send.ext', {data:$(body).html()});
So, after alot of searching ... I fianlly found the answer to my own question.
Bookmarklets: http://en.wikipedia.org/wiki/Bookmarklet
Which as descripbed here: http://www.learningjquery.com/2006/12/jquerify-bookmarklet let you inject jquery on the site:
Create the following as a bookmark:
var s=document.createElement('script');
s.setAttribute('src','https://ajax.googleapis.com/ajax/libs/jquery/1.6.4/jquery.min.js');
document.getElementsByTagName('body')[0].appendChild(s);
Now it just extending it and fetch the information I need. Neat little trick I would say.

How do you get content from another domain with .load()?

Requesting data from any location on my domain with .load() (or any jQuery ajax functions) works just fine.
Trying to access a URL in a different domain doesn't work though. How do you do it? The other domain also happens to be mine.
I read about a trick you can do with PHP and making a proxy that gets the content, then you use jQuery's ajax functions, on that php location on your server, but that's still using jQuery ajax on your own server so that doesn't count.
Is there a good plugin?
EDIT: I found a very nice plugin for jQuery that allows you to request content from other pages using any of the jQuery function in just the same way you would a normal ajax request in your own domain.
The post: http://james.padolsey.com/javascript/cross-domain-requests-with-jquery/
The plugin: https://github.com/jamespadolsey/jQuery-Plugins/tree/master/cross-domain-ajax/
This is because of the cross-domain policy, which, in sort, means that using a client-side script (a.k.a. javascript...) you cannot request data from another domain. Lucky for us, this restriction does not exist in most server-side scripts.
So...
Javascript:
$("#google-html").load("google-html.php");
PHP in "google-html.php":
echo file_get_contents("http://www.google.com/");
would work.
Different domains = different servers as far as your browser is concerned. Either use JSONP to do the request or use PHP to proxy. You can use jQuery.ajax() to do a cross-domain JSONP request.
One really easy workaround is to use Yahoo's YQL service, which can retrieve content from any external site.
I've successfully done this on a few sites following this example which uses just JavaScript and YQL.
http://icant.co.uk/articles/crossdomain-ajax-with-jquery/using-yql.html
This example is a part of a blog post which outlines a few other solutions as well.
http://www.wait-till-i.com/2010/01/10/loading-external-content-with-ajax-using-jquery-and-yql/
I know of another solution which works.
It does not require that you alter JQuery. It does require that you can stand up an ASP page in your domain. I have used this method myself.
1) Create a proxy.asp page like the one on this page http://www.itbsllc.com/zip/proxyscripts.html
2) You can then do a JQuery load function and feed it proxy.asp?url=.......
there is an example on that link of how exactly to format it.
Anyway, you feed the foreign page URL and your desired mime type as get variables to your local proxy.asp page. The two mime types I have used are text/html and image/jpg.
Note, if your target page has images with relative source links those probably won't load.
I hope this helps.

Categories

Resources