I am pretty new to front-end and trying to confirm conceptually first before I start to implement it.
For example, I want to return a static HTML file to the user's rest call request. That way, I can open HTML page on the user's side and get input. Along that line, I want to insert puppeteer code inside of the static HTML file to navigate to certain websites before getting the user's input.
Does it make sense? If not, can you please explain why?
It doesn't make sense to me. Puppeteer is a Node.js library. It's meant to be run on a machine, not in the browser. Based on what you want to do, I would explore using an <iframe> to load the other websites and writing JavaScript to control the <iframe> and get whatever input you need.
Related
I am developing a small labeling tool that given a URL should display a document hosted on that URL and allow a user to choose a label for that document.
I want to display the contents of the URL for this purpose. As far as I know, I can either get the URL content, parse the contents, and display or use an iframe option.
Without using parser
Iframes are not enabled for the target URL, the contents of which I want to display. Is there any other way to do this using javascript without using parser?
Using parser
I can crawl the contents of the URL, get everything between and dump it in the webpage area.
I'm new to javascript and front end development so I am not sure whether these are the only options.
Are there other options to do this?
If the parser is the only option, Can I dump the HTML that I get from the remote URL? I understand that images and other media that may be within on remote url won't be displayed. Is there any other caveat to this method? More importantly, Is this the best way to do this?
Most sites do it via the iframe like you mentioned like codepen.
Also, you can use Puppeteer ( a headless browser ) to do these sort of things. Get the contents using web scraping or take a screenshot or print a pdf. Pretty nifty library.
Most things that you can do manually in the browser can be done using
Puppeteer! Here are a few examples to get you started:
Generate screenshots and PDFs of pages. Crawl a SPA (Single-Page Application) and generate pre-rendered content (i.e. "SSR"
(Server-Side Rendering)).
Automate form submission, UI testing, keyboard input, etc. Create an up-to-date, automated testing environment.
Run your tests directly in the latest version of Chrome using the latest JavaScript and browser features.
Hope this helps !
I want to get the INSPECT ELEMENT data of a website. Let's say Truecaller. So that i can get the Name of the person who's mobile number I searched.
But whenever i make a python script it gives me the PAGE SOURCE that does not contain the required information.
Kindly help me. I am a beginner so kindly excuse me of any mistake in the question.
TL;DR: Use Selenium (and PhantomJS)
The view page source will give you the html that was loaded when you made a request for the page (which is most likely what you are getting when you make a request from python.
Since nowadays a lot of pages load things and modify the DOM after the initial html was loaded, you will not get most of the information you want just by looking into that initial response.
To get the inspect element information you will need some sort of web browser to actually go to the page, wait for the information you want to load, and then use it. However you still want to do this in your python script.
Enter selenium, which is a tool for browser automation (mostly used for testing webpages). You can create a python script that opens a browser page and executes whatever code you write for it to do (even wait for a while and search for an after load DOM element!). Your script will still open a browser (which is kind of weird I would guess).
Enter PhantomJS, another library that you can use to have a headless browser to do all your web testing without having to rely on the actual browser UI.
Using selenium only you might achieve your goals, but with phantomjs you can do that in an even cleaner way! Good Luck.
INSPECT ELEMENT and VIEW PAGE SOURCE are not the same.
View source shows you the original HTML source of the page. When you view source from the browser, you get the HTML as it was delivered by the server, not after javascript does its thing.
The inspector shows you the DOM as it was interpreted by the browser. This includes for example changes made by javascript which cannot be seen in the HTML source.
what you see in the element inspector is not the source-code anymore.
You see a javascript manipulated version.
Instead of trying to execute all the scripts on your own which may lead into multiple problems like cross origin security and so on,
search the network tab for the actual search request and its parameters.
Then request the data from there, that is the trick.
Also it seems like you need to be logged in to search on the url you provided so you need to eventually adapt cookie/session/header and stuff, just like a request from your browser would.
So what i want to say is, better analyse where the data you look for is coming from if it is not in the source
I am tring to make a script but I can't really find a solution.
I'm trying to find a string from a website. Hard part here is that I can't use
document.documentElement.innerHTML.search("string")
Since I can't do it locally, I want to use something like this:
var link = "myweb.com"
link.documentElement.innerHTML.search("string")
At the moment, my script generates the link, opens it and closes it: I just need to search the webpage for the word "error."
Javascript run inside of a client's browser won't actually be able to retrieve another website's html for you (unless it is a different page on your own website). You may want to read about the Same-Origin Policy.
You can, however, use javascript as a language to do what you want - just not running inside of a browser. You can use something called Node.js, which is simply a program you can use to run javascript outside of a browser.
What it really boils down to is that if you want to scrape another website (which is the term for what you are trying to do), you typically need to make a scraper that runs on a server, and not a browser.
To be complete, a (probably shady) way to scrape another website is to:
Have your server-side code fetch another website's conents
Use AJAX to pass the contents to a client's browser
Have the client do all of the processing
Optionally send the scraped information back to your server
Here is a good article on scraping with nodeJS.
if you need it just to work on your computer, you can make a userscript that will do this easily. If you want it to work as part of a hosted website, you need a server side solution
I'm trying to build a sample bookmarklet to grab current webpage source code and pass it to a validator. Validator is not a an online website, but a folder with bunch of javascript and html files. I'm trying to open file:///C:/Users/Electrifyings/Desktop/Validator/Main.html file with the help of javascript bookmarklet code and put the source code in the textarea in the newly opened window, but it is not working for some reasons that I'm not aware of.
Here is the sample code with algorithm:
javascript:(function(){var t = document.body.innerHTML;window.open('file:///C:/Users/RandomHero/Desktop/test.html',_self);document.getElementById("validator_textarea")=t;})()
Here are the steps:
Grab current web page source code in a variable.
Open locally stored HTML web page in current or new window or new tab (either way is fine with me, but no luck)
Put the source code from the variable into the validator textarea of the newly opened HTML file.
I have tried above code with a lot of variations, but got stuck on the part where it opens the new window. Either it's not opening the new window at all or it is opening blank window without loading the file.
Would love to get some help with this issue, thanks a lot.
Oh and btw,
Windows 7 x64, Tried IE, Firefox and Chrome. All latest and stable builds. I guess it's not a browser side issues, but something related to javascript code not opening the URI with file:/// protocol. Let me know if any more details are needed. :)
You wouldn't want a webpage you visit to be able to open up file://c:/Program Files/Quicken/YourSensitiveTaxInfo right? Because then if you make a mistake and go to a "bad" website (either a sleazy one or a good one that's been compromised by hackers), evil people on the intarweb would suddenly have access to your private info. That would suck.
Browser makers know this, and for that reason they put VERY strict limits to prevent Javascript code from accessing files on a user's local computer. This is what is getting in the way of your plan.
Solutions?
build the whole validator in to the bookmarklet (not likely to work unless it's really small)
put your validator code up on the web somewhere
write a plug-in (because the user has to choose to install a plug-in, they get much more freedom than webpages ... even though for Firefox, Chrome, etc. plug-ins are basically just Javascript)
* * Edit * *
Extra bonus solution, if you don't limit yourself to a purely-client-side implementation:
Have your bookmarklet add a normal (HTML) form to the page.
Also add an iframe to the page (it's ok if you hide it with CSS styling)
Set the form's target attribute to point to the iframe. This will make it so that, when the user submits the form and the server replies back to that submission, the server's reply will go to the (hidden) iframe, instead of replacing the page as it normally would.
Add a file input to your form - you won't be able to access the file within that input using Javascript, but that's ok because your server will be doing the accessing, not your bookmarklet.
Write a server-side script which takes the form submissions, reads the file that came with it, and then parrots that file back as the response. In other words, you'll have a URL that you can POST to, and when it sees a file in the POST's contents, it will respond back with the contents of that file.
Now that you've got all that the user can pick their validator file using the file input, upload it to your server, your server will respond back with the file it just got, and that file will appear as the contents of the iframe.
And now that you finally have the file that you worked so hard to get (inside your iframe) you can do $('#thatIframe').html() and viola, you have access to your file. You can save the current page's source and then replace the whole page with that uploaded file (and then pass the saved page source back to the new validator page), or you can do whatever else you want with the contents of the uploaded validator file.
Of course, if the file doesn't vary from computer to computer, you can make all of that much simpler by just having a server that sends the validator file back; this could be a pure Apache server with no logic whatsoever, as all it would have to do is serve a static file.
Either way though, if you go with this approach and your new file upload script is not on the same server as your starting webpage, you will have a new security problem: cross-domain script limitations. However, these limitations are much less strict than local file access ones, so there are ways to work around them (JSONP, cross-site policy files, etc.). There are already tons of great Stack Overflow posts explaining these techniques, so I won't bother repeating them here.
Hope that helps.
I'm developing a modal/popup system for my users to embed in their sites, along the lines of what KissInsights and Hello Bar (example here and here) do.
What is the best practice for architecting services like this? It looks like users embed a bit of JS but that code then inserts additional script tag.
I'm wondering how it communicates with the web service to get the user's content, etc.
TIA
You're right that usually it's simply a script that the customer embeds on their website. However, what comes after that is a bit more complicated matter.
1. Embed a script
The first step as said is to have a script on the target page.
Essentially this script is just a piece of JavaScript code. It's pretty similar to what you'd have on your own page.
This script should generate the content on the customer's page that you wish to display.
However, there are some things you need to take into account:
You can't use any libraries (or if you do, be very careful what you use): These may conflict with what is already on the page, and break the customer's site. You don't want to do that.
Never override anything, as overriding may break the customer's site: This includes event listeners, native object properties, whatever. For example, always use addEventListener or addEvent with events, because these allow you to have multiple listeners
You can't trust any styles: All styles of HTML elements you create must be inlined, because the customer's website may have its own CSS styling for them.
You can't add any CSS rules of your own: These may again break the customer's site.
These rules apply to any script or content you run directly on the customer site. If you create an iframe and display your content there, you can ignore these rules in any content that is inside the frame.
2. Process script on your server
Your embeddable script should usually be generated by a script on your server. This allows you to include logic such as choosing what to display based on parameters, or data from your application's database.
This can be written in any language you like.
Typically your script URL should include some kind of an identifier so that you know what to display. For example, you can use the ID to tell which customer's site it is or other things like that.
If your application requires users to log in, you can process this just like normal. The fact the server-side script is being called by the other website makes no difference.
Communication between the embedded script and your server or frames
There are a few tricks to this as well.
As you may know, XMLHttpRequest does not work across different domains, so you can't use that.
The simplest way to send data over from the other site would be to use an iframe and have the user submit a form inside the iframe (or run an XMLHttpRequest inside the frame, since the iframe's content resides on your own server so there is no cross domain communication)
If your embedded script displays content in an iframe dialog, you may need to be able to tell the script embedded on the customer site when to close the iframe. This can be achieved for example by using window.postMessage
For postMessage, see http://ejohn.org/blog/cross-window-messaging/
For cross-domain communication, see http://softwareas.com/cross-domain-communication-with-iframes
You could take a look here - it's an example of an API created using my JsApiToolkit, a framework for allowing service providers to easily create and distribute Facebook Connect-like tools to third-party sites.
The library is built on top of easyXDM for Cross Domain Messaging, and facilitates interaction via modal dialogs or via popups.
The code and the readme should be sufficient to explain how things fit together (it's really not too complicated once you abstract away things like the XDM).
About the embedding itself; you can do this directly, but most services use a 'bootstrapping' script that can easily be updated to point to the real files - this small file could be served with a cache pragma that would ensure that it was not cached for too long, while the injected files could be served as long living files.
This way you only incur the overhead of re-downloading the bootstrapper instead of the entire set of scripts.
Best practice is to put as little code as possible into your code snippet, so you don't ever have to ask the users to update their code. For instance:
<script type="text/javascript" src="http://your.site.com/somecode.js"></script>
Works fine if the author will embed it inside their page. Otherwise, if you need a bookmarklet, you can use this code to load your script on any page:
javascript:(function(){
var e=document.createElement('script');
e.setAttribute('language','javascript');
e.setAttribute('src','http://your.site.com/somecode.js');
document.head.appendChild(e);
})();
Now all your code will live at the above referenced URI, and whenever their page is loaded, a fresh copy of your code will be downloaded and executed. (not taking caching settings into account)
From that script, just make sure that you don't clobber namespaces, and check if a library exists before loading another. Use the safe jQuery object instead of $ if you are using that. And if you want to load more external content (like jQuery, UI stuff, etc.) use the onload handler to detect when they are fully loaded. For example:
function jsLoad(loc, callback){
var e=document.createElement('script');
e.setAttribute('language','javascript');
e.setAttribute('src',loc);
if (callback) e.onload = callback;
document.head.appendChild(e);
}
Then you can simply call this function to load any js file, and execute a callback function.
jsLoad('http://link.to/some.js', function(){
// do some stuff
});
Now, a tricky way to communicate with your domain to retrieve data is to use javascript as the transport. For instance:
jsLoad('http://link.to/someother.js?data=xy&callback=getSome', function(){
var yourData = getSome();
});
Your server will have to dynamically process that route, and return some javascript that has a "getSome" function that does what you want it to. For instance:
function getSome(){
return {'some':'data','more':'data'};
}
That will pretty effectively allow you to communicate with your server and process data from anywhere your server can get it.
You can serve a dynamically generated (use for example PHP or Ruby on Rails) to generate this file on each request) JS file from your server that is imported from the customers web site like this:
<script type="text/javascript" src="//www.yourserver.com/dynamic.js"></script>
Then you need to provide a way for your customer to decide what they want the modal/popup to contain (e.g. text, graphics, links etc.). Either you create a simple CMS or you do it manually for each customer.
Your server can see where each request for the JS file is coming from and provide different JS code based on that. The JS code can for example insert HTML code into your customers web site that creates a bar at the top with some text and a link.
If you want to access your customers visitors info you probably need to either read it from the HTML code, make your customers provide the information you want in a specific way or figure out a different way to access it from each customers web server.