Is there a way to use jsdom in a foolproof sandbox? - javascript

I'm using jsdom to load web pages with my Node.js application.
Sometimes, I don't get the full DOM because some web pages use scripts to load their content dynamically after the onload event is triggered.
jsdom deactivates the execution of these scripts by default because it would cause a security flaw, as stated in their documentation:
The jsdom sandbox is not foolproof, and code running inside the DOM's < script>s can, if it tries hard enough, get access to the Node.js environment, and thus to your machine
I was wondering if there was a way to make it foolproof using some workarounds? I'm kind of new in Node.JS development and as it is a single threaded environment, I'm not sure how I can create a secured sandbox.

NodeJS does not have this kind of security out of the box. If you'll be running untrusted, 3rd party code in your Node engine, you'll need to use operating system tools to isolate and secure it.
Things you could look into:
Using a chroot jail.
Using a virtual machine.
Using a Docker container.
Using the jailed sandbox library (haven't used it myself, but it has good reputation).
Do some research on these approaches and their limitations, and see which suits your purpose best. A virtual machine will offer the greatest isolation and least chance for error, I think, but it has the greatest overhead. All approaches could be made to work.

Related

Node.js Automate Login and Upload

I currently have a node.js script that automatically creates a group of files and then zips them ready for being uploaded on a site. I'm trying to add one extra piece of functionality to the script that will log into the site and upload the file itself.
I've done some reading around and found a lot about headless browsers but not sure if that's the right path to go down as they seem to rely on other applications like chromium and they're focused on testing sites.
Does anyone know where I should start looking?
In my current project I am using the following library from Google, puppeteer. I personally found it to be very easy to use, and it even provides access to the dev protocol that Google Chrome has.
I've done some reading around and found a lot about headless browsers but not sure if that's the right path to go down as they seem to rely on other applications like chromium and they're focused on testing sites.
Yes, they are often used for testing, to assure that the correct things are rendered on screen etc. However, in many scenarios, like yours, the use of a headless browser to interact with a website is totally legit in a non-testing scenario.

How to test code that uses browsers APIs

I'm using Node for running my unit-tests.
I have a JavaScript module run in the browser I'd like to test.
My code is "isomorphic", i.e. it avoids language features not available in Node, like exports.
But it uses pure browsers APIs: XMLHttpRequest, FormData and File.
I have found Node's implementations for each of them.
But the one of XMLHttpRequest does not support upload.
So I'm looking for the simplest way to unit-test this code in an environment with these APIs.
The code does not need DOM or other browsers APIs, "only" these three.
I've already used PhantomJS for other needs but:
this will create another test workflow (minor issue),
it supports an older JavaScript version and it would force a complete rewrite of the code to test (major issue),
the code has a lot of NPM dependencies that probably won't be compatible (blocking issue).
As the code is Browserified all these issues may disappear but before going along this way I'd like to be sure.
Is there any chance to get it work with PhantomJS, CasperJS or the like?
Which other alternatives are available?
This is not how you test code that runs in a browser. If it runs in a browser, it needs to be tested in a browser.
You need to look into solutions based on the webdriver spec. The big hairy monster in this ecosystem is Selenium. I'm currently researching this topic because of some issues we've had with using selenium-server. You should also look into Nightwatch and Leadfoot. Webdriver.io is the first recommendation a lot of people recommend, as its a node-based client that wraps (poorly) around Selenium. But the documentation is all over the place and we've run into frequent bugs using it.

Getting a greasemonkey script to interact with a running process?

Say I have a local daemon running on my machine, and I want to talk to the daemon from a Greasemonkey script. I know that one of the core concepts of site JavaScript is that it is isolated from everything else, but I was wondering if there was a workaround.
One of the ideas I had was to use a WebSocket to send data to the local daemon but they are only available on Webkit based browsers.
Three possibilities:
Give the daemon web-server capabilities and then use normal AJAX techniques via GM_xmlhttpRequest() to interact with it.
Instead of a GM script, make a Firefox add-on. Add-ons can interact with the local system in much more dangerous ways than a GM script can.
I do not recommend this last approach, but include it for completeness... It may be possible for the daemon to read and/or write Firefox cookies or localStorage. GM scripts can also, but XSS restrictions apply here (unlike with GM_xmlhttpRequest()).
You could get the daemon to accept HTTP requests, which are done very easily using JavaScript? I think you are going to need to improve the daemon here, rather than the script itself - JavaScript is very secure, and Greasemonkey just takes that a step further.

Using node.js in production?

To those in the business of web development, is node.js ready for production use? How reliable is it?
Node.js is absolutely ready for production in terms of things like system stability, power, and performance. However, some features might still change before version 1, and there are a lot of mature tools on other platforms that don't quite exist yet for node (though new things are popping up on node every day).
Several businesses are already using node.js in production. For a few, check out https://github.com/joyent/node/wiki/Projects,-Applications,-and-Companies-Using-Node
outloud.fm uses it, seems to work pretty well
I don't run that site though, so I can't speak from personal experience
Does it do what you want?
Does it run like you want it to without crashing for your needs?
Then it's production use ready.

Executing JavaScript with Python without X

I want to parse a html-page that unfortunately requires JavaScript to show any content. In order to do so I use a small python-script that pulls the html-code of the page, but after that I have to execute the JavaScript in a DOM-context which seems pretty hard.
To make it even harder I want to use it in a server environment that has no X11-server.
Note: I already read about http://code.google.com/p/pywebkitgtk/ but it seems to need a X-server.
You can simulate a browser environment using EnvJS. However, in order to make use of it, you will have to embed some kind of JavaScript runtime (e.g. Rhino) in your program (or spawn one as an external process).
You could try using Xvfb to have a fake frame buffer, so you won't need to run X11 (though it may be a dependency of Xvfb on your system). Most rendering engines don't have a headless mode, so something like Xvfb is necessary to run them. I used this technique successfully using XULRunner to navigate web pages, though not from python.
I'm still trying to figure this out myself, so take my answer with a grain of salt.
So far, I found http://blog.motane.lu/2009/06/18/pywebkitgtk-execute-javascript-from-python/, which describes the use and the quirks of Pywebkitgtk by someone who has similar needs to what we do.
Later, however, the writer of that blogpost discovered that he can't get it to work with Xvbf, so he hunted some more and found a Qt webkit (possibly in Qt itself, if I understand correctly) http://blog.motane.lu/2009/07/07/downloading-a-pages-content-with-python-and-webkit/. Apparently it's a much better solution than PywebkitGTK.
Naturally, I'll be looking into the other solutions offered here--but I wanted to bring up the Qt solution, because to me, it seems the most likely candidate for what I want to do...and if not, then perhaps it will be for someone else, looking for an answer to this question! :-)
I use VNC or Xvfb for this purpose, combined with Firefox. After experimenting with the two, I settled on XTightVNC. We use it to create screenshots on demand for various test purposes. It's nice to use one of these because you're executing it in an actual browser, same as a user would be (though most users probably won't be using the same OS as your server).
The handy thing about using VNC is that you can connect remotely to set up and test the browser when needed.
This might help: http://code.google.com/p/pyv8/

Categories

Resources