I am attempting to create an html document parser with Python. I am very familiar with jQuery and I would like to use its traversing functionality to parse these html files and return the data gathered with jQuery back to my Python program.
Is there any way to use javascript scripts through Python? Or is this just a pipe dream?
You might not need to do this. There is a Python module called PyQuery that directly emulates the API for jQuery. It works exactly as you would expect it to in almost every way. Give it a shot!
jQuery itself does not contain an HTML/XML parser at all. It uses the browser to do all its parsing. Thus, even if you figure out how to run Javascript from Python, it won't do you any good.
jQuery doesn't parse HTML - it traverses the DOM. You'd need an entire rendering engine (e.g. WebKit) if you wanted to use jQuery to work on the HTML.
Well from your question it seems you will require python-javascript bridge like
Pyjamas http://pyjs.org/ , PyPy http://codespeak.net/pypy/dist/pypy/doc/ , skulpt http://www.skulpt.org/ . Or my personal favorite PyXPCOM http://pyxpcomext.mozdev.org/ it installs a python backend directly into the firefox browser and using xpi stubs one can make bidirectioal calls ( mind you very complicated )
Related
I am a beginner in Javascript, I decided to practice Javascript by problem solving using it, I found an online judge that accepts Javascript V8 4.8.0 code.
So, I searched online to get that version of Javascript V8 on my machine, but I couldn't find any easy way, All the pages were explaining how to build it, and it seems to be a process that I don't need to go through.
Is there an easy way to compile and run command line apps written in Javascript on my machine?
Note: I don't want to use node.js because I tried using it's I/O and
as a beginner I think it is complex in some way.
Update: I found that package manager pbox.me which provides a version of V8 JavaScript Engine and I managed to install it.
Yet another problem appeared: whenever I try to run a js file writing d8 myfile.js in command line nothing happens as if it is an empty program, knowing that I tryied to d8.exe file and it is working, and I made sure the PATH is inserted in the environment variables.
What am I doing wrong?
The easiest way to get started with JavaScript is probably to use it in a browser. You can type simple things directly into the browser's JavaScript console (check the menu); or you can embed your code in a simple HTML document.
If you want, you can even pretty easily implement the readline()/print() functions, so you can pretend to be doing stdin/stdout based I/O: just read from an array of strings, and send output to console.log (or create DOM nodes if you want to be fancy and/or learn how to generate dynamic website content by hand).
Side note: V8 4.8 is severely outdated, don't use it to execute code you haven't written yourself.
I've written / modified a couple custom snippets via the Ruby bundles (Ugh, Yuck!) but I'd like to get a little more complex...
a) Can I parse / modify the current document? (usually JavaScript)
b) Can I get at the tree of project files and read the contents?
c) Is it possible to write commands in not-Ruby? JavaScript or Python for example?
Specifically, I'd like to write something that automatically manages imports (something I miss from ActionScript editors) to cut down on manually typing:
var MyClass = require('path/to/MyClass');
and then manually sorting them
over and over and over...
You can do anything you like that you could do using Java/Eclipse. Unfortunately, no other languages are supported yet natively (i.e. JavaScript or Python), though you might try looking at some of the related pages here: http://code.google.com/p/jrfonseca/wiki/PythonMonkey
To your points, I would investigate https://wiki.appcelerator.org/display/tis/Interacting+with+Eclipse+or+Java as that will give some information on how to call Java classes from within Ruby.
For projects, I would look at the navigator framework.
For parsing/AST, I would suggest looking at the JavaScript parser/editor in the Aptana source code on github: https://github.com/aptana/studio3/tree/development/plugins/com.aptana.editor.js
I will try to summarize the best I can what I need and what is blocking me to do it.
What I need
I need to append script tags to the head of an html file, BUT during my "build" process. I'm using ant as a automation build tool, and I would like to avoid placing tokens in my HTML file to then replace it with ant, or also I will like to avoid any midway solution using regular expression matching. Waht I would really like to use is plain javascript running through rhino javascript interpreter and exceute it easily from an ant task, and finally add the script tag dinamically.
What is blocking me?
I really don't know anyway that I can load an html file without issuing a GET or a POST HTTP methods. Cause I'm building my code from source I don't have it under an HTTP server, so I wish I could find someway to load the HTML DOM into a javascript variable and then write it with the new script tag that I need.
I need all the DOM manipulation features without having a browser that renders the HTML file.
Best!
Demian
From what I understand you would like to have a valid DOM object from an HTML file, as if you were running in a browser, but do it "offline"? e.g. be able to do a jQuery selector on the DOM and edit it?
You can always start by looking into an embeded open source browser (http://www.chromium.org/)?
But I would look into node.js, see this question Can I use jQuery with Node.js?
This will allow you to do DOM traversing and modifications without a browser as far as I understand
It is possible to run JavaScript with Python? There are any library that makes this possible?
I need to execute some JavaScript, I know that this is possible with some Java libraries, but I prefer Python.
Can someone give me a clue on this?
Best Regards,
You can check spidermonkey
If you already use PyQt and QWebView in it, displaying custom html, the function evaluateJavaScript of QWebFrame may be useful for you:
# Python
def runJavaScriptText(self, jsText):
jsText = 'hello()' # just to fit javascript example
self.webView.page().currentFrame().evaluateJavaScript(jsText)
// Javascript
function hello() {
alert('hello');
};
Using spidermonkey would give you a tightier integration of your code, but as a workaround, you could make the javascript get run in a browser using Selenium remote-control:
http://seleniumhq.org/projects/remote-control/
(there are ways of doing that without needing a "physical" display for the browser, using VNC servers, for example)
Does it need to be CPython ?
And does the Javascript need a browser client environment ?
If not, you can probably call Rhino from Jython.
( Note also that Jython release is only at 2.5.2 )
Yes you can execute JavaScript from Python. I find the easiest way is through Python's bindings to the webkit library - here is an example. In my experience selenium and spidermonkey are harder to get working.
So we're writing a full-text search framework MongoDb. MongoDB is pretty much javascript-native, so we wrote the javascript library first, and it works.
Now I'm trying to write a python framework for it, which will be partially in python, but partially use those same stored javascript functions - the javascript functions are an intrinsic part of the library. On the other hand, the javascript framework does not depend on python. since they are pretty intertwined it seems like it's worthwhile keeping them in the same repository.
I'm trying to work out a way of structuring the whole project to give the javascript and python frameworks equal status (maybe a ruby driver or whatever in the future?), but still allow the python library to install nicely.
Currently it looks like this: (simplified a little)
javascript/jstest/test1.js
javascript/mongo-fulltext/search.js
javascript/mongo-fulltext/util.js
python/docs/indext.rst
python/tests/search_test.py
python/tests/__init__.py
python/mongofulltextsearch/__init__.py
python/mongofulltextsearch/mongo_search.py
python/mongofulltextsearch/util.py
python/setup.py
I've skipped out a few files for simplicity, but you get the general idea; it' a pretty much standard python project... except that it depends critcally ona whole bunch of javascript which is stored in a sibling directory tree.
What's the preferred setup for dealing with this kind of thing when it comes to setuptools? I can work out how to use package_data etc to install data files that live inside my python project as per the setuptools docs.
The problem is if i want to use setuptools to install stuff, including the javascript files from outside the python code tree, and then also access them in a consistent way when I'm developing the python code and when it is easy_installed to someone's site.
Is that supported behaviour for setuptools? Should i be using paver or distutils2 or Distribute or something? (basic distutils is not an option; the whole reason I'm doing this is to enable requirements tracking) How should i be reading the contents of those files into python scripts?
The short answer is that none of the Python distribution tools is going to do what you want, the exact way you want it. Even if you use distutils' data_files feature, you're still going to have to have your javascript files copied into your Python project directory (i.e., somewhere under the same directory as your setup.py.)
Given that, you might as well just copy the .js files to your package (i.e. alongside mongofulltextsearch/init.py) as part of your build process, and use package_data or include_package_data=True.
(Or alternatively, you could possibly use symlinks, externals, or some such, if your revision control system supports those. I believe that when building source distributions, the Python distribution tools convert symlinks to real files. At least, you could give that a try.)