Images, OCR and greasemonkey - javascript

I have a webpage on which I'm doing tons of cool stuff using greasemonkey. I'm actually pretty terrible at greasemonkey/javascript but I'm learning quick. Every once in a while I get a really terrible CAPTCHA validation which I want to automate. I have a command-line utility that can do this with local and remote files, but not with the file in question because It's behind a session..
tim#g2sv ocr-thingy my-image.png
135189
Works like a charm. I'm looking for a way to pass the image from the website (running the script) to the utility. I don't care how complicated it has to be, but at the moment I'm clueless. I've been thinking about providing the utility as some type of REST-like API for greasemonkey to interface with but I don't know how to supply an image to an API other than by passing the URL which doesn't work (as stated before). Greasemonkey (luckly) doesn't allow you to download a file an run software on my PC so the most straightforward option is out.
I'm open to all suggestions it's a fun side-project for me so the crazier the better ;) I would prefer an option that silently runs in the background (doesn't take or require focus like mouse and keyboard controlling software, java robot).
Maybe you're interested to know I'm not trying to brake any laws or anything, the owner of the website knows I'm doing this and was interested to see if I was able to do it!

"I don't care how complicated it has to be, but at the moment I'm clueless."
Well, it's possible, but it is an "involved" process. Here's the high level steps:
Approach 1:
Forget about Greasemonkey; write a Firefox add-on. Add-ons can interact with the file system and probably can get the image data without having to use Flash or Canvas.
Approach 2:
Use Greasemonkey and JS to send the image data to your server (using GM_xmlhttpRequest()). This is not simple, search around for how to do that.
Your server can be your own local machine running something like XAMPP or any one of the free web-application servers.
Your server uses PHP (or Coldfusion, or C#, or Python, etc.) to run your OCR program and do whatever you want with the results, including AJAXing them back to the GM script.

Related

How would a site, like w3schools, run many languages locally?

I'm working on a portfolio website, and I thought it would be cool to be able to run my scripts in python, java, js, rust, etc locally in the browser
Here's an example of this from w3schools
I'm guessing they have a translation layer to javascript, but in the event that they're actually running C++ here, does anyone know how this might be done or generalized?
They do not translate those languages to JavaScript and then run it on the client - that would be a herculean task to get right, especially for multiple languages. Rather, they (and the many similar sites like try it online) take the source code text from the client's browser, and run it on the server, and respond to the client with the result from the server.
To achieve something like this, you'd need
A backend (in any language)
Compilers/interpreters/etc for all languages you want to support
A way for the backend to programatically call them with the user's input (which is often pretty easy)
For proof that w3schools takes the result from a server, rather than running it locally, examine the network tab of your browser, and you'll see it:

IS it possible to open .exe files inside a website that is displayed on the webpage itself instead of just opening it on the computer

Is it possible to add a load of programs to a website that can be opened within the webpage itself. for example say the animation software blender. Would it be possible to add all the blender files to the website files and execute them within the webpage creating the GUI inside a window sized set of parameters
Updates Based on Inputs: Amazon recently released a Product/Service named Appstream 2.0 which does exactly what you had asked for, It is a proprietary platform though and has its own learning curve to even set it up. It does work well with solidworks, so it should work with most of other applications too.(Solidworks are providing the demo of their software via this service as of now(Nov18)). Also it uses a specific technique which doesn't require a live video stream and instead streams just the changes in images sprites which in itself is pretty interesting.
Another alternative is cameyo which is also a paid service.
As of opensource alternatives, I don't yet know of such a software.
-----------------------------------------------------------------------------------------
The closest you will ever get to this thing is flash and that too is NOT an exe.
About having a GUI for a software is a different thing, depends upon whether that specific software is exposing an API and maybe a service/Port that its listening to. And maybe making a script based client side GET request to that specific service on localhost, maybe.
Anyways, Wanting to run an executable on the client side is totally defeating the purpose of a website. Generally it is the other way round like say, you want to provide a service via website to compress a picture. You can pipe the submitted data to that respective software and then return the result as a response.

Add enhancements to a website (whether it be by C#, Chrome Extensions, etc.) -- Not sure what would work?

There is a website that I visit often... let's call it www.example.com. And, I am able to interact with parts of this website. The interactions send XMLHttpRequest and get a response back through Javascript, jQuery I believe.
I'm not sure what technology will let me achieve what I want to do, and where to start. Basically, I want to add additional options/shortcuts that the site does not provide. I thought about maybe using a macro, but trying to use macro recording software is just a pain in the butt.
I inspected (using Google Chrome's Developer Tools) the XMLHttpRequest being sent back and forth and I noticed that it is simple JSON messages. I figured the best way to add enhancements to the site without waiting for the actual owners of the site to do so would be to simulate the website sending/recieving these XMLHttpRequest/Response and making additional adjustments to the DOM to provide extra shortcuts.
I don't want to interfere with the original site's functionality though... ie if I send a request and receive a response I want both the original script and my script to process the response. So, here is where I'm stuck... I'm not sure whether to go along the paths of creating a C# application or a Google Chrome extension (I use Google Chrome) or something else alltogether. Any pointers on what dev tools/languages will give me the ability to do what I want would be great. Thanks!
Chrome has built in support for user scripts. You can use these to modify the page as you see fit and also to make requests. Without more details regarding what exactly you want to do with these AJAX request it's hard to advise further.
I'm not 100% sure what your question is, but as I understand it, you want to be able to make changes to a certain website. If these changes can be done with js, i would recommend Greasemonkey for Firefox. It basically lets you run a custom script when you are visiting a certain webpage/domain. You can be as specific as you want about which pages use the script. Once your script loads jQuery, it is really easy to add any functionality.
https://addons.mozilla.org/en-US/firefox/addon/greasemonkey/
You can find pre-written scripts for tons of sites here:
http://userscripts.org/

Webserver virtual network

It's quite hard for me to figure out if this sort of thing has ever been implemented. I want to look for any libraries that may exist so I don't go about reinventing the wheel.
I have this idea of having a web app that connects the people who are on the site. Every user that is connected to the site may communicate to another user also on the site via the server. So the protocols will be implemented in JavaScript, and the server simply helps to identify users, and just echoes data to enable the communication. For instance I can use this to implement my game networking ideas in javascript, and easily test them without having my testers download any executables, they can just log onto the site.
Now obviously this isn't going to be an effective architecture for any kind of serious application. But I think if I can get it working I could build really cool networking apps without having any sort of download.
What I'm thinking about is using ajax for client->webserver and webserver->client (Comet?) and I can code up the webserver echo bit with PHP or a cgi script. And then I can implement an entirely separate protocol in JS that the webserver does not care or know about.
The reason for having the webserver echo everything is because I don't want to use java or anything else that I can open up sockets in. Why make it harder for me? Because I can and because I happen to be really enamored with javascript at the moment. It's the only web technology I trust. Screw java applets.
Does this make any sense to anyone? Am I crazy?
Don't know about the crazy part (there's a proposal at area51, go check that) but it's definitely doable.
You could use a plain old XMPP server and a javascript XMPP client (there are libraries - for example strophe)
You could do it with AJAX and a PHP backend: Making an AJAX Web Chat
You could use the fancy Websockets from HTML5: Start Using HTML5 WebSockets
You could use some existing component if you can find any (I couldn't find any I would use)
Cheers :)

Writing a non-GUI bot using Mozilla Framework

I'm looking for a way to write a non-GUI bot using Mozilla Framework. The bot should be able to work like normal browser (automatically download relevant JS files, make XMLHTTPRequests, run JS operations, modify DOM), except no GUI will be needed.
I wonder if it is possbile to build XULRunner without X, GTK/KDE (without any GUI dependencies), as I will run the bot on FreeBSD server 6.4.
It may sound a bit weird but I need a bot with capacity to operate like browser, runs JS, modifies DOM, submit forms running on non-GUI environments.
I've looked into other browsers such as Lynx, Links, Hulahop, Chrome V8 engine, WebKit JavascriptCore but yet to find desirable output.
It's a part of school project, thesis. We will use to observe price change of budget airlines and after one year long data collection, we need to deduce pricing strategy and customer behavior. It is a serious Final Year Project.
Any hint or help is greatly appreciated! Thank you in advance!
Regards.
You should be able to make progress with selenium. It's a record/test/play tool but its core is manipulating the DOM.
Update from Grundlefleck's comment: As for launching the actual tests there is selenium remote-control, which allows you to write your tests in Java, Ruby, plain HTML and other possible drivers.
Yes, it is possible (but it might very well require LOTS of code changes).
No, I do not know any of the details.
I would not recommend this approach for your purposes. From your comment, it sounds like you are trying to scrape webpages. If you really need to use JavaScript, you can use a stand-alone JavaScript-engine (Mozilla's is available here). Otherwise, I would use Beautiful Soup with Python or Twill. You might also want to read this question.

Categories

Resources