Get labels for images using Google Custom Search api - javascript

If I drag an image to google and search, I get results of other images that are similar.
I'd like javascript or C++ or C# code to send an arbitrary image and get back the top 10 similar images along with the labels (if any) that are associated with each.
I'm even ok with pre-uploading "my" image so it accessible via url. So I'd like to do:
customsearch.google.com?myurl=123.com/snap.jpg&limit=10
And i'd like to get back:
results:[
{"a.com/2003nissan.jpg":["nissan","red","2003","car"]},
{"cd.com/maxima_2001.jpg":["Maxima","2001","dealer","used"]},
{"b.com/fordf150.jpg":["f-150","truck","2007","driver"]}
]
That is of course the simplified example. I'm fine with wading through all the api and getting the results in any format, but what I need to know is
Is this possible?
If so, what basic steps do I take (eg, get an api
key, load X.js, ?)
Any example or tutorial would be greatly appreciated.
Finally, if Bing, or anything else, can do the same, I'm not prejudiced.

While not a webservice (yet) OverFeat is a fully trained deep neural network. It's available for free as a command line app. You just pass it the image, and it tells you the top 5 labels it thinks are related. It was trained on a database of I think 1 million images, and it has 1000 different labels that it chooses from.
It even has an example of running it with a webcam. I'll be trying it out soon, but I do believe this will give pretty good results.

Related

How to load HTML document including js.download elements with Python

I am trying to collect data from a web page displaying search results about cars on sale.
The structure of the online document is not too complex and I was able to single-out the interesting bits of information based on a certain attribute data-testid which every returned car record possesses.
I can find different interesting bits of information like price, immatriculation year, mileage and so on, based on substring characteristics of this attribute.
I use beautifulsoup to parse the HTML and requests to initially load the HTML document from the web.
Now, here's the issue. In a way that I cannot predict, nor find a logic to, the HTML returned by requests.get() is somehow incomplete. In a page of 100 results, which I can see when I inspect the page online (and I can track there are 100 data-testid fields with that specific substring for price, 100 for mileage and so on...), the HTML returned by requests.get(), in the same way as the one I can obtain with a 'save-as' operation from the page itself, only contains a portion of these fields.
Also their number is kind of unpredictable.
I started asking just why this discrepancy between online and saved HTML.
So far no full response, but in the comments the hint was that the page was kinda loading dynamically through JavaScript.
I was happy to find that saving the page to disk, with all the files, somehow produced the full HTML I could then parse without further issues.
However, my joy only lasted for that specific search. When I needed to try a new one, I was suddenly back to square one.
With further investigation, I came to my current understanding, which is at the origin of the question: I noticed that, when I save the online page as 'Webpage, Complete' (which creates an .html file plus a folder), this combo surely contains ALL records. I can say that because if I go offline and double-click on the newly saved html, I can see all records which were online (100 in this case).
However, the HTML file itself only contains a few of them!!!
My deduction is, therefore, that the rest of the records must be 'hidden' in the folder created at saving time, and I would tend to say it could be embedded in those (many) *.js.download files:
My questions are:
is my assumption correct? The other records are stored in those files?
if yes, how can I make them 'explicit' when parsing the HTML with beautifulsoup?
UPDATE 07/05
I've tried to install and use requests_html as suggested in the comments and in this answer.
Its render() method looked promising, however I'm probably not really understanding the mechanisms explained in requests_html documentation here (the render JS portion) because even after the following operations (pseudo-code)
from requests_html import HTMLSession
session = HTMLSession()
r = session.get(URL)
r.html.render()
At this point, I was hoping to have 'forced' the site to 'spit out' ALL HTML, including those bits which remain somehow hidden and only show up in the live page.
However, a successive dump of the r.html.html into a file, still gives back the same old 5 records (why 5 now, when for other searches it returned 12, or even 60, is a complete mystery to me).

Is there a npm package, or web api, for reading specific parts of an image?

I'm adding a new function to my node express server that will allow me to upload a drivers ELD daily log and get from that image / pdf the time driven, start time, end time, lunch, etc..
I've looking into converting the pdf into a csv / json / html, but the issue there is that it's an unlabeled mess. So I am figuring that trying to somehow read and create a chart similar to the chart already on the eld log.
ie. Reading it would be segmented by say 15 minutes, or however many pixels.
IF line exists in segment call proceed and log data ELSE check segments "SB" "D" "ON" then recursively call
In the example shown above, this driver went on duty at 6:45am.
The files are provided in a pdf format, and I am having issues extracting the data and have it be useful / labeled.
UPDATE: Thinking about it a bit more, this solution might be pretty resource costly, especially if done on the server end, ie. chopping up the image / leaving it in a buffer and reading off it... Maybe it would be better to just try and make sense of the garbage parsing from pdf to something else...
UPDATE 2: I may try and use Tesseractocr depending on how it outputs data.
Using on a page like this:
I think the term you're looking for is OCR (optical character recognition). That's the name of the technology for converting text on images into actual text to work with. Once you have that, decoding the text should be easy if it's in a standard format. There are plenty of OCR libraries for Node: https://www.npmjs.com/search?q=OCR No need to reinvent the wheel and try to build your own OCR system :)

Can I adjust or use a given (JavaScript)chart to create a different/new chart?

I am wondering and sadly I don't know where else to ask the question.
I want to make a interactive chart using the top 5 downloaded movies.
And the current box office top 5. How i'm going to make this interactive is beyond me yet.
What I would like to know first is, if there is any way to manipulate or change the given chart from Mojo.
Right now I use:
http://boxofficemojo.com/about/data.htm
With the code given by Mojo:
<script type="text/javascript" language="javascript" src="http://www.boxofficemojo.com/data/js/wknd5.php?h=myclass1&r=myclass2"></script>
This just shows me the top 5 (see example at the previously provided URL).
Is there any way that I can use that chart and make a, pie chart or any other chart or graph out of it?
And if this is possible will it still update with money and new movies every time the site does, like the given chart with the javascript code does right now?
Hopefully someone can help me or maybe has a different way of making a chart/graph out of this data (the box office top 5).
So to be more clear. I would like to create my own chart/graph with the box office top 5 data. And I would also like it to be "live" and update itself when ever the top 5 changes or the numbers change.
There for it doesn't seem like the best idea to create my own Json with data since it won't update when the data updates without me making changes into the Json file.
The interactive part doesn't matter yet.
You want to create a chart with live data.
First thing is getting the data.
The script you are including basically creates the chart for you. So you don't have direct access to the data itself, so it would be a little tricky to use that.
You should google around for a,preferrably REST API, that provides you with data on-demand.
1. Find a suitable REST API and perform calls to fetch data
You perform HTTP requests to the API and it provides data - so on page-load for example you will request data from this API. So each time a user loads your website a call is made and it fetches the most recent data on movies.
2. Parse the data, perform calculations and built your own custom JSON
Then you need to parse that data - Take from this raw feed the data you actually need - perform calculations if needed(from what I take from your question, it's gonna be simple elementary school maths), then build you own custom JSON data structure that can be used to display the data.
3. Visualize the JSON
Include a JS library that renders Charts. Chart.js, is an excellent example and it's dead-simple to provide a JSON and render a really nice looking chart.
That's how it ,usually, goes.

Run Database Stored RegEx against DOM

I have a question about how to approach a certain scenario before I get halfway through it and figure out it was not the best option.
I work for a large company that has a team that creates tools for the team mates to use that aren’t official enterprise tools. We have no access to the database directly, just access to an internal server to store our files to run and be able to access the main site with javascript etc (same domain).
What I am working on is a tool that has a ton of options in it that allow you to select that I will call “data points” on a page.
There are things like “Account status, Balance, Name, Phone number, email etc” and have it save those to an excel sheet.
So you input account numbers, choose what you need and then using IE Objects it navigates to the page and scrapes data you request.
My question is as follows..
I want to make the scraping part pretty Dynamic in the way it works. I want to be able to add new datapoints on the fly.
My goal or idea is so store the regular expression needed to get the specific piece of data in the table with the “data point option”.
If I choose “Name” it knows the expression for name in the database to run again the DOM.
What would be the best way about creating that type of function in Javascript / Jquery?
I need to pass a Regex to a function, have it run against the DOM and then return the result.
I have a feeling that there will be things that require more than 1 step to get the information etc.
I am just trying to think of the best way to approach it without having to hardcode 200+ expressions into the file as the page may get updated and need to be changed.
Any ideas?
IRobotSoft scraper may be the tool you are looking for. Check this forum and see if questions are similar to what you are doing: http://irobotsoft.org/bb/YaBB.pl?board=newcomer. It is free.
What it uses is not regular expression but a language called HTQL, which may be more suitable for extracting web pages. It also supports regular expression, but not as the main language.
It organizes all your actions well with a visual interface, so you can dynamically compose actions or tasks for changing needs.

XML File Parse in javascript how to. Large File maybe use SAX?

G'day All,
I am pulling my hair out, getting headaches and my eyes hurt. I have been hither and thither and I seem to get whither.
This will be my first experience with xml and would really want to get this working. It is a large file. Well large in my eyes +-5mb. I can not imagine that this file would be loaded into memory to process. Users will get a bit peeved with this.
Basically we are using a 3rd parties site to do our ecommerce. So we have no access to the database other than via the admin area.
What we want to do is make sure that there is no stuff ups when it comes to addresses. Therefore we got this xml file put together listing all postcodes with areas and states:
<?xml version="1.0"?>
<POSTCODES>
<PostCode id="2035">
<Area>2035 1</Area>
<Area>2035 2</Area>
<Area>2035 3</Area>
<State>NSW</State>
</Postcode>
<PostCode id="2038">
<Area>2038 1</Area>
<Area>2038 2</Area>
<Area>2038 3</Area>
<State>NSW</State>
</Postcode>
<PostCode id="2111">
<Area>2111 1</Area>
<Area>2111 2</Area>
<Area>2111 3</Area>
<State>NSW</State>
</Postcode>
</POSTCODES>
Someone suggested SAX but suddenly died when asked how? The web is not helping unless I am not looking properly. I see a lot of examples. Either they do not show how to read the file but rather do it from a textarea or the example is in java.
What do we want? User enters a post code of 2038. We want to go to the javascript with that data and have returned to us all the suburbs that full within that post code.
Anyone out there that can please tell me what to download and how to use it to get what i need?
Please, please, please. It is hard to see a grown man begging and crying but I am.
Sounds like you want a script on the server which will suggest suburbs based on the users postcode selection? You could use jQuery's ajax functionality to do this.
You might also be able to use jQueryUI's autocomplete control to parse XML and make suggestions: http://jqueryui.com/demos/autocomplete/#xml
It's also possible to do this entirely in javascript without any script on the server side, but it would be pretty slow at loading if the XML file is 5MB. You might be able to get a significant reduction in file size thought by gzipping it before transmission from the server.
If you need to parse this in Javascript, you can use jQuery.
http://www.switchonthecode.com/tutorials/xml-parsing-with-jquery

Categories

Resources