Is it possible to use gdata javascript or any other javascript api to retrieve the list of blog posts based on labels?
My usage case:
Each blog post has a label that means its category. Some posts are labelled with 'Summary' and the category it belongs.
I want to be able to display the summary of MyCategory(Label) on the label's page. e.g. http://myblog.blogspot.com/search/label/MyCategory
Is it possible to retrieve the list of blog posts matching 'Summary' and 'MyCategory'?
UPDATE:
more details:
it is a blog I have edit access to
the js can be placed on google sites or inside the blog html
the blog has 18k+ posts, so listing all posts and filtering is not an option.
myblog.blogspot was referring to any blogger, not the actual one. I was just talking about label-based blogger filter.
I've read and re-read this question and blogspot-link a couple of times. It's difficult to understand.
I think it would help if you gave some more information:
where do you want to place this javascript? I mean: is it going to be
placed on the same blog? I'm asking because this determines cross-site security requirements.
I have a strong feeling this is actually a question where you want to a cross-domain request (load data from a different domain|server (blogspot.com)) that you do not control, otherwise you'd be playing with 'Access-Control-Allow-Origin' on the server-side.
Will this script be located in a online or local (x)html source?
Could you please provide a more elaborate example (or sample) of an existing list that contain's this labels, or do you want to crawl a blog like a spider|index-robot?
If the above assumptions are correct, the first part of your problem is retrieving cross-domain data (which is hard nowadays using simple solutions like XMLHttpRequest aka AJAX).
You could then start looking at some own server-side scripts (php) to get this data and send it (pre-parsed) to your browser-application (effectively this is simply a proxy located on your own domain).
I have also heard of using a java-object (or silverlight? or flash which nowadays also suffers from cross-domain-security restrictions), to get around this modern day cross-domain security.
You could then embed one or more of these objects (that retrieve the source) and communicate with them through javascript. A variation of this technique is also often used for cross-browser multiple file-uploads.
There is a big chance there is already a solution (object) to this part of your problem here on StackOverflow.
If you fix this first part of the problem, the second part of your problem simply comes down to parsing (regex for example) your retrieved 'label'-data, building new links from them to retrieve the 'summary'-content you where after, using the same data-retrieval technique that was used to get the labels-list in the first place..
Is this what you are after?
UPDATE:
In pure javascript/json there is an excellent topic here on SO.
Should you go with java, you could look at this.
In php you use file_get_contents() or file_get_html(). See also this topic on SO.
UPDATE2: The accepted ANSWER (out of comment's below:)
On google's developers blogger docs 2.0 you can find: RetrievingWithQuery.
Quote:
/category
Specifies categories (also known as labels) to filter the feed results. For example,
blogger.com/feeds/blogID/posts/default/-/Fritz/Laurie returns entries
with both the labels Fritz and Laurie.
You can also find a working piece of javascript that uses this technique over here: list-recent-posts-by-label
Now you can simply continue 'AJAX'ing your summary's out of this filtered list.
Good luck!
Related
http://www.color-hex.com/color-palette/35967
Using javascript/JQuery I want to get the colors from the above color palette website. The only api I found seemed limited.
Any answers for api's or other palette-picker sharing sites are accepted as well. API's are preferred.
Edit:
found a promising api: http://www.colourlovers.com/api
Though being a bit of a noob means I do not know exactly how I'm supposed to use it without an explicit javascript example :'(
Based on the structure of the page:
$("table.table tr td a").each((i,e) => console.log($(e).html()))
should output the list of five hex colors representing the palette. However, without knowing how you are using the information, obtaining the HTML from the page is still a mystery.
Have a look at examples using $.get() and $.parseHTML() in hopes of pulling the page's data and then manipulating the resultant DOM.
http://liveweave.com/xfOKga
I'm trying to figure out how to save code similar to Liveweave.
Basically whatever you code you click the save button and it generates a hash after the url. When you go to this url you can see the saved code. (I been trying to learn this, I just keep having trouble finding the right sources. My search results end up with references completely unrelated to what I'm looking for, example )
I spent the past two days researching into this and I've gotten no where.
Can anyone can help direct me to a tutorial or article that explains this type of save event thoroughly?
To understand the functionality, it is best to try and identify everything that is happening. Dissect this feature according to the technology that would typically be used for each distinguishable component. That dissected overview will then make it easier to see how the underlying technologies work together. I suspect you may lack the experience or nomenclature to see at a glance how a site like liveweave works or how to search for the individual pieces, so I will break it down for you. It will be up to you to research the individual components that I will name. Knowing this, here are the keys you need to research:
Note that without being the actual developer of liveweave, knowing all the backend technology is not possible, but intelligent guesses will suffice. The practice is all the same. This is a cursory breakdown.
1) A marked up page, with HTML, CSS, and JavaScript. This is the user-facing part of the application, where content can be typed, and how the user interacts with the application.
2) JavaScript to asynchronously (AJAX) submit the page's form to the backend for processing.
3) A backend programming/scripting language to process the incoming form. In the case of liveweave, the form is POSTed. It is also using PHP to process the form.
4) A database table with a column for each language (liveweave has HTML, CSS, and JavaScript). This database will insert the current data from each textarea submitted in the form and processed by PHP as a new row. Each row will generate a new hash and store it alongside the data just inserted. A popular database is MySQL.
5) When the database insert is complete, the scripting language takes over again, and send its response back to the marked up page (1). That page is waiting for a response from the backend. JavaScript handles the response. In the case of liveweave, the response is the latest hash to be used in the URL.
6) The URL magic happens with JavaScript. You want to look up JavaScript's latest History API, where methods like pushState will be used to update the URL in the browser without actually refreshing the page.
When a URL with a given hash is navigated to, the scripting language processes the request, grabs the hash, searches for the hash in the database table, finds a matching row, and populates the page's textareas with the data just found.
Throughout all this, there should be checks to avoid duplication and a multitude of exploits. This is also up to you to research.
It should be noted that currently there are two comments for your question. Darren's link will indeed allow the URL to change, but it is a redirect, and not what you want. ksealey's answer is not wrong; that is one way of doing it, but it is not the most robust or scalable, and would not be the recommended approach for solving this.
I have a question about how to approach a certain scenario before I get halfway through it and figure out it was not the best option.
I work for a large company that has a team that creates tools for the team mates to use that aren’t official enterprise tools. We have no access to the database directly, just access to an internal server to store our files to run and be able to access the main site with javascript etc (same domain).
What I am working on is a tool that has a ton of options in it that allow you to select that I will call “data points” on a page.
There are things like “Account status, Balance, Name, Phone number, email etc” and have it save those to an excel sheet.
So you input account numbers, choose what you need and then using IE Objects it navigates to the page and scrapes data you request.
My question is as follows..
I want to make the scraping part pretty Dynamic in the way it works. I want to be able to add new datapoints on the fly.
My goal or idea is so store the regular expression needed to get the specific piece of data in the table with the “data point option”.
If I choose “Name” it knows the expression for name in the database to run again the DOM.
What would be the best way about creating that type of function in Javascript / Jquery?
I need to pass a Regex to a function, have it run against the DOM and then return the result.
I have a feeling that there will be things that require more than 1 step to get the information etc.
I am just trying to think of the best way to approach it without having to hardcode 200+ expressions into the file as the page may get updated and need to be changed.
Any ideas?
IRobotSoft scraper may be the tool you are looking for. Check this forum and see if questions are similar to what you are doing: http://irobotsoft.org/bb/YaBB.pl?board=newcomer. It is free.
What it uses is not regular expression but a language called HTQL, which may be more suitable for extracting web pages. It also supports regular expression, but not as the main language.
It organizes all your actions well with a visual interface, so you can dynamically compose actions or tasks for changing needs.
I would like get meaning of selected word using wikionary API.
Content retrieve data should be the same as is presented in "Word of the day", only the basic meaning without etympology, Synonyms etc..
for example
"postiche n
Any item of false hair worn on the head or face, such as a false beard or wig."
I tried use documentation but i can find similar example, can anybody help with this problem?
Although MediaWiki has an API (api.php), it might be easiest for your purposes to just use the action=raw parameter to index.php if you just want to retrieve the source code of one revision (not wrapped in XML, JSON, etc., as opposed to the API).
For example, this is the raw word of the day page for November 14:
http://en.wiktionary.org/w/index.php?title=Wiktionary:Word_of_the_day/November_14&action=raw
What's unfortunate is that the format of wiki pages focuses on presentation (for the human reader) rather than on semantics (for the machine), so you should not be surprised that there is no "get word definition" API command. Instead, your script will have to make sense of the numerous text formatting templates that Wiktionary editors have created and used, as well as complex presentational formatting syntax, including headings, unordered lists, and others. For example, here is the source code for the page "overflow":
http://en.wiktionary.org/w/index.php?title=overflow&action=raw
There is a "generate XML parse tree" option in the API, but it doesn't break much of the presentational formatting into XML. Just see for yourself:
http://en.wiktionary.org/w/api.php?action=query&titles=overflow&prop=revisions&rvprop=content&rvgeneratexml=&format=jsonfm
In case you are wondering whether there exists a parser for MediaWiki-format pages other than MediaWiki, no, there isn't. At least not anything written in JavaScript that's currently maintained (see list of alternative parsers, and check the web sites of the two listed ones). And even then, supporting most/all of the common templates will be a big challenge. Good luck.
OK, I admit defeat.
There are some files relating to Wiktionary in Pywikipediabot and I looking at the code, it does look like you should be able to get it to parse meaning/definition fields for you.
However the last half an hour has convinced me otherwise. The code is not well written and I wonder if it has ever worked.
So I defer to idealmachine's answer, but I thought I would post this to save anyone else from making the same mistakes. :)
As mentioned earlier, the content of the Wiktionary pages is in human-readable format, wikitext, so MediaWiki API doesn't allow to get word meaning because the data is not structured.
However, each page follows specific convention, so it's not that hard to extract the meanings from the wikitext. Also, there're some APIs, like Wordnik or Lingua Robot that parse Wiktionary content and provide it in JSON format.
MediaWiki does have an API but it's low-level and has no support for anything specific to each wiki. For instance it has no encyclopedia support for Wikipedia and no dictionary support for Wiktionary. You can retrieve the raw wikitext markup of a page or a section using the API but you will have to parse it yourself.
The first caveat is that each Wiktionary has evolved its own format but I assume you are only interested in the English Wiktionary. One cheap trick many tools use is to get the first line which begins with the '#' character. This will usually be the text of the definition of the first sense of the first homonym.
Another caveat is that every Wiktionary uses many wiki templates so if you are looking at the raw text you will see plenty of these. The only way to reliably expand these templates is by calling the API with action=parse.
What is YQL ? Is it like jQuery ? How can i use it ?
Definition:
The Yahoo! Query Language is an
expressive SQL-like language that lets
you query, filter, and join data
across Web services. With YQL, apps
run faster with fewer lines of code
and a smaller network footprint.
See more in yahoo reference.
No, it doesn't have anything with jQuery. It's like a SQL to webservices, etc.
jQuery is used to manipulate (x)HTML, handle events, handle animations, help in crossbrowsing, etc.
EDIT
YQL example:
select * from flickr.photos.search where text="Cat" limit 10
Access Flickr website and gets photo information.
jQuery example:
$(".search[text=Cat]").text();
Search current page, looking everthing with class search and have attribute text = Cat. Returns his text.
YQL(Yahoo query Language) helps you do select * from internet. Which means this is a SQL like query that helps you collect/join/filter/extract data from internet using simple SQL.
This is a very powerful tool atleast for me, you can easily do wonders using this.
Select * from html where url='example.com' is a small example of a major thing.
You can try this video which explains YQL to a new and basic user.
http://blog.konarkmodi.com/tag/yql
Here is a nice example using YQL and jQuery to build a FAQ list:
http://tutorialzine.com/2010/08/dynamic-faq-jquery-yql-google-docs/
YQL is a simple SQL like language but end up expanding into a REST URL, which will deliver you XML or JSON data so that you can quickly create a mashup.
When its comes to YQL always follow Christian Heilmann (http://www.twitter.com/codepo8). Here is His video http://developer.yahoo.com/yui/theater/video.php?v=heilmann-yql
Read this Wikipedia article that might give a little quick start http://en.wikipedia.org/wiki/YQL_Page_Scraping (this example uses HTML table of YQL)
Page Scraping can also be done using this chrome extension https://chrome.google.com/extensions/detail/bkmllkjbfbeephbldeflbnpclgfbjfmn
I am huge fan of YQL here is some more i created using YQL (pure javascript page , do a view source) http://www.purplegene.com/static/twenital.html
Google and you might find some more good examples.
here is another one from me!
http://www.purplegene.com/static/androidversions.html