scrape dechtech website using python - javascript

I am looking for a way to scrape the data from this website: http://www.dectech.org/football/index.php preferably using Python. The difficulty that I seem to be having is that the data is not hard-coded into the HTML of the website, and appears to be wrapped in something called a mochi-kit ( http://mochi.github.com/mochikit/ ).
I've done some research and it seems that something like BeautifulSoup might be useful to me, but I think I may not be using it correctly. I've also tried using urllib to parse the website with no joy.
My ultimate goal is to have a program that monitors the dectech website and when new predictions are released, automatically picks out value bets using the Betfair API.

It looks like the data is being loaded by javascript from this url
http://www.dectech.org/cgi-bin/new_site/GetUpcomingGames.pl?divID=0
which returns
{
"games" : [
{
"apct" : 0.377838,
"dpct" : 0.263445,
"expGoalDiff" : -0.04086,
"awayID" : "6",
"homeID" : "17",
"date" : "20/10/2012",
"away" : "Chelsea",
"home" : "Tottenham",
"hpct" : 0.358717
},
{
"apct" : 0.237829,
"dpct" : 0.250146,
"expGoalDiff" : 0.594234,
"awayID" : "1",
"homeID" : "8",
"date" : "20/10/2012",
"away" : "Aston Villa",
"home" : "Fulham",
"hpct" : 0.512025
}, /* shortened for brevity */
So you're incredibly lucky, you don't need to scrape the data (which is tricky), you just need to retreive it and parse it like they're doing with mochi.
Python's simplejson module would be able to parse it...

Related

Is it possible to move a particular JSON key to the top of the JSON using Javascript/VuejJS/Nodejs package?

I have a web application built using the Vuejs within that I am obtaining a JSON from the Java backend service which has a format something like this:
{
"a" : "Value-A",
"c" : "Value-C",
"b" : "Value-B",
"schema" : "2.0"
}
Before displaying the result to the user I would like to move the schema to the top so that it would be easy for the user to read and I want it to look something like this:
{
"schema" : "2.0",
"a" : "Value-A",
"c" : "Value-C",
"b" : "Value-B"
}
As we can see only schema position has changed, the rest of the JSON as it is.
Please Note:
I am aware the JSON order does not matter but I am doing this for better readability purposes. If there is a way then it would be really useful for the reader to understand the JSON better.
I want to know if there is a direct way to do it rather than looping over the JSON, as my created JSON in the real application can be pretty large.
All I want to do is move the schema to the top of the JSON. The rest of the JSON can be as it is I do not want to make any modifications to it.
Is there a way to do this using vanilla Javascript or using some Nodejs library as I am using the Vuejs?
I would really appreciate it if there was a way to do it or is there any workaround for this.
A very simplistic approach could be to stringify a new object.
const myObject = {
"a" : "Value-A",
"c" : "Value-C",
"b" : "Value-B",
"schema" : "2.0"
};
console.log(
JSON.stringify({
schema: myObject.schema,
...myObject
}, null, 2)
);
My suggestion is almost the same as the accepted answer but without using JSON.stringify():
const myObject = {
"a" : "Value-A",
"c" : "Value-C",
"b" : "Value-B",
"schema" : "2.0"
};
const myReorderedObject = {
schema: '',
...myObject
};
console.log(myReorderedObject);

Meteor render data from embeded, nested collection

I've been trying to get data out of a nested collection without any luck, besides using the dot notation from the html nothing seems to work.
What I want basically is to soley fetch the data I need from a nested collection. I'm trying to build a file upload for images and audio files and then a simple way to use the files. I'm using the cfs:standard-packages and cfs:filesystem packages.
The code below shows a working example of what I don't want, eg fetching the whole file object and retrieving the data in the html. If I could use the dot notation in the mongo command somehow would be perfect. I also could settle for _each but I would prefer fetching just the data I need on each db call. As you can see I'm passing an id for the whole file object here. Uploads.find({_id:Session.get('getpic')}); BTW, the actual file is stored in a folder on my local server.
The collection:
{
"_id" : "DXFkudDGCdvLpPALP",
"original" : {
"name" : "drink.png",
"updatedAt" : ISODate("2015-04-30T07:14:56.000Z"),
"size" : 127944,
"type" : "image/png"
},
"uploadedAt" : ISODate("2015-07-11T21:53:32.526Z"),
"copies" : {
"uploads" : {
"name" : "drink.png",
"type" : "image/png",
"size" : 127944,
"key" : "uploads-DXFkudDGCdvLpPALP-drink.png",
"updatedAt" : ISODate("2015-07-11T21:53:32.000Z"),
"createdAt" : ISODate("2015-07-11T21:53:32.000Z")
}
}
}
HTML
<template name="renderImages">
{{#each showpic}}
<img width="300" height="300" src="/projectuploads/{{copies.uploads.key}}"/>
{{/each}}
Javascript:
Template.renderImages.helpers({
showpic: function() {
return Uploads.find({_id:Session.get('getpic')});
}
});
specify the returned fields in the find query like so
return Uploads.find({_id:Session.get('getpic')}, { fields: {'copies.uploads.key': 1} } );
but a word on that. here you query minimongo (on the client), which is in the browsercache so it's basically free. take care to publish only those fields to the client that you actually want there.

MongoDB/NodeJS query to get data from dictionary

Hi in mongo DB I have a table "games" like this:
{
"_id" : ObjectId("53c66f922e15c4e5ee2655af"),
"name" : "alien-kindergarden",
"title" : "Alien Kindergarden",
"description" : "Alien description",
"gameCategory_id" : "1",
"deviceOrientation_id" : "1",
"position" : "1"
}
and I have a few dictionaries (also simple collections in MongoDB) like "gameCategory" for example:
{
"_id" : 0,
"name" : "GAME_CATEGORY_NO_CATEGORY"
}
{
"_id" : 1,
"name" : "GAME_CATEGORY_POPULAR"
}
How to get data from collection "games" with fields from my dictionary like:
{
"_id" : ObjectId("53c66f922e15c4e5ee2655af"),
"name" : "alien-kindergarden",
"title" : "Alien Kindergarden",
"description" : "Alien description",
"gameCategory" : GAME_CATEGORY_NO_CATEGORY, <---------------
"deviceOrientation_id" : "1",
"position" : "1"
}
Thanks. I'm using Node server for it.
What you are essentially asking for is a "JOIN" as in SQL. Your first point of reading should be that MongoDB does not do joins. The general concept here is "embedding" where the related information is actually contained within the document.
A good reading of the Data Modelling section of the official documentation can cover various points such as this and alternate approaches. But in most cases the data you wish to reference should just be part of the original document:
{
"_id" : ObjectId("53c66f922e15c4e5ee2655af"),
"name" : "alien-kindergarden",
"title" : "Alien Kindergarden",
"description" : "Alien description",
"gameCategory" : "GAME_CATEGORY_POPULAR",
"deviceOrientation_id" : "1",
"position" : "1"
}
This is generally because the MongoDB concept of being "scalable" is that all operations only ever deal with one collection at a time and "JOINS" are not attempted.
There are options available under node.js such as Mongoose that allow you to .populate() items from another "related" collection. But the general problem here is that you cannot "query" on the "related" information. All this really implements is a "query behind the scenes" approach. So more than one query is actually executed. To find by "related" information the best approach is generally:
var catId = db.category.findOne({ "name": "GAME_CATEGORY_POPULAR" })._id;
db.category.find({ "gameCategory_id": catId })
As nothing will let you query the "game" collection by a value held in a foreign collection.
The idea of "embedding" and generally "duplicating" data might seem alien to those used to relational database concepts. But really your reason for applying a solution such as MongoDB should be that you realize certain "relational patterns" are not the "best fit" for your application.
If you have not looked at this in that way, then perhaps you should stick with the relational database approach and use those tools. At least until you "find" the actual shortcomings and realize why you need to "design" around that.
Unlearn what you have learned.

AngularJS $scope data across different pages

I'm getting to grips with AngularJS and I am working on an application just now where I have questions and answers.
The questions use an incrementing + and - button per item which updates the $scope
What I am wondering though, because I will have to access the values from the question side of the app in the answers, what would be the simplest way to get this across. I had thought of storing the $scope.questions into localstorage.
{
"uid" : 1,
"name": "Quiz",
"drinks" : [
{
"id" : 0,
"type" : "Footballs",
"image" : "http://placehold.it/280x300",
"amount" : ""
},
{
"id" : 1,
"type" : "Golf Balls",
"image" : "http://placehold.it/280x300",
"amount" : ""
}
]
}
The above is json which is fed into my page and then using ng-repeat it will display and then the amount keys get updated when the user has clicked either + or -.
I would like to somehow update this json to be accessable throughout the site too so that when the user/client has updated it they can view a separate page which shows them the answers.
Used localStorage to carry data from the $scope around
localStorage.setItem('data', angular.toJson($scope.data));

Backbone relational lazy loading

I'm using Backbone with my RESTful JSON API in order to create an app that works with posts and their comments. I've been trying to get Backbone Relational to work, but run into problems with my Lazy loading.
I load up a list of posts, without the related comments. On click of an post in the list, I open up a view that fetches the full Post, comments included, and renders this.
I've got 2 Backbone.RelationModels, Posts and Comments. The post relation to the comment is setup as folows:`
relations: [{
type: Backbone.HasMany,
key: 'comments',
relatedModel: 'Comment',
includeInJSON: true, // don't include it in the exporting json
collectionType: 'Comments'
}]
Now the problem I'm facing is that the relationships are initialized as soon as I retrieve my list, that do not contain the comments yet. I load up the full data later, by fetching the model by it's URI. However, the relationships don't seem to reinitialise, calling Posts.get(1).get('comments') returns a Comments collection that is empty!
Does anyone know how I could best tackle this problem? The data is there, it just seems the collection of the comments doesn't gets updated.
Edit: I can make the Post model bind it's change:comments to itself, which updates the collection. However, I can't seem to find a solid way to get the original comments' JSON, since this.get('comments') returns the Comments collection.
Note: In my collection, I do parse the JSON to make it work with my API, with the following code:
parse: function(response) {
var response_array = [];
_.each(response, function(item) {
response_array.push(item);
});
return response_array;
}
This is because the JSON returned by my API returns an object with indexed keys (associative array) instead of a native JSON array.
{
"id" : "1",
"title" : "post title",
"comments" : {
"2" : {
"id" : "2",
"description": "this should solve it"
},
"6" : {
"id" : "6",
"description": "this should solve it"
}
}
}
Thanks a bunch! Please ask any questions, I'm sure I've been vague somewhere!
The Backbone Relational model doesn't parse collections other then arrays, the JSON from my question didn't work. I changed the backend to return the comments in a proper array
{
"id" : "1",
"title" : "post title",
"comments" : [
{
"id" : "2",
"description": "this should solve it"
},
{
"id" : "6",
"description": "this should solve it"
}]
}
}
The RelationalModel doesn't respect the parse function that Backbone provides to parse your JSON before it moves on. With the backend returning "proper" JSON, the lazy loading works without any extra code.
You can also use the initialize method on your comment model, to simulate the parse method and define attributes with custom values like this (CoffeeScript):
initialize: (obj) ->
#attributes = obj.customKey

Categories

Resources