I have been struggling with a site I am scrapping using scrappy.
This site, returns a series of Javascript variables (array) with the products data.
Example:
datos[0] = ["12345","3M YELLOW CAT5E CABLE","6.81","1","A","N","N","N","N","N",0,0,0,0,0,"0","0","0","0","0","P","001-0030","12","40K8957","28396","250","Due: 30-12-1899",0.0000,1,"",\'\'];
datos[1] = ["12346","3M GREEN CAT5E CABLE","7.81","1","A","N","N","N","N","N",0,0,0,0,0,"0","0","0","0","0","P","001-0030","12","40K8957","28396","250","Due: 30-12-1899",0.0000,1,"",\'\'];
...
So on...
Fetching the array into a string with scrapy was easy, since the site response prints the variables.
The problem is I want to transform it into Json so I can process it and store it in a database table.
Normally I would use Javascript's function Json.stringify to convert it to Json and post it in PHP.
However when using Python's json.loads and even StringIO I am unable to load the array into json.
Probably is a format error, but I am unable to identify it, since I am not expert in Json nor Python.
EDIT:
I just realize since scrapy is unable to execute Javascript probably the main issue is that the data is just a string. I should format it into a Json format.
Any help is more than welcome.
Thank you.
If you wanted to take an array and create a json object, you could do something like this.
values = ["12345","3M YELLOW CAT5E CABLE","6.81","1","A","N","N","N","N","N",0,0,0,0,0,"0","0","0","0","0","P","001-0030","12","40K8957","28396","250","Due: 30-12-1899",0.0000,1]
keys = [x for x in range(len(values))]
d = dict(zip(keys, values))
x = json.dumps(d)
There is a section in the scrapy doc to find various ways to parse the JavaScript code. For your case, if you just need to have it in an array, you can use the regex to get the data.
Since the website you are scraping is not present in the question, I am assuming this would be a more straightforward way to get it, but you could use whichever way seems suitable.
So yesterday I started messing around with MongoDB in Node and when it comes to retrieving data I encountered a weird practice.
You retrieve data from a collection that is nested within a database by calling.
data = client.db([dbname]).collection([collectionname]).find([searchcriteria])
and this returns what seems to be an object at least in the eyes of typeof
the sample code then uses the following lines to log it to the console:
function iterate(x){
console.log(x)
}
data.forEach(iterate)
The output is as expected in this case two objects with 2 pairs everything is fine so far.
I thought it is a bit unnecessary to have the iterate function so I changed that to just
console.log(data)
expecting the 2 objects in a array or nested in another object but what i get is this huge object with all kinds of different things in it EXCEPT the two objects that we saw before.
So now to my Question and what I need deeper explanation on:
why can i actually use .forEach() on this object I cannot recreate this on other objects.
and the second thing is why is console.log(data) giving me all this output that is hidden if I call it through .forEach()?
and is there any other way to quickly within one or 2 lines of code retrieve data from Mongo ?
this seems to be a very not useful way of doing things.
and how does this .forEach() thing on objects work I found a article here on stack however this was not very detailed and not very easy to understand.
The find function returns a cursor - this is the huge object that you are seeing. Checkout documentation for more details here: https://docs.mongodb.com/manual/reference/method/db.collection.find/#db.collection.find
The reason why you can call forEach on the returned object (=cursor), is because it is one of its methods. See https://docs.mongodb.com/manual/reference/method/cursor.forEach/#cursor.forEach
Overview of all cursor methods is here: https://docs.mongodb.com/manual/reference/method/js-cursor/
To get the array of data that you are looking for you need to use the toArray method, like so:
const data = client.db([dbname]).collection([collectionname]).find([searchcriteria]).toArray()
This question already has answers here:
Parse JSON in JavaScript? [duplicate]
(16 answers)
Closed 5 years ago.
I have JSON data in the following format:
{"User":"aa2","Owner":"aa2_role","Status":"locked","Port":"5432","Description":"Transferred from CFS01 on Jun29","Project":"aa2","Server":"localhost"}
I can count on the key called "Project": to always be the same.
What I need is to pull the Value of the "Project": key which in this case is "aa2" using regex in JavaScript.
I have been trying different variations of /^("Project":)$/g but it is not working.
My last attempt I tried /[^"Project":$]/g but it gives me everything in the JSON Object but "Project":
Does anyone know how I can capture the value of a specific Project key?
EDIT: Just for clarification I agree with the use of the Parsing mechanism. My problem is that I am debugging/working with a previous code base that is using a separate function to pass the parsed JSON as a string into the function where now I need to pull the data needed from the string. After having spent this amount of time trying to come up with a work around I am about to scrap the function in order to use Parse... that's why I'm trying to use regex but the more I think about it, the more I realize that it's just bad code... but still, now I am just curious.
No need to use Regular expression to extract value from object string. Just Use JSON.parse(). This will convert string to Object. After parsing you can use key to access the value.
var obj = JSON.parse('{"User":"aa2","Owner":"aa2_role","Status":"locked","Port":"5432","Description":"Transferred from CFS01 on Jun29","Project":"aa2","Server":"localhost"}');
console.log(obj.Project);
Parse the JSON and then get the value.
Working example : https://jsfiddle.net/vineeshmp/2c8z7r6u/
var json =' {"User":"aa2","Owner":"aa2_role","Status":"locked","Port":"5432","Description":"Transferred from CFS01 on Jun29","Project":"aa2","Server":"localhost"}';
json = JSON.parse(json);
alert(json.Project);
This question already has answers here:
How to store objects in HTML5 localStorage/sessionStorage
(24 answers)
Closed 6 years ago.
I'm trying to build a chrome extension that would feed data in the localStorage of the form:
var a = 'YouTube';
var data = {
'title': 'twenty one pilots: Stressed Out [OFFICIAL VIDEO]',
'url': 'https://www.youtube.com/watch?v=pXRviuL6vMY'
};
localStorage.setItem(a, data);
But when I look into the resources in dev. tools, it doesn't show the data object in the value table. How can I make it appear in there?What's wrong with the code?
The Image of the console and the localStorage.
I have tried commands like localStorage.YouTube and localStorage.getItem('YouTube') but it always returns [object Object].
What could be a possible workaround for making something like this possible?
LocalStorage in HTML5 can only store key-values strings. You can't store objects there.
BUT, you can use JSON. To make a string from your object you can use JSON.stringify(yourObj);
and when you want to get value from LocalStorage you simply parse that string and create new object which you can use again. JSON.parse(yourObj);
This question already has answers here:
How can I access and process nested objects, arrays, or JSON?
(31 answers)
Closed 8 years ago.
I am encountering a problem where in I need to manipulate a json result in a way that
I could manipulate it with sql commands types.
like left joins and sum and group by
Thus anyone of you have encountered this lately?
I am currently using / exploring jsonsql javascript.. for the time being
attach is the json file that i need to manipuate..
the new result should be like this
so Im guessing I needed to use left joins in order to be result in this kind of output.
My problem is like collecting all / some parts from the rows and making it a column
then putting it all together something like that???
I hope I made sense.. Thanks for all your response..
The language provides you with some nice features, so there's no need for wrapping SQL over JS:
var data=[{"firstName":"Alice", "age":"16"},{"firstName":"Bob", "age":"18"} ... {"firstName":"Zacharias", "age":"37"}]
If you want to SELECT * FROM json WHERE age>16, you could do something equivalent in JS:
data.filter(function(x){ return x.age>16 })
If you want to SELECT count(*) FROM json you just write
data.length;
If you want to SELECT avg(age) FROM json you could write
data.reduce(function(o,n){ return o+(n.age/data.length) }, 0)
If you want to SELECT sum(age) from json you could write
data.reduce(function(o,n){ return o+n.age*1 },0)
So why not using, what the language gives you?
Edit: I saw, you specified your needs. What exactly is the transformation needed? I think there should be a JS-way, to do, what you want.
Edit2: For tabular representation, you have to do a reduce over all data. For the example I simplified a little bit:
var aaData=[{"country":"USA", "month":"1", "earnings":"1000"}, {"country":"USA", "month":"2", "earnings":"1001"}, {"country":"USA", "month":"3", "earnings":"1002"}, {"country":"Germany", "month":"1", "earnings":"1000"}, {"country":"Germany", "month":"2", "earnings":"1001"}, {"country":"Germany", "month":"3", "earnings":"1002"}]
var result=aaData.reduce(function(o, n){
if (!o.hasOwnProperty(n.country)) o[n.country]={};
o[n.country][n.month]=n.earnings;
return o;
}, {})
Here is a JSFiddle to play with.