D3 making new, smaller CSV file - javascript

I'm stuck with a quite simple problem and need help.
I have a big CSV file with 50 columns which i absolutely can't modifie.
Now i want to make a chart where i only need 5-6 columns out of it.
My idea was now to make a new "data2" which contains only these 5-6 columns (with key and evertything) and work with this data2.
But i'm not able to create this data2.
To filter which columns i need i wanted to work with regex. Something like this:
d3.keys(data[0]).filter(function(d) { return d.match(/.../); })
But how do i create the new data2 then? I'm sure i need to work with d3.map but even with the api i'm not able to understand how it works correctly.
Can someone help me out?

Firstly, your question's title is misleading: you're not asking about making a smaller CSV file, since the file itself is not changed. You're asking about changing the data array created by D3 when that CSV was parsed.
That brings us to the second point: you don't need to do that. Since you already lost some time/resources loading the CSV and parsing that CSV, the best idea is just keeping it the way it is, and using only those 5 columns you want. If you try to filter some columns out (which means deleting some properties from each object in the array) you will only add more unnecessary tasks for the browser to execute. A way better idea is changing the CSV itself.
However, if you really want to do this, you can use the array property that d3.csv creates when it loads an CSV, called columns, and a for...in loop to delete some properties from each object.
For instance, here...
var myColumns = data.columns.splice(0, 4);
... I'm getting the first 4 columns in the CSV. Then, I use this array to delete, in each object, the properties regarding all other columns:
var filteredData = data.map(function(d) {
for (var key in d) {
if (myColumns.indexOf(key) === -1) delete d[key];
}
return d;
})
Here is a demo. I'm using a <pre> element because I cannot use a real CSV in the Stack snippet. My "CSV" has 12 columns, but my filtered array keeps only the first 4:
var data = d3.csvParse(d3.select("#csv").text());
var myColumns = data.columns.splice(0, 4);
var filteredData = data.map(function(d) {
for (var key in d) {
if (myColumns.indexOf(key) === -1) delete d[key];
}
return d;
})
console.log(filteredData)
pre {
display: none;
}
<script src="https://d3js.org/d3.v4.min.js"></script>
<pre id="csv">foo,bar,baz,foofoo,foobar,foobaz,barfoo,barbar,barbaz,bazfoo,bazbar,bazbaz
1,2,5,4,3,5,6,5,7,3,4,3
3,4,2,8,7,6,5,6,4,3,5,4
8,7,9,6,5,6,4,3,4,2,9,8</pre>

Related

Javascript adding new column with primary key of the array

I have a simple csv file like below. I want to read that file and find lat and long value from a webservice than write it as new columns in to the arrays objects.
I can read csv, import it, I can get lat and long values. But when i try to update array it does not work. Sometimes it update only one row, sometimes 2 rows (it is meaningless) but i can not update all off the row, what i need.
I am working on a react habitat. My webservice and code is below. I will be happy if you help me to find my bug. Thanks.
orderid,postcode,weight
1,BB9 7BP,5000
2,E4 9JB,3500
3,WR1 2RX,10000
4,WR5 2AD,5000
function getCoordinatesForAddresses(event) {
orders.map((e) => {
let url = `https://nominatim.openstreetmap.org/search.php?q=${e.postcode}&polygon_geojson=1&format=jsonv2`;
axios.get(url).then((res) => {
orders.find((ord) => ord.orderid === e.orderid)["latitude"] =
res.data[0].lat;
orders.find((ord) => ord.orderid === e.orderid)["longitude"] =
res.data[0].lon;
});
});
}
when i want to see all array i can see i have added new colums
enter image description here
but when i want to see one by one i could not see my new colums but when i expand object i see the values in the object.
enter image description here
it is very interesting for me. I tried with filter, find and nearly everything.

How to organise/nest data for d3.js chart output

I'm looking for some advice on how to effectively use large amounts of data with d3.js. Lets say for instance, I have this data set taken from a raw .csv file (converted from excel);
EA
,Jan_2016,Feb_2016,Mar_2016
Netherlands,11.7999,15.0526,13.2411
Belgium,25.7713,24.1374
France,27.6033,23.6186,20.2142
EB
,Jan_2016,Feb_2016,Mar_2016
Netherlands,1.9024,2.9456,4.0728
Belgium,-,6.5699,7.8894
France,5.3284,4.8213,1.471
EC
,Jan_2016,Feb_2016,Mar_2016
Netherlands,3.1499,3.1139,3.3284
Belgium,3.0781,4.8349,5.1596
France,16.3458,12.6975,11.6196
Using csv I guess the best way to represent this data would be something like;
Org,Country,Month,Score
EA,Netherlands,Jan,11.7999
EA,Belgium,Jan,27.6033
EA,France,Jan,20.2142
EA,Netherlands,Feb,15.0526
EA,Belgium,Feb,25.9374
EA,France,Feb,23.6186
EA,Netherlands,Mar,13.2411
EA,Belgium,Mar,24.1374
EA,France,Mar,20.2142
This seems very long winded to me, and would use up a lot of time. I was wondering if there was an easier way to do this?
From what I can think of, I assume that JSON may be the more logical choice?
And for context of what kind of chart this data would go into, I would be looking to create a pie chart which can update the data depending on the country/month selected and comparing the three organisations scores each time.
(plnk to visualise)
http://plnkr.co/edit/P3loEGu4jMRpsvTOgCMM?p=preview
Thanks for any advice, I'm a bit lost here.
I would say the intermediary step you propose is a good one for keeping everything organized in memory. You don't have to go through a csv file though, you can just load your original csv file and turn it into an array of objects. Here is a parser:
d3.text("data.csv", function(error, dataTxt) { //import data file as text first
var dataCsv=d3.csv.parseRows(dataTxt); //parseRows gives a 2D array
var group=""; // the current group header ("organization")
var times=[]; //the current month headers
var data=[]; //the final data object, will be filled up progressively
for (var i=0;i<dataCsv.length;i++) {
if (dataCsv[i].length==1 ) { //group name
if ( dataCsv[i][0] == "")
i++; //remove empty line
group = dataCsv[i][0]; //get group name
i++;
times = dataCsv[i];//get list of time headings for this group
times.shift(); // (shift out first empty element)
} else {
country=dataCsv[i].shift(); //regular row: get country name
dataCsv[i].forEach(function(x,j){ //enumerate values
data.push({ //create new data item
Org: group,
Country: country,
Month: times[j],
Score: x
})
})
}
}
This gives the following data array:
data= [{"Org":"EA","Country":"Netherlands","Month":"Jan_2016","Score":"11.7999"},
{"Org":"EA","Country":"Netherlands","Month":"Feb_2016","Score":"15.0526"}, ...]
This is IMO the most versatile structure you can have. Not the best for memory usage though.
A simple way to nest this is the following:
d3.nest()
.key(function(d) { return d.Month+"-"+d.Country; })
.map(data);
It will give a map with key-values such as:
"Jan_2016-Netherlands":[{"Org":"EA","Country":"Netherlands","Month":"Jan_2016","Score":"11.7999"},{"Org":"EB","Country":"Netherlands","Month":"Jan_2016","Score":"1.9024"},{"Org":"EC","Country":"Netherlands","Month":"Jan_2016","Score":"3.1499"}]
Use entries instead of mapto have an array instead of a map, and use a rollup function if you want to simplify the data by keeping only the array of scores. At this point it is rather straightforward to plug it into any d3 drawing tool.
PS: a Plunker with the running code of this script. Everything is shown in the console.

Use an array as input data in D3.js

I have built a custom D3 chart that used an old REST api, and I am looking for help on how to adapt the D3 input to the new one.
At http://codepen.io/anon/pen/bEgMdV I use the dataset 'dataset' and I have tried to use the 'data' which has 'genes' and 'dataset' in one.
So far, I have tried changing
var gs = svg.selectAll("g")
.data(d3.values(dataset)).enter().append("g");
gs.selectAll("path").data(function(d) {
return pie(d.data);
to
var gs = svg.selectAll("g")
.data(d3.values(data.values)).enter().append("g");
gs.selectAll("path").data(function(d) {
return pie(d[0]);
In order to get the numbers of the array in 'values'.
I feel I am somewhat close to getting this. Can someone help?
One of the ways is to convert the new data into the old data structure, so that you don't need to change the code.
This for loop will change you new data to the expected old data structure.
genes = [];
dataset = [{data:[], gene:"LPAR"}];
data.values.forEach(function(s){
genes.push(s[0]);//iterate and add the gene
dataset[0].data.push(s[1])//iterate and add the values.
});
Working code here
Hope this helps!

Retrieving values in crossfilter.dimension

Hi I'm a newbie in JS and Crossfilter. I'm using crossfilter with my data (.csv file) and retrieved distinct values in a column using
var scoreDim = ppr.dimension(function (d) {
return d.score;
});
Also I could get the counts for each value using
var scoreDimGroup = scoreDim.group().reduceCount();
I could use dc.js to plot the chart and the result looks correct. But how do I retrieve the values in scoreDim and scoreDimGroup so that I can use it for further processing in my code. When I look at the object using a debugger, I could see a bunch of functions but could not see the actual values contained in the objects.
scoreDim.top(Infinity)
will retrieve the records.
scoreDimGroup.top(Infinity)
will retrieve the groups (key-value pairs of the dimension value and the count).
Generally, this kind of thing is covered well in the Crossfilter API documentation.
You can use the top method of the group object:
var groupings = teamMemberGroup.top(Infinity);
This returns an array of groups, which will have the structure that you built in the reduce method. For example, to output the key and value you can do this:
groupings.forEach(function (x) {
console.log(x.key + x.value.projectCount);
});
You can access the dimension values in the same way:
var dimData = teamMemberDimension.top(Infinity);
dimData.forEach(function (x) {
console.log(JSON.stringify(x));
});
Here is a simple example of this: http://jsfiddle.net/djmartin_umich/T5v4N/
Rusty has a nice tutorial on how this works at http://blog.rusty.io/2012/09/17/crossfilter-tutorial/
If you are looking to view these values in the console then you can use this print_filter function that was mentioned in the tutorial!
(http://www.codeproject.com/Articles/693841/Making-Dashboards-with-Dc-js-Part-1-Using-Crossfil)
Basically you would include this bit of code in your javascript rendering of the crossfilter charts before you define your data source or your ndx variable:
function print_filter(filter) {
var f = eval(filter);
if (typeof(f.length) != "undefined") {}else{}
if (typeof(f.top) != "undefined") {f=f.top(Infinity);}else{}
if (typeof(f.dimension) != "undefined") {f=f.dimension(function(d) { return "";}).top(Infinity);}else{}
console.log(filter+"("+f.length+") = "+JSON.stringify(f).replace("[","[\n\t").replace(/}\,/g,"},\n\t").replace("]","\n]"));
};
Then you can simply run print_filter(scoreDim) in your console! It's that simple! You can use this to see all of the objects you create using crossfilter including groups, etc.
Hope this helps!

How do I scan JSON with jquery to determine the number of instances of a certain string?

I have some JSON which looks generally like this...
{"appJSON": [
{
"title":"Application Title",
"category":"Business",
"industry":"Retail",
"language":"English",
"tags":[
{"tags":"Sales"},{"tags":"Reporting"},{"tags":"Transportation"},{"tags":"Hospitality"}
],
},
{
"title":"Airline Quality Assurance",
...
...
...]}
I'm looping through JSON to get an array of all of the unique Tags in the data.
My question is, now that I have an array of the different unique Tags in the JSON, how do I best determine the number of times each Tag occurs?
So basically I'm looking to generate a list of all of the tags found in the JSON (which I already have) with the number of times each one occurs (which I don't already have).
Thanks a lot in advance!
I'm assuming when you find a new tag you check to see if you already have that tag somewhere. If you don't you add it to your list. Why not when you check do something like.
var nextTag=//get the next tag in the JSON list
var newTag=true;
for(var i=0;i<tags.length;i++){
if(nextTag === tags[i]){
tagCount[i]++;
newTag=false;
break;
}
}
if(newTag){
tags[tags.length]=nextTag;
tagCount[tagCount.length]=1;
}
This uses two arrays where tagCount[i] is the number of times tag in tags[i] occurs. You could uses an object to do this or however you wanted to.
As an alternative, here's a function which will fill an associative array; the keys will be the tags and the values will be the number of occurrences of that tag.
var tagCounts = []; // Global variable here, but could be an object property or any array you like really
function countTags(tags, tagCounts)
{
$.each(tags, function(i, item) {
var tag = item.tags; // This would change depending on the format of your JSON
if(tagCounts[tag] == undefined) // If there's not an index for this tag
tagCounts[tag] = 0;
tagCounts[tag]++;
});
}
So you can call this function on any number of arrays of tags, passing in your tagCounts (totals) array, and it will aggregate the totals.
var tags1 = [{"tags":"Sales"},{"tags":"Reporting"},{"tags":"Transportation"},{"tags":"Hospitality"}];
var tags2 = [{"tags":"Reporting"},{"tags":"Transportation"}];
var tags3 = [{"tags":"Reporting"},{"tags":"Hospitality"}];
countTags(tags1, tagCounts);
countTags(tags2, tagCounts);
countTags(tags3, tagCounts);
Then you can read them out like so:
for(var t in tagCounts)
// t will be the tag, tagCounts[t] will be the number of occurrences
Working example here: http://jsfiddle.net/UVUrJ/1/
qw3n's answer is actually a more efficient way of doing things, as you're only looping through all the tags onceā€”but unless you have a really huge JSON source the difference isn't going to be noticeable.

Categories

Resources