Choropleth Alternative Method and more - javascript
In my previous post1 and post2, i managed to fix the choropleth map/legend issue + draw circles problems when drawing a map.
When i follow this must-do tutorial about choropleth and when i search on internet i always find the same logic
d3.csv("my.csv", function(data) {
d3.json(myjson, function(json) {
for (var i = 0; i < data.length ; i++) {
//Grab state name
var dataState = data[i].nom;
//Grab data value, and convert from string to float
var dataValue = data[i].population;
//Find the corresponding state inside the GeoJSON
for (var j = 0; j < json.features.length; j++) {
var jsonState = json.features[j].properties.nom;
if (dataState == jsonState) {
//Copy the data value into the JSON
json.features[j].properties.CA = dataValue;
//Stop looking through the JSON
break;
}
}
}
So in my case, i have a map with 75 path (1 path=region) and my csv file have 75 rows (1 row = 1 path)
Now i'm trying to do things a little differently
My new csv has N rows (N > 75, let's say 200) and for each row, a store (properties+lat+lon) is affected to a path ==> i can have 5 stores/path e.g
Here are my questions :
1) How do i write my choropleth code differently ==> I'd like to scan the csv file and return for each distinct path the sum of specific properties (here "income") in order to write it on my json file ???
2) When i click on a specific region/path, i'd like to display on a new div (in my case #output) the json file corresponding to my region (basicaly i have 75 json files "region1.json", "region2.json" and so on...") with circles inside (one circle = one store, in my csv file "name" column") ==> How do i retrieve this "on click value" and call the correct/corresponding json file ????
3) Finally, if i click on a displayed specific circle of the #output div, i'd like to have on a third div a chart ==> How do i writte correctly my 3rd div so it is correctly displayed (css, other ?? ==> it can be applied to #output too) ??
Thank you so much for reading this request and for your availability and help
Here's the plunker file (do not mind the sales.csv file, i just used it to try displaying something when i click on the path
Thanks again
d3 nest roll up is the solution
d3.csv("source-data.csv", function(error, csv_data) {
var data = d3.nest()
.key(function(d) { return d.date;})
.rollup(function(d) {
return d3.sum(d, function(g) {return g.value; });
}).entries(csv_data);
});
More info here
Related
Parsing file - Improve load time
May I know If I could parse differently, so that performance and load time can be improved. I deal with file size that is in MBs. My file with multiple objects is parsed as below: I display multiple graphs in a single page. 'chart1' to 'chart5' arrays are used for each graph's Y-axis. 'xAxisTime' array is used for x-axis (common x-axis for all graphs). eg: I get 50000 values in each array (chart1, chart2...xAxisTime). data = fileWithMultipleObjects; objects= data.split('\n'); for (var i= 0; i < objects.length - 2; i++) { var obj = JSON.parse(objects[i]) if (obj.type=== "A") { chart1.push(obj.c1) chart2.push(obj.c2) } else { chart1.push(null) chart2.push(null) } if (obj.type=== "B") { chart3.push(obj.c3) chart4.push(obj.c4) chart5.push(obj.c5) } else { chart3.push(null) chart4.push(null) chart5.push(null) } //common for all charts - xAxis data if (obj.date === undefined) { obj.date = null } if (obj.date!== null) { var date= obj.date xAxisTime.push(date) } } Reason for pushing null values when no data: I need to show all 5 graphs in line with the time the job was run. Also, this helps me use "this.point.y" (var index = this.point.y), which helps me display tooltips on plotline. Fiddle showing tooltip(when clicked on green line) with "this.point.y" :https://jsfiddle.net/xqb0cn5r/4/ This is how my graphs look. Fiddle: https://jsfiddle.net/xgpL1w3t/. Only charts that meets if condition are rendered. chart1 and chart5 are rendered in this case. Thank you. Edited: Below is the data structure of my file: {"C1":55.77,"C2":11367.25,"type":"A","date":"10/24/2022 12:05:37.236"} {"C3":55.77,"C4":11367.25,"type":"B","date":"10/24/2022 12:05:37.236","C5":445.21} {"C1":55.77,"C2":11367.25,"type":"A","date":"10/24/2022 12:05:37.236"} {"C1":55.77,"C2":11367.25,"type":"A","date":"10/24/2022 12:05:37.240"} {"C3":55.77,"C4":11367.25,"type":"B","date":"10/24/2022 12:05:37.250","C5:445.25} {"C3":55.77,"C4":11367.25,"type":"B","date":"10/24/2022 12:05:37.275","C5":445.26} I display like 8 to 10 charts in real scenario with around 50000 points each. Common x-axis for all. Out of 50000, only 2000 to 4000 are real data, remaining are all null values. I push null, so that I can keep graphs in line with the whole time(x-axis) the job was run. Right now, it takes time to load charts. As so much data to parse and load. I believe performance can be be improved if my data is parsed differently for the same scenario. I would appreciate any suggestions. Performing single json parse as below: var convertStringToJsonFormat= "[" + data.split('\n').join(", ") JsonFormat = convertStringToJsonFormat.slice(0, -2); JsonFormat = JsonFormat + "]" data = JSON.parse(JsonFormat)
HighChart with more than 50 data points breaks
I am trying to create a bubble chart using the JS HighChart in Angular2+. Whenever there are more than 50 data points (bubbles), the graph breaks. There are the correct number of bubbles in the correct positions (x,y plots) with all different colors but the sizes are all the same even though the z-values are all different. (I am outputing the z-values in a tooltip and the z-values are accurate) This function is how I am passing in data to the high-chart configuration. setSeries() { this.objData = [] this.Data.forEach(element => { var x= element['xVal']; var y = element['yVal']; var z = element['zVal'].toFixed(0); var name = element['seriesName'].trim(); var newData =[{ x:x, y:y, z:+z, }] // SetSeriesData is how i am creating the obj to pass into series=[] in highchart configuration if(i<50) //If I comment this condition, the graph breaks. Right now, the graph is working properly this.setSeriesData(sumData, name, this.objData) i++ }) this.options.series = this.objData; this.generateChart(); } This is my setSeriesData function. setSeriesData(graphData: any, dataName: any, objData: any){ var obj = {}; obj['name'] = dataName; obj['data'] = graphData; obj['events'] = {click: function(e) { //takes me to another link }}; objData.push(obj) } In the above function, I configured the chart so that when you click the bubble, it takes you to another page. When the data points >50, this click functionality is not working either. In addition, the fillOpacity is not correct. Just a few things to point out 1. I am using Angular 2+ 2. The discovered issues are, fillOpacity, click, and size based on z-value. 3. It works perfectly when the data points are less than 50 How can I fix this?
dc.js Scatter Plot with multiple values for a single key
We have scatter plots working great in our dashboard, but we have been thrown a curve ball. We have a new dataset that provides multiple y values for a single key. We have other datasets were this occurs but we had flatten the data first, but we do not want to flatten this dataset. The scatter plot should us the uid for the x-axis and each value in the inj field for the y-axis values. The inj field will always be an array of numbers, but each row could have 1 .. n values in the array. var data = [ {"uid":1, "actions": {"inj":[2,4,10], "img":[10,15,25], "res":[15,19,37]}, {"uid":2, "actions": {"inj":[5,8,15], "img":[5,8,12], "res":[33, 45,57]} {"uid":3, "actions": {"inj":[9], "img":[2], "res":[29]} ]; We can define the dimension and group to plot the first value from the inj field. var ndx = crossfilter(data); var spDim = ndx.dimension(function(d){ return [d.uid, d.actions.inj[0]];}); var spGrp = spDim.group(); But are there any suggestions on how to define the scatter plot to handle multiple y values for each x value? Here is a jsfiddle example showing how I can display the first element or the last element. But how can I show all elements of the array? --- Additional Information --- Above is just a simple example to demonstrate a requirement. We have developed a dynamic data explorer that is fully data driven. Currently the datasets being used are protected. We will be adding a public dataset soon to show off the various features. Below are a couple of images. I have hidden some legends. For the Scatter Plot we added a vertical only brush that is enabled when pressing the "Selection" button. The notes section is populated on scatter plot chart initialization with the overall dataset statistics. Then when any filter is performed the notes section is updated with statistics of just the filtered data. The field selection tree displays the metadata for the selected dataset. The user can decide which fields to show as charts and in datatables (not shown). Currently for the dataset shown we only have 89 available fields, but for another dataset there are 530 fields the user can mix and match. I have not shown the various tabs below the charts DIV that hold several datatables with the actual data. The metadata has several fields that are defined to help use dynamically build the explorer dashboard.
I warned you the code would not be pretty! You will probably be happier if you can flatten your data, but it's possible to make this work. We can first aggregate all the injs within each uid, by filtering by the rows in the data and aggregating by uid. In the reduction we count the instances of each inj value: uidDimension = ndx.dimension(function (d) { return +d.uid; }), uidGroup = uidDimension.group().reduce( function(p, v) { // add v.actions.inj.forEach(function(i) { p.inj[i] = (p.inj[i] || 0) + 1; }); return p; }, function(p, v) { // remove v.actions.inj.forEach(function(i) { p.inj[i] = p.inj[i] - 1; if(!p.inj[i]) delete p.inj[i]; }); return p; }, function() { // init return {inj: {}}; } ); uidDimension = ndx.dimension(function (d) { return +d.uid; }), uidGroup = uidDimension.group().reduce( function(p, v) { // add v.actions.inj.forEach(function(i) { p.inj[i] = (p.inj[i] || 0) + 1; }); return p; }, function(p, v) { // remove v.actions.inj.forEach(function(i) { p.inj[i] = p.inj[i] - 1; if(!p.inj[i]) delete p.inj[i]; }); return p; }, function() { // init return {inj: {}}; } ); Here we assume that there might be rows of data with the same uid and different inj arrays. This is more general than needed for your sample data: you could probably do something simpler if there is indeed only one row of data for each uid. To flatten out the resulting group, with we can use a "fake group" to create one group-like {key, value} data item for each [uid, inj] pair: function flatten_group(group, field) { return { all: function() { var ret = []; group.all().forEach(function(kv) { Object.keys(kv.value[field]).forEach(function(i) { ret.push({ key: [kv.key, +i], value: kv.value[field][i] }); }) }); return ret; } } } var uidinjGroup = flatten_group(uidGroup, 'inj'); Fork of your fiddle In the fiddle, I've added a bar chart to demonstrate filtering by UID. Filtering on the bar chart works, but filtering on the scatter plot does not. If you need to filter on the scatter plot, that could probably be fixed, but it could only filter on the uid dimension because your data is too course to allow filtering by inj.
Drawing parallel coordinates for random selection of data
I have 500 samples (rows) in my data which is stored as a csv file. You can see 5 rows of it as follow: path,Ktype,label,CGX,CGY,C_1,C_2,C_3,C_4,total1,total2,totalI3,total4,feature1,feature2,feature3,feature4,feature5,feature6,feature7,feature8,feature9,feature10,feature11,feature12,A,B,C,D,feature13,feature14,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, .\Mydata\Case1,k1,1,42,33,0,57.44534,0,52597,71,16,10,276,4038,3789.631,0.6173469,0.6499337,2.103316,0.6661285,1.065539,248.3694,0.630161,0.000192848,0.9999996,0.000642777,1,0,0,1,9.60E-05,3136.698,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, .\Mydata\Case2,k1,2,163,29,0,43.28862,0,49050,71,16,10,248,2944,2587.956,0.5726808,0.5681185,2.130387,0.601512,1.137578,356.0444,0.6335613,0.000327267,1.000029,0.001271235,1,0,0,1,0.00010854,2676.418,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, .\Mydata\Case3,k1,3,774,19,0,45.26291,0,53455,71,16,10,212,2366,1982.547,0.408179,0.4579566,1.994296,0.6615351,1.193415,383.4534,0.7153812,0.000264522,1.000031,0.001210507,1,1,0,0,9.54E-05,3221.289,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, .\Mydata\Case4,k1,4,1116,25,0,80.76469,0,57542,71,16,10,284,3908,3453.988,0.3549117,0.4811547,1.982244,0.6088744,1.131446,454.0122,0.6166388,0.000314288,0.9999836,0.00129846,0,1,1,0,0.000140592,2143.42,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, .\Mydata\Case5,k1,5,1364,59,1,52.96776,0,49670,71,16,10,228,2725,2642.675,0.4328255,0.475517,1.859871,0.6587288,1.031152,82.32471,0.5775694,0.000466264,0.9999803,0.001765345,0,1,1,0,0.00012014,2439.636,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, I am drawing parallel coordinates for my data. This is the part of code which reads in the csv file and filters it: d3.csv("Mydata.csv", function(raw_data) { // Convert quantitative scales to floats data = raw_data.map(function(d) { for (var k in d) { if (!_.isNaN(raw_data[0][k] - 0) && k != 'id' && k != 'cgX' && k != 'cgY') { d[k] = parseFloat(d[k]) || 0; } }; return d; }); // Extract the list of numerical dimensions and create a scale for each. xscale.domain(dimensions = d3.keys(data[0]).filter(function(k) { return (_.isNumber(data[0][k])) && (yscale[k] = d3.scale.linear() .domain(d3.extent(data, function(d) { return +d[k]; })) .range([h, 0])); })); // And the rest of the code for drawing parallel coordinates. // It is similar to the code in this link: // http://bl.ocks.org/syntagmatic/3150059 } Now, I want to change it in a way that instead of drawing 500 samples (500 polylines in parallel coordinates), it selects 100 of data randomly. How should I do that?
Do it in 2 passes. First read in all the data... d3.csv("Mydata.csv", function(raw_data) {...} Then pick 100 at random and feed it to rendering function. The bonus is that you don't have parse / transform the coords you don't want to render. Also if you hit 100 before the full 500 is read, you can simply exit early and jump to the rendering algorithm.
Extract specific row of data in d3
The data: A 'premiums.tsv' file, with two columns, 'BiddingDate' and 'CategoryA' What I'm trying to do: Using d3, display in a table or divs just the latest BiddingDate and the CategoryA value that corresponds to that latest BiddingDate. I've got a chart on the page, that works fine. I can also get a table of all the values. That works fine too. But I just can't figure out how to isolate the data corresponding to the latest date value and then display it. Would really appreciate any help. Thanks!
I solved it by using data.map to map the data into a new array, then simply calling the last item in the array. I only ever have 241 data points, but for datasets of variable size, you can just use .length to figure out the latest data point. var allDates = []; allDates = data.map(function(d) {return d.BiddingMonth;}); var latestDate = allDates[240]; var allCatA = []; allCatA = data.map(function(d) {return d.CategoryAPremium;}); var latestCatA = allCatA[240]; Then all I had to do was to print latestDate and latestCatA wherever I wanted. d3.select("#lateDate").text("$ " + latestDate); d3.select("#lateA").text("$ " + latestCatA);