I'm having trouble creating a stacked barchart, I cannot figure out how to properly apply the dimensions to a panel dataset
My data looks like this:
Date name value
12/1/15 name1 5
12/1/15 name2 6
12/1/15 name3 2
13/1/15 name1 2
13/1/15 name2 7
13/1/15 name3 8
14/1/15 name1 2
14/1/15 name2 5
14/1/15 name3 10
Stored in JSON format
I would like to create stacked charts that plot the values associated with the different names over time.
As I understand dc-js I need to provide a chart with the date dimension, and then another dimension by the different names which then groups the values, however I am unsure how to proceed.
Does anyone know how I can do this with?
Here's how to produce the group, from the FAQ:
var group = dimension.group().reduce(
function(p, v) { // add
p[v.type] = (p[v.type] || 0) + v.value;
return p;
},
function(p, v) { // remove
p[v.type] -= v.value;
return p;
},
function() { // initial
return {};
});
(In your case, v.name instead of v.type.)
Basically you create a group which reduces to objects containing the values for each stack.
Then use the name and accessor parameters of .group() and .stack(). Unfortunately you have to spell it out, as the first has to be .group and the rest .stack.
As so:
chart.group(group, 'name1', function(d) { return d.name1; })
.stack(group, 'name2', function(d) { return d.name2; })
.stack(group, 'name3', function(d) { return d.name3; })
...
(Somewhere on SO or the web is a loop form of this, but I can't find it ATM.)
Related
I have some csv data of the following format, that I have loaded with d3.csv():
Name, Group, Amount
Bill, A, 25
Bill, B, 35
Bill, C, 45
Joe, A, 5
Joe, B, 8
But as I understand from various examples, I need the data like this to use it in a stacked bar chart:
Name, AmountA, AmountB, AmountC
Bill, 25, 35, 45
Joe, 5, 8, NA
How can I transform my data appropriately in the js script? There is also the issue of missing data, as you can see in my example.
Thanks for any help.
Yes, you are correct that in order to use d3.stack your data needs re-shaping. You could use d3.nest to group the data by name, then construct an object for each group - but your missing data will cause issues.
Instead, I'd do the following. Parse the data:
var data = `Name,Group,Amount
Bill,A,25
Bill,B,35
Bill,C,45
Joe,A,5
Joe,B,8`;
var parsed = d3.csvParse(data);
Obtain an array of names and an array of groups:
// obtain unique names
var names = d3.nest()
.key(function(d) { return d.Name; })
.entries(parsed)
.map(function(d) { return d.key; });
// obtain unique groups
var groups = d3.nest()
.key(function(d) { return d.Group; })
.entries(parsed)
.map(function(d) { return d.key; });
(Note, this is using d3.nest to create an array of unique values. Other utility libraries such as underscore have a simpler mechanism for achieving this).
Next, iterate over each unique name and add the group value, using zero for the missing data:
var grouped = names.map(function(d) {
var item = {
name: d
};
groups.forEach(function(e) {
var itemForGroup = parsed.filter(function(f) {
return f.Group === e && f.Name === d;
});
if (itemForGroup.length) {
item[e] = Number(itemForGroup[0].Amount);
} else {
item[e] = 0;
}
})
return item;
})
This gives the data in the correct form for use with d3.stack.
Here's a codepen with the complete example:
https://codepen.io/ColinEberhardt/pen/BQbBoX
It also makes use of d3fc in order to make it easier to render the stacked series.
I have some data where, in a given column of a csv, there are six possible values:
1,2,3,4,5,NaN.
I am currently trying to group the data using the d3.nest and rollup functions. My goal is to group the data but exclude "NaN" values in the final output.
This is my current code:
var nested = d3.nest()
.key(function(d){return d[question];
})
.rollup(function(leaves){
var total = data.length
var responses = leaves.length;
return {
'responses' : responses,
'percent' : responses/total
};
})
.entries(data)
As you can see, I would like to return both a count of each of the categories as well as the percentage of the total that they represent. After removing NaN, I would also like the removal of NaN represented in percentage values of all of the other categories so that they sum to 100%.
The easiest way to do this is to remove the rows the contain NaN before passing the data to d3.nest():
var filtered = data.filter(function(d) { return d.question !== 'NaN'; });
I am trying to read in data from csv file and want to visualise this data with a scatterChart in NVD3.
I would have linked to a JSfiddle or something similar but I don't know how to include a csv file in these online JavaScript IDEs. Is that possible?
The csv file has the following format:
country,y,x
Algeria,91.8,15.7
Bahrain,98.2,49.3
Jordan,99.1,55.0
Kuwait,98.6,57.4
Lebanon,98.7,58.6
My best guess for the code to read the csv file with is:
var scatterdata = [
{
key : "Group1",
values : []//{x:"",y:""}
}
];
d3.csv("literacyScatterCountrynames.csv", function (error, csv) {
if (error) return console.log("there was an error loading the csv: " + error);
console.log("there are " + csv.length + " elements in my csv set");
scatterdata[0].values["x"] = csv.map(function(d){return [+d["x"] ]; });
scatterdata[0].values["y"] = csv.map(function(d){return [+d["y"] ]; });
I see my data in the DOM and it looks about right but the chart is not shown and instead it says 'No Data Available.' in bold letters where the chart should be.
Neither here at StockOverflow, nor in the NVD3 documentation on Github, nor in the helpful website on NVD3 charts by cmaurer on GitHub could I find more information on how to do this.
Turning your csv into JSON would work, but isn't necessary. You've just got your data formatting methods inside-out.
You seem to be expecting an object containing three arrays, one for each column in your table. The D3 methods create (and the NVD3 methods expect) an array of objects, one for each row.
When you do
scatterdata[0].values["y"] = csv.map(function(d){return [+d["y"] ]; });
You're creating named properties of the values array object (all Javascript arrays are also objects), but not actually adding content using array methods, so the length of that array is still zero and NVD3 sees it as an empty array -- and gives you the "no data" warning.
Instead of using the mapping function as you have it, you can use a single mapping function to do number formatting on the data array, and then set the result directly to be your values array.
Like so:
var scatterdata = [
{
key : "Group1",
values : []//{x:"",y:""}
}
];
d3.csv("literacyScatterCountrynames.csv", function (error, csv) {
if (error) return console.log("there was an error loading the csv: " + error);
console.log("there are " + csv.length + " elements in my csv set");
scatterdata[0].values = csv.map(function(d){
d.x = +d.x;
d.y = +d.y;
return d;
});
console.log("there are " + scatterdata[0].values.length + " elements in my data");
//this should now match the previous log statement
/* draw your graph using scatterdata */
}
The mapping function takes all the elements in the csv array -- each one of which represents a row from your csv file -- and passes them to the function, then takes the returned values from the function and creates a new array out of them. The function replaces the string-version of the x and y properties of the passed in object with their numerical version, and then returns the correctly formatted object. The resulting array of formatted objects becomes the values array directly.
Edit
The above method creates a single data series containing all the data points. As discussed in the comments, that can be a problem if you want a category name to show up in the tooltip -- the NVD3 tooltip automatically shows the series name as the tooltip value. Which in the above code, would mean that every point would have the tooltip "Group1". Not terribly informative.
To format the data to get useful tooltips, you need each point as its own data series. The easiest way to make that happen, and have the output in the form NVD3 expects, is with d3.nest. Each "nested" sub-array will only have one data point in it, but that's not a problem for NVD3.
The code to create each point as a separate series would be:
var scatterdata;
//Don't need to initialize nested array, d3.nest will create it.
d3.csv("literacyScatterCountrynames.csv", function (error, csv) {
if (error) return console.log("there was an error loading the csv: " + error);
console.log("there are " + csv.length + " elements in my csv set");
var nestFunction = d3.nest().key(function(d){return d.country;});
//create the function that will nest data by country name
scatterdata = nestFunction.entries(
csv.map(function(d){
d.x = +d.x;
d.y = +d.y;
return d;
})
); //pass the formatted data array into the nest function
console.log("there are " + scatterdata.length + " elements in my data");
//this should still match the previous log statement
//but each element in scatterdatta will be a nested object containing
//one data point
/* draw your graph using scatterdata */
}
You could place the data into a variable, as Mike describes here:
name value
Locke 4
Reyes 8
Ford 15
Jarrah 16
Shephard 23
Kwon 42
is represented this way:
var data = [
{name: "Locke", value: 4},
{name: "Reyes", value: 8},
{name: "Ford", value: 15},
{name: "Jarrah", value: 16},
{name: "Shephard", value: 23},
{name: "Kwon", value: 42}
];
I'm attempting to use the crossfilter javascript library (in conjunction with D3.js) to group and filter json data.
My json data has the following fields: week_date, category, team_name, title, subtitle
I've been able to successfully group on all records to produce YTD totals, using something like this:
var dimension = data.dimension(function(d) { return d[target]; });
var dimensionGrouped = dimension.group().all();
Where target is either category or team_name.
In addition to keeping the YTD totals, I also want to display totals for a given range. For example, a user-selected week (i.e. Oct 1st - 5th).
How do I create a filtered group which returns the totals for a given date range? Using my week_date field.
Well, after some superficial research, including skimming over crossfilter's issues list, I've concluded that crossfilter does not currently support grouping on multiple fields.
I was able to work around this by using a filtered copy of the data instead, as such:
// YTD rows
var crossfilterData = crossfilter(data);
var ytdDimension = crossfilterData.dimension(function(d) { return d[target]; });
var ytdDimensionGrouped = ytdDimension.group().all();
ytdDimensionGrouped.forEach(function (item) {
// ...
});
// Ranged rows
var filteredData = data.filter(function (d) {
return d.week_date >= filter[0] && d.week_date <= filter[1];
});
crossfilterData = crossfilter(filteredData);
var rangeDimension = crossfilterData.dimension(function(d) { return d[target]; });
var rangeDimensionGrouped = rangeDimension.group().all();
rangeDimensionGrouped.forEach(function (item) {
// ...
});
I have an array of data get from the server(ordered by date):
[ {date:"2012-8", name:"Tokyo"}, {date:"2012-3", name:"Beijing"}, {date:"2011-10", name:"New York"} ]
I'd like to :
get the name of the first element whose date is in a given year, for example, given 2012, I need Tokyo
get the year of a given name
change the date of a name
which data structure should I use to make this effective ?
because the array could be large, I prefer not to loop the array to find something
Since it appears that the data is probably already sorted by descending date you could use a binary search on that data to avoid performing a full linear scan.
To handle the unstated requirement that changing the date will then change the ordering, you would need to perform two searches, which as above could be binary searches. Having found the current index, and the index where it's supposed to be, you can use two calls to Array.splice() to move the element from one place in the array to another.
To handle searches by name, and assuming that each name is unique, you should create a secondary structure that maps from names to elements:
var map = {};
for (var i = 0, n = array.length; i < n; ++i) {
var name = array[i].name;
map[name] = array[i];
}
You can then use the map array to directly address requirements 2 and 3.
Because the map elements are actually just references to the array elements, changes to those elements will happen in both.
Assuming you are using unique cities, I would use the city names as a map key:
cities = {
Tokyo: {
date: "2012-8"
},
New York: {
date: "2011-10"
}
}
To search by date:
function byDate(date) {
for(el in cities) {
if(cities.hasOwnProperty(el) && cities[el].date === date)
return el;
}
}
Just for the record: without redesigning your date structure you could use sorting combined with the Array filter or map method:
function sortByDate(a,b){
return Number(a.date.replace(/[^\d]+/g,'')) >
Number(b.date.replace(/[^\d]+/g,''));
}
var example = [ {date:"2012-8", name:"Tokyo"},
{date:"2012-3", name:"Beijing"},
{date:"2011-10", name:"New York"} ]
.sort(sortByDate);
//first city with year 2012 (and the lowest month of that year)
var b = example.filter(function(a){return +(a.date.substr(0,4)) === 2012})[0];
b.name; //=> Beijing
//year of a given city
var city = 'Tokyo';
var c = example.filter(function(a){return a.city === city;})[0];
c.year; //=> 2012
//change year of 'New York', and resort data
var city = 'New York', date = '2010-10';
example = example.map(
function(a){if (a.name === city) {a.date = date;} return a;}
).sort(sortByDate);