Losing data when rendering rowchart using dc.js - javascript

I lose data when creating a dc.js rowchart.
var ndx = crossfilter(data);
var emailDimemsion = ndx.dimension(function(d) {
return d.email;
});
var emailGroup = emailDimemsion.group().reduce(
function(p, d) {
++p.count;
p.totalWordCount += +d.word_count;
p.studentName = d.student_name;
return p;
},
function(p, d) {
--p.count;
p.totalWordCount -= +d.word_count;
p.studentName = d.student_name;
return p;
},
function() {
return {
count: 0,
totalWordCount: 0,
studentName: ""
};
});
leaderRowChart
.width(600)
.height(300)
.margins({
top: 0,
right: 10,
bottom: 20,
left: 5
})
.dimension(emailDimemsion)
.group(emailGroup)
.elasticX(true)
.valueAccessor(function(d) {
return +d.value.totalWordCount;
})
.rowsCap(15)
.othersGrouper(false)
.label(function(d) {
return (d.value.studentName + ": " + d.value.totalWordCount);
})
.ordering(function(d) {
return -d.value.totalWordCount
})
.xAxis()
.ticks(5);
dc.renderAll();
The fiddle is here, https://jsfiddle.net/santoshsewlal/6vt8t8rn/
My graph comes out like this:
but I'm expecting my results to be
Have I messed up the reduce functions somehow to omit data?
Thanks

Unfortunately there are two levels of ordering for dc.js charts using crossfilter.
First, dc.js pulls the top N using group.top(N), where N is the rowsCap value. Crossfilter sorts these items according to the group.order function.
Then it sorts the items again using the chart.ordering function.
In cases like these, the second sort can mask the fact that the first sort didn't work right. Crossfilter does not know how to sort objects unless you tell it what field to look at, so group.top(N) returns some random items instead.
You can fix your chart by making the crossfilter group's order match the chart's ordering:
emailGroup.order(function(p) {
return p.totalWordCount;
})
Fork of your fiddle: https://jsfiddle.net/gordonwoodhull/0kvatrb1/1/
It looks there is one student with a much longer word count, but otherwise this is consistent with your spreadsheet:
We plan to stop using group.top in the future, because the current behavior is highly inconsistent.
https://github.com/dc-js/dc.js/issues/934
Update: If you're willing to use the unstable latest version, dc.js 2.1.4 and up do not use group.top - the capping is determined by chart.ordering, capMixin.cap, and capMixin.takeFront only.

Related

Highcharts JS- add third variable to tooltip for two series

I've already figured out how to make a chart using highcharts where there are three variables- one on the X axis, one on the Y axis, and one on the tooltip. The way to do this is to add the following to the tooltip:
tooltip: {
formatter () {
// this.point.x is the timestamp in my original chartData array
const pointData = chartData.find(row => row.timestamp === this.point.x)
return pointData.somethingElse
}
}
See this fiddle for the full code:
https://jsfiddle.net/m9e6thwn/
I would simply like to do the same, but with two series instead of one. I can't get it to work. I tried this:
tooltip: {
formatter () {
// this.point.x is the timestamp in my original chartData array
const pointData = chartData1.find(row => row.timestamp === this.point.x)
return pointData.somethingElse
const pointData2 = chartData2.find(row => row.timestamp === this.point.x)
return pointData2.somethingElse
}
}
Here is the fiddle of the above: https://jsfiddle.net/hdeg9x02/ As you can see, the third variable only appears on one of the two series. What am I getting wrong?
There are some issues with the way you are using the formatter now. For one, you cannot have two returns in the same function without any if clauses. That will mean that only the first return will be used.
Anyway, here are some improvements I suggest you do for your code.
Add the extra information for each point to highcharts, that makes it a lot easier to access this information through highcharts. E.g. in a tooltip. You can set the data like this:
chartData1.map(function(row) {
return {
x: row.timestamp,
y: row.value,
somethingElse: row.somethingElse
}
})
If you do that, then returning the correct tooltip for each series is a simple matter of doing this:
tooltip: {
formatter () {
// this.point.x is the timestamp in my original chartData array
return this.point.somethingElse
}
}
Working JSFiddle example: https://jsfiddle.net/ewolden/dq7L64jg/6/
If you wanted more info in the tooltip you could then do:
tooltip: {
formatter () {
// this.point.x is the timestamp in my original chartData array
return this.point.somethingElse + ", time: " + str(this.x) + ", value: " + str(this.y)
}
}
Addtionally, you need to ensure that xAxis elements, i.e. your timestamps are sorted. This is a requirement for highcharts to function properly. As it is, your example is reporting
Highcharts error #15: www.highcharts.com/errors/15
in console, because chartData2 is in reverse order. It looks okay for this example, but more complicated examples can lead to the chart not looking as you expect it to.
For this example using reverse is easy enough: data: chartData2.reverse().map(function(row) {return {x: row.timestamp, y: row.value, somethingElse: row.somethingElse}})
Working JSFiddle example: https://jsfiddle.net/ewolden/dq7L64jg/7/

dc.js and crossfilter - not reading json at all

I have the problem of trying to import a geojson (this seems to work) but then passing it onto crossfilter - no data seems to be loaded into the crossfilter object.
I made a jsfiddle here: https://jsfiddle.net/Svarto/pLyhg9Lb/
When I try to console.log(ndx), i.e. the crossfilter, I only get the crossfilter object with nothing loaded (same when I try to console.log any sort of group:
I would have expected some sort of data when writing the crossfilter with loaded data to console. The problem gets evident when I try to draw a histogram with the data - only 2 bars that are not what I expected.
The code is this:
d3.json("https://dl.dropboxusercontent.com/s/7417jc3ld25i0a4/srealitky_geojson.json?dl=1", function(err,json){
var h = 300;
var w = 350;
var ndx = crossfilter();
console.log(json.features);
ndx.add(json.features);
console.log(ndx);
var all = ndx.groupAll();
var yieldDimension = ndx.dimension(function(d){
return d.properties.yields
});
var yieldGroup = yieldDimension.group().reduceCount();
console.log(yieldGroup);
var priceDimension = ndx.dimension(function(d){
return d.properties.price
});
var priceGroup = priceDimension.group().reduceCount();
var barChart = dc.barChart("#yieldChart");
barChart.width(350)
.height(300)
.x(d3.scale.linear().domain([0,30]))
.brushOn(false)
.dimension(yieldDimension)
.group(yieldGroup);
dc.renderAll();
}
As pointed out by #Nilo above, the problem is not the reading of json data but the setup of your coordinates.
You probably want to bin your data by rounding to, say, a precision of 0.01:
var yieldGroup = yieldDimension.group(function(yields) {
return Math.floor(yields*100)/100;
}).reduceCount();
Then clean up the margins, add elasticX and elasticY, and specify the xUnits to match, and we get a nice histogram (with a normal-ish distribution):
barChart
.margins({left: 50, top: 5, right: 0, bottom: 20})
.x(d3.scale.linear())
.elasticX(true).elasticY(true)
.xUnits(dc.units.fp.precision(0.01))
Fork of your fiddle.
With 0.001 precision.
See the documentation for coordinateGridMixin for more details.

dc.js Scatter Plot with multiple values for a single key

We have scatter plots working great in our dashboard, but we have been thrown a curve ball. We have a new dataset that provides multiple y values for a single key. We have other datasets were this occurs but we had flatten the data first, but we do not want to flatten this dataset.
The scatter plot should us the uid for the x-axis and each value in the inj field for the y-axis values. The inj field will always be an array of numbers, but each row could have 1 .. n values in the array.
var data = [
{"uid":1, "actions": {"inj":[2,4,10], "img":[10,15,25], "res":[15,19,37]},
{"uid":2, "actions": {"inj":[5,8,15], "img":[5,8,12], "res":[33, 45,57]}
{"uid":3, "actions": {"inj":[9], "img":[2], "res":[29]}
];
We can define the dimension and group to plot the first value from the inj field.
var ndx = crossfilter(data);
var spDim = ndx.dimension(function(d){ return [d.uid, d.actions.inj[0]];});
var spGrp = spDim.group();
But are there any suggestions on how to define the scatter plot to handle multiple y values for each x value?
Here is a jsfiddle example showing how I can display the first element or the last element. But how can I show all elements of the array?
--- Additional Information ---
Above is just a simple example to demonstrate a requirement. We have developed a dynamic data explorer that is fully data driven. Currently the datasets being used are protected. We will be adding a public dataset soon to show off the various features. Below are a couple of images.
I have hidden some legends. For the Scatter Plot we added a vertical only brush that is enabled when pressing the "Selection" button. The notes section is populated on scatter plot chart initialization with the overall dataset statistics. Then when any filter is performed the notes section is updated with statistics of just the filtered data.
The field selection tree displays the metadata for the selected dataset. The user can decide which fields to show as charts and in datatables (not shown). Currently for the dataset shown we only have 89 available fields, but for another dataset there are 530 fields the user can mix and match.
I have not shown the various tabs below the charts DIV that hold several datatables with the actual data.
The metadata has several fields that are defined to help use dynamically build the explorer dashboard.
I warned you the code would not be pretty! You will probably be happier if you can flatten your data, but it's possible to make this work.
We can first aggregate all the injs within each uid, by filtering by the rows in the data and aggregating by uid. In the reduction we count the instances of each inj value:
uidDimension = ndx.dimension(function (d) {
return +d.uid;
}),
uidGroup = uidDimension.group().reduce(
function(p, v) { // add
v.actions.inj.forEach(function(i) {
p.inj[i] = (p.inj[i] || 0) + 1;
});
return p;
},
function(p, v) { // remove
v.actions.inj.forEach(function(i) {
p.inj[i] = p.inj[i] - 1;
if(!p.inj[i])
delete p.inj[i];
});
return p;
},
function() { // init
return {inj: {}};
}
);
uidDimension = ndx.dimension(function (d) {
return +d.uid;
}),
uidGroup = uidDimension.group().reduce(
function(p, v) { // add
v.actions.inj.forEach(function(i) {
p.inj[i] = (p.inj[i] || 0) + 1;
});
return p;
},
function(p, v) { // remove
v.actions.inj.forEach(function(i) {
p.inj[i] = p.inj[i] - 1;
if(!p.inj[i])
delete p.inj[i];
});
return p;
},
function() { // init
return {inj: {}};
}
);
Here we assume that there might be rows of data with the same uid and different inj arrays. This is more general than needed for your sample data: you could probably do something simpler if there is indeed only one row of data for each uid.
To flatten out the resulting group, with we can use a "fake group" to create one group-like {key, value} data item for each [uid, inj] pair:
function flatten_group(group, field) {
return {
all: function() {
var ret = [];
group.all().forEach(function(kv) {
Object.keys(kv.value[field]).forEach(function(i) {
ret.push({
key: [kv.key, +i],
value: kv.value[field][i]
});
})
});
return ret;
}
}
}
var uidinjGroup = flatten_group(uidGroup, 'inj');
Fork of your fiddle
In the fiddle, I've added a bar chart to demonstrate filtering by UID. Filtering on the bar chart works, but filtering on the scatter plot does not. If you need to filter on the scatter plot, that could probably be fixed, but it could only filter on the uid dimension because your data is too course to allow filtering by inj.

dc.js sort ordinal line chart by y axis/value

I have a dc.js ordinal chart whose x-axis consists of things like 'Cosmetics' and the y-axis is the number of sales. I want to sort the chart by sales decreasing, however when I use .ordering(function(d){return -d.value.ty}) the path of the line chart is still ordered by the x-axis.
var departmentChart = dc.compositeChart('#mystore_department_chart'),
ndx = crossfilter(response.data),
dimension = ndx.dimension(function(d) {return d.name}),
group = dimension.group().reduce(function(p, v) {
p.ty += v.tyvalue;
p.ly += v.lyvalue;
return p;
}, function(p, v) {
p.ty -= v.tyvalue;
p.ly -= v.lyvalue;
return p;
}, function() {
return {
ty: 0,
ly: 0
}
});
departmentChart
.ordering(function(d){return -d.value.ty})
//dimensions
//.width(768)
.height(250)
.margins({top: 10, right: 50, bottom: 25, left: 50})
//x-axis
.x(d3.scale.ordinal())
.xUnits(dc.units.ordinal)
.xAxisLabel('Department')
//left y-axis
.yAxisLabel('Sales')
.elasticY(true)
.renderHorizontalGridLines(true)
//composition
.dimension(dimension)
.group(group)
.compose([
dc.barChart(departmentChart)
.centerBar(true)
.gap(5)
.dimension(dimension)
.group(group, 'This Year')
.valueAccessor(function(d) {return d.value.ty}),
dc.lineChart(departmentChart)
.renderArea(false)
.renderDataPoints(true)
.dimension(dimension)
.group(group, 'Last Year')
.valueAccessor(function(d) {return d.value.ly})
])
.brushOn(false)
render();
This is the hack I ended up doing. Be aware that it could have performance issues on large data sets as all() is faster than top(Infinity). For some reason I couldn't get Gordon's answer to work, but in theory it should.
On my group I specified an order function
group.order(function(p) {
return p.myfield;
});
Then because all() doesn't use the order function I overrode the default all function
group.all = function() {
return group.top(Infinity);
}
And then on my chart I had to specify an order function
chart.ordering(function(d){
return -d.value.myfield
}); // order by myfield descending
No doubt this is a bug.
As a workaround, you could sort the data yourself instead of using the ordering function, as described in the FAQ.
I filed a bug report: https://github.com/dc-js/dc.js/issues/598
It does appear to be a bug. I too couldn't make it work as intended.
I implemented a workaround; I added an example in a Plunker, link is (also) in the dc.js issue at github.
It is based upon a recent snapshot of dc.js 2.0.0-dev.
(So for now I guess this could be considered an answer)
http://embed.plnkr.co/VItIQ4ZcW9abfzI13z64/

Logarithmic charts in crossfilter

I'm charting various lists of data into histograms using crossfilter. Some of the item's values are a lot higher than others and I'm keen to plot the histograms in a logarithmic fashion.
Is there a way I could convert all the summed item counts into logarithmic values after reduceSum has been called? I'm keen to add in something like
Math.log(d.count) / Math.LN10;
into the following:
var crossfiltered_data = crossfilter(data),
all = crossfiltered_data.groupAll(),
item_labels = crossfiltered_data.dimension(function(d) { return d.name; }),
items_group = hour.group().reduceSum(function(d) {
return d.count
}),
charts = [
barChart(on_range_change)
.dimension(item_labels)
.group(items_group)
.x(d3.scale.linear()
.domain([domain_start, domain_end])
.rangeRound([0, 240])),
],
chart = d3.selectAll(target)
.data(charts)
.each(function(chart) {
chart .on("brush", renderAll)
.on("brushend", renderAll);
}),
render = function(method) {
d3.select(this).call(method);
},
renderAll = function(event) {
chart.each(render);
};
Why not just use a log scale for output? See d3.scale.log. Assuming you're using the barChart on the crossfilter page, you'd just change this line to use a log scale instead of a linear one.

Categories

Resources