EDIT: Ok, got it. It cannot be done with a find. I workarounded this in another way, but its pretty unrelated with this question: i added a boolean field "last_inserted" and used meteor hooks to make sure that only the last inserted have that field. Thank you all anyway.
I'm in a Meteor project and have a collection like this:
group_id, date, attribute1, attributeN
1, 2015-11-26 09:40:23.000Z, "foo", "bar"
1, 2015-11-23 14:23:53.000Z, "foo", "bar"
2, 2015-11-23 14:24:01.000Z, "foo", "bar"
2, 2015-11-23 14:25:44.000Z, "foo", "bar"
i have to filter this and get, for each group_id, the element with the biggest date, having a result like this:
group_id, date, attribute1, attributeN
1, 2015-11-26 09:40:23.000Z, "foo", "bar"
2, 2015-11-23 14:25:44.000Z, "foo", "bar"
I read in some related question that it can be done with the aggregate operator:
MongoDB - get documents with max attribute per group in a collection
but Meteor still doesn't implement it, and i have to do it with a find.
I'm no MongoDB expert and i'm reading the documentation from a lot, but i'm starting to fear that i can't do this without getting all the elements and manually filtering those in javascript (i really don't like this kind of solution)
I read there are some meteor packages that adds the aggregate command (like this), but i don't know how much they are reliable since this is a project that needs to be rock-solid
How can i do that?
This is a proposal in plain Javascript with a temporary object for the references to the result array.
var data = [{ group_id: 1, date: '2015-11-26 09:40:23.000Z', attribute1: 'foo', attributeN: 'bar' }, { group_id: 1, date: '2015-11-23 14:23:53.000Z', attribute1: 'foo', attributeN: 'bar' }, { group_id: 2, date: '2015-11-23 14:24:01.000Z', attribute1: 'foo', attributeN: 'bar' }, { group_id: 2, date: '2015-11-23 14:25:44.000Z', attribute1: 'foo', attributeN: 'bar' }],
result = function (data) {
var r = [], o = {};
data.forEach(function (a) {
if (!(a.group_id in o)) {
o[a.group_id] = r.push(a) - 1;
return;
}
if (a.date > r[o[a.group_id]].date) {
r[o[a.group_id]] = a;
}
});
return r;
}(data);
document.write('<pre>' + JSON.stringify(result, 0, 4) + '</pre>');
Related
I've rewritten this into a simplified form to demonstrate, I have an array of pickers who have an array of time entries, I'm using reduce to summarise time entries by type on the pickers & then a second reduce to show global entries across both pickers.
The first reduce per picker works as expected.
The second reduce on global time entries works as expected but somehow changes the entries for the first picker ( Sam ).
Sam & John pick the same amount.
Apples 2h, Peaches 2h, Lemons 1h
Is there a better way to write this? Is there a concept I've failed to understand?
function testBug() {
// Reducer Function
function entryReducer(summary, entry) {
// find an index if the types of fruit are the same
let index = summary.findIndex((item) => {
return item.type.id === entry.type.id;
});
if (index === -1) {
summary.push(entry);
} else {
summary[index].hours = summary[index].hours + entry.hours;
}
return summary;
}
let pickers = [
{
id: 1,
identifier: "Sam Smith",
timeEntries: [
{
type: {
id: 1,
name: "Apples",
},
hours: 1,
},
{
type: {
id: 2,
name: "Peaches",
},
hours: 1,
},
{
type: {
id: 3,
name: "Lemons",
},
hours: 1,
},
{
type: {
id: 1,
name: "Apples",
},
hours: 1,
},
{
type: {
id: 2,
name: "Peaches",
},
hours: 1,
},
],
},
{
id: 2,
identifier: "John Snow",
timeEntries: [
{
type: {
id: 1,
name: "Apples",
},
hours: 1,
},
{
type: {
id: 2,
name: "Peaches",
},
hours: 1,
},
{
type: {
id: 3,
name: "Lemons",
},
hours: 1,
},
{
type: {
id: 1,
name: "Apples",
},
hours: 1,
},
{
type: {
id: 2,
name: "Peaches",
},
hours: 1,
},
],
},
];
let pickersSummary = [];
let timeEntriesSummary = [];
for (const picker of pickers) {
if (picker.timeEntries.length > 0) {
// reduce time entries into an array of similar types
picker.timeEntries = picker.timeEntries.reduce(entryReducer, []);
// push to pickers summary arr
pickersSummary.push(picker);
// push time entries to a summary array for later reduce
picker.timeEntries.map((entry) => timeEntriesSummary.push(entry));
}
}
// Reduce time entries for all pickers
// Sam & John pick the same amount
// Apples 2h
// Peaches 2h
// Lemons 1h
// **** If I run this Sam's entries are overwritten with the global time entries ***
timeEntriesSummary = timeEntriesSummary.reduce(entryReducer, []);
const results = { pickersSummary, timeEntriesSummary };
console.log(results);
}
testBug();
module.exports = testBug;
Even though with each reducer you pass a new array [], the actual objects contained by these arrays could be shared. This means when you edit one of the objects in array "A", the objects could also change in array "B".
You know how some languages let you pass variables by value or by reference and how this fundamentally changes how values are handled? JavaScript technically uses call-by-sharing. I suggest reading this other answer: Is JavaScript a pass-by-reference or pass-by-value language?
once an element in an array is pushed into a different array it is separate in memory?
No, it isn't. In JavaScript you will always remember when you made an individual copy of an object (or at least wanted to), because that needs some effort, see What is the most efficient way to deep clone an object in JavaScript? or How do I correctly clone a JavaScript object?
So, just like when you use a=b, push(a) into an array refers the original object. See this example where there is a single object accessible via two variables (x and y), and via both elements of array z. So modifying it as z[1] affects all the others:
let x={a:5};
let y=x;
let z=[x];
z.push(y);
z[1].a=4;
console.log(x);
console.log(y);
console.log(z[0]);
console.log(z[1]);
As your objects are value-like ones and do not have anything what JSON would not support (like member functions), JSON-based cloning can work on them:
function testBug() {
// Reducer Function
function entryReducer(summary, entry) {
// find an index if the types of fruit are the same
let index = summary.findIndex((item) => {
return item.type.id === entry.type.id;
});
if (index === -1) {
//summary.push(entry);
summary.push(JSON.parse(JSON.stringify(entry))); // <--- the only change
} else {
summary[index].hours = summary[index].hours + entry.hours;
}
return summary;
}
let pickers = [
{id: 1, identifier: "Sam Smith", timeEntries: [
{type: {id: 1, name: "Apples",}, hours: 1,},
{type: {id: 2, name: "Peaches",}, hours: 1,},
{type: {id: 3, name: "Lemons",}, hours: 1,},
{type: {id: 1, name: "Apples",}, hours: 1,},
{type: {id: 2, name: "Peaches",}, hours: 1,},],},
{id: 2, identifier: "John Snow", timeEntries: [
{type: {id: 1, name: "Apples",}, hours: 1,},
{type: {id: 2, name: "Peaches",}, hours: 1,},
{type: {id: 3, name: "Lemons",}, hours: 1,},
{type: {id: 1, name: "Apples",}, hours: 1,},
{type: {id: 2, name: "Peaches",}, hours: 1,},],},];
let pickersSummary = [];
let timeEntriesSummary = [];
for (const picker of pickers) {
if (picker.timeEntries.length > 0) {
// reduce time entries into an array of similar types
picker.timeEntries = picker.timeEntries.reduce(entryReducer, []);
// push to pickers summary arr
pickersSummary.push(picker);
// push time entries to a summary array for later reduce
picker.timeEntries.map((entry) => timeEntriesSummary.push(entry));
}
}
// Reduce time entries for all pickers
// Sam & John pick the same amount
// Apples 2h
// Peaches 2h
// Lemons 1h
// **** If I run this Sam's entries are overwritten with the global time entries ***
timeEntriesSummary = timeEntriesSummary.reduce(entryReducer, []);
const results = { pickersSummary, timeEntriesSummary };
console.log(results);
}
testBug();
Now it probably displays what you expected, but in the background it still alters the pickers themselves, you have that picker.timeEntries = ... line running after all. It may be worth mentioning that const something = xy; means that you can not write something = yz; later, something will stick with a given entity. But, if that entity is an object, its internals can still be changed, that happens with picker.timeEntries above (while writing picker = 123; would fail).
There is an array of data that needs to be converted to a tree:
const array = [{
id: 5,
name: 'vueJS',
parentId: [3]
}, {
id: 6,
name: 'reactJS',
parentId: [3]
}, {
id: 3,
name: 'js',
parentId: [1]
}, {
id: 1,
name: 'development',
parentId: null
}, {
id: 4,
name: 'oracle',
parentId: [1,2]
}, {
id: 2,
name: 'data-analysis',
parentId: null
}];
Now it works using this function:
function arrayToTree(array, parent) {
var unflattenArray = [];
array.forEach(function(item) {
if(item.parentId === parent) {
var children = arrayToTree(array, item.id);
if(children.length) {
item.children = children
}
unflattenArray.push(item)
}
});
return unflattenArray;
}
console.log(arrayToTree(array, null));
I have two problems with this feature:
The value of "parentId" should be an array of id, for example -
"parentId": [2, 3]
How to transfer to function only one argument - "array"?
https://codepen.io/pershay/pen/PgVJOO?editors=0010
I find this question confusing. It sounds like what you are really saying is the array represents the “definition of node types in the tree” and not the actual instances of those nodes that will be in the tree.
So your problem is you need to copy the “definitions” from the array to new “instance” nodes in your tree. This would let “Oracle” show twice, as you’d create a new “oracle instance” node for each parent in its parent array. It wouldn’t technically need to be a deep copy depending on your use, so you could proof of concept with Object.assign, but each instance would point to the same parents array and that may or may not cause problems for that or future reference values you add to the definition.
Finally, depending on the size of the tree and what you are really trying to do, you might want to convert to a tree represented by nodes/edges instead of parent/children. For really large datasets recursion can sometimes cause you problems.
Sorry I’m on my phone so some things are hard to see on the codepen.
So essentially I have a JSON that it's in the format as so:
JSON= [
{username: foo, score: 12343},
{username: bar, score: 9432},
{username: foo-bar, score: 402, ...
]
I know that the JSON that I am getting back is already rank from highest to lowest, based on the scores.
How can I make an array that automatically inserts the rank position? For instance, I would like to see an output of :
[1, username: foo, score: 12343],
[2, username: bar, score: 9432],
[3, username: foo-bar, score: 402],...
Javascript got you covered already. An array always has indexes, though it starts at zero. If you want the best, just use
console.log(JSON[0]);
and if you insist on 1 being 1, try this
function getByRank(index) {
return JSON[index - 1];
}
so you will get the best with
console.log(getByRank(1)); // {username: "foo", score: 12343}
I think in your code the json is valid json, see the " around foo...
Your syntax seems off, but assuming that the JSON is:
JSON = [
{username: 'foo', score: 12343},
{username: 'bar', score: 9432},
{username: 'foo-bar', score: 402}
]
Since Arrays are 0 indexed and you have access to the current index in a .map method, you can do something along the following:
JSON = JSON.map(function (record, index) {
record.rank = index + 1;
return record;
})
This should give you the following output:
[
{ rank: 1, username: 'foo', score: 12343 },
{ rank: 2, username: 'bar', score: 9432},
{ rank: 3, username: 'foo-bar', score: 402}
]
Hope this helps.
Strictly from the point of view of your question, you already got the answer. I just want to enumerate some ideas, maybe they will be useful to some of you in the future:
First of all, JSON (JavaScript Object Notation) is an open-standard format that uses human-readable text to serialize objects, arrays, numbers, strings, booleans, and null. Its syntax, as the name states, is derived from JavaScript, but JSON is actually a text, a string.
A JSON string is (almost always) syntactically correct
JavaScript code.
Make sure you don't use the keyword "JSON" as the name of a variable, because JSON is actually an existing object in JavaScript, and it is used to encode to JSON format (stringify) and decode from JSON format (parse).
Just try JSON.stringify({a: 2, b: 3}); and JSON.parse('{"a":2,"b":3}'); in a JavaScript console.
Assume I have a viewModel like below.
var data = {
a: { a1: "a1", a2: "a2" },
b: "b"
};
I would like to ignore a.a1 and b. So my expected JSON is
{"a":{a2:"a2"}}
However, on doing this
var result = ko.mapping.toJSON(data, { ignore: ["a.a1", "b"] })
I am getting result=
{"a":{"a1":"a1","a2":"a2"}}
Knockout mapping is not ignoring a.a1. Is this a bug in the plugin? It correctly ignored 'b' but why not 'a.a1'?
The names found in the ignore array should be the name of the property, regardless of what level it is in the object. You have to use:
{ ignore: [ "a1", "b" ] }
I had similar ignore list where some "ids" have to be suppressed and others left as is. I wanted to expand on the answer so that people using fromJS can see specific ignores do work
var data1 = {
invoice: { id: 'a1', name: 'a2', type: 'a3'},
shipping: "b1",
id:"c1"
};
var resultvm = ko.mapping.fromJS(data1, {'ignore':["invoice.id",
"ship"]}); ko.applyBindings(resultvm);
Will give you an output as below. Notice that only the id for invoice has been ignored.
{"invoice":{"name":"a2","type":"a3"},"id":"c1"}
But toJSON gives
Code:
var result = ko.mapping.toJSON(data1, { ignore: ["invoice.id",
"shipping"] });
Result:
{"invoice":{"id":"a1","name":"a2","type":"a3"},"id":"c1"}
Here is my jsFiddle: https://jsfiddle.net/3b17r0ys/8/
I have 100 documents in my mongoDB, assuming each of them are possible duplicate with other document(s) in different conditions, such as firstName & lastName, email and mobile phone.
I am trying to mapReduce these 100 documents to have the key-value pairs, like grouping.
Everything works fine until I have the 101st duplicate records in the DB.
The output of the mapReduce result for the other documents which are duplicate with the 101st records are corrupted.
For example:
I am working on firstName & lastName now.
When the DB contains 100 documents, I can have the result containing
{
_id: {
firstName: "foo",
lastName: "bar,
},
value: {
count: 20
duplicate: [{
id: ObjectId("/*an object id*/"),
fullName: "foo bar",
DOB: ISODate("2000-01-01T00:00:00.000Z")
},{
id: ObjectId("/*another object id*/"),
fullName: "foo bar",
DOB: ISODate("2000-01-02T00:00:00.000Z")
},...]
},
}
It is what exactly I want, but...
when the DB contains more than 100 possible duplicated documents, the result became like this,
Let's say the 101st documents is
{
firstName: "foo",
lastName: "bar",
email: "foo#bar.com",
mobile: "019894793"
}
containing 101 documents:
{
_id: {
firstName: "foo",
lastName: "bar,
},
value: {
count: 21
duplicate: [{
id: undefined,
fullName: undefined,
DOB: undefined
},{
id: ObjectId("/*another object id*/"),
fullName: "foo bar",
DOB: ISODate("2000-01-02T00:00:00.000Z")
}]
},
}
containing 102 documents:
{
_id: {
firstName: "foo",
lastName: "bar,
},
value: {
count: 22
duplicate: [{
id: undefined,
fullName: undefined,
DOB: undefined
},{
id: undefined,
fullName: undefined,
DOB: undefined
}]
},
}
I found another topic on stackoverflow having the similar issue like me, but the answer does not work for me
MapReduce results seem limited to 100?
Any ideas?
Edit:
Original source code:
var map = function () {
var value = {
count: 1,
userId: this._id
};
emit({lastName: this.lastName, firstName: this.firstName}, value);
};
var reduce = function (key, values) {
var reducedObj = {
count: 0,
userIds: []
};
values.forEach(function (value) {
reducedObj.count += value.count;
reducedObj.userIds.push(value.userId);
});
return reducedObj;
};
Source code now:
var map = function () {
var value = {
count: 1,
users: [this]
};
emit({lastName: this.lastName, firstName: this.firstName}, value);
};
var reduce = function (key, values) {
var reducedObj = {
count: 0,
users: []
};
values.forEach(function (value) {
reducedObj.count += value.count;
reducedObj.users = reducedObj.users.concat(values.users); // or using the forEach method
// value.users.forEach(function (user) {
// reducedObj.users.push(user);
// });
});
return reducedObj;
};
I don't understand why it would fail as I was also pushing a value (userId) to reducedObj.userIds.
Are there some problems about the value that I emitted in map function?
Explaining the problem
This is a common mapReduce trap, but clearly part of the problem you have here is that the questions you are finding don't have answers that explain this clearly or even properly. So an answer is justified here.
The point in the documentation that is often missed or at least misunderstood is here in the documentation:
MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key.
And adding to that just a little later down the page:
the type of the return object must be identical to the type of the value emitted by the map function.
What this means in the context of your question is that at a certain point there are "too many" duplicate key values being passed in for a reduce stage to act on this in one single pass as it will be able to do for a lower number of documents. By design the reduce method is called multiple times, often taking the "output" from data that is already reduced as part of it's "input" for yet another pass.
This is how mapReduce is designed to handle very large datasets, by processing everything in "chunks" until it finally "reduces" down to a singular grouped result per key. This is why the next statement is important is that what comes out of both emit and the reduce output needs to be structured exactly the same in order for the reduce code to handle it correctly.
Solving the problem
You correct this by fixing up how you are both emitting the data in the map and how you also return and process in the reduce function:
db.collection.mapReduce(
function() {
emit(
{ "firstName": this.firstName, "lastName": this.lastName },
{ "count": 1, "duplicate": [this] } // Note [this]
)
},
function(key,values) {
var reduced = { "count": 0, "duplicate": [] };
values.forEach(function(value) {
reduced.count += value.count;
value.duplicate.forEach(function(duplicate) {
reduced.duplicate.push(duplicate);
});
});
return reduced;
},
{
"out": { "inline": 1 },
}
)
The key points can be seen in both the content to emit and the first line of the reduce function. Essentially these present a structure that is the same. In the case of the emit it does not matter that the array being produced only has a singular element, but you send it that way anyhow. Side by side:
{ "count": 1, "duplicate": [this] } // Note [this]
// Same as
var reduced = { "count": 0, "duplicate": [] };
That also means that the remainder of the reduce function will always assume that the "duplicate" content is in fact an array, because that is how it came as original input and is also how it will be returned:
values.forEach(function(value) {
reduced.count += value.count;
value.duplicate.forEach(function(duplicate) {
reduced.duplicate.push(duplicate);
});
});
return reduced;
Alternate Solution
The other reason for an answer is that considering the output you are expecting, this would in fact be much better suited to the aggregation framework. It's going to do this a lot faster than mapReduce can, and is even far more simple to code up:
db.collection.aggregate([
{ "$group": {
"_id": { "firstName": "$firstName", "lastName": "$lastName" },
"duplicate": { "$push": "$$ROOT" },
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$gt": 1 } }}
])
That's all it is. You can write out to a collection by adding an $out stage to this where required. But basically either mapReduce or aggregate, you are still placing the same 16MB restriction on the document size by adding your "duplicate" items into an array.
Also note that you can simply do something that mapReduce cannot here, and just "omit" any items that are not in fact a "duplicate" from the results. The mapReduce method cannot do this without first producing output to a collection and then "filtering" the results in a separate query.
That core documentation itself quotes:
NOTE
For most aggregation operations, the Aggregation Pipeline provides better performance and more coherent interface. However, map-reduce operations provide some flexibility that is not presently available in the aggregation pipeline.
So it's really a case of weighing up which is better suited to the problem at hand.