How to Map Reduce nested arrays with Underscore.js

How to Map Reduce nested arrays with Underscore.js - javascript

Given:
appliedDoses = {
"_id": "MAIN",
"scheme": {
"_id": "MAIN_SCHEME",
"name": "ESQUEMA CLASICO",
"vaccines": [
{
"_id": "BCG",
"__v": 0,
"doses": [
{
"_id": "BCG_UNICA",
"frequencies": [
{
"_id": "BCG_UNICA_RECIEN_NACIDO",
"group_type": {
"_id": "RECIEN_NACIDO",
"name": "RECIEN NACIDO"
},
"__v": 0,
"status": true,
"number_applied": 10
}, ...
What I want is to filter by group_type.id == "RECIEN_NACIDO" and doses[]._id == "BCG_UNICA" then get a total sumatory of frequencies[].number_applied
I tried:
async.each(appliedDose, function(scheme){
async.each(scheme.scheme.vaccines, function(vaccine){
async.each(vaccine.doses, function(dose){
if(dose._id == getDose) {
async.each(dose.frequencies, function(frequencie){
if(frequencie.group_type._id == getGroup) {
applied += frequencie.number_applied;
}
});
}
});
});
});
But my code isn't quite efficiently and I was wondering if MapReduce can be used to improve it. Can somebody give me a hint? Help is much appreciated!

I'm a little confused by the object you're dealing with so I'll try to deal in abstraction. What you're probably looking to do here is easier and faster if you break it down.
get the collection that results from filtering by: group_type._id == "RECIEN_NACIDO" && doses._id
get the array of number_applied's by plucking them out with _.pluck
reduce that with sum: _.reduce(numAppliedArray, function(sum, x) { return sum + x}, 0);

Related

Iterate through array of objects using javascript

I am trying to iterate through the array of objects but somehow not getting it right. Can somone please let me know where i am going wrong.
Here is the data
const response = {
"pass": 1,
"fail": 2,
"projects_all": [
{
"projects": [
{
"name": "Project1",
"current": null,
"previous": {
"environment": "qa4nc",
"status": "UNKNOWN",
}
}
]
},
{
"projects": [
{
"name": "Project2",
"current": null,
"previous": {
"environment": "qa4nc",
"status": "FAIL",
}
},
{
"name": "Project3",
"status": "LIVE",
"current": null,
"previous": {
"environment": "qa4nc",
"status": "UNKNOWN",
}
}
]
}
]
}
And here is the code i tried
if(response) {
response?.projects_all?.forEach((projects) => {
projects.forEach(project) => {
if(project.previous !== null) {
//do something here
}
});
});
}
I am trying to iterate through this array of objects but it says projects not iterable. Any help is appreciated to make me understand where i am going wrong.

You were missing iterating over an array properly. A good idea is to format the JSON object that you plan to iterate over. So that you can see what are the arrays and objects, and at what hierarchy.
if (response) {
response?.projects_all?.forEach((project) => {
project?.projects?.forEach((project) => {
console.log(project?.name);
});
}
);
}

response?.projects_all?.forEach((projects) => {
This is the exact correct way to start the code. The problem that happens next is you apparently misunderstand what projects means in the following context
You do projects.forEach(project) as if you think projects is as array. projects is not an array at this point, it is an object that looks like this:
{
"projects": [
{
"name": "Project1",
"current": null,
"previous": {
"environment": "qa4nc",
"status": "UNKNOWN",
}
}
]
}
So I would actually want to do projects.projects.forEach(project => { ... }), or you could change the variable name from projects so it makes more sense to read.

First, determine what shape your response object currently has.
By using the ?. operator your essentially muting JS built in error reporting.
From the context, I assume your response actually looks like this:
console.log(response);
{
data: {
projects_all: [ ... ]
}
}
Therefore your existing code using response?.projects_all doesn't actually hit the projects_all property inside your response.
Can you try the following:
response.data.projects_all.forEach((project) => {
console.info("Project: ", project);
project.projects.forEach((project) => {
console.log(project, project?.name);
});
});
Alternatively, if you don't have a data key inside your response object, you can omit it in the loop:
response.data.projects_all.forEach((project) => {
console.info("Project: ", project);
project.projects.forEach((project) => {
console.log(project, project?.name);
});
});

Wildcard Search in a collection with lodash

I've an array like the one shown below and I want to do a wildcard search and retrieve the corresponding value. This is not returning me any result, can someone help me if there is any better way to do this. I'm using lodash utilities in my nodejs application.
var allmCar = [
{
"_id": ObjectId("5833527e25bf78ac0f4ca30e"),
"type": "mCar",
"value": "ABDC",
"__v": 0
},
{
"_id": ObjectId("5833527e25bf78ac0f4ca30e"),
"type": "mCar",
"value": "XYZ ABD",
"__v": 0
},
{
"_id": ObjectId("5833527e25bf78ac0f4ca30e"),
"type": "mCar",
"value": "FGHJ",
"__v": 0
}
]
_.find(allmCar, {
value: {
$regex: 'XYZ'
}
})
I finally ended up using _.includes as below
_.each(allmCar,function(car){
if(_.includes('XYZ', car.value)===true)
return car;
})

You can do the same with a function passed to _.find, like this
_.find(allmCar, function(mCar) {
return /XYZ/.test(mCar.value);
});
Or with arrow functions,
_.find(allmCar, (mCar) => /XYZ/.test(mCar.value));
This will apply the function passed to all the items of the collection and if an item returns true, that item will be returned.

Remove all arrays that has power == 0

I have a use case where there comes a JSON response from backend in the form as follows:
[
{
"name": "cab",
"child": [
{
"name": "def",
"child": [
{
"name": "ghi",
"power": "0.00",
"isParent": false
}
],
"power": "1.23",
"isParent": true
}
],
"power": "1.1",
"isParent": true
},
{
"name": "hhi",
"child": [
{
"name": "hhi2",
"child": [
{
"name": "hhi3",
"power": "0.00",
"isParent": false
}
],
"power": "1.23",
"isParent": true
}
],
"power": "1.1",
"isParent": true
}
]
I need to remove all objects that has power == 0. It's easy to use filter on simple collection of arrays, but there might be cases where any n number of childs can contain n number of childs in it.
Thanks in advance!

Just iterate over the arrays with a recursive function:
var json = ["JSON_HERE"];
function deleteIterator(json) {
if(json.power == "0.00") {
return null;
} else if(json.child) {
json.child = deleteIterator(json.child);
}
return json;
}
for(var i = 0; i < json.length; i++) {
json[i] = deleteIterator(json[i]);
}
What this does is:
Iterate over the JSON children.
Check if the power is "0.00".
If it is, remove it (return null)
Check if it has children
If it does, then iterate over it (go to step 2)
Return the JSON element.

Recursively iterate through the object, looking for child each time and filter on power === 0 or whatever your requirements are.
If you dont know how to use recursion, here is a tutorial to get you started. I really hope someone doesnt come along after me and hand you the exact solution to your problem because this is something you should be able to solve yourself once you know how to use recursion. You could also use loops but.. recursion is best.
Edit: This problem has been solved before, in a different flavor, but all the same. If you find your implementation ends up having bugs you cant figure out, please feel free to mention me in a new question and i'll try my best to help you.

You can iterate recursively using Array#filter with a named function expression:
var objArray = [{"name":"cab","child":[{"name":"def","child":[{"name":"ghi","power":"0.00","isParent":false}],"power":"1.23","isParent":true}],"power":"1.1","isParent":true},{"name":"hhi","child":[{"name":"hhi2","child":[{"name":"hhi3","power":"0.00","isParent":false}],"power":"1.23","isParent":true}],"power":"1.1","isParent":true}];
objArray = _.filter(objArray, function powerFilter(o) {
if (o.power == 0) return false;
if (o.isParent && o.child) {
o.child = _.filter(o.child, powerFilter); // recursive call
o.isParent = o.child.length > 0;
if (!o.isParent) delete o.child;
}
return true;
});
console.log(objArray);
<script src="https://cdn.jsdelivr.net/underscorejs/1.8.3/underscore-min.js"></script>

Mongodb find in hash by value

i have this mongodb documents format:
{
"_id": ObjectId("5406e4c49b324869198b456a"),
"phones": {
"12035508684": 1,
"13399874497": 0,
"15148399728": 1,
"18721839971": 1,
"98311321109": -1,
}
}
phones field - its a hash of phone numbers and frequency of its using.
And i need to select all documents, which have at least one zero or less frequency.
Trying this:
db.my_collection.find({"phones": { $lte: 0} })
but no luck.
Thanks in advance for your advices

You can't do that sort of query in MongoDB, well not in a simple way anyhow, as what you are doing here is generally an "anti-pattern", where part of your data is actually being specified as "keys". So a better way to model this is you use something where that "data" is actually a value to a key, and not the other way around:
{
"_id": ObjectId("5406e4c49b324869198b456a"),
"phones": [
{ "number": "12035508684", "value": 1 },
{ "number": "13399874497", "value": 0 },
{ "number": "15148399728", "value": 1 },
{ "number": "18721839971", "value": 1 },
{ "number": "98311321109", "value": -1 },
}
}
Then your query is quite simple:
db.collection.find({ "phones.value": { "$lte": 0 } })
But otherwise MongoDB cannot "natively" traverse the "keys" of an object/hash, and to do that you need do JavaScript evaluation to do this. Which is not a great idea for performance. Basically a $where query in short form:
db.collection.find(function() {
var phones = this.phones;
return Object.keys(phones).some(function(phone) {
return phones[phone] <= 0;
})
})
So the better option is to change the way you are modelling this and take advantage of the native operators. Otherwise most queries require and "explicit" path to any "key" inside the object/hash.

mapreduce with sort on inner document mongodb

I have a quick question on map-reduce with mongodb. I have this following document structure
{
"_id": "ffc74819-c844-4d61-8657-b6ab09617271",
"value": {
"mid_tag": {
"0": {
"0": "Prakash Javadekar",
"1": "Shastri Bhawan",
"2": "Prime Minister's Office (PMO)",
"3": "Narendra Modi"
},
"1": {
"0": "explosion",
"1": "GAIL",
"2": "Andhra Pradesh",
"3": "N Chandrababu Naidu"
},
"2": {
"0": "Prime Minister",
"1": "Narendra Modi",
"2": "Bharatiya Janata Party (BJP)",
"3": "Government"
}
},
"total": 3
}
}
when I am doing my map reduce code on this collection of documents I want to specify total as the sort field in this command
db.ana_mid_big.mapReduce(map, reduce,
{
out: "analysis_result",
sort: {"value.total": -1}
}
);
But this does not seem to work. How can I specify a key which is nested for sorting? Please help.
----------------------- EDIT ---------------------------------
as per the comments I am posting my whole problem here. I have started with a collection with a little more than 3.5M documents (this is just an old snap shot of the live one, which already crossed 5.5 M) which looks like this
{
"_id": ObjectId("53b394d6f9c747e33d19234d"),
"autoUid": "ffc74819-c844-4d61-8657-b6ab09617271"
"createDate": ISODate("2014-07-02T05:12:54.171Z"),
"account_details": {
"tag_cloud": {
"0": "FIFA World Cup 2014",
"1": "Brazil",
"2": "Football",
"3": "Argentina",
"4": "Belgium"
}
}
}
So, there can be many documents with the same autoUid but with different (or partially same or even same) tag_cloud.
I have written this following map-reduce to generate an intermediate collection which looks like the one at the start of the question. So, evidently that is collection of all the tag_clouds belongs to one person in a single document. To achieve this the MR code i used looks like the following
var map = function(){
final_val = {
tag_cloud: this.account_details.tag_cloud,
total: 1
};
emit(this.autoUid, final_val)
}
var reduce = function(key, values){
var fv = {
mid_tags: [],
total: 0
}
try{
for (i in values){
fv.mid_tags.push(values[i].tag_cloud);
fv.total = fv.total + 1;
}
}catch(e){
fv.mid_tags.push(values)
fv.total = fv.total + 1;
}
return fv;
}
db.my_orig_collection.mapReduce(map, reduce,
{
out: "analysis_mid",
sort: {createDate: -1}
}
);
Here comes problem Number-1 when somebody has more than one record it obeys reduce function. But when somebody has only one instead of naming it "mid_tag" it retains the name "tag_cloud". I understand that there is some problem with the reduce code but can not find what.
Now I want to reach to a final result which looks like
{"_id": "ffc74819-c844-4d61-8657-b6ab09617271",
"value": {
"tags": {
"Prakash Javadekar": 1,
"Shastri Bhawan": 1,
"Prime Minister's Office (PMO)": 1,
"Narendra Modi": 2,
"explosion": 1,
"GAIL": 1,
"Andhra Pradesh": 1,
"N Chandrababu Naidu": 1,
"Prime Minister": 1,
"Bharatiya Janata Party (BJP)": 1,
"Government": 1
}
}
Which is finally one document for each person representing the tag density they have used. The MR code I am trying to use (not tested yet) looks like this---
var map = function(){
var val = {};
if ("mid_tags" in this.value){
for (i in this.value.mid_tags){
for (j in this.value.mid_tags[i]){
k = this.value.mid_tags[i][j].trim();
if (!(k in val)){
val[k] = 1;
}else{
val[k] = val[k] + 1;
}
}
}
var final_val = {
tag: val,
total: this.value.total
}
emit(this._id, final_val);
}else if("tag_cloud" in this.value){
for (i in this.value.tag_cloud){
k = this.value.tag_cloud[i].trim();
if (!(k in val)){
val[k] = 1;
}else{
val[k] = val[k] + 1;
}
}
var final_val = {
tag: val,
total: this.value.total
}
emit(this._id, final_val);
}
}
var reduce = function(key, values){
return values;
}
db.analysis_mid.mapReduce(map, reduce,
{
out: "analysis_result"
}
);
This last piece of code is not tested yet. That is all I want to do. Please help

Your PHP background appears to be showing. The data structures you are representing are not showing arrays in typical JSON notation, however there are noted calls to "push" in your mapReduce code that at least in your "interim document" the values are actually arrays. You seem to have "notated" them the same way so it seems reasonable to presume they are.
Actual arrays are your best option for storage here, especially considering your desired outcome. So even if they do not, your original documents should look like this, as they would be represented in the shell:
{
"_id": ObjectId("53b394d6f9c747e33d19234d"),
"autoUid": "ffc74819-c844-4d61-8657-b6ab09617271"
"createDate": ISODate("2014-07-02T05:12:54.171Z"),
"account_details": {
"tag_cloud": [
"FIFA World Cup 2014",
"Brazil",
"Football",
"Argentina",
"Belgium"
]
}
}
With documents like that or if you change them to be like that, then your right tool for doing this is the aggregation framework. That works in native code and does not require JavaScript interpretation, hence it is much faster.
An aggregation statement to get to your final result is like this:
db.collection.aggregate([
// Unwind the array to "de-normalize"
{ "$unwind": "$account_details.tag_cloud" },
// Group by "autoUid" and "tag", summing totals
{ "$group": {
"_id": {
"autoUid": "$autoUid",
"tag": "$account_details.tag_cloud"
},
"total": { "$sum": 1 }
}},
// Sort the results to largest count per user
{ "$sort": { "_id.autoUid": 1, "total": -1 }
// Group to a single user with an array of "tags" if you must
{ "$group": {
"_id": "$_id.autoUid",
"tags": {
"$push": {
"tag": "$_id.tag",
"total": "$total"
}
}
}}
])
Slightly different output, but much simpler to process and much faster:
{
"_id": "ffc74819-c844-4d61-8657-b6ab09617271",
"tags": [
{ "tag": "Narendra Modi", "total": 2 },
{ "tag": "Prakash Javadekar", "total": 1 },
{ "tag": "Shastri Bhawan", "total": 1 },
{ "tag": "Prime Minister's Office (PMO)", "total": 1 },
{ "tag": "explosion", "total": 1 },
{ "tag": "GAIL", "total": 1 },
{ "tag": "Andhra Pradesh", "total": 1 },
{ "tag": "N Chandrababu Naidu", "total": 1 },
{ "tag": "Prime Minister", "total": 1 },
{ "tag": "Bharatiya Janata Party (BJP)", "total": 1 },
{ "tag": "Government", "total": 1 }
]
}
Also sorted by "tag relevance score" for the user for good measure, but you can look at dropping that or even both of the last stages as is appropriate to your actual case.
Still, by far the best option. Get to learn how to use the aggregation framework. If your "output" will still be "big" ( over 16MB ) then try to look at moving to MongoDB 2.6 or greater. Aggregate statements can produce a "cursor" which can be iterated rather than pull all results at once. Also there is the $out operator which can create a collection just like mapReduce does.
If your data is actually in the "hash" like format of sub-documents how you indicate in your notation of this ( which follows a PHP "dump" convention for arrays ), then you need to use mapReduce as the aggregation framework cannot traverse "hash-keys" the way these are represented. Not the best structure, and you should change it if this is the case.
Still there are several corrections to your approach and this does in fact become a single step operation to the final result. Again though, the final output will contain and "array" of "tags", since it really is not good practice to use your "data" as "key" names:
db.collection.mapReduce(
function() {
var tag_cloud = this.account_details.tag_cloud;
var obj = {};
for ( var k in tag_cloud ) {
obj[tag_cloud[k]] = 1;
}
emit( this.autoUid, obj );
},
function(key,values) {
var reduced = {};
// Combine keys and totals
values.forEach(function(value) {
for ( var k in value ) {
if (!reduced.hasOwnProperty(k))
reduced[k] = 0;
reduced[k] += value[k];
}
});
return reduced;
},
{
"out": { "inline": 1 },
"finalize": function(key,value) {
var output = [];
// Mapped to array for output
for ( var k in value ) {
output.push({
"tag": k,
"total": value[k]
});
}
// Even sorted just the same
return output.sort(function(a,b) {
return ( a.total < b.total ) ? -1 : ( a.total > b.total ) ? 1 : 0;
});
}
}
)
Or if it actually is an "array" of "tags" in your original document but your end output will be too big and you cannot move up to a recent release, then the initial array processing is just a little different:
db.collection.mapReduce(
function() {
var tag_cloud = this.account_details.tag_cloud;
var obj = {};
tag_cloud.forEach(function(tag) {
obj[tag] = 1;
});
emit( this.autoUid, obj );
},
function(key,values) {
var reduced = {};
// Combine keys and totals
values.forEach(function(value) {
for ( var k in value ) {
if (!reduced.hasOwnProperty(k))
reduced[k] = 0;
reduced[k] += value[k];
}
});
return reduced;
},
{
"out": { "replace": "newcollection" },
"finalize": function(key,value) {
var output = [];
// Mapped to array for output
for ( var k in value ) {
output.push({
"tag": k,
"total": value[k]
});
}
// Even sorted just the same
return output.sort(function(a,b) {
return ( a.total < b.total ) ? -1 : ( a.total > b.total ) ? 1 : 0;
});
}
}
)
Everything essentially follows the same principles to get to the end result:
De-normalize to a "user" and "tag" combination with "user" and the grouping key
Combine the results per user with a total on "tag" values.
In the mapReduce approach here, apart from being cleaner than what you seemed to be trying, the other main point to consider here is that the reducer needs to "output" exactly the same sort of "input" that comes from the mapper. The reason is actually well documented, as the "reducer" can in fact get called several times, basically "reducing again" output that has already been through reduce processing.
This is generally how mapReduce deals with "large inputs", where there are lots of values for a given "key" and the "reducer" only processes so many of them at one time. For example a reducer may actually only take 30 or so documents emitted with the same key, reduce two sets of those 30 down to 2 documents and then finally reduce to a single output for a single key.
The end result here is the same as the other output shown above, with the mapReduce difference that everything is under a "value" key as that is just how it works.
So a couple of ways to do it depending on your data. Do try to stick with the aggregation framework where possible as it is much faster and modern versions can consume and output just as much data as you can throw at mapReduce.

Develop Reference

JavaScript is the programming language of the Web.

How to Map Reduce nested arrays with Underscore.js - javascript

Related

Iterate through array of objects using javascript

Wildcard Search in a collection with lodash

Remove all arrays that has power == 0

Mongodb find in hash by value

mapreduce with sort on inner document mongodb

Categories

Resources