Group by Day and Item Total, but Output Item Names as Keys - javascript

I've been trying these examples : https://docs.mongodb.com/manual/reference/operator/aggregation/push/ and
https://docs.mongodb.com/manual/reference/operator/aggregation/addToSet/
Sample documents:
{ "_id" : 1, "item" : "abc", "price" : 10, "quantity" : 2, "date" : ISODate("2014-01-01T08:00:00Z") }
{ "_id" : 2, "item" : "jkl", "price" : 20, "quantity" : 1, "date" : ISODate("2014-02-03T09:00:00Z") }
{ "_id" : 3, "item" : "xyz", "price" : 5, "quantity" : 5, "date" : ISODate("2014-02-03T09:05:00Z") }
{ "_id" : 4, "item" : "abc", "price" : 10, "quantity" : 10, "date" : ISODate("2014-02-15T08:00:00Z") }
{ "_id" : 5, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-02-15T09:05:00Z") }
{ "_id" : 6, "item" : "xyz", "price" : 5, "quantity" : 5, "date" : ISODate("2014-02-15T12:05:10Z") }
{ "_id" : 7, "item" : "xyz", "price" : 5, "quantity" : 10, "date" : ISODate("2014-02-15T14:12:12Z") }
But my need is kind of mixes of them. In push example, the results look like:
{
"_id" : { "day" : 46, "year" : 2014 },
"itemsSold" : [
{ "item" : "abc", "quantity" : 10 },
{ "item" : "xyz", "quantity" : 10 },
{ "item" : "xyz", "quantity" : 5 },
{ "item" : "xyz", "quantity" : 10 }
]
}
{
"_id" : { "day" : 34, "year" : 2014 },
"itemsSold" : [
{ "item" : "jkl", "quantity" : 1 },
{ "item" : "xyz", "quantity" : 5 }
]
}
{
"_id" : { "day" : 1, "year" : 2014 },
"itemsSold" : [ { "item" : "abc", "quantity" : 2 } ]
}
And in $addToSet example, results look like:
{ "_id" : { "day" : 46, "year" : 2014 }, "itemsSold" : [ "xyz", "abc" ] }
{ "_id" : { "day" : 34, "year" : 2014 }, "itemsSold" : [ "xyz", "jkl" ] }
{ "_id" : { "day" : 1, "year" : 2014 }, "itemsSold" : [ "abc" ] }
What I want is going to be like:
{ "_id" : { "day" : 46, "year" : 2014 }, "itemsSold" : { "xyz": 25, "abc": 10 } }
{ "_id" : { "day" : 34, "year" : 2014 }, "itemsSold" : { "xyz": 5, "jkl": 1 ] }
{ "_id" : { "day" : 1, "year" : 2014 }, "itemsSold" : { "abc": 2 } }
Is this possible? If it is, any guide, direction would be helpful.

Based on your data you want two $group stages, in order to first collect per "item" and then to add those item details to an array.
Depending on your MongoDB version you have available is how you process the rest. For MongoDB 3.6 ( of from 3.4.7 ) you can use $arrayToObject in order to reshape the data:
db.collection.aggregate([
{ "$group": {
"_id": {
"year": { "$year": "$date" },
"dayOfYear": { "$dayOfYear": "$date" },
"item": "$item"
},
"total": { "$sum": "$quantity" }
}},
{ "$group": {
"_id": {
"year": "$_id.year",
"dayOfYear": "$_id.dayOfYear"
},
"itemsSold": { "$push": { "k": "$_id.item", "v": "$total" } }
}},
{ "$sort": { "_id": -1 } },
{ "$addFields": {
"itemsSold": { "$arrayToObject": "$itemsSold" }
}}
])
Or with earlier versions, you can simply post process the results. All the "aggregation" work is done before the last stage anyway:
db.collection.aggregate([
{ "$group": {
"_id": {
"year": { "$year": "$date" },
"dayOfYear": { "$dayOfYear": "$date" },
"item": "$item"
},
"total": { "$sum": "$quantity" }
}},
{ "$group": {
"_id": {
"year": "$_id.year",
"dayOfYear": "$_id.dayOfYear"
},
"itemsSold": { "$push": { "k": "$_id.item", "v": "$total" } }
}},
{ "$sort": { "_id": -1 } },
/*
{ "$addFields": {
"itemsSold": { "$arrayToObject": "$itemsSold" }
}}
*/
]).map( d => Object.assign( d,
{
itemsSold: d.itemsSold.reduce((acc,curr) =>
Object.assign(acc, { [curr.k]: curr.v }),
{}
)
}
))
Either way produces the same desired result:
{
"_id" : {
"year" : 2014,
"dayOfYear" : 46
},
"itemsSold" : {
"xyz" : 25,
"abc" : 10
}
}
{
"_id" : {
"year" : 2014,
"dayOfYear" : 34
},
"itemsSold" : {
"jkl" : 1,
"xyz" : 5
}
}
{
"_id" : {
"year" : 2014,
"dayOfYear" : 1
},
"itemsSold" : {
"abc" : 2
}
}
So you can do things with new aggregation features, but really that end result is just "reshaping" which is usually best left to client processing instead.

Related

How to filter with data taken from $lookup

I am currently trying to aggregate list of documents by filtering them with data taken with $lookup
Product.aggregate([
{
$lookup: {
from: "categories",
localField: "category",
foreignField: "_id",
as: "category",
},
},
{ $unwind: "$category" }])
I was hoping adding { $match: { "category.left": {$gte: 3}} }, would be able to get all of the products with categories that's left property is greater than specified, but so far I get nothing. what would be the solution for this?
category documents
{ "_id" : ObjectId("570557d4094a4514fc1291d6"), "left" : 1, "right" : "2" }
{ "_id" : ObjectId("570557d4094a4514fc1291d7"), "left" : 3, "right" : "8"}
{ "_id" : ObjectId("570557d4094a4514fc1291d8"), "left" : 4, "right" : "5"}
{ "_id" : ObjectId("570557d4094a4514fc1291d9"), "left" : 6, "right" : "7" }
product documents
{ "_id" : ObjectId("570557d4094a4514fc129120"), "category": ObjectId("570557d4094a4514fc1291d6") }
{ "_id" : ObjectId("570557d4094a4514fc129121"), "category": ObjectId("570557d4094a4514fc1291d7")}
{ "_id" : ObjectId("570557d4094a4514fc129122"), "category": ObjectId("570557d4094a4514fc1291d8")}
{ "_id" : ObjectId("570557d4094a4514fc129123"), "category": ObjectId("570557d4094a4514fc1291d9") }
I was expecting to get
{ "_id" : ObjectId("570557d4094a4514fc129121"), "category": ObjectId("570557d4094a4514fc1291d7")}
{ "_id" : ObjectId("570557d4094a4514fc129122"), "category": ObjectId("570557d4094a4514fc1291d8")}
{ "_id" : ObjectId("570557d4094a4514fc129123"), "category": ObjectId("570557d4094a4514fc1291d9") }
for my response

Elasticsearch only one record based on userid?

In post index, postid is primary key and userid is foreign key.
i want all post but only post from one userid, such that only one user have the one post in results sort by postdate(optional latest first)
//Actual Result
[
{
userid: "u1",
postid: "p1"
},
{
userid: "u1",
postid: "p2"
},
{
userid: "u2",
postid: "p3"
},
{
userid: "u3",
postid: "p4"
},
{
userid: "u3",
postid: "p5"
},
{
userid: "u3",
postid: "p6"
}
]
needed as below
//Expecting Result
[
{
userid: "u1",
postid: "p1"
},
{
userid: "u2",
postid: "p3"
},
{
userid: "u3",
postid: "p4"
}
]
I think you can use top hit for this. Here the sample for this :
DELETE my-index-000001
PUT my-index-000001
{
"mappings": {
"properties": {
"userid": {
"type": "keyword"
},
"postid": {
"type": "keyword"
},
"postdate": {
"type": "date"
}
}
}
}
PUT my-index-000001/_doc/1
{"userid": "u1", "postid": "p1", "postdate": "2021-03-01"}
PUT my-index-000001/_doc/2
{"userid": "u1", "postid": "p2", "postdate": "2021-03-02"}
PUT my-index-000001/_doc/3
{"userid": "u2", "postid": "p3", "postdate": "2021-03-03"}
PUT my-index-000001/_doc/4
{"userid": "u3", "postid": "p4", "postdate": "2021-03-04"}
PUT my-index-000001/_doc/5
{"userid": "u3", "postid": "p5", "postdate": "2021-03-05"}
PUT my-index-000001/_doc/6
{"userid": "u3", "postid": "p6", "postdate": "2021-03-06"}
These are the sample index creating steps. And here the query :
GET my-index-000001/_search
{
"size": 0,
"aggs": {
"top_users": {
"terms": {
"field": "userid",
"size": 100
},
"aggs": {
"top": {
"top_hits": {
"sort": [
{
"postdate": {
"order": "desc"
}
}
],
"_source": {
"includes": [ "postdate", "postid" ]
},
"size": 1
}
}
}
}
}
}
And, inside the resultset you can see the top post for the every users inside the aggregations:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"top_users" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "u3",
"doc_count" : 3,
"top" : {
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "6",
"_score" : null,
"_source" : {
"postdate" : "2021-03-06",
"postid" : "p6"
},
"sort" : [
1614988800000
]
}
]
}
}
},
{
"key" : "u1",
"doc_count" : 2,
"top" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "2",
"_score" : null,
"_source" : {
"postdate" : "2021-03-02",
"postid" : "p2"
},
"sort" : [
1614643200000
]
}
]
}
}
},
{
"key" : "u2",
"doc_count" : 1,
"top" : {
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "my-index-000001",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"postdate" : "2021-03-03",
"postid" : "p3"
},
"sort" : [
1614729600000
]
}
]
}
}
}
]
}
}
}
Assuming an index mapping of the form:
PUT user_posts
{
"mappings": {
"properties": {
"userid": {
"type": "keyword"
},
"postid": {
"type": "keyword"
},
"postdate": {
"type": "date"
}
}
}
}
You could:
aggregate on the userid and order the IDs alphabetically
sub-aggregate on the postid and sort the post by posttime descending via a max aggregation.
filter the response through the filter_path option to only retrieve what you need
POST user_posts/_search?filter_path=aggregations.*.buckets.key,aggregations.*.buckets.*.buckets.key
{
"size": 0,
"aggs": {
"by_userid": {
"terms": {
"field": "userid",
"order": {
"_key": "asc"
},
"size": 100
},
"aggs": {
"by_latest_postid": {
"terms": {
"field": "postid",
"size": 1,
"order": {
"latest_posttime": "desc"
}
},
"aggs": {
"latest_posttime": {
"max": {
"field": "postdate"
}
}
}
}
}
}
}
}
Yielding:
{
"aggregations" : {
"by_userid" : {
"buckets" : [
{
"key" : "u1",
"by_latest_postid" : {
"buckets" : [
{
"key" : "p1"
}
]
}
},
{
"key" : "u2",
"by_latest_postid" : {
"buckets" : [
{
"key" : "p3"
}
]
}
},
{
"key" : "u3",
"by_latest_postid" : {
"buckets" : [
{
"key" : "p4"
}
]
}
}
]
}
}
}
which you can then post-process as you normally would:
...
const response = await ...; // transform the above request for use in the ES JS lib of your choice
const result = response.aggregations.by_userid.buckets.map(b => {
return {
userid: b.key,
postid: b.by_latest_postid.buckets && b.by_latest_postid.buckets[0].key
}
})
You can use the top hits sub-aggregation. So first do a terms aggregation by userId, then you can use top-hits with a sort by post-date to get the latest post by each user.
I should say that if you have many userIds and you want the top hit for each one, you should probably use composite aggregation as your top-level agg, and not terms.

How to return the variant values of each product if that product is a variant?

I have a database in MongoDB like this
{"productId" : 1,
"isVariant": 1,
"variantId" : 1,
"attributeSet" : [
{
"name" : "Capacity",
"value" : "500 GB",
"id" : 3
},
{
"name" : "Form Factor",
"value" : "5 inch",
"id" : 4
},
{
"id" : 5,
"name" : "Memory Components",
"value" : "3D NAND"
}
]
},
{"productId" : 2,
"isVariant": 1,
"variantId" : 1,
"attributeSet" : [
{
"name" : "Capacity",
"value" : "1 TB",
"id" : 3
},
{
"name" : "Form Factor",
"value" : "5 inch",
"id" : 4
},
{
"id" : 5,
"name" : "Memory Components",
"value" : "3D NAND"
}
]
},
{"productId" : 3,
"isVariant": 1,
"variantId" : 1,
"attributeSet" : [
{
"name" : "Capacity",
"value" : "500 GB",
"id" : 3
},
{
"name" : "Form Factor",
"value" : "2.5 inch",
"id" : 4
},
{
"id" : 5,
"name" : "Memory Components",
"value" : "3D NAND"
}
]
},
{"productId" : 4,
"isVariant": 1,
"variantId" : 1,
"attributeSet" : [
{
"name" : "Capacity",
"value" : "1 TB",
"id" : 3
},
{
"name" : "Form Factor",
"value" : "2.5 inch",
"id" : 4
},
{
"id" : 5,
"name" : "Memory Components",
"value" : "3D NAND"
}
]
}
Now I want to return data where 500 GB has been in productId 1 and 3
The response should be like this:
variantValues : [{
attributeValue : "500 GB",
data : [
{productId : 1},
{productId : 3}
]},
{
attributeValue : "1 TB",
data : [
{productId : 2},
{productId : 4}
]},
{
attributeValue : "2.5 inch",
data : [
{productId : 3},
{productId : 4}
]},
{
attributeValue : "5 inch",
data : [
{productId : 1},
{productId : 2}
]}]
I have the possible values that I store in another collection for variantPossible values. The values that i am storing are like this:
"VariantValues" : {
"3" : [
"500 GB",
"1 TB"
],
"4" : [
"2.5 inch",
"5 inch"
]
},
I want to return the variant values of each product if that product is a variant with the above format. can anyone help me with this.
You should be able to achieve this using $unwind and $group in your aggregation pipeline. This first flattens each attribute into a single document and on those you can group by the attribute value.
Finally, you can use $project to get the desired name for attributeValue:
db.collection.aggregate([
{
$unwind: "$attributeSet"
},
{
$group: {
_id: "$attributeSet.value",
data: {
"$addToSet": {
productId: "$productId"
}
}
}
},
{
"$project": {
_id: 0,
data: 1,
attributeValue: "$_id"
}
}
])
See this simplifed example on mongoplayground: https://mongoplayground.net/p/VASadZnDedc

unable to group date and field in mongodb

I am trying to group date and name field then finding the count of each day.I am not able to differentiate date and time so My approach is doing the grouping based on date as well time in ISO format.
My approach:-
db.getCollection('blog').aggregate([
{ "$group": {
"_id": {"name":"$name","date":"$date"},
"Count": {
"$sum": {
"$sum": "$Count"
}
}
}},
{ "$project": {
"name": "$_id",
"Count": "$Count",
"_id": 0
}}
]).toArray()
input:-
{ "_id" : ObjectId("1"), "Count" : 4 , "name" : Ram, "date" : ISODate("2017-02-01T00:00:00Z") }
{ "_id" : ObjectId("2"), "Count" : 4, "name" : Arjun, "date" : ISODate("2017-01-08T00:00:00Z") }
{ "_id" : ObjectId("3"), "Count" :2 , "name" : Ram, "date" : ISODate("2017-02-01T00:00:00Z")}
{ "_id" : ObjectId("4"), "Count" : 2, "name" : Arjun, "date" : ISODate("2017-01-08T00:00:00Z") }
{ "_id" : ObjectId("5"), "Count" : 6, "name" : Arjun, "date" : ISODate("2017-01-08T00:00:00Z") }
{ "_id" : ObjectId("6"), "Count" : 6, "name" : Shyam, "date" : ISODate("2017-02-09T00:00:00Z")}
{ "_id" : ObjectId("7"), "count" : 1, "name" : Shyam, "date" : ISODate("2017-02-03T00:00:00Z") }
{ "_id" : ObjectId("8"), "loginID" : 2, "name" : Arjun, "date" : ISODate("2017-02-08T00:00:00Z") }
Expected output:-
{
name:Ram,
Count:6,
date:2017-02-01
}
{
name: Arjun,
Count:12,
date:2017-02-08
}
{
name: Arjun,
Count:2,
date:2017-01-08
}
....Something like that
You are using $sum twice in your $group stage, you only need it once.
db.getCollection('blog').aggregate([
{
$group: {
_id: {
name: "$name",
date: "$date"
},
Count: {
$sum: "$Count"
}
}
},
{
$project: {
name: "$_id.name",
date: "$_id.date",
Count: 1,
_id: 0
}
}
])

Modify the query to get the expected result

I am trying to modify query to get expected output.I am able to write the query but not getting the output as expected so that I may bind in the front end.
Actual output:-
{
"_id" : null,
"first" : 3571.0,
"second" : 24.0
}
Expected output:-
{ "_id" : null,
"opertion":edit,
"count" : 3571.0,
}
{ "_id" : null,
"opertion":read,
"count" : 24,
}
{ "_id" : null,
"opertion":update,
"count" : 9000,
}
Myquery:-
db.getCollection('blog').aggregate([
{ "$group": {
"_id": null,
"first": {
"$sum": {
"$cond": [{ "$in": ["$Operation", ["edit1", "edit2"]] }, 1, 0]
}
},
"second": {
"$sum": {
"$cond": [{ "$in": ["$Operation", ["read1", "read2"]] }, 1, 0]
}
}
},
},
])
if you have collection which is like as below:
[
{
"_id" : 1,
"operation" : "edit1" # some extra fields
},
{
"_id" : 2,
"operation" : "read1"
},
{
"_id" : 3,
"operation" : "update1"
}
]
by using $project and $cond you can rename the "read1", "read2" to read or updates to update, or edits to edit then by grouping on the new operation field you can get the count of each operation.
you can use this query:
db.aggregate([
{
"$project": {
"new_operation":
{
"$cond": [
{"$in":
["$Operation", ["edit1", "edit2"]]
}, "edit", {
"$cond": [
{"$in":
["$operation", ["read1", "read2"]]
}, "read", "update"]
}
]
}
}
},
{
"$group": {
"_id": "$new_operation",
"count": {"$sum": 1}
}
}
])

Categories

Resources