I have a series of documents in MongoDB that look like this:
{
"_id" : ObjectId("63ceb466db8c0f5500ea0aaa"),
"Partner_ID" : "662347848",
"EarningsData" : [
{
"From_Date" : ISODate("2022-01-10T18:30:00.000Z"),
"Scheme_Name" : "CUSTOMERWINBACKJCA01",
"Net_Amount" : 256,
},
{
"From_Date" : ISODate("2022-02-10T18:30:00.000Z"),
"Scheme_Name" : "CUSTOMERWINBACKJCA01",
"Net_Amount" : 285,
}
],
"createdAt" : ISODate("2023-01-23T16:23:02.440Z")
}
Now, what I need to do is to get the sum of Net_Amount per Scheme_Name per month of From_Date for the specific Partner_ID.
For the above document, the output will look something like this:
[
{
"Month" : 1,
"Scheme_Name" : 'CUSTOMERWINBACKJCA01'
"Net_Amount": 256
},
{
"Month" : 2,
"Scheme_Name" : 'CUSTOMERWINBACKJCA01'
"Net_Amount": 285
}
]
I have tried to implement the aggregation pipeline and was successfully able to get the sum of Net_Amount per Scheme_Name but I am not able to figure out how to integrate the per month of From_Date logic.
Below is the query sample:
var projectQry = [
{
"$unwind": {
path : '$EarningsData',
preserveNullAndEmptyArrays: true
}
},
{
$match: {
"Partner_ID": userId
}
},
{
$group : {
_id: "$EarningsData.Scheme_Name",
Net_Amount: {
$sum: "$EarningsData.Net_Amount"
}
}
},
{
$project: {
_id: 0,
Scheme_Name: "$_id",
Net_Amount: 1
}
}
];
You need to fix some issues,
$match move this stage to first for better performance, can use an index if you have created
$unwind doesn't need preserveNullAndEmptyArrays property, it preserves empty and null arrays
$group by Scheme_Name and From_Date as month, get sum of From_Date by $sum operator
$project to show required fields
db.collection.aggregate([
{ $match: { "Partner_ID": "662347848" } },
{ $unwind: "$EarningsData" },
{
$group: {
_id: {
Scheme_Name: "$EarningsData.Scheme_Name",
Month: {
$month: "$EarningsData.From_Date"
}
},
Net_Amount: {
$sum: "$EarningsData.Net_Amount"
}
}
},
{
$project: {
_id: 0,
Net_Amount: 1,
Scheme_Name: "$_id.Scheme_Name",
Month: "$_id.Month"
}
}
])
Playground
Related
I want to group by name then find the percentage of count of fill document to total document. The data is given below,here(fill:0 means not fill):-
{"name":"Raj","fill":0}
{"name":"Raj","fill":23}
{"name":"Raj","fill":0}
{"name":"Raj","fill":43}
{"name":"Rahul","fill":0}
{"name":"Rahul","fill":23}
{"name":"Rahul","fill":0}
{"name":"Rahul","fill":43}
{"name":"Rahul","fill":43}
{"name":"Rahul","fill":43}
Result :-
{
"name":"Raj",
fillcount:2,
fillpercentagetototaldocument:50% // 2 (fill count except 0 value ) divide by 4(total document for raj)
}
{
"name":"Rahul",
fillcount:4,
fillpercentagetototaldocument:66% // 4(fill count except 0 value ) divide by 6(total document for rahul)
}
You want you use $group combined with a conditional count like so:
db.collection.aggregate([
{
$group: {
_id: "$name",
total: {
$sum: 1
},
fillcount: {
$sum: {
$cond: [
{
$ne: [
"$fill",
0
]
},
1,
0
]
}
}
}
},
{
$project: {
_id: 0,
name: "$_id",
fillcount: 1,
fillpercentagetototaldocument: {
"$multiply": [
{
"$divide": [
"$fillcount",
"$total"
]
},
100
]
}
}
}
])
Mongo Playground
You can use mongos aggregation function to achieve that.
example:
db.getCollection('CollectionName').aggregate([
{
$group: {
_id: { name: '$name'},
fillpercentagetototaldocument: { $sum: '$fill' },
fillCount:{$sum:1}
},
},
{ $sort: { fillpercentagetototaldocument: -1 } },
]);
The result will look like this afterwards:
[
{
"_id" : {
"name" : "Rahul"
},
"fillpercentagetototaldocument" : 152,
"fillCount" : 6.0
},
{
"_id" : {
"name" : "Raj"
},
"fillpercentagetototaldocument" : 66,
"fillCount" : 4.0
}
]
I have a set of documents (messages) in MongoDB collection as below. I want to just preserve the latest 500 records for individual user pairs. Users are identified as sentBy and sentTo.
/* 1 */
{
"_id" : ObjectId("5f1c1b00c62e9b9aafbe1d6c"),
"sentAt" : ISODate("2020-07-25T11:44:00.004Z"),
"readAt" : ISODate("1970-01-01T00:00:00.000Z"),
"msgBody" : "dummy text",
"msgType" : "text",
"sentBy" : ObjectId("54d6732319f899c704b21ef7"),
"sentTo" : ObjectId("54d6732319f899c704b21ef5"),
}
/* 2 */
{
"_id" : ObjectId("5f1c1b3cc62e9b9aafbe1d6d"),
"sentAt" : ISODate("2020-07-25T11:45:00.003Z"),
"readAt" : ISODate("1970-01-01T00:00:00.000Z"),
"msgBody" : "dummy text",
"msgType" : "text",
"sentBy" : ObjectId("54d6732319f899c704b21ef9"),
"sentTo" : ObjectId("54d6732319f899c704b21ef8"),
}
/* 3 */
{
"_id" : ObjectId("5f1c1b78c62e9b9aafbe1d6e"),
"sentAt" : ISODate("2020-07-25T11:46:00.003Z"),
"readAt" : ISODate("1970-01-01T00:00:00.000Z"),
"msgBody" : "dummy text",
"msgType" : "text",
"sentBy" : ObjectId("54d6732319f899c704b21ef6"),
"sentTo" : ObjectId("54d6732319f899c704b21ef8"),
}
/* 4 */
{
"_id" : ObjectId("5f1c1c2e1449dd9bbef28575"),
"sentAt" : ISODate("2020-07-25T11:49:02.012Z"),
"readAt" : ISODate("1970-01-01T00:00:00.000Z"),
"msgBody" : "dummy text",
"msgType" : "text",
"sentBy" : ObjectId("54cfcf93e2b8994c25077924"),
"sentTo" : ObjectId("54d6732319f899c704b21ef5"),
}
/* and soon... assume it to be 10k+ */
Algo that came to my mind is -
Grouping first based on the OR operator
Sorting the records in descending order on a timely basis
Limit it to 500
Get the array of _id that should be preserved
Pass the ID(s) to new mongo query .deleteMany() with $nin condition
Please help I struggled a lot on this, and have not got any success. Many Thanks :)
Depending on scale I would do one of the two following:
Assuming scale is somewhat low and you can actually group the entire collection in a reasonable time I would do something similar to what you suggjested:
db.collection.aggregate([
{
$sort: {
sentAt: 1
}
},
{
$group: {
_id: {
$cond: [
{$gt: ["$sentBy", "$sentTo"]},
["$sendBy", "$sentTo"],
["$sentTo", "$sendBy"],
]
},
roots: {$push: "$$ROOT"}
}
},
{
$project: {
roots: {$slice: ["$roots", -500]}
}
},
{
$unwind: "$roots"
},
{
$replaceRoot: {
newRoot: "$roots"
}
},
{
$out: "this_collection"
}
])
The sort stage has to come first as you can't sort an inner array post group, the $cond in the group stage simulates the $or operator logic which can't be used there. finally instead of retrieving the result than using deleteMany with $nin you can just use $out to rewrite the current collection.
If scale is way too big to support this then you should just iterate user by user and do what you suggested at first, here is a quick example:
let userIds = await db.collection.distinct("sentBy");
let done = [1];
for (let i = 0; i < userIds.length; i++) {
let matches = await db.collection.aggregate([
{
$match: {
$and: [
{
$or: [
{
"sentTo": userIds[i]
},
{
"sendBy": userIds[i]
}
]
},
{ // this is not necessary it's just to avoid running on ZxY and YxZ
$or: [
{
sendTo: {$nin: done}
},
{
sendBy: {$nin: done}
}
]
}
]
}
},
{
$sort: {
sentAt: 1
}
},
{
$group: {
_id: {
$cond: [
{$eq: ["$sentBy", userIds[i]]},
"$sendTo",
"$sentBy"
]
},
roots: {$push: "$$ROOT"}
}
},
{
$project: {
roots: {$slice: ["$roots", -500]}
}
},
{
$unwind: "$roots"
},
{
$group: {
_id: null,
keepers: {$push: "$roots._id"}
}
}
]).toArray();
if (matches.length) {
await db.collection.deleteMany(
{
$and: [
{
$or: [
{
"sentTo": userIds[i]
},
{
"sendBy": userIds[i]
}
]
},
{ // this is only necessary if you used it above.
$or: [
{
sendTo: {$nin: done}
},
{
sendBy: {$nin: done}
}
]
},
{
_id: {$nin: matches[0].keepers}
}
]
}
)
}
done.push(userIds[i])
}
I have this structure of a collection.
I'd like to group the collections by month based on date_created field with this code.
db.activities.aggregate(
[
{ $match: { type: "Expense", "user_hash": "xxx" } },
{
$group: {
_id: { month: { $month: "$date_created" } },
total: { $sum: "$amount" }
}
},
]
);
I got this Exception in progress.
2020-07-19T23:13:01.652+0700 E QUERY [js] uncaught exception: Error: command failed: {
"operationTime" : Timestamp(1595175178, 1),
"ok" : 0,
"errmsg" : "can't convert from BSON type int to Date",
"code" : 16006,
"codeName" : "Location16006",
"$clusterTime" : {
"clusterTime" : Timestamp(1595175178, 1),
"signature" : {
"hash" : BinData(0,"xxx"),
"keyId" : NumberLong("xxx")
}
}
} : aggregate failed :
_getErrorWithCode#src/mongo/shell/utils.js:25:13
doassert#src/mongo/shell/assert.js:18:14
_assertCommandWorked#src/mongo/shell/assert.js:583:17
assert.commandWorked#src/mongo/shell/assert.js:673:16
DB.prototype._runAggregate#src/mongo/shell/db.js:266:5
DBCollection.prototype.aggregate#src/mongo/shell/collection.js:1012:12
#(shell):1:1
Any solution? Thank you in Advance.
You'll have to convert the data from type int to Date. You can do that using the $toDate operator (available from MongoDB v4.0).
Apparently you are storing the timestamps in seconds, so you'll have to multiply by 1000 to obtain timestamps in milliseconds before passing it to $toDate
db.activities.aggregate(
[
{ $match: { type: "Expense", "user_hash": "xxx" } },
{
$group: {
_id: {
month: {
$month: { $toDate: { $multiply: ["$date_created", 1000] } }
}
},
total: { $sum: "$amount" }
}
},
]
);
My mongodb data is like this,i want to filter the memoryLine.
{
"_id" : ObjectId("5e36950f65fae21293937594"),
"userId" : "5e33ee0b4a3895a6d246f3ee",
"notes" : [
{
"noteId" : ObjectId("5e36953665fae212939375a0"),
"time" : ISODate("2020-02-02T17:24:06.460Z"),
"memoryLine" : [
{
"_id" : ObjectId("5e36953665fae212939375ab"),
"memoryTime" : ISODate("2020-02-03T17:54:06.460Z")
},
{
"_id" : ObjectId("5e36953665fae212939375aa"),
"memoryTime" : ISODate("2020-02-03T05:24:06.460Z")
}
]
}
]}
i want to get the item which memoryTime is great than now as expected like this,
"userId" : "5e33ee0b4a3895a6d246f3ee",
"notes" : [
{
"noteId" : ObjectId("5e36953665fae212939375a0"),
"time" : ISODate("2020-02-02T17:24:06.460Z"),
"memoryLine" : [
{
"_id" : ObjectId("5e36953665fae212939375ab"),
"memoryTime" : ISODate("2020-02-03T17:54:06.460Z")
},
{
"_id" : ObjectId("5e36953665fae212939375aa"),
"memoryTime" : ISODate("2020-02-03T05:24:06.460Z")
}
]
}]
so is use code as below.i use a $filter in memoryLine to filter to get the right item.
aggregate([{
$match: {
"$and": [
{ userId: "5e33ee0b4a3895a6d246f3ee"},
]
}
}, {
$project: {
userId: 1,
notes: {
noteId: 1,
time: 1,
memoryLine: {
$filter: {
input: "$memoryLine",
as: "mLine",
cond: { $gt: ["$$mLine.memoryTime", new Date(new Date().getTime() + 8 * 1000 * 3600)] }
}
}
}
}
}]).then(doc => {
res.json({
code: 200,
message: 'success',
result: doc
})
});
but i got this,memoryLine is null,why?I try to change $gt to $lt, but also got null.
"userId" : "5e33ee0b4a3895a6d246f3ee",
"notes" : [
{
"noteId" : ObjectId("5e36953665fae212939375a0"),
"time" : ISODate("2020-02-02T17:24:06.460Z"),
"memoryLine" : null <<<------------- here is not right
}]
You can use $addFields to replace existing field, $map for outer collection and $filter for inner:
db.collection.aggregate([
{
$addFields: {
notes: {
$map: {
input: "$notes",
in: {
$mergeObjects: [
"$$this",
{
memoryLine: {
$filter: {
input: "$$this.memoryLine",
as: "ml",
cond: {
$gt: [ "$$ml.memoryTime", new Date() ]
}
}
}
}
]
}
}
}
}
}
])
$mergeObjects is used to avoid repeating fields from source memoryLine object.
Mongo Playground
I have the following documents in my collection:
{
"_id": ObjectId("5b8fed64b77d7829788ebdc8"),
"valueId": "6e01c881-c15e-b754-43bd-0fe7381cc02a",
"value": 14,
"date": "2018-09-05T14:51:11.427Z"
}
I want to group the "date" by a certain interval, get for all "valueId" a sum of the "value", which is inside the date interval. My current aggregation looks like this:
myCollection.aggregate([
{
$match: {
date: {
$gte: start,
$lte: end,
},
},
},
{
$group: {
_id: {
$toDate: {
$subtract: [{ $toLong: '$date' }, { $mod: [{ $toLong: '$date' }, interval] }],
},
},
valueId: { $addToSet: '$valueId' },
},
},
{
$project: {
_id: 1,
valueId: 1,
},
},
]);
Which gives out something like this:
{
_id: 2018-09-04T15:45:00.000Z,
valueId:[
'cb255343-9c16-f495-9c29-3697d6c7d6cb',
'97e729aa-7b0f-c107-d591-01188b768a7a'
]
}
How can I get it to something like this (simplified with one value):
{
_id: 2018-09-04T15:45:00.000Z,
valueId: [[
'cb255343-9c16-f495-9c29-3697d6c7d6cb',
<sum of value>
]]
}
EDIT:
Endsolution:
myCollection.aggregate([
{"$match":{"date":{"$gte":start,"$lte":end}}},
{"$group":{
"_id":{
"interval":{"$toDate":{"$subtract":[{"$toLong":"$date"},{"$mod":[{"$toLong":"$date"},interval]}]}},
"valueId":"$valueId"
},
"value":{"$sum":"$value"}
}},
{ $group: {
_id: "$_id.interval",
values: {
$addToSet: { id: "$_id.valueId", sum: "$value" },}
}}])
You can use multiple group, one for summing value for each valueid and interval combination and second group to push all the documents for interval.
myCollection.aggregate([
{"$match":{"date":{"$gte":start,"$lte":end}}},
{"$group":{
"_id":{
"interval":{"$toDate":{"$subtract":[{"$toLong":"$date"},{"$mod":[{"$toLong":"$date"},interval]}]}},
"valueId":"$valueId"
},
"value":{"$sum":"$value"}
}},
{"$group":{
"_id":"$_id.interval",
"valueId":{"$push":{"id":"$_id.valueId","sum":"$value"}}
}}
])