I'm trying to aggregate $sum between 2 dates stored as UTC strings (yyyy-mm-dd-hh). It takes 5+ seconds to get the results. My collection has 5 million+ docs.
{
$match: {
start: {
$gte: '2020-08-01-00',
$lte: '2021-08-01-00'
}
}
},
{
$group: {
_id: {
symbol: '$symbol'
},
unverifiedCount: {
$sum: {
$cond: {
if: { $eq: ['$isVerified', false] }, then: '$count', else: 0
}
}
},
verifiedCount: {
$sum: {
$cond: {
if: { $eq: ['$isVerified', true]}, then: '$count', else: 0
}
}
}
}
}, {
$sort: {
unverifiedCount: -1
}
}
Tried using $toDateString but performance remained the same
Related
How to add count inside of MongoDB aggregation. My document in collection looks like this:
{
_id,
Sat,
calories,
carbs,
category,
fat,
fiber,
food,
grams,
measure,
protein,
sat.fat
}
I want to do query which will return all documents where "food" is like string and I want to project only fields which user select in frontend (for example, only "protein" and "carbs"). Also I want to add count field which will count found documents. So output should look something like this:
{
nutrientList: [
// here come all found projected documents
],
count: numberOfFOundDocs
}
I tried with, but it doesnt work:
const { macros, foodName, skip } = req.body;
// search word
var keywords = [
foodName,
foodName.toLocaleLowerCase(),
foodName.toLocaleUpperCase(),
foodName.toLowerCase(),
foodName.toUpperCase(),
],
regex = keywords.join("|");
Nutrient.aggregate([
{ $match: { food: { $regex: regex } } },
{
$project: {
protein: {
$cond: {
if: { $eq: [false, macros.protein] },
then: "$$REMOVE",
else: "$protein",
},
},
carbs: {
$cond: {
if: { $eq: [false, macros.carbohydrate] },
then: "$$REMOVE",
else: "$carbs",
},
},
fat: {
$cond: {
if: { $eq: [false, macros.fat] },
then: "$$REMOVE",
else: "$fat",
},
},
"sat.fat": {
$cond: {
if: { $eq: [false, macros.satFat] },
then: "$$REMOVE",
else: "$sat.fat",
},
},
fiber: {
$cond: {
if: { $eq: [false, macros.fiber] },
then: "$$REMOVE",
else: "$fiber",
},
},
calories: 1,
food: 1,
grams: 1,
measure: 1,
category: 1
},
},
{
$addFields: {
'totalCount': {$count: {}}
}
}
])
Using the following code, I do get totalAccount and totalBalance. But, no other field/data is showing up. How can I also get all data from my collection that matches my query (brcode)?
const test = await db.collection('alldeposit').aggregate([
{
$match: {
brcode: brcode
}
},
{
$group: {
_id: null,
totalAccount: {
$sum: 1
},
totalBalance: {
$sum: "$acbal"
}
}
}
]).toArray()
You have to specify which fields you want to see in the $group stage
For example:
await db.collection('alldeposit').aggregate([
{
$match: {
brcode: brcode
}
},
{
$group: {
_id : null,
name : { $first: '$name' },
age : { $first: '$age' },
sex : { $first: '$sex' },
province : { $first: '$province' },
city : { $first: '$city' },
area : { $first: '$area' },
address : { $first: '$address' },
totalAccount: {
$sum: 1
},
totalBalance: {
$sum: "$acbal"
}
}
}]);
Edit:
Regarding our chat in the comments, unfortunately I don't know a way to do the operation you asked in a single aggregation.
But with two steps, you can do it:
First step:
db.collection.aggregate([
{
$match: {
brcode: brcode
}
},
{
"$group": {
"_id": null,
totalAccount: {
$sum: 1
},
totalBalance: {
$sum: "$acbal"
}
}
}
])
And second step:
db.collection.update(
{ brcode: brcode }
,{$set : {
"totalAccount": totalAccount,
"totalBalance": totalBalance
}}
)
I'm using the aggregate framework to query a collection and create an array of active players (up until the last $lookup) after which I'm trying to use $lookup and $pipeline to select all the players from another collection (users) that are not present inside the activeUsers array.
Is there any way of doing this with my current setup?
Game.aggregate[{
$match: {
date: {
$gte: ISODate('2021-04-10T00:00:00.355Z')
},
gameStatus: 'played'
}
}, {
$unwind: {
path: '$players',
preserveNullAndEmptyArrays: false
}
}, {
$group: {
_id: '$players'
}
}, {
$group: {
_id: null,
activeUsers: {
$push: '$_id'
}
}
}, {
$project: {
activeUsers: true,
_id: false
}
}, {
$lookup: {
from: 'users',
'let': {
active: '$activeUsers'
},
pipeline: [{
$match: {
deactivated: false,
// The rest of the query works fine but here I would like to
// select only the elements that *aren't* inside
// the array (instead of the ones that *are* inside)
// but if I use '$nin' here mongoDB throws
// an 'unrecognized' error
$expr: {
$in: [
'$_id',
'$$active'
]
}
}
},
{
$project: {
_id: 1
}
}
],
as: 'users'
}
}]
Thanks
For negative condition use $not before $in operator,
{ $expr: { $not: { $in: ['$_id', '$$active'] } } }
Is there a way to sort in MongoDB by $indexOfCp without adding it as field
Currently that what I'm doing
Food.aggregate([
{ $match: { name: { $regex: search } } },
{ $addFields: { score: { $indexOfCP: ['$name', search] } } },
{ $sort: { score: 1 } }
])
How can I do that 👆 without $addFields?
I have generated a histogram by the following command:
db.mydb.aggregate([{ $bucketAuto: { groupBy: "$userId", buckets: 1e9 } }])
Assuming I have fewer than 1 billion unique users (and sufficient memory), this gives me the count of documents for each user.
User Docs
===== ====
userA 3
userB 1
userC 5
userD 1
I want to take the result of this histogram and pivot to count the number of users for each document count.
The result would look like:
Docs Users
==== =====
1 2
2 0
3 1
4 0
5 1
Is there a simple, functional, way of doing this in MongoDB?
One thing you can start with is simple $group stage:
db.col.aggregate([
{
$group: {
_id: "$docs",
count: { $sum: 1 }
}
},
{
$project: {
_id: 0,
docs: "$_id",
users: "$count"
}
},
{
$sort: { docs: 1 }
}
])
This will give you below result:
{ "docs" : 1, "users" : 2 }
{ "docs" : 3, "users" : 1 }
{ "docs" : 5, "users" : 1 }
Then docs without users are the missing part. You can add them either from your application or from MongoDB (shown below):
db.col.aggregate([
{
$group: {
_id: "$docs",
count: { $sum: 1 }
}
},
{
$group: {
_id: null,
histogram: { $push: "$$ROOT" }
}
},
{
$project: {
values: {
$map: {
input: { $range: [ { $min: "$histogram._id" }, { $add: [ { $max: "$histogram._id" }, 1 ] } ] },
in: {
docs: "$$this",
users: {
$let: {
vars: {
current: { $arrayElemAt: [ { $filter: { input: "$histogram", as: "h", cond: { $eq: [ "$$h._id", "$$this" ] } } }, 0 ] }
},
in: {
$ifNull: [ "$$current.count", 0 ]
}
}
}
}
}
}
}
},
{
$unwind: "$values"
},
{
$replaceRoot: {
newRoot: "$values"
}
}
])
The idea here is that we can $group by null which produces single document containing all docs from previous stage. Knowing $min and $max values we can generate a $range of numbers and $map that range into either existing counts or default value which is 0. Then we can use $unwind and $replaceRange to get single histogram point per document. Output:
{ "docs" : 1, "users" : 2 }
{ "docs" : 2, "users" : 0 }
{ "docs" : 3, "users" : 1 }
{ "docs" : 4, "users" : 0 }
{ "docs" : 5, "users" : 1 }
mickl's answer definitely got me moving in the right direction. In particular, using $group is a nice improvement over $bucketAuto for this use-case. The trick to layering the histogram was just to use a $group stage more than once within the same aggregate. I guess it's obvious in hindsight.
The complete solution is here:
const h2 = db.mydb.aggregate([
{ $group: { _id: "$userId", count: { $sum: 1 } } },
{ $group: { _id: "$count", count: { $sum: 1 } } },
{ $project: { docs: "$_id", users: "$count" } },
{ $sort: { docs: +1 } }
])