I'm looking to write a query to aggregate the quantity based on dates.
I have documents that look like this:
{
"_id" : 1234,
"itemNumber" : "item 1",
"date" : ISODate("2021-10-26T21:00:00Z"),
"quantity" : 1,
"__v" : 0
}
And a query like this:
//monogoose
myCollection.aggregate().group({
_id: '$itemNumber',
ninetyDays: {
$sum: {
$and: {
$gte: ["date", dayjs().subtract(90, 'd').toDate()],
$lt: ["date", dayjs().toDate()]
}
}
}
})
In the query above ninetyDays is always 0.
I'm basically looking to get the sum of the quantity given a date range.
Help is much appreciated.
Thank you
You can use $cond to sum 1 or 0 if your condition is match.
Assuming your date expression is correct this should works, but don't forger $ in date.
db.collection.aggregate([
{
"$group": {
"_id": "$itemNumber",
"ninetyDays": {
"$sum": {
"$cond": {
"if": {
"$and": [
$gte: ["$date", dayjs().subtract(90, 'd').toDate()],
$lt: ["$date", dayjs().toDate()]
]
},
"then": 1,
"else": 0
}
}
}
}
}
])
Example here where I've used mongo $$NOW but if your date works is easier to use your code.
Related
I have collections that have dates in an array like:
datesArray: [{
start_date: Date,
end_date: Date
}]
I want only those collections which satisfy all elements of datesArray.
I am using it in aggregation $match operator like:
Model.aggregate([
{
$match: {
'datesArray.start_date': { $gte: new Date('11-01-21') },
'datesArray.end_date': { $lte: new Date('11-30-21') }
}
}
])
I tried with $elemMatch but it matches at least one array element.
I also tried $all with $elemMatch but had no success.
Thank you
$map your datesArray to a boolean array by your date range matching criteria. Perform $allElementsTrue on the result to get your desired result.
inputDate1 and inputDate2 are your inputs. Feel free to update them.
db.collection.aggregate([
{
"$addFields": {
"inputDate1": ISODate("2021-01-01"),
"inputDate2": ISODate("2021-12-31")
}
},
{
"$match": {
$expr: {
"$allElementsTrue": [
{
"$map": {
"input": "$datesArray",
"as": "d",
"in": {
$and: [
{
$gte: [
"$$d.start_date",
"$inputDate1"
]
},
{
$lte: [
"$$d.end_date",
"$inputDate2"
]
}
]
}
}
}
]
}
}
}
])
Here is the Mongo playground for your reference.
You could simply invert each of the critera and use $nor:
db.collection.aggregate([
{$match: {
$nor: [
{"datesArray.start_date": {$lt: ISODate("2021-11-01")}},
{"datesArray.end_date": {$gt: ISODate("2021-11-30")}}
]
}}
])
I want to find all the documents which are present and have array size greater than 1
My MongoDB collection looks like
{
"_id" : ObjectId("5eaaeedd00101108e1123452"),
"type" : ["admin","teacher","student"]
}
{
"_id" : ObjectId("5eaaeedd00101108e1123453"),
"type" : ["student"],
}
How I find the document which has more than 1 type
You can do something like this. This is working version > 4.2
db.collection.find({
$expr: {
$gt: [
{
$size: "$type"
},
1
]
}
})
Working Mongo playground
If you use less, you can do something like follwoing
db.collection.find({
type: {
$gt: {
$size: 1
}
}
})
db.collection.find({type: {$gt: 1}})
just change the name of the colletion
gt means greatter, you can see more about it here
You can use $gt
db.collectionName.find({"type": {$gt: 1} });
You can use $where
db.collectionName.find( { $where: "type > 1" } );
The only working solution for this problem is as follows:
db.collection.find({
$expr: {
$gt: [
{
$size: "$arrayfield"
},
1
]
}
})
All other solutions do not work. Tried it.
I have got my data in following format..
{
"_id" : ObjectId("534fd4662d22a05415000000"),
"product_id" : "50862224",
"ean" : "8808992479390",
"brand" : "LG",
"model" : "37LH3000",
"features" : [{
{
"key" : "Screen Format",
"value" : "16:9",
}, {
"key" : "DVD Player / Recorder",
"value" : "No",
},
"key" : "Weight in kg",
"value" : "12.6",
}
... so on
]
}
I need to compare features of one product with others and divide the result into separate categories ( 100% match, 50-99 % match) based on % of feature matches..
My initial thought was to prepare a dynamic query with or condition for each feature and do the percentage thing in php but then that means mongodb will return me even those product which only have 1 feature matching. And I I think nearly all products of a category might have some feature in common, so I fear I might be working on lot of products in php.
I have two questions basically.
is there any alternate ways?
And is the data structure I am using is good enough to support the functionality I am looking for, Or should I consider changing it
Well your solution really should be MongoDB specific otherwise you will end up doing your calculations and possible matching on the client side, and that is not going to be good for performance.
So of course what you really want is a way for that to have that processing on the server side:
db.products.aggregate([
// Match the documents that meet your conditions
{ "$match": {
"$or": [
{
"features": {
"$elemMatch": {
"key": "Screen Format",
"value": "16:9"
}
}
},
{
"features": {
"$elemMatch": {
"key" : "Weight in kg",
"value" : { "$gt": "5", "$lt": "8" }
}
}
},
]
}},
// Keep the document and a copy of the features array
{ "$project": {
"_id": {
"_id": "$_id",
"product_id": "$product_id",
"ean": "$ean",
"brand": "$brand",
"model": "$model",
"features": "$features"
},
"features": 1
}},
// Unwind the array
{ "$unwind": "$features" },
// Find the actual elements that match the conditions
{ "$match": {
"$or": [
{
"features.key": "Screen Format",
"features.value": "16:9"
},
{
"features.key" : "Weight in kg",
"features.value" : { "$gt": "5", "$lt": "8" }
},
]
}},
// Count those matched elements
{ "$group": {
"_id": "$_id",
"count": { "$sum": 1 }
}},
// Restore the document and divide the mated elements by the
// number of elements in the "or" condition
{ "$project": {
"_id": "$_id._id",
"product_id": "$_id.product_id",
"ean": "$_id.ean",
"brand": "$_id.brand",
"model": "$_id.model",
"features": "$_id.features",
"matched": { "$divide": [ "$count", 2 ] }
}},
// Sort by the matched percentage
{ "$sort": { "matched": -1 } }
])
So as you know the "length" of the $or condition being applied, then you simply need to find out how many of the elements in the "features" array match those conditions. So that is what the second $match in the pipeline is all about.
Once you have that count, you simply divide by the number of conditions what were passed in as your $or. The beauty here is that now you can do something useful with this like sort by that relevance and then even "page" the results server side.
Of course if you want some additional "categorization" of this, all you would need to do is add another $project stage to the end of the pipeline:
{ "$project": {
"product_id": 1
"ean": 1
"brand": 1
"model": 1,
"features": 1,
"matched": 1,
"category": { "$cond": [
{ "$eq": [ "$matched", 1 ] },
"100",
{ "$cond": [
{ "$gte": [ "$matched", .7 ] },
"70-99",
{ "$cond": [
"$gte": [ "$matched", .4 ] },
"40-69",
"under 40"
]}
]}
]}
}}
Or as something similar. But the $cond operator can help you here.
The architecture should be fine as you have it as you can have a compound index on the "key" and "value" for the entries in your features array and this should scale well for queries.
Of course if you actually need something more than that, such as faceted searching and results, you can look at solutions like Solr or elastic search. But the full implementation of that would be a bit lengthy for here.
I'm assuming that you'd like to compare the rest of the collection to a given product, which is a textbook example of aggregation:
lookingat = db.products.findOne({product_id:'50862224'})
matches = db.products.aggregate([
{ $unwind: '$features' },
{ $match: { features: { $in: lookingat.features }}},
{ $group: { _id: '$product_id', matchedfeatures: { $sum:1 }}},
{ $sort: { matchedfeatures: -1 }},
{ $limit: 5 },
{ $project: { _id:0, product_id: '$_id',
pctmatch: { $multiply: [ '$matchedfeatures',
100/lookingat.features.length ]}
}}
])
Walking through this briefly from the perspective of a product in the collection that has 6 features, and comparing it to the target product ('lookingat') which has 4 features, 3 of which match:
$unwind turns 1 document with 6 features into 6 otherwise-identical documents with 1 feature each
$match looks for that feature in the target's feature array (be aware that two documents are "equal" only if they have the same field names and values, in the same order), discards the 3 that don't match, and passes along the 3 that do
$group consumes those 3 matching documents and produces a new one that tells you there were 3 documents that matched that product_id
$sort and $limit give you the most relevant results and leave behind all those 1-feature matches you were concerned about
$project lets you rename the _id from the $group step back to product_id and also math the number of matching features into a percentage (we avoided a $divide operation by recognizing that 2 of the 3 terms in our calculation are constants and can be divided in JS)
What I am trying to achieve is, within a given date range, I want to group Users by First time and then by userId.
I tried below query to group by Multiple Fields,
ReactiveAggregate(this, Questionaire,
[
{
"$match": {
"time": {$gte: fromDate, $lte: toDate},
"userId": {'$regex' : regex}
}
},
{
$group : {
"_id": {
"userId": "$userId",
"date": { $dateToString: { format: "%Y-%m-%d", date: "$time" } }
},
"total": { "$sum": 1 }
}
}
], { clientCollection: "Questionaire" }
);
But When I execute it on server side, it shows me below error,
Exception from sub Questionaire id kndfrx9EuZ5EejKmE
Error: Meteor does not currently support objects other than ObjectID as ids
The message actually says it all, since the "compound" _id value that you are generating via the $group is not actually supported in the clientCollection output which will be published.
The simple solution of course is to not use the resulting _id value from $group as the "final" _id value in the generated output. So just as the example on the project README demonstrates, simply add a $project that removes the _id and renames the present "compound grouping key" as a different property name:
ReactiveAggregate(this, Questionaire,
[
{
"$match": {
"time": {$gte: fromDate, $lte: toDate},
"userId": {'$regex' : regex}
}
},
{
$group : {
"_id": {
"userId": "$userId",
"date": { $dateToString: { format: "%Y-%m-%d", date: "$time" } }
},
"total": { "$sum": 1 }
}
},
// Add the reshaping to the end of the pipeline
{
"$project": {
"_id": 0, // remove the _id, this will be automatically filled
"userDate": "$_id", // the renamed compound key
"total": 1
}
}
], { clientCollection: "Questionaire" }
);
The field order will be different because MongoDB keeps the existing fields ( i.e "total" in this example ) and then adds any new fields to the document. You can cou[nter that by using different field names in the $groupand $project stages rather than the 1 inclusive syntax.
Without such a plugin, this sort of reshaping is something that has been regularly done in the past, by again renaming the output _id and supplying a new _id value compatible with what meteor client collections expect to be present in this property.
On closer inspection of how the code is implemented, it is probably best to actually supply an _id value in the results because the plugin actually makes no effort to create an _id value.
So simply extracting one of the existing document _id values in the grouping should be sufficient. So I would add a $max to do this, and then replace the _id in the $project:
ReactiveAggregate(this, Questionaire,
[
{
"$match": {
"time": {$gte: fromDate, $lte: toDate},
"userId": {'$regex' : regex}
}
},
{
$group : {
"_id": {
"userId": "$userId",
"date": { $dateToString: { format: "%Y-%m-%d", date: "$time" } }
},
"maxId": { "$max": "$_id" },
"total": { "$sum": 1 }
}
},
// Add the reshaping to the end of the pipeline
{
"$project": {
"_id": "$maxId", // replaced _id
"userDate": "$_id", // the renamed compound key
"total": 1
}
}
], { clientCollection: "Questionaire" }
);
This could be easily patched in the plugin by replacing the lines
if (!sub._ids[doc._id]) {
sub.added(options.clientCollection, doc._id, doc);
} else {
sub.changed(options.clientCollection, doc._id, doc);
}
With using Random.id() when the document(s) output from the pipeline did not already have an _id value present:
if (!sub._ids[doc._id]) {
sub.added(options.clientCollection, doc._id || Random.id(), doc);
} else {
sub.changed(options.clientCollection, doc._id || Random.id(), doc);
}
But that might be a note to the author to consider updating the package.
I currently have a MongoDB collection that looks like so:
{
{
"_id": ObjectId,
"user_id": Number,
"updates": [
{
"_id": ObjectId,
"mode": Number,
"score": Number
},
{
"_id": ObjectId,
"mode": Number,
"score": Number
},
{
"_id": ObjectId,
"mode": Number,
"score": Number
}
]
}
}
I am looking to find a way to find the users with the largest number of updates per mode. For instance, if I specify mode 0, I want it to load the users in order of greatest number of updates with mode: 0.
Is this possible in MongoDB? It does not need to be a fast algorithm, as it will be cached for quite a while, and it will run asynchronously.
The fastest way would be to store a count for each "mode" within the document as another field, then you could just sort on that:
var update = {
"$push": { "updates": updateDoc },
};
var countDoc = {};
countDoc["counts." + updateDoc.mode] = 1;
update["$inc"] = countDoc;
Model.update(
{ "_id": id },
update,
function(err,numAffected) {
}
);
Which would use $inc to increment a "counts" field for each "mode" value as a key for each "mode" pushed to the "updates" array. All the calculation happens on update, so it's fast and so is the query that can be applied with a sort on that value:
Model.find({ "updates.mode": 0 }).sort({ "counts.0": -1 }).exec(function(err,users) {
});
If you don't want to or cannot store such a field then the other option is to calculate at query time with .aggregate():
Model.aggregate(
[
{ "$match": { "updates.mode": 0 } },
{ "$project": {
"user_id": 1,
"updates": 1,
"count": {
"$size": {
"$setDifference": [
{ "$map": {
"input": "$updates",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.mode", 0 ] },
"$$el",
false
]
}
}},
[false]
]
}
}
}},
{ "$sort": { "count": -1 } }
],
function(err,results) {
}
);
Which isn't bad since the filtering of the array and getting the $size is fairly effecient, but it's not as fast as just using a stored value.
The $map operator allows inline processing of the array elements which are tested by $cond to see if it returns a match or false. Then $setDifference removes any false values. A much better way to filter array content than using $unwind, which can slow things down significantly and should not be used unless your intent to to aggregate array content across documents.
But the better approach is to store the value for the count instead, since this does not require runtime calculation and can even use an index
I think this is a duplicate of this question:
Mongo find query for longest arrays inside object
The accepted answer seem to be doing exactly what you ask for.
db.collection.aggregate( [
{ $unwind : "$l" },
{ $group : { _id : "$_id", len : { $sum : 1 } } },
{ $sort : { len : -1 } },
{ $limit : 25 }
] )
just replace "$l" with "$updates".
[edit:] and you probably do not want the result limited to 25, so you should also get rid of the { $limit : 25 }