Mongoose date $gte operator not working as expected - javascript

I am trying to write query for last week but it is not working as expected in mongoDB.
[{
$lookup: {
from: 'reviews',
localField: 'groupReviews',
foreignField: '_id',
as: 'groupReviews'
}
}, {
$match: {
$and: [{
_id: {
$eq: ObjectId('5f247eea8ad8eb53883f4a9b')
}
},
{
"groupReviews.reviewCreated": {
$gte: ISODate('2020-06-20T10:24:51.303Z')
}
}
]
}
}, {
$project: {
count: {
$size: "$groupReviews",
},
groupReviews: {
$slice: ["$groupReviews", 0, 20],
}
}
}, {
$sort: {
"groupReviews.reviewCreated": -1
}
}]
the actual result: above code returning results which is older than 2020-06-20.
the expected result: it should not display older than 2020-06-20.
I am attaching an image for more reference.
Image Link

The $match stages matches entire documents, not individual array elements. If the array contains at least one element that satisfies the $gte condition, the document will be matched and passed along the pipeline.
If you want to remove the individual array elements that are older than the given date, you could either
$unwind the array before matching and $group to rebuild it with only the matching entries
Use $filter in your $project stage to eliminate the unwanted elements prior to slicing

Related

MongoDB Aggregation - match documents with array of objects, by another array of objects filter

I have documents that consist of an array of objects, and each object in this array consists of another array of objects.
For simplicity, irrelevant fields of the documents were omitted.
It looks like this (2 documents):
{
title: 'abc',
parts: [
{
part: "verse",
progressions: [
{
progression: "62a4a87da7fdbdabf787e47f",
key: "Ab",
_id: "62b5aaa0c9e9fe8a7d7240d3"
},
{
progression: "62adf477ed11cbbe156d5769",
key: "C",
_id: "62b5aaa0c9e9fe8a7d7240d3"
},
],
_id: "62b5aaa0c9e9fe8a7d7240d2"
},
{
part: "chorus",
progressions: [
{
progression: "62a4a51b4693c43dce9be09c",
key: "E",
_id: "62b5aaa0c9e9fe8a7d7240d9"
}
],
_id: "62b5aaa0c9e9fe8a7d7240d8"
}
],
}
{
title: 'def',
parts: [
{
part: "verse",
progressions: [
{
progression: "33a4a87da7fopvvbf787erwe",
key: "E",
_id: "62b5aaa0c9e9fe8a7d7240d3"
},
{
progression: "98opf477ewfscbbe156d5442",
key: "Bb",
_id: "62b5aaa0c9e9fe8a7d7240d3"
},
],
_id: "12r3aaa0c4r5me8a7d72oi8u"
},
{
part: "bridge",
progressions: [
{
progression: "62a4a51b4693c43dce9be09c",
key: "C#",
_id: "62b5aaa0c9e9fe8a7d7240d9"
}
],
_id: "62b5aaa0rwfvse8a7d7240d8"
}
],
}
The parameters that the client sends with a request are an array of objects:
[
{ part: 'verse', progressions: ['62a4a87da7fdbdabf787e47f', '62a4a51b4693c43dce9be09c'] },
{ part: 'chorus', progressions: ['62adf477ed11cbbe156d5769'] }
]
I want to retrieve, through mongodb aggregation, the documents that at least one of objects in the input array above is matching them:
In this example, documents that have in their parts array field, an object that has the value 'verse' in the part property and one of the progressions id's ['62a4a87da7fdbdabf787e47f', '62a4a51b4693c43dce9be09c'] in the progression property in one of the objects in the progressions property, or documents that have in their parts array field, an object that has the value 'chorus' in the part property and one of the progressions id's ['62adf477ed11cbbe156d5769'] in the progression property in one of the objects in the progressions property.
In this example, the matching document is the first one (with the title 'abc'), but in actual use, there might be many matching documents.
I tried to create an aggregation pipeline myself (using the mongoose 'aggregate' method):
// parsedProgressions = [
// { part: 'verse', progressions: ['62a4a87da7fdbdabf787e47f', '62a4a51b4693c43dce9be09c'] },
// { part: 'chorus', progressions: ['62adf477ed11cbbe156d5769'] }
// ]
songs.aggregate([
{
$addFields: {
"tempMapResults": {
$map: {
input: parsedProgressions,
as: "parsedProgression",
in: {
$cond: {
if: { parts: { $elemMatch: { part: "$$parsedProgression.part", "progressions.progression": mongoose.Types.ObjectId("$$parsedProgression.progression") } } },
then: true, else: false
}
}
}
}
}
},
{
$addFields: {
"isMatched": { $anyElementTrue: ["$tempMapResults"] }
}
},
{ $match: { isMatched: true } },
{ $project: { title: 1, "parts.part": 1, "parts.progressions.progression": 1 } }
]);
But it didn't work - as I understand it, because the $elemMatch can be used only in the $match stage.
Anyway, I guess I overcomplicated the aggregation pipeline, so I will be glad if you can fix my aggregation pipeline/offer a better working one.
This is not a simple case as these are both nested arrays and we need to match both the part and the progressions, which are not on the same level
One option looks complicated a bit, but keeps your data small:
In order to make things easier, $set a new array field called matchCond which includes an array called progs containing the parts.progressions. To each sub-object inside it insert the matching progressions input array. We do need to be careful here and handle the case where there is no matching progressions input arrayprogressions input array, as this is the case for the "bridge" part on the second document.
Now we just need to check if for any of these progs items, the progression field is matching one option in input array. This is done using $filter, and $rediceing the number of results.
Just match document which have results and format the answer
db.collection.aggregate([
{
$set: {
matchCond: {
$map: {
input: "$parts",
as: "parts",
in: {progs: {
$map: {
input: "$$parts.progressions",
in: {$mergeObjects: [
"$$this",
{input: {progressions: []}},
{input: {$first: {
$filter: {
input: inputData,
as: "inputPart",
cond: {$eq: ["$$inputPart.part", "$$parts.part"]}
}
}}}
]}
}
}}
}
}
}
},
{$set: {
matchCond: {
$reduce: {
input: "$matchCond",
initialValue: 0,
in: {$add: [
"$$value",
{$size: {
$filter: {
input: "$$this.progs",
as: "part",
cond: {$in: ["$$part.progression", "$$part.input.progressions"]}
}
}
}
]
}
}
}
}
},
{$match: {matchCond: {$gt: 0}}},
{$project: {title: 1, parts: 1}}
])
See how it works on the playground example
Another option is to use $unwind, which looks simple, but will duplicate your data, thus, likely to be slower:
db.collection.aggregate([
{$addFields: {inputData: inputData, cond: "$parts"}},
{$unwind: "$cond"},
{$unwind: "$cond.progressions"},
{$unwind: "$inputData"},
{$match: {
$expr: {
$and: [
{$eq: ["$cond.part", "$inputData.part"]},
{$in: ["$cond.progressions.progression", "$inputData.progressions"]}
]
}
}
},
{$project: {title: 1, parts: 1}}
])
See how it works on the playground example - unwind
There are several options between these two...

How to properly get distinct values with Mongoose in large dataset with date filters with timezone?

I have a large MongoDB dataset of around 34gb and I am using Fastify and Mongoose for the API. I want to retrieve all list of unique userUuid from the date range. I tried the distinct method from Mongoose:
These are my filters:
let filters = {
applicationUuid: opts.applicationUuid,
impressions: {
$gte: opts.impressions
},
date: {
$gte: moment(opts.startDate).tz('America/Chicago').format(),
$lt: moment(opts.endDate).tz('America/Chicago').format()
}
}
This is my distinct Mongoose function:
return await Model.distinct("userUuid", filters)
This method will return an array with unique userUuid based from the filters.
This works fine for small dataset, but it has a memory cap of 16MB when it comes to huge dataset.
Therefore, I tried the aggregate method to achieve similar results, having read that it is better optimized. Nevertheless, the same filters object above does not work inside the match pipeline because aggregate does not accept string date that comes as the result of moment; but only JavaScript Date is accepted. However, JavaScript date dissregards all the timezones since it is unix based.
This is my aggregate function to get distinct values based on filters.
return await Model.aggregate(
[
{
$match: filters
},
{
$group: {
_id: {userUuid: "$userUuid" }
}
}
]
).allowDiskUse(true);
As I said, $match does not work with moment, but only with new Date(opts.startDate), however, JavaScript's new Date disregards moment's timezone. Nor it has a proper native timezone. Any thought on how to achieve this array of unique ids based on filters with Mongoose?
This is the solution I came up with and it works pretty well regarding the performance. Use this solution for large dataset:
let filters = {
applicationUuid: opts.applicationUuid,
impressions: { $gte: opts.impressions },
$expr: {
$and: [
{
$gte: [
'$date',
{
$dateFromString: {
dateString: opts.startDate,
timezone: 'America/Chicago',
},
},
],
},
{
$lt: [
'$date',
{
$dateFromString: {
dateString: opts.endDate,
timezone: 'America/Chicago',
},
},
],
},
],
},
}
return Model.aggregate([
{ $match: filters },
{
$group: {
_id: '$userUuid',
},
},
{
$project: {
_id: 0,
userUuid: '$_id',
},
},
])
.allowDiskUse(true)
Which will return a list of unique ids i.e.
[
{ userUuid: "someId" },
{ userUuid: "someId" }
]
Use the following method on small dataset which is more convenient:
let filters = {
applicationUuid: opts.applicationUuid,
impressions: {
$gte: opts.impressions
},
date: {
$gte: opts.startDate,
$lte: opts.endDate
}
}
return Model.distinct("userUuid", filters)
Which will return the following result:
[ "someId", "someOtherId" ]

Get some elements from an array mongoDB

In MongoDB shell version v4.4.6
the following code works perfectly.
db['pri-msgs'].findOne({tag:'aaa&%qqq'},{msgs:{$slice:-2}})
But in nodeJs mongoDB the following code doesn't work.
db.collection('pri-msgs').findOne({
tag: 'aaa&%qqq'
}, {
msgs: {
slice: -2
}
})
My document-->
{"_id":{"$oid":"60c4730fadf6891850db90f9"},"tag":"aaa&%qqq","msgs":[{"msg":"abc","sender":0,"mID":"ctYAR5FDa","time":1},{"msg":"bcd","sender":0,"mID":"gCjgPf85z","time":2},{"msg":"def","sender":0,"mID":"lAhc4yLr6","time":3},{"msg":"efg","sender":0,"mID":"XcBLC2rGf","time":4,"edited":true},{"msg":"fgh","sender":0,"mID":"9RWVcEOlD","time":5},{"msg":"hij","sender":0,"mID":"TJXVTuWrR","time":6},{"msg":"jkl","sender":0,"mID":"HxUuzwrYN","time":7},{"msg":"klm","sender":0,"mID":"jXEOhARC2","time":8},{"msg":"mno","sender":0,"mID":"B8sVt4kCy","time":9}]}
Actually what I'm trying to do is Get last 2 itmes from msgs Array where time is greater than 'n'. Here 'n' is a number.
You can use aggregation-pipeline to get the results you are looking for. The steps are the following.
Match the documents you want by tag.
Unwind the msgs array.
Sort descending by msgs.time.
Limit first 2 elements.
Match the time you are looking for using a range query.
Group the documents back by _id.
Your query should look something like this:
db['pri-msgs'].aggregate([
{ $match: { tag: 'aaa&%qqq' } },
{ $unwind: '$msgs' },
{
$sort: {
'msgs.time': -1 //DESC
}
},
{ $limit: 2 },
{
$match: {
'msgs.time': {
$gt: 2 //n
}
}
},
{
$group: {
_id: '$_id',
tag: { $first: '$tag' },
msgs: {
$push: { msg: '$msgs.msg', sender: '$msgs.sender', mID: '$msgs.mID', time: '$msgs.time' }
}
}
}
]);

Query for documents to iterate over Array and take sum on a particular property of JSON in mongodb using $cond

I have a array of JSON like this:
let x = [{"Data":"Chocolate","Company":"FiveStar"},{"Data":"Biscuit","Company":"Parle"},{"Data":"Chocolate","Company":"DairyMilk"}]
This is a sample array of JSON. What I want to do is how to use MongoDB $cond to take count of all fields having "Data" equals Chocolate?
If you wanted to stick to $cond then you could run the query like this:
db.collection.aggregate([
{
$match: {
Data: "Chocolate"
}
},
{
$group: {
_id: "$Data",
count: {
$sum: {
$cond: [
{
$eq: [
"$Data",
"Chocolate"
]
},
1,
0
]
}
}
}
}
])
You can see the results of this query here.
You can see that firstly I get all elements where Data is equal to "Chocolate" with $match.
Later, we can use $sum to get a count of the elements that match the conditions with $cond and return 1 if it is equal and 0 if not.
In this case, we only are left with the Data that is equal to "Chocolate" anyway.
Of course, it would be faster and easier to simply run:
db.collection.aggregate([
{
$match: {
Data: "Chocolate"
}
},
{
$count: "count"
}
])
The example of this query here.

Lookup using an ID array with pipeline

I was trying to write a lookup function that takes an array with object ids and timestamps of object y. This worked flawlessly with localflied and foreignfield but I cannot reproduce the same result using pipeline.
(Names like y are made up to keep it general)
Working version:
$lookup: {
from: 'y',
localField: 'ys.object_id',
foreignField: '_id',
as: 'docs',
},
ys is an array of objects structured like this:
{
object_id: ObjectID(),
timestamp: Date(),
}
I would like to rewrite this expression to use pipeline because I already want to filter some of the objects looked up out using their timestamp attribute.
What I have tried:
$lookup: {
from: 'y',
let: { ys: '$ys' },
pipeline: [
{
$match: { $expr: { $eq: ['$_id', '$$ys.object_id'] } },
},
],
as: 'docs',
},
Database size: 20.4GB
Full Query:
const query = [
{
$match: { 'ys.timestamp': { $lte: date, $gt: previousMonth } }, // I have shorten this part a little (It's not the same but the logic was flawed anyway)
},
{
$limit: 100,
},
{
$lookup: {
from: 'y',
let: { ys: '$ys' },
pipeline: [
{
$match: { $expr: { $in: ['$_id', '$$ys.object_id'] } },
},
{
$sort: { timestamp: -1 },
},
{
$limit: 1,
},
],
as: 'doc',
},
},
];
The above solution doesn't work it seems to get stuck and never actually return anything. (Times out after some time)
Is there a proper way of rewriting the working solution to a pipeline solution?
IMPORTANT:
I have changed the query to look for one specific element by ID and then perform the lookup. This action did work but took about 20 seconds. I am pretty certain this is why my query times out when I run it with my usual query. Can anyone explain why there is a performance difference between the 2 approaches and if I can somehow bypass that?
Very close - use $in instead of $eq:
$lookup: {
from: 'y',
let: { ys: '$ys' },
pipeline: [
{
$match: { $expr: { $in: ['$_id', '$$ys.object_id'] } },
},
],
as: 'docs',
},
If you use $eq you're looking for a value that is equal to that array. Using $in means you're looking for a value that is contained within that array (like includes).

Categories

Resources