How can I retrieve data with a custom sort in Mongoose?
There is a job starting date that needs to be sorted by the month and year, but currently this script is only sorting from December to January.
router.get('/', (req, res) => {
Job.find()
.sort({ from: -1 })
.then(jobs => res.json(jobs))
.catch(err => res.status(404).json(err));
});
The problem is in the sort; values for from is like 12.2018, 06.2019, 03.2020, 11.2009 and so on.
I want to sort these results first from the year (which is after the dot) and then sort from the months. I cannot currently change how the data is set and it's stored as a String in the model Schema.
You have to use aggregation framework to first transform your string to a valid date by
$spliting it,
$convert parts from string to int
and using $dateFromParts,
then you sort and finally remove created field.
Here's the query :
db.collection.aggregate([
{
$addFields: {
date: {
$dateFromParts: {
year: {
$convert: {
input: {
$arrayElemAt: [
{
$split: [
"$from",
"."
]
},
1
]
},
to: "int"
}
},
month: {
$convert: {
input: {
$arrayElemAt: [
{
$split: [
"$from",
"."
]
},
0
]
},
to: "int"
}
},
}
}
}
},
{
$sort: {
date: -1
}
},
{
$project: {
date: 0
}
}
])
You can test it here
Related
My model is the following :
const scheduleTaskSchema = new Schema({
activity: { type: Object, required: true },
date: { type: Date, required: true },
crew: Object,
vehicle: Object,
pickups: Array,
details: String,
});
pickups is an array of objects with the following structure :
{
time : //iso date,
...rest //irrelevant
}
i want to return my data grouped by date,and also every group within has to be sorted by pickup[0].time so it will finally return something like this example :
{
"2022-08-28T00:00:00.000+00:00": [
{
date: "2022-08-28T00:00:00.000+00:00",
pickups: [
{
time: "2022-08-28T07:30:00.000Z",
...rest //irrelevant
}
],
...rest //irrelevant
},
{
date: "2022-08-28T00:00:00.000+00:00",
pickups: [
{
time: "2022-08-28T09:30:00.000Z",
...rest //irrelevant
}
],
...rest //irrelevant
}
]
,
"2022-08-29T00:00:00.000+00:00": [
{
date: "2022-08-29T00:00:00.000+00:00",
pickups: [
{
time: "2022-08-29T10:00:00.000Z",
...rest //irrelevant
}
],
...rest //irrelevant
},
{
date: "2022-08-29T00:00:00.000+00:00",
pickups: [
{
time: "2022-08-29T11:30:00.000Z",
...rest //irrelevant
}
],
...rest //irrelevant
}
]
}
You need to use the aggregation framework for this.
Assuming the dates are exactly the same (down to the millisecond), your code would look something like this.
const scheduledTasksGroups = await ScheduledTask.aggregate([{
$group: {
_id: '$date',
scheduledTasks: { $push: '$$ROOT' }
}
}]);
The output will be something like:
[
{ _id: "2022-08-29T10:00:00.000Z", scheduledTasks: [...] },
{ _id: "2022-08-28T10:00:00.000Z", scheduledTasks: [...] }
]
If you want to group by day instead of by millisecond, your pipeline would look like this:
const scheduledTasksGroups = await ScheduledTask.aggregate([{
$group: {
// format date first to `YYYY-MM-DD`, then group by the new format
_id: { $dateToString: { format: '%Y-%m-%d', date: '$date' } },
scheduledTasks: { $push: '$$ROOT' }
}
}]);
For what it's worth, this is a MongoDB feature, the grouping happens on the MongoDB server side; mongoose doesn't do anything special here; it just sends the command to the server. Then the server is responsible for grouping the data and returning them back.
Also, keep in mind that mongoose does not cast aggregation pipelines by default, but this plugin makes mongoose cast automatically whenever possible.
I want to filter a collection to retreive X records, giving a specific month, filter the field "CreatedAt", by month.
const aggregatorOpts = [
{
$group: {
_id: "$userId",
count: { $sum: 1 }
}
}, {
$addFields: {
"month": {
$month: '$createdAt'
}
}
}, {
$match: {
month: 1 //want to pass the month where
}
}]
const result = await Something.aggregate(aggregatorOpts).sort({ count: -1 }).exec()
The code above is what i already tried out.
Im also grouping by user with more occurrences in the collection, and sorting by that.
Is it possible to filter by month as i want, in mongoose?
I have a large MongoDB dataset of around 34gb and I am using Fastify and Mongoose for the API. I want to retrieve all list of unique userUuid from the date range. I tried the distinct method from Mongoose:
These are my filters:
let filters = {
applicationUuid: opts.applicationUuid,
impressions: {
$gte: opts.impressions
},
date: {
$gte: moment(opts.startDate).tz('America/Chicago').format(),
$lt: moment(opts.endDate).tz('America/Chicago').format()
}
}
This is my distinct Mongoose function:
return await Model.distinct("userUuid", filters)
This method will return an array with unique userUuid based from the filters.
This works fine for small dataset, but it has a memory cap of 16MB when it comes to huge dataset.
Therefore, I tried the aggregate method to achieve similar results, having read that it is better optimized. Nevertheless, the same filters object above does not work inside the match pipeline because aggregate does not accept string date that comes as the result of moment; but only JavaScript Date is accepted. However, JavaScript date dissregards all the timezones since it is unix based.
This is my aggregate function to get distinct values based on filters.
return await Model.aggregate(
[
{
$match: filters
},
{
$group: {
_id: {userUuid: "$userUuid" }
}
}
]
).allowDiskUse(true);
As I said, $match does not work with moment, but only with new Date(opts.startDate), however, JavaScript's new Date disregards moment's timezone. Nor it has a proper native timezone. Any thought on how to achieve this array of unique ids based on filters with Mongoose?
This is the solution I came up with and it works pretty well regarding the performance. Use this solution for large dataset:
let filters = {
applicationUuid: opts.applicationUuid,
impressions: { $gte: opts.impressions },
$expr: {
$and: [
{
$gte: [
'$date',
{
$dateFromString: {
dateString: opts.startDate,
timezone: 'America/Chicago',
},
},
],
},
{
$lt: [
'$date',
{
$dateFromString: {
dateString: opts.endDate,
timezone: 'America/Chicago',
},
},
],
},
],
},
}
return Model.aggregate([
{ $match: filters },
{
$group: {
_id: '$userUuid',
},
},
{
$project: {
_id: 0,
userUuid: '$_id',
},
},
])
.allowDiskUse(true)
Which will return a list of unique ids i.e.
[
{ userUuid: "someId" },
{ userUuid: "someId" }
]
Use the following method on small dataset which is more convenient:
let filters = {
applicationUuid: opts.applicationUuid,
impressions: {
$gte: opts.impressions
},
date: {
$gte: opts.startDate,
$lte: opts.endDate
}
}
return Model.distinct("userUuid", filters)
Which will return the following result:
[ "someId", "someOtherId" ]
Suppose we have the query :
EightWeekGamePlan.aggregate(
[
{
$group: {
_id: {
LeadId: "$LeadId",
Week: "$Week",
InsertDate: "$InsertDate" , // I want to group by the date part
Status: "$Status"
},
count: { $count: 1 }
}
},
{
$lookup: {
from: "leads",
localField: "_id",
foreignField: "LeadId",
as: "Joined"
}
},
{ $unwind: "$Joined" },
{ $replaceRoot: { newRoot: { $mergeObjects: ["$Joined", "$$ROOT"] } } },
{ $sort: { total: -1 } }
],
function(err, results) {
if (err) {
console.log(err);
}
// ... do some manipulations ...
console.log(_filtered);
return res.json(_filtered);
}
);
I grouping by multiple fields and I want to take only the date part of InsertDate and disregard the time.
How can we do that ?
I believe your question is addressed in mongodb documentations under Group by Day of the Year:
https://docs.mongodb.com/manual/reference/operator/aggregation/group/
You have to convert the date into date-formatted string using $dateToString and add it to $group _id
_id : {$dateToString: { format: "%Y-%m-%d", date: "$InserDate" }}
I hope this helps!
I've written a MongoDB aggregation query that uses a number of stages. At the end, I'd like the query to return my data in the following format:
{
data: // Array of the matching documents here
count: // The total count of all the documents, including those that are skipped and limited.
}
I'm going to use the skip and limit features to eventually pare down the results. However, I'd like to know the count of the number of documents returned before I skip and limit them. Presumably, the pipeline stage would have to occur somewhere after the $match stage but before the $skip and $limit stages.
Here's the query I've currently written (it's in an express.js route, which is why I'm using so many variables:
const {
minDate,
maxDate,
filter, // Text to search
filterTarget, // Row to search for text
sortBy, // Row to sort by
sortOrder, // 1 or -1
skip, // rowsPerPage * pageNumber
rowsPerPage, // Limit value
} = req.query;
db[source].aggregate([
{
$match: {
date: {
$gt: minDate, // Filter out by time frame...
$lt: maxDate
}
}
},
{
$match: {
[filterTarget]: searchTerm // Match search query....
}
},
{
$sort: {
[sortBy]: sortOrder // Sort by date...
}
},
{
$skip: skip // Skip the first X number of doucuments...
},
{
$limit: rowsPerPage
},
]);
Thanks for your help!
We can use facet to run parallel pipelines on the data and then merge the output of each pipeline.
The following is the updated query:
db[source].aggregate([
{
$match: {
date: {
$gt: minDate, // Filter out by time frame...
$lt: maxDate
}
}
},
{
$match: {
[filterTarget]: searchTerm // Match search query....
}
},
{
$set: {
[filterTarget]: { $toLower: `$${filterTarget}` } // Necessary to ensure that sort works properly...
}
},
{
$sort: {
[sortBy]: sortOrder // Sort by date...
}
},
{
$facet:{
"data":[
{
$skip: skip
},
{
$limit:rowsPerPage
}
],
"info":[
{
$count:"count"
}
]
}
},
{
$project:{
"_id":0,
"data":1,
"count":{
$let:{
"vars":{
"elem":{
$arrayElemAt:["$info",0]
}
},
"in":{
$trunc:"$$elem.count"
}
}
}
}
}
]).pretty()
I think I figured it out. But if someone knows that this answer is slow, or at least faulty in some way, please let me know!
It's to add a $group stage, passing null as the value, then pushing each document, $$ROOT, into the data array, and for each one, incrementing count by 1 with the $sum operator.
Then, in the next $project stage, I simply remove the _id property, and slice down the array.
db[source].aggregate([
{
$match: {
date: {
$gt: minDate, // Filter out by time frame...
$lt: maxDate
}
}
},
{
$match: {
[filterTarget]: searchTerm // Match search query....
}
},
{
$set: {
[filterTarget]: { $toLower: `$${filterTarget}` } // Necessary to ensure that sort works properly...
}
},
{
$sort: {
[sortBy]: sortOrder // Sort by date...
}
},
{
$group: {
_id: null,
data: { $push: "$$ROOT" }, // Push each document into the data array.
count: { $sum: 1 }
}
},
{
$project: {
_id: 0,
count: 1,
data: {
$slice: ["$data", skip, rowsPerPage]
},
}
}
]).pretty()