I want to retrieve a user's chats with corresponding users from a different collection in NodeJS and MongoDB.
The nature of NodeJS gives me a bad feeling that running the following code will block or decrease performance of my app. I can duplicate some data but I want to learn more about NodeJS.
Please let me know whether my code is ok and will not decrease performance.
Here I fetch 20 chats. I also need their corresponding users.
then I get the userIds and perform another query against the User collection.
Now I have both but I should merge them using Array.map.
I don't use $lookup because my collections are sharded.
$lookup
Performs a left outer join to an unsharded collection in the same database to filter in documents from the "joined" collection for processing. To each input document, the $lookup stage adds a new array field whose elements are the matching documents from the "joined" collection. The $lookup stage passes these reshaped documents to the next stage.
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/#mongodb-pipeline-pipe.-lookup
let chats = await Chat.find({ active: true }).limit(20);
/*
[
{_id: ..., userId: 1, title: 'Chat A'},
...
]
*/
const userIds = chats.map(item => item.userId);
/*
[1, ...]
*/
const users = await User.find({ _id: { $in: userIds }});
/*
[
{_id: 1, fullName: 'Jack'},
...
]
*/
chats = chats.map(item => {
item.user = users.find(user => user._id === item.userId);
return item;
});
/*
[
{
_id: ...,
userId: 1,
user: {_id: 1, fullName: 'Jack'}, // <-------- added
title: 'Chat A'
},
...
]
*/
This is NOT how you should do it. MongoDB has something called Aggregation Framework and $lookup pipeline that will do that for you automatically with only 1 MongoDB query.
But since you are using Mongoose, this query become even more simpler since you can use populate() method of the Mongoose. So your whole code can be replaced with one line like this:
const chats = await Chat.find({ active: true }).populate('userId;).limit(20);
console.log(chats)
Note: If your collections are sharded, in my opinion you already implemented the logic in best possible way.
You are using async/await, so your code will wait a response from every time use await
// Wait to finish here
let chats = await Chat.find({ active: true }).limit(20);
/*
[
{_id: ..., userId: 1, title: 'Chat A'},
...
]
*/
// Wait to finish here too
const users = await User.find({ _id: { $in: userIds }});
/*
[
{_id: 1, fullName: 'Jack'},
...
]
*/
So if you has too many data and you don't have any index on your collection it will be too long to finish those query.
At this case you should create ref in your collection Chat to collection User with chat.userId = user._id
Then when you call query chat, you populate field userId so you don't have to map const userIds = chats.map(item => item.userId); and chats = chats.map...
Sample for chat schema
const { Schema, model } = require("mongoose");
const chatSchema = new Schema({
active: Boolean,
userId: {
type: "ObjectId",
ref: "User",
},
title: String,
message: String
// another property
});
const userSchema = new Schema({
username: String,
email: String
// another property
})
// query for chat
const chatModel = new model('chat', chatSchema)
let chats = await chatModel.find({ active: true }).populate('userId').limit(20);
/*
[
{
_id: ...,
userId: {_id: 1, fullName: 'Jack'}, // <-------- already have
title: 'Chat A'
},
...
]
*/
Related
I have the following Schema with a array of ObjectIds:
const userSchema = new Schema({
...
article: [{
type: mongoose.Schema.Types.ObjectId,
}],
...
},
I will count the array elements in the example above the result should be 10.
I have tried the following but this doesn't worked for me. The req.query.id is the _id from the user and will filter the specific user with the matching article array.
const userData = User.aggregate(
[
{
$match: {_id: id}
},
{
$project: {article: {$size: '$article'}}
},
]
)
console.log(res.json(userData));
The console.log(article.length) give me currently 0. How can I do this? Is the aggregate function the right choice or is a other way better to count elements of a array?
Not sure why to use aggregate when array of ids is already with user object.
Define articles field as reference:
const {Schema} = mongoose.Schema;
const {Types} = Schema;
const userSchema = new Schema({
...
article: {
type: [Types.ObjectId],
ref: 'Article',
index: true,
},
...
});
// add virtual if You want
userSchema.virtual('articleCount').get(function () {
return this.article.length;
});
and get them using populate:
const user = await User.findById(req.query.id).populate('articles');
console.log(user.article.length);
or simply have array of ids:
const user = await User.findById(req.query.id);
console.log(user.article.length);
make use of virtual field:
const user = await User.findById(req.query.id);
console.log(user.articleCount);
P.S. I use aggregate when I need to do complex post filter logic which in fact is aggregation. Think about it like You have resultset, but You want process resultset on db side to have more specific information which would be ineffective if You would do queries to db inside loop. Like if I need to get users which added specific article by specific day and partition them by hour.
I'm new to Aggregation in MongoDB and I'm trying to understand the concepts of it by making examples.
I'm trying to paginate my subdocuments using aggregation but the returned document is always the overall values of all document's specific field.
I want to paginate my following field which contains an array of Object IDs.
I have this User Schema:
const UserSchema = new mongoose.Schema({
username: {
type: String,
unique: true,
required: true
},
firstname: String,
lastname: String,
following: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
}],
...
}, { timestamps: true, toJSON: { virtuals: true }, toObject: { getters: true, virtuals: true } });
Without aggregation, I am able to paginate following,
I have this route which gets the user's post by their username
router.get(
'/v1/:username/following',
isAuthenticated,
async (req, res, next) => {
try {
const { username } = req.params;
const { offset: off } = req.query;
let offset = 0;
if (typeof off !== undefined && !isNaN(off)) offset = parseInt(off);
const limit = 2;
const skip = offset * limit;
const user = await User
.findOne({ username })
.populate({
path: 'following',
select: 'profilePicture username fullname',
options: {
skip,
limit,
}
})
res.status(200).send(user.following);
} catch (e) {
console.log(e);
res.status(500).send(e)
}
}
);
And my pagination version using aggregate:
const following = await User.aggregate([
{
$match: { username }
},
{
$lookup: {
'from': User.collection.name,
'let': { 'following': '$following' },
'pipeline': [
{
$project: {
'fullname': 1,
'username': 1,
'profilePicture': 1
}
}
],
'as': 'following'
},
}, {
$project: {
'_id': 0,
'following': {
$slice: ['$following', skip, limit]
}
}
}
]);
Suppose I have this documents:
[
{
_id: '5fdgffdgfdgdsfsdfsf',
username: 'gagi',
following: []
},
{
_id: '5fgjhkljvlkdsjfsldkf',
username: 'kuku',
following: []
},
{
_id: '76jghkdfhasjhfsdkf',
username: 'john',
following: ['5fdgffdgfdgdsfsdfsf', '5fgjhkljvlkdsjfsldkf']
},
]
And when I test my route for user john: /john/following, everything is fine but when I test for different user which doesn't have any following: /gagi/following, the returned result is the same as john's following which aggregate doesn't seem to match user by username.
/john/following | following: 2
/kuku/following | following: 0
Aggregate result:
[
{
_id: '5fdgffdgfdgdsfsdfsf',
username: 'kuku',
...
},
{
_id: '5fgjhkljvlkdsjfsldkf',
username: 'gagi',
...
}
]
I expect /kuku/following to return an empty array [] but the result is same as john's. Actually, all username I test return the same result.
I'm thinking that there must be wrong with my implementation since I've only started exploring aggregation.
Mongoose uses a DBRef to be able to populate the field after it has been retrieved.
DBRefs are only handled on the client side, MongoDB aggregation does not have any operators for handling those.
The reason that aggregation pipeline is returning all of the users is the lookup's pipeline does not have a match stage, so all of the documents in the collection are selected and included in the lookup.
The sample document there is showing an array of strings instead of DBRefs, which wouldn't work with populate.
Essentially, you must decide whether you want to use aggregation or populate to handle the join.
For populate, use the ref as shown in that sample schema.
For aggregate, store an array of ObjectId so you can use lookup to link with the _id field.
I'm trying to delete a mongodb object and then once deleted, I want to delete everything associated with that mongodb object. Including nested mongodb objects from my mongo database.
var parentObjectSchema = new mongoose.Schema({
name: String,
split: Number,
parts: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "ChildObjectSchema"
}
],
});
var childObjectSchema = new mongoose.Schema({
name: String,
number: Number,
things: [
{
type: mongoose.Schema.Types.ObjectId,
ref: "Things"
}
],
});
So I am trying to delete the parentObject, and childObjects that come along with it. Not sure how I would go about doing that. I am successful in deleting the parentObject but that childObject is still in the mongodb, taking up space. Any ideas?
MongoDB doesn't provide the notion of foreign keys like other databases do. Mongoose has convenience methods in the client library that populates your documents with other documents using multiple queries and joining the results:
https://mongoosejs.com/docs/populate.html
If you want to do a cascading deletion then you'll need to grab the object ids of the children in the parent documents you want to delete, and then execute a delete against those children documents.
Here's a simplified example:
const deleteThing = (thingId) => {
thingObjectSchema.remove({ _id: thingId });
};
const deleteChild = (childId) => {
childObjectSchema.findOne({ _id: childId }).select('things').lean().exec((err, child) => {
for (const thingId of child.things) {
deleteThing(thingId);
}
childObjectSchema.remove({ _id: childId });
})
};
const deleteParent = (parentId) => {
parentObjectSchema.findOne({ _id: parentId }).select('parts').lean().exec((err, parent) => {
for (const childId of parent.parts) {
deleteChild(childId);
}
parentObjectSchema.remove({ _id: parentId });
})
};
// note: not actually tested
A user has project IDs but I also want to store some additional project info:
const userSchema = new Schema({
...
projects: [{
_id: {
type: Schema.Types.ObjectId,
ref: 'Project',
unique: true, // needed?
},
selectedLanguage: String,
}]
});
And I want to populate with the project name so I'm doing:
const user = await User
.findById(req.user.id, 'projects')
.populate('projects._id', 'name')
.exec();
However user.projects gives me this undesirable output:
[
{
selectedLanguage: 'en',
_id: { name: 'ProjectName', _id: 5a50ccde03c2d1f5a07e0ff3 }
}
]
What I wanted was:
[
{ name: 'ProjectName', _id: 5a50ccde03c2d1f5a07e0ff3, selectedLanguage: 'en' }
]
I can transform the data but I'm hoping that Mongoose can achieve this out the box as it seems a common scenario? Thanks.
Seems like there are two options here:
1) Name the _id field something more semantic so it's:
{
selectedLanguage: 'en',
somethingSemantic: { _id: x, name: 'ProjectName' },
}
2) Flatten the data which can be done generically with modern JS:
const user = await User
.findById(req.user.id, 'projects')
.populate('projects._id', 'name')
.lean() // Important to use .lean() or you get mongoose props spread in
.exec();
const projects = user.projects.map(({ _id, ...other }) => ({
..._id,
...other,
}));
try something like this
populate({path:'projects', select:'name selectedLanguage'})
I have two mongo collections.
Enrollment:
{UserID: String, CourseID: String, EducatorFlag: boolean}
Courses
{_id: String, courseName: String}
I'm attempting to generate a list of courseNames when given a UserID. This requires me to find all courses that a User is enrolled in. The following function returns just the CourseID of each course a user is in.
var currentCourses = Enrollment.find(
{ UserId: Meteor.userId(), EducatorFlag: false },
{ fields: { CourseID: 1 });
I'm unsure of how to take this cursor, and use each item in it to run another query and build a list from the output. Basically for each CourseID in currentCourses I need to do
var result = []
result += Courses.find({_id: CourseID}, {fields: {_id: 0, courseName: 1}});
The goal is simply to print all the courses that a user is enrolled in.
You have several options:
Use the cursor directly with a .forEach()
Use .fetch() to transform the cursor into an array of objects and then manipulate that.
Get an array of _ids of enrollments with .map() and directly search the courses with mongo's $in
Let's just use the first one for now since it's pretty simple:
let courseNames = [];
Enrollment.find(
{ UserId: Meteor.userId(), EducatorFlag: false },
{ fields: { CourseID: 1 }).forEach((e)=>{
let course = Courses.findOne(e.CourseID, { fields: { courseName: 1 }})
courseNames.push(course.courseName);
});
Note: when selecting fields in a query you can't mix exclusions and inclusions.
Getting an array of _ids and using that with $in is also pretty straightforward:
let courseIdArray = Enrollment.find(
{ UserId: Meteor.userId(), EducatorFlag: false },
{ fields: { CourseID: 1 }).map((e)=>{ return e.CourseID });
let courseNames = Courses.find(
{ _id: { $in: courseIdArray }}).map((c)=>{ return c.courseName });