Aggregation Query Optimization Mongodb - javascript

User Schema
I have been building a social media application and I have to write a query that returns the user of user. The schema of user is shown below.
const userSchema = Schema(
{
email: {
type: String,
unique: true,
required: [true, "Email is required"],
index: true,
},
active: {
type: Boolean,
default: true,
},
phone: {
type: String,
unique: true,
required: [true, "Phone is required"],
index: true,
},
name: {
required: true,
type: String,
required: [true, "Name is required"],
},
bio: {
type: String,
},
is_admin: {
type: Boolean,
index: true,
default: false,
},
is_merchant: {
type: Boolean,
index: true,
default: false,
},
password: {
type: String,
required: [true, "Password is required"],
},
profile_picture: {
type: String,
},
followers: [
// meaning who has followed me
{
type: Types.ObjectId,
ref: "user",
required: false,
},
],
followings: [
// meaning all of them who I followed
{
type: Types.ObjectId,
ref: "user",
required: false,
},
],
},
{
timestamps: { createdAt: "created_at", updatedAt: "updated_at" },
toObject: {
transform: function (doc, user) {
delete user.password;
},
},
toJSON: {
transform: function (doc, user) {
delete user.password;
},
},
}
);
Follow/following implementation
I have implemented follow/following using the logic shown as below. Each time user follows another user. It would perform 2 queries. One would update the follower followers part using findOneAndUpdate({push:followee._id}) and a second query to update the part of followee user.
Query Response Pattern
I have written a query that should return the response with followings response appended to each user
{
doesViewerFollowsUser: boolean // implying if person we are viewing profile of follows us
doesUserFollowsViewer: boolean // implying if person we are viewing profile of follows us
}
The actual query
The query must looks like this
userModel
.aggregate([
{
$match: {
_id: {
$in: [new Types.ObjectId(userId), new Types.ObjectId(viewerId)],
},
},
},
{
$addFields: {
order: {
$cond: [
{
$eq: ["$_id", new Types.ObjectId(viewerId)], // testing for viewer
},
2,
1,
],
},
},
},
{
$group: {
_id: 0,
subjectFollowings: {
$first: "$followings",
},
viewerFollowings: {
$last: "$followings",
},
viewerFollowers: {
$last: "$followers",
},
},
},
{
$lookup: {
from: "users",
localField: "subjectFollowings",
foreignField: "_id",
as: "subjectFollowings",
},
},
{
$project: {
subjectFollowings: {
$map: {
input: "$subjectFollowings",
as: "user",
in: {
$mergeObjects: [
"$$user",
{
doesViewerFollowsUser: {
$cond: [
{
$in: ["$$user._id", "$viewerFollowers"],
},
true,
false,
],
},
},
{
doesUserFollowsViewer: {
$cond: [
{
$in: ["$$user._id", "$viewerFollowings"],
},
true,
false,
],
},
},
],
},
},
},
},
},
{
$project: {
"subjectFollowings.followings": 0,
"subjectFollowings.followers": 0,
"subjectFollowings.bio": 0,
"subjectFollowings.password": 0,
"subjectFollowings.is_admin": 0,
"subjectFollowings.is_merchant": 0,
"subjectFollowings.email": 0,
"subjectFollowings.phone": 0,
"subjectFollowings.created_at": 0,
"subjectFollowings.updated_at": 0,
"subjectFollowings.__v": 0,
},
},
])
The problem
I don't think the current query scales that much. The worst case complexity for this query reaches 0(n^2) (approximately). So, please help me optimize this query.

The problem is with your data modeling. You shound not store follower/following in an array because:
Mongodb has a 16mb hard limit for every document, which means you can store limited data in a single document
Arrays lookups will take linear time; larger the array, longer it will take to query it.
What you can do is have a collection for user relationships like so:
follower: user id
followee: user id
You can then create a compound index on follower-followee and query effectively to check who follows who. You can also enable timestamps here.
In order to get all followers of a user, just create an index on followee key and this will also resolve quickly

Related

How to populate a nested path using aggregate?

I have been trying to find the averageSum and averageRating, but I cannot get it done because I do not know how to populate using aggregate or if there is a work around. I have heard of $lookup, but I am not sure how to do it, also it tells me something about atlas tier does not do it. Is there a another way around to this? Can I populate then aggregate or can I find the averageSum and averageRating at the end using another method? Please help me
here is how my schema looks:
const favoriteSchema = new mongoose.Schema({
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "User",
unique: true,
},
favoriteSellers: [
//create array of object id, make sure they are unique for user not to add multiple sellers
{
type: mongoose.Schema.Types.ObjectId,
ref: "Seller",
unique: true,
},
],
});
and here is my Seller schema:
const sellerSchema = new mongoose.Schema({
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "User",
unique: true,
},
business: businessSchema,
sellerType: [String],
reviews: [
{
by: {
type: mongoose.Schema.Types.ObjectId,
ref: "User",
unique: true,
},
title: {
type: String,
},
message: {
type: String,
},
rating: Number,
imagesUri: [String],
timestamp: {
type: Date,
default: Date.now,
},
},
],
...
});
So I have an array of favorite sellers, I want to populate the sellers, then populate the reviews.by and user paths, and then do the calculation for the average sum and do the average rating. If possible please help me. What are my options here? Just do it outside on the expressjs route logic?
Here is my aggregate:
aggregatePipeline.push({
$match: { user: req.user._id },
});
//****** Here is where I want to populate before start the rest **********
then continue to following code because the fields(paths) are not populated so it averageSum will be 0 at all times.
aggregatePipeline.push({
$addFields: {
ratingSum: {
$reduce: {
initialValue: 0,
input: "$favoriteSellers.reviews",
in: { $sum: ["$$value", "$$this.rating"] },
},
},
},
});
//get average of rating ex. seller1 has a 4.5 averageRating field
aggregatePipeline.push({
$addFields: {
averageRating: {
$cond: [
{ $eq: [{ $size: "favoriteSellers.reviews" }, 0] }, //if it does not have any reviews, then we will just send 0
0, //set it to 0
{
$divide: ["$ratingSum", { $size: "$reviews" }], //else we can divide to get average Rating
},
],
},
},
});
let favList = await Favorite.aggregate(aggregatePipeline).exec();
When I retrieve my code, the array looks like:
[
{
_id: new ObjectId("62a7ce9550094eafc7a61233"),
user: new ObjectId("6287e4e61df773752aadc286"),
favoriteSellers: [ new ObjectId("6293210asdce81d9f2ae1685") ],
}
]
Here is a sample on how I want it to look:
(so each seller should have a field of average rating like and averageSum)
_id: 'favorite_id.....'
user: 'my id',
favoriteSellers:[
{
_id: 'kjskjhajkhsjk',
averageRating: 4.6
reviews:[.....],
...
},
{
_id: 'id______hsjk',
averageRating: 2.6
reviews:[.....],
...
},
{
_id: 'kjid______khsjk....',
averageRating: 3.6
reviews:[.....],
...
}
]

Mongoose text index on array of subdocuments always returns nothing

I have a collection of documents for users to track companies that looks like this:
const mongoose = require("mongoose");
const TrackingListSchema = new mongoose.Schema({
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "users",
required: true,
index: true
},
email: {
type: String,
required: true,
},
companies: [{
country: {
type: String,
required: true
},
countryCode: {
type: String,
required: true
},
name: {
type: String,
required: true
}
}]
});
TrackingListSchema.index({"companies.name": "text"});
module.exports = mongoose.model("trackinglist", TrackingListSchema);
As you can see, I am creating a text index on the "companies.name" field.
However, when I run a query as such:
TrackingList.aggregate([
{$match: {
"companies.countryCode": "us",
$text: {
$search: "Some Company",
$caseSensitive: false
}
}},
{$project: {
_id: 0,
email: 1,
}}
])
Nothing is returned even though I am sure that at least one tracking list has a company with that name. What am I doing wrong? Am I creating the text index wrong because it is inside an array? I thought I had this working but now I just can't get it to work.

Mongoose populate returns an empty array | multiple levels of embedded documents

I am trying to populate my ChatRoom model with the User reference. However, it returns a ChatRoom object with only _ids where I expected usernames, as if I never applied populate on it.
Here is an extract of my ChatRoom model :
const ChatRoom = mongoose.model("ChatRoom", {
sender: {
type: mongoose.Schema.Types.ObjectId,
ref: "User",
},
roomname: { type: String, default: "new room" },
messages: [
{
messages: {
type: mongoose.Schema.Types.ObjectId,
ref: "Message",
},
meta: [
{
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "User",
},
delivered: Boolean,
read: Boolean,
},
],
},
],
participants: [
{
user: { type: mongoose.Schema.Types.ObjectId, ref: "User" },
},
],
isPrivate: { type: Boolean, default: "false" },
});
My User model :
const User = mongoose.model("User", {
username: { required: true, unique: true, type: String },
avatar: Object,
token: String,
hash: String,
salt: String,
chatroom: {
type: mongoose.Schema.Types.ObjectId,
ref: "ChatRoom",
},
});
As this seems to be a recurrent issue, I tested several StackOverflow answers for my populate code :
Using populate('participants.user') and 'model: user' or just populate('participants.user'), same solution here:
const chatroom = await ChatRoom.findById(req.params.id)
.populate([
{
path: "participants.user",
model: "User",
},
])
.exec((err, user) => {
if (err) {
console.log("error", err);
} else {
console.log("Populated User " + user);
}
});
The console.log returns :
Populated User { _id: new ObjectId("62262b342e28298eb438d9eb"),
sender: new ObjectId("6225d86c9340237fe2a3f067"), roomname:
'Hosmeade', participants: [ { _id: new
ObjectId("6225d86c9340237fe2a3f067") } ], isPrivate: false,
messages: [], __v: 0 }
As if no populate method was ever applied. On the client side, I get an empty string.
Checking my documents aren't empty, this link mentions that Mongoose get problems with detecting referring model across multiple files but the solution doesn't work for me :
_id:6225d86c9340237fe2a3f067 username:"Berlioz" token:"rTCLAiU7jq3Smi3B"
hash:"wvJdqq25jYSaJjfiHAV4YRn/Yub+s1KHXzGrkDpaPus="
salt:"hZdiqIQQXGM1ryYK" avatar:Object
__v:0
If I remove the .exec(...) part, I get the info on the client side, but participants is still filled with only id :
chatroom response : Object { _id: "62262bb14e66d86fb8a041e8",
sender: "6225d86c9340237fe2a3f067", roomname: "Very secret room",
participants: (1) […], isPrivate: false, messages: [], __v: 0 }
I also tried with select: 'username' and get the same result as above :
const chatroom = await ChatRoom.findById(req.params.id).populate({
path: "participants.user",
select: "username",
});
Populating it "as an array"
Changing type of participants.user in my ChatRoom model into an Object (nothing changes)
If needed hereafter are my repos:
Client side and Backend
I run out of ideas on how to debbug my code. Any help would be great !

Populating in Mongodb Aggregating

I just asked a related question here:
Mongoose/Mongodb Aggregate - group and average multiple fields
I'm trying to use Model.aggregate() to find the average rating of all posts by date and then by some author's subdocument like country.name or gender. Having trouble with this though. I know for the first stage I just need to use $match for the date and I think I need to use $lookup to "populate" the author field but not sure how to implement this.
This works for finding an average rating for all posts by date:
Post.aggregate([
{ $group: { _id: "$date", avgRating: { $avg: '$rating' }}}
]).
then(function (res) {
console.log(res);
})
And this is basically what I want to do but it doesn't work:
Post.aggregate([
{$match: {"date": today}},
{$group: {_id: {"country": "$author.country.name"}, avgRating: {$avg: "$rating"}}}
]).then(function(res) {
console.log(res)
})
User model:
const userSchema = new Schema({
email: {
type: String,
required: true,
unique: true
},
birthday: {
type: Date,
required: true,
},
gender:{
type: String,
required: true
},
country:{
name: {
type: String,
required: true
},
flag: {
type: String,
// default: "/images/flags/US.png"
}
},
avatar: AvatarSchema,
displayName: String,
bio: String,
coverColor: {
type: String,
default: "#343a40"
},
posts: [
{
type: Schema.Types.ObjectId,
ref: "Post"
}
],
comments: [
{
type: Schema.Types.ObjectId,
ref: "Comment"
}
],
postedToday: {
type: Boolean,
default: false
},
todaysPost: {
type: String
}
})
You can populate an aggregation after you fetched the data from the MongoDB. Your `Query will look a bit like this:
modelName.aggregate([{
$unwind: ''//if Needed
}, {
$group: {
_id: {"country":"$author.country.name"},
avgRating: {
$avg: '$rating'
}
}])
.exec(function(err, transactions) {
// ERRORHANDLING
// CallsBacks
modelName.populate(columnName, {path: '_id'}, function(err, populatedModel) {
// Your populated columnName inside TaleName
});
});

Wrong implementation of $lookup from MongoDB in NodeJs

I have an Entity model and a Review model, they are related by entityId field which is part Review model.
I am trying to find all the reviews from a specific entity and then calculate the average of all the rating of all reviews. (rating is another field of Review model, given below)
This is how Entity model looks:
const entitySchema = new Schema({
name: {
type: String,
required: true,
trim: true,
unique: true,
}
});
and this is Review model:
const reviewSchema = new Schema({
rating: {
type: Number,
min: 0,
max: 5,
required: true,
},
comment: {
type: String,
trim: true,
},
public: {
type: Boolean,
required: true,
default: false,
},
entityId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Entity',
required: true,
}
}, {
timestamps: true,
});
I want to $lookup function here and this is what I have tried till now:
router.get('/entities/reviews/average', async (req, res) => {
try {
const entity = await Entity.find();
const entityId = [];
Object.keys(entity).forEach((key) => {
entityId.push(entity[key]._id);
});
Object.keys(entityId).forEach((key) => {
const reviews = Review.aggregate([
{ $match: { entityId: ObjectId(entityId[key]) } },
{
$lookup: {
from: 'entity',
localField: '_id',
foriegnField: 'entityId',
as: 'rating',
},
},
{
$group: {
_id: null,
avg: { $avg: '$rating' },
},
},
]);
res.send(reviews);
});
} catch (e) {
res.status(500).send();
}
});
But this doesn't work it gives this response back
{
"_pipeline": [
{
"$match": {
"entityId": "5eb658d7"
}
},
{
"$lookup": {
"from": "entity",
"localField": "_id",
"foriegnField": "entityId",
"as": "rating"
}
},
{
"$group": {
"_id": null,
"avg": {
"$avg": "$rating"
}
}
}
],
"options": {}
}
How to do this? What am I doing wrong?
I am not getting the reason behind that you are getting same query in return,
If i am not wrong then you are doing average of rating for entity, my suggestion is you can combine query and do it in single query,
$lookup to join rating collection
$addFields to do average, make array of rating using $map and then do average using $avg
router.get('/entities/reviews/average', async (req, res) => {
try {
let reviews = await Entity.aggregate([
{
$lookup: {
from: "Review",
localField: "_id",
foreignField: "entityId",
as: "avgRating"
}
},
{
$addFields: {
avgRating: {
$avg: {
$map: {
input: "$avgRating",
in: "$$this.rating"
}
}
}
}
}
])
res.send(reviews);
} catch (e) {
res.status(500).send();
}
});
Playground
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
Lookup is doing a sql type join so the two fields you want to join on would have to match. I couldn't get you query working in mongo shell but I did get the following to work.
Reviews.aggregate([
{
$group: {
_id: { entityId: "5f56460d567f27054739c3bb" },
averageRating: { $avg: "$rating" },
},
},
])
It's run in mongo shell as well.

Categories

Resources