Wrong implementation of $lookup from MongoDB in NodeJs - javascript

I have an Entity model and a Review model, they are related by entityId field which is part Review model.
I am trying to find all the reviews from a specific entity and then calculate the average of all the rating of all reviews. (rating is another field of Review model, given below)
This is how Entity model looks:
const entitySchema = new Schema({
name: {
type: String,
required: true,
trim: true,
unique: true,
}
});
and this is Review model:
const reviewSchema = new Schema({
rating: {
type: Number,
min: 0,
max: 5,
required: true,
},
comment: {
type: String,
trim: true,
},
public: {
type: Boolean,
required: true,
default: false,
},
entityId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Entity',
required: true,
}
}, {
timestamps: true,
});
I want to $lookup function here and this is what I have tried till now:
router.get('/entities/reviews/average', async (req, res) => {
try {
const entity = await Entity.find();
const entityId = [];
Object.keys(entity).forEach((key) => {
entityId.push(entity[key]._id);
});
Object.keys(entityId).forEach((key) => {
const reviews = Review.aggregate([
{ $match: { entityId: ObjectId(entityId[key]) } },
{
$lookup: {
from: 'entity',
localField: '_id',
foriegnField: 'entityId',
as: 'rating',
},
},
{
$group: {
_id: null,
avg: { $avg: '$rating' },
},
},
]);
res.send(reviews);
});
} catch (e) {
res.status(500).send();
}
});
But this doesn't work it gives this response back
{
"_pipeline": [
{
"$match": {
"entityId": "5eb658d7"
}
},
{
"$lookup": {
"from": "entity",
"localField": "_id",
"foriegnField": "entityId",
"as": "rating"
}
},
{
"$group": {
"_id": null,
"avg": {
"$avg": "$rating"
}
}
}
],
"options": {}
}
How to do this? What am I doing wrong?

I am not getting the reason behind that you are getting same query in return,
If i am not wrong then you are doing average of rating for entity, my suggestion is you can combine query and do it in single query,
$lookup to join rating collection
$addFields to do average, make array of rating using $map and then do average using $avg
router.get('/entities/reviews/average', async (req, res) => {
try {
let reviews = await Entity.aggregate([
{
$lookup: {
from: "Review",
localField: "_id",
foreignField: "entityId",
as: "avgRating"
}
},
{
$addFields: {
avgRating: {
$avg: {
$map: {
input: "$avgRating",
in: "$$this.rating"
}
}
}
}
}
])
res.send(reviews);
} catch (e) {
res.status(500).send();
}
});
Playground

https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
Lookup is doing a sql type join so the two fields you want to join on would have to match. I couldn't get you query working in mongo shell but I did get the following to work.
Reviews.aggregate([
{
$group: {
_id: { entityId: "5f56460d567f27054739c3bb" },
averageRating: { $avg: "$rating" },
},
},
])
It's run in mongo shell as well.

Related

Searching by multiple fields across multiple collections in a single query in mongodb. [using aggregate & populate functions]

Here, we've two mongo collections [Ad and Company]. A company can have multiple ads.
Need an API endpoint that should return the ads matching the keyword entered. Searching across the company name, primary text, headline, and description in a single query using aggregate & populate functions.
These are the schemas for both of them:
const CompanySchema = new mongoose.Schema({
name: {
type: String,
required: true,
},
url: {
type: String,
required: true,
},
});
const Company = mongoose.model("company", CompanySchema);
const AdSchema = new mongoose.Schema({
primaryText: {
type: String,
required: true,
},
companyId: {
type: mongoose.Schema.Types.ObjectId,
ref: "company",
required: true,
},
headline: {
type: String,
required: true,
},
description: {
type: String,
required: false,
default: "",
},
CTA: {
type: String,
required: true,
},
imageUrl: {
type: String,
required: true,
},
});
AdSchema.index({ primaryText: 'text', headline: 'text', description: 'text' });
const Ad = mongoose.model("ad", AdSchema);
This is the approach I've taken but here I'm not able to search by company Name. Can anyone please help me in the matter, what is wrong with the below approach and what's the correct approach?
router.get("/search", async function (req, res, next) {
const query = req.query.query;
const ads = await Ad.aggregate([
{
$match: {
$text: {
$search: query,
},
},
},
{
$lookup: {
from: "companies",
localField: "companyId",
foreignField: "_id",
as: "company",
},
},
{
$unwind: "$company",
},
]);
res.json(ads);
});
Here is your modified answer.
router.get("/search", async function (req, res, next) {
const query = req.query.query;
const ads = await Ad.aggregate([
{
$match: {text:query},
},
{
$lookup: {
from: "companies",
localField: "companyId",
foreignField: "_id",
as: "company",
},
},
{
$unwind: "$company",
},
]);
res.json(ads);
});
or
This will find your query any where within the text.
router.get("/search", async function (req, res, next) {
const query = req.query.query;
const ads = await Ad.aggregate([
{
$match: {text:{
$regex: `.*${query}.*`,
$options: 'i'
}
},
},
{
$lookup: {
from: "companies",
localField: "companyId",
foreignField: "_id",
as: "company",
},
},
{
$unwind: "$company",
},
]);
res.json(ads);
});

Aggregation Query Optimization Mongodb

User Schema
I have been building a social media application and I have to write a query that returns the user of user. The schema of user is shown below.
const userSchema = Schema(
{
email: {
type: String,
unique: true,
required: [true, "Email is required"],
index: true,
},
active: {
type: Boolean,
default: true,
},
phone: {
type: String,
unique: true,
required: [true, "Phone is required"],
index: true,
},
name: {
required: true,
type: String,
required: [true, "Name is required"],
},
bio: {
type: String,
},
is_admin: {
type: Boolean,
index: true,
default: false,
},
is_merchant: {
type: Boolean,
index: true,
default: false,
},
password: {
type: String,
required: [true, "Password is required"],
},
profile_picture: {
type: String,
},
followers: [
// meaning who has followed me
{
type: Types.ObjectId,
ref: "user",
required: false,
},
],
followings: [
// meaning all of them who I followed
{
type: Types.ObjectId,
ref: "user",
required: false,
},
],
},
{
timestamps: { createdAt: "created_at", updatedAt: "updated_at" },
toObject: {
transform: function (doc, user) {
delete user.password;
},
},
toJSON: {
transform: function (doc, user) {
delete user.password;
},
},
}
);
Follow/following implementation
I have implemented follow/following using the logic shown as below. Each time user follows another user. It would perform 2 queries. One would update the follower followers part using findOneAndUpdate({push:followee._id}) and a second query to update the part of followee user.
Query Response Pattern
I have written a query that should return the response with followings response appended to each user
{
doesViewerFollowsUser: boolean // implying if person we are viewing profile of follows us
doesUserFollowsViewer: boolean // implying if person we are viewing profile of follows us
}
The actual query
The query must looks like this
userModel
.aggregate([
{
$match: {
_id: {
$in: [new Types.ObjectId(userId), new Types.ObjectId(viewerId)],
},
},
},
{
$addFields: {
order: {
$cond: [
{
$eq: ["$_id", new Types.ObjectId(viewerId)], // testing for viewer
},
2,
1,
],
},
},
},
{
$group: {
_id: 0,
subjectFollowings: {
$first: "$followings",
},
viewerFollowings: {
$last: "$followings",
},
viewerFollowers: {
$last: "$followers",
},
},
},
{
$lookup: {
from: "users",
localField: "subjectFollowings",
foreignField: "_id",
as: "subjectFollowings",
},
},
{
$project: {
subjectFollowings: {
$map: {
input: "$subjectFollowings",
as: "user",
in: {
$mergeObjects: [
"$$user",
{
doesViewerFollowsUser: {
$cond: [
{
$in: ["$$user._id", "$viewerFollowers"],
},
true,
false,
],
},
},
{
doesUserFollowsViewer: {
$cond: [
{
$in: ["$$user._id", "$viewerFollowings"],
},
true,
false,
],
},
},
],
},
},
},
},
},
{
$project: {
"subjectFollowings.followings": 0,
"subjectFollowings.followers": 0,
"subjectFollowings.bio": 0,
"subjectFollowings.password": 0,
"subjectFollowings.is_admin": 0,
"subjectFollowings.is_merchant": 0,
"subjectFollowings.email": 0,
"subjectFollowings.phone": 0,
"subjectFollowings.created_at": 0,
"subjectFollowings.updated_at": 0,
"subjectFollowings.__v": 0,
},
},
])
The problem
I don't think the current query scales that much. The worst case complexity for this query reaches 0(n^2) (approximately). So, please help me optimize this query.
The problem is with your data modeling. You shound not store follower/following in an array because:
Mongodb has a 16mb hard limit for every document, which means you can store limited data in a single document
Arrays lookups will take linear time; larger the array, longer it will take to query it.
What you can do is have a collection for user relationships like so:
follower: user id
followee: user id
You can then create a compound index on follower-followee and query effectively to check who follows who. You can also enable timestamps here.
In order to get all followers of a user, just create an index on followee key and this will also resolve quickly

How can I get the total sum ($sum) of an array from a nested field?

I need the total sum of all the elements in an array that is nestet in my schema.
This is the schema:
const mongoose = require('mongoose');
let historySchema = new mongoose.Schema({
time: {
type:String
}
})
//users schema
let userSchema = new mongoose.Schema({
name:{
type:String,
},
dob:{
type:String,
},
email:{
type:String,
},
noOfpeopleClimbing:{
type: Number,
default:0
},
details:{
type:String,
},
status:{
type: Boolean,
default: false
},
timeIn:{
type: Number,
default: 0
},
timeOut:{
type: Number,
default: 0
},
timeFinal:{
type: Number,
default: 0
},
history:[{
time:{
type: Number
},
date:{
type:Date,
default:Date.now()
},
climbers:{
type: Number
},
names:{
type: String
}
}]
})
let User = module.exports = mongoose.model("User", userSchema);
The nested field in disscusion is:
history:[{
time:{
type: Number
}]
And the find method is:
app.get('/user/:id', function(req,res){
Users.findById(req.params.id, function(err, users){
res.render("user",
{
title:users.name,
users:users,
});
})
})
Can I attach to my find route an aggregate with $sum in order for me to send the data with the sum to my render view?.
For example totalTimeHistory:$sum aggregate data.
Use the following snippet below:
const result = Users.aggregate([
{
$match: {
//Your find block here
}
},
{
$unwind: "$history"
},
{
$project: {
totalTimeHistory: { $sum: "$history.time"}
}
}
])
Try this query:
db.collection.aggregate([
{
"$match": {
"name": "name" //or whatever you want
}
},
{
"$project": {
"total": {
"$sum": "$history.time"
}
}
}
])
In this way you don't need $unwind
Example here
db.getCollection('users').aggregate([
{
$match: {
<your find query goes here>
}
},
{
$unwind: '$history'
},
{
$group: {
_id: <your user object id (or) null>,
history: { $push: "$$ROOT.history" }
}
},
{
$addFields: {
totalSumOfHistoryTypes: { $sum: "$history.type" }
}
},
])
your output will look like
explanation:
$match: to find in the collection
$unwind: to unwind the history array so that the values of history can be grouped
$group: here we have created an array called history and pushed the history object($$ROOT.history) into it
$addFiled: used to add a new field which is not present on the schema
Hope this explains cheers

Mongoose Model find using an attribute from anohter Schema

Basically I have 2 Schemas.
User and Post.
User have an array which contains _ids from posts.
And post have an attribute that tells if he's an active post. -> is_active.
So, i want to filter User that have at least, one active post.
UserSchema
const UserSchema = new Schema(
{
name: {
type: String,
trim: true,
required: true
},
posts: [
{
type: Schema.Types.ObjectId,
ref: 'Post'
}
],
created_at: {
type: Date,
required: true,
default: Date.now()
}
}
)
export default mongoose.model<User>('User', UserSchema)
Post Schema
const postSchema = new Schema(
{
name: String,
is_active: boolean
}
)
As an alternative to #Tunmee's answer
Since the pipeline $lookup is available from v3.6 and as of v4.2 still has some performance issues. You could also use the "regular" $lookup available from v3.2
db.Users.aggregate([
{
$lookup: {
from: "Posts",
localField: "posts",
foreignField: "_id",
as: "posts"
}
},
{
$match: {
"posts.is_active": true
}
}
])
You can try this:
Users.aggregate([
{
$lookup: {
from: "Posts",
let: { postIds: "$posts", },
pipeline: [
{
$match: {
$expr: {
$and: [
{
$in: [ "$_id", "$$postIds" ]
},
{
$eq: [ "$is_active", true ]
},
]
}
},
},
// You can remove the projection below
// if you need the actual posts data in the final result
{
$project: { _id: 1 }
}
],
as: "posts"
}
},
{
$match: {
$expr: {
$gt: [ { $size: "$posts" }, 0 ]
}
}
}
])
You can test it out in a playground here
I'm not sure about your application's query requirement but you can add a compound index on _id and is_active properties in Posts collection to make the query faster.
You can read more about MongoDB data aggregation here.

Implement feed with retweets in MongoDB

I want to implement retweet feature in my app. I use Mongoose and have User and Message models, and I store retweets as array of objects of type {userId, createdAt} where createdAt is time when retweet occurred. Message model has it's own createdAt field.
I need to create feed of original and retweeted messages merged together based on createdAt fields. I am stuck with merging, whether to do it in a single query or separate and do the merge in JavaScript. Can I do it all in Mongoose with a single query? If not how to find merge insertion points and index of the last message?
So far I just have fetching of original messages.
My Message model:
const messageSchema = new mongoose.Schema(
{
fileId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'File',
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
},
likesIds: [{ type: mongoose.Schema.Types.ObjectId, ref: 'User' }],
reposts: [
{
reposterId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
createdAt: { type: Date, default: Date.now },
},
],
},
{
timestamps: true,
},
);
Edit: Now I have this but pagination is broken. I am trying to use newCreatedAt field for cursor, that doesn't seem to work. It returns empty array in second call when newCreatedAt is passed from the frontend.
messages: async (
parent,
{ cursor, limit = 100, username },
{ models },
) => {
const user = username
? await models.User.findOne({
username,
})
: null;
const options = {
...(cursor && {
newCreatedAt: {
$lt: new Date(fromCursorHash(cursor)),
},
}),
...(username && {
userId: mongoose.Types.ObjectId(user.id),
}),
};
console.log(options);
const aMessages = await models.Message.aggregate([
{
$addFields: {
newReposts: {
$concatArrays: [
[{ createdAt: '$createdAt', original: true }],
'$reposts',
],
},
},
},
{
$unwind: '$newReposts',
},
{
$addFields: {
newCreatedAt: '$newReposts.createdAt',
original: '$newReposts.original',
},
},
{ $match: options },
{
$sort: {
newCreatedAt: -1,
},
},
{
$limit: limit + 1,
},
]);
const messages = aMessages.map(m => {
m.id = m._id.toString();
return m;
});
//console.log(messages);
const hasNextPage = messages.length > limit;
const edges = hasNextPage ? messages.slice(0, -1) : messages;
return {
edges,
pageInfo: {
hasNextPage,
endCursor: toCursorHash(
edges[edges.length - 1].newCreatedAt.toString(),
),
},
};
},
Here are the queries. The working one:
Mongoose: messages.aggregate([{
'$match': {
createdAt: {
'$lt': 2020 - 02 - 02 T19: 48: 54.000 Z
}
}
}, {
'$sort': {
createdAt: -1
}
}, {
'$limit': 3
}], {})
And the non working one:
Mongoose: messages.aggregate([{
'$match': {
newCreatedAt: {
'$lt': 2020 - 02 - 02 T19: 51: 39.000 Z
}
}
}, {
'$addFields': {
newReposts: {
'$concatArrays': [
[{
createdAt: '$createdAt',
original: true
}], '$reposts'
]
}
}
}, {
'$unwind': '$newReposts'
}, {
'$addFields': {
newCreatedAt: '$newReposts.createdAt',
original: '$newReposts.original'
}
}, {
'$sort': {
newCreatedAt: -1
}
}, {
'$limit': 3
}], {})
This can be done in one query, although its a little hack-ish:
db.collection.aggregate([
{
$addFields: {
reposts: {
$concatArrays: [[{createdAt: "$createdAt", original: true}],"$reports"]
}
}
},
{
$unwind: "$reposts"
},
{
$addFields: {
createdAt: "$reposts.createdAt",
original: "$reposts.original"
}
},
{
$sort: {
createdAt: -1
}
}
]);
You can add any other logic you want to the query using the original field, documents with original: true are the original posts while the others are retweets.

Categories

Resources