Mongo find by sum of subdoc array - javascript

I'm trying to find stocks in the Stock collection where the sum of all owners' shares is less than 100. Here is my schema.
const stockSchema = new mongoose.Schema({
owners: [
{
owner: {
type: Schema.Types.ObjectId,
ref: "Owner"
},
shares: {
type: Number,
min: 0,
max: 100
}
}
]
}
const Stock = mongoose.model("Stock", stockSchema);
I've tried to use aggregate but it returns a single object computed over all stocks in the collection, as opposed to multiple objects with the sum of each stock's shares.
stockSchema.statics.getUnderfundedStocks = async () => {
const result = await Stock.aggregate([
{ $unwind: "$owners" },
{ $group: { _id: null, shares: { $sum: "$owners.shares" } } },
{ $match: { shares: { $lt: 100 } } }
]);
return result;
};
So, rather than getting:
[ { _id: null, shares: 150 } ] from getUnderfundedStocks, I'm looking to get:
[ { _id: null, shares: 90 }, { _id: null, shares: 60 } ].
I've come across $expr, which looks useful, but documentation is scarce and not sure if that's the appropriate path to take.
Edit: Some document examples:
/* 1 */
{
"_id" : ObjectId("5ea699fb201db57b8e4e2e8a"),
"owners" : [
{
"owner" : ObjectId("5ea62a94ccb1b974d40a2c72"),
"shares" : 85
}
]
}
/* 2 */
{
"_id" : ObjectId("5ea699fb201db57b8e4e2e1e"),
"owners" : [
{
"owner" : ObjectId("5ea62a94ccb1b974d40a2c72"),
"shares" : 20
},
{
"owner" : ObjectId("5ea62a94ccb1b974d40a2c73"),
"shares" : 50
},
{
"owner" : ObjectId("5ea62a94ccb1b974d40a2c74"),
"shares" : 30
}
]
}
I'd like to return an array that just includes document #1.

You do not need to use $group here. Simply use $project with $sum operator.
db.collection.aggregate([
{ "$project": {
"shares": { "$sum": "$owners.shares" }
}},
{ "$match": { "shares": { "$lt": 100 } } }
])
Or even you do not need to use aggregation here
db.collection.find({
"$expr": { "$lt": [{ "$sum": "$owners.shares" }, 100] }
})
MongoPlayground

Related

how to catch mongo change streams when one key in object gets updated

I am using the mongo change stream to update some of my data in the cache. This is the Documentation i am following.
I want to know when one of the keys in my object gets updated.
Schema :- attributes: { position: { type: Number } },
How will I come to know when the position will get updated?
This is how my updateDescription object looks like after updation
updateDescription = {
updatedFields: { 'attributes.position': 1, randomKey: '1234'},
removedFields: []
}
Tried this and its not working
Collection.watch(
[
{
$match: {
$and: [
{ operationType: { $in: ['update'] } },
{
$or: [
{ 'updateDescription.updatedFields[attributes.position]': { $exists: true } },
],
},
],
},
},
],
{ fullDocument: 'updateLookup' },
);
I was able to fix it like this,
const streamUpdates = <CollectionName>.watch(
[
{
$project:
{
'updateDescription.updatedFields': { $objectToArray: '$updateDescription.updatedFields' },
fullDocument: 1,
operationType: 1,
},
},
{
$match: {
$and: [
{ operationType: 'update' },
{
'updateDescription.updatedFields.k': {
$in: [
'attributes.position',
],
},
},
],
},
},
],
{ fullDocument: 'updateLookup' },
);
and then do something with the stream
streamUpdates.on('change', async (next) => {
//your logic
});
Read more here MongoDB

Implement feed with retweets in MongoDB

I want to implement retweet feature in my app. I use Mongoose and have User and Message models, and I store retweets as array of objects of type {userId, createdAt} where createdAt is time when retweet occurred. Message model has it's own createdAt field.
I need to create feed of original and retweeted messages merged together based on createdAt fields. I am stuck with merging, whether to do it in a single query or separate and do the merge in JavaScript. Can I do it all in Mongoose with a single query? If not how to find merge insertion points and index of the last message?
So far I just have fetching of original messages.
My Message model:
const messageSchema = new mongoose.Schema(
{
fileId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'File',
required: true,
},
userId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
required: true,
},
likesIds: [{ type: mongoose.Schema.Types.ObjectId, ref: 'User' }],
reposts: [
{
reposterId: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
createdAt: { type: Date, default: Date.now },
},
],
},
{
timestamps: true,
},
);
Edit: Now I have this but pagination is broken. I am trying to use newCreatedAt field for cursor, that doesn't seem to work. It returns empty array in second call when newCreatedAt is passed from the frontend.
messages: async (
parent,
{ cursor, limit = 100, username },
{ models },
) => {
const user = username
? await models.User.findOne({
username,
})
: null;
const options = {
...(cursor && {
newCreatedAt: {
$lt: new Date(fromCursorHash(cursor)),
},
}),
...(username && {
userId: mongoose.Types.ObjectId(user.id),
}),
};
console.log(options);
const aMessages = await models.Message.aggregate([
{
$addFields: {
newReposts: {
$concatArrays: [
[{ createdAt: '$createdAt', original: true }],
'$reposts',
],
},
},
},
{
$unwind: '$newReposts',
},
{
$addFields: {
newCreatedAt: '$newReposts.createdAt',
original: '$newReposts.original',
},
},
{ $match: options },
{
$sort: {
newCreatedAt: -1,
},
},
{
$limit: limit + 1,
},
]);
const messages = aMessages.map(m => {
m.id = m._id.toString();
return m;
});
//console.log(messages);
const hasNextPage = messages.length > limit;
const edges = hasNextPage ? messages.slice(0, -1) : messages;
return {
edges,
pageInfo: {
hasNextPage,
endCursor: toCursorHash(
edges[edges.length - 1].newCreatedAt.toString(),
),
},
};
},
Here are the queries. The working one:
Mongoose: messages.aggregate([{
'$match': {
createdAt: {
'$lt': 2020 - 02 - 02 T19: 48: 54.000 Z
}
}
}, {
'$sort': {
createdAt: -1
}
}, {
'$limit': 3
}], {})
And the non working one:
Mongoose: messages.aggregate([{
'$match': {
newCreatedAt: {
'$lt': 2020 - 02 - 02 T19: 51: 39.000 Z
}
}
}, {
'$addFields': {
newReposts: {
'$concatArrays': [
[{
createdAt: '$createdAt',
original: true
}], '$reposts'
]
}
}
}, {
'$unwind': '$newReposts'
}, {
'$addFields': {
newCreatedAt: '$newReposts.createdAt',
original: '$newReposts.original'
}
}, {
'$sort': {
newCreatedAt: -1
}
}, {
'$limit': 3
}], {})
This can be done in one query, although its a little hack-ish:
db.collection.aggregate([
{
$addFields: {
reposts: {
$concatArrays: [[{createdAt: "$createdAt", original: true}],"$reports"]
}
}
},
{
$unwind: "$reposts"
},
{
$addFields: {
createdAt: "$reposts.createdAt",
original: "$reposts.original"
}
},
{
$sort: {
createdAt: -1
}
}
]);
You can add any other logic you want to the query using the original field, documents with original: true are the original posts while the others are retweets.

histogram the result of a histogram

I have generated a histogram by the following command:
db.mydb.aggregate([{ $bucketAuto: { groupBy: "$userId", buckets: 1e9 } }])
Assuming I have fewer than 1 billion unique users (and sufficient memory), this gives me the count of documents for each user.
User Docs
===== ====
userA 3
userB 1
userC 5
userD 1
I want to take the result of this histogram and pivot to count the number of users for each document count.
The result would look like:
Docs Users
==== =====
1 2
2 0
3 1
4 0
5 1
Is there a simple, functional, way of doing this in MongoDB?
One thing you can start with is simple $group stage:
db.col.aggregate([
{
$group: {
_id: "$docs",
count: { $sum: 1 }
}
},
{
$project: {
_id: 0,
docs: "$_id",
users: "$count"
}
},
{
$sort: { docs: 1 }
}
])
This will give you below result:
{ "docs" : 1, "users" : 2 }
{ "docs" : 3, "users" : 1 }
{ "docs" : 5, "users" : 1 }
Then docs without users are the missing part. You can add them either from your application or from MongoDB (shown below):
db.col.aggregate([
{
$group: {
_id: "$docs",
count: { $sum: 1 }
}
},
{
$group: {
_id: null,
histogram: { $push: "$$ROOT" }
}
},
{
$project: {
values: {
$map: {
input: { $range: [ { $min: "$histogram._id" }, { $add: [ { $max: "$histogram._id" }, 1 ] } ] },
in: {
docs: "$$this",
users: {
$let: {
vars: {
current: { $arrayElemAt: [ { $filter: { input: "$histogram", as: "h", cond: { $eq: [ "$$h._id", "$$this" ] } } }, 0 ] }
},
in: {
$ifNull: [ "$$current.count", 0 ]
}
}
}
}
}
}
}
},
{
$unwind: "$values"
},
{
$replaceRoot: {
newRoot: "$values"
}
}
])
The idea here is that we can $group by null which produces single document containing all docs from previous stage. Knowing $min and $max values we can generate a $range of numbers and $map that range into either existing counts or default value which is 0. Then we can use $unwind and $replaceRange to get single histogram point per document. Output:
{ "docs" : 1, "users" : 2 }
{ "docs" : 2, "users" : 0 }
{ "docs" : 3, "users" : 1 }
{ "docs" : 4, "users" : 0 }
{ "docs" : 5, "users" : 1 }
mickl's answer definitely got me moving in the right direction. In particular, using $group is a nice improvement over $bucketAuto for this use-case. The trick to layering the histogram was just to use a $group stage more than once within the same aggregate. I guess it's obvious in hindsight.
The complete solution is here:
const h2 = db.mydb.aggregate([
{ $group: { _id: "$userId", count: { $sum: 1 } } },
{ $group: { _id: "$count", count: { $sum: 1 } } },
{ $project: { docs: "$_id", users: "$count" } },
{ $sort: { docs: +1 } }
])

Value added to array when using $project

I am writing an aggregation pipeline to return a win ratio. When I use $sum the value is output from $facet $project within an array. This has me confused. To solve the issue I simply run $sum on the arrays when I calculate the winRatio, which works fine. How do I use $project without it adding values into an array?
Round.aggregate([
{
$match: {
$and: query,
},
},
{
$facet: {
wins: [
{
$match: {
winner: user,
},
},
{
$group: {
_id: { user: '$scores.player', game: '$game' },
value: { $sum: 1 }, // value *not* within array
},
},
],
rounds: [
{
$unwind: '$scores',
},
{
$match: {
'scores.player': user,
},
},
{
$group: {
_id: { user: '$scores.player', game: '$game' },
value: { $sum: 1 }, // value *not* within array
},
},
],
},
},
{
$project: {
_id: '$rounds._id',
rounds: '$rounds.value', // value within an array
wins: '$wins.value', // value within an array
winRatio: { ... },
},
},
]);
Schema:
const schema = new mongoose.Schema(
{
game: { type: mongoose.Schema.ObjectId, required: true },
scores: [
{
player: { type: mongoose.Schema.ObjectId, ref: 'User', required: true },
playerName: { type: String }, // denormalise
score: { type: Number, required: true },
},
],
winner: { type: mongoose.Schema.ObjectId, required: true },
datePlayed: { type: Date },
},
{ timestamps: true },
);
Your asking why $sum 'works' and $project dosent.
Lets start off by understand the output of the $facet phase.
{
"wins" : [
{
"_id" : {
"user" : [
"player1",
"player2"
],
"game" : 1.0
},
"value" : 2.0
}
],
"rounds" : [
{
"_id" : {
"user" : "player1",
"game" : 1.0
},
"value" : 3.0
}
]
}
As we can see each document result is an array, even though you grouped at the end, imagine each result as its own aggregation, that return value is always an array (either empty or not depending on results).
so when you $project on $rounds.value you're telling mongo to keep the value field for each of the results in the array. in our case its only one but still.
$sum on the other hand is an accumulative operator, from the docs:
With a single expression as its operand, if the expression resolves to an array, $sum traverses into the array to operate on the numerical elements of the array to return a single value.
a quick fix to your 'issue' is just to add $sum while projecting:
{
$project: {
_id: '$rounds._id',
rounds: {$sum: '$rounds.value'},
wins: {$sum: '$wins.value'},
winRatio: { ... },
},
},

(mongoDB) find with empty or null values

how to make a mongodb collection find (db.collection.find) with empty values?
currently i have:
function test(s) {
if (!s) {
return null;
} else {
return s;
}
}
var content = {
date: {
from: '2017-10-15',
to: '2017-11-15'
},
'name': 'some text', //this can be null or empty
'text': 'some other text' //this can be null or empty
}
col.find({
"date": {
$gte: new Date(content.date.from),
$lte: new Date(content.date.to)
},
"name": {
$ne: {
$type: null
},
$eq: test(content.name)
},
"text": {
$ne: {
$type: null
},
$eq: test(content.text)
},
}).toArray((err, items) => {
console.log(items)
});
but it returns an empty array, because "name" or "text" is null / an empty string,
i want that it query only the values that have something specified or ignore it (like content.name is something in it or its empty)
how do i get it? i already searched ... but didnt found something
thanks!
( already testet mongoDB : multi value field search ignoring null fields)
Versions:
Node: 8.9.0
(npm) mongodb: 2.2.33
mongodb: 3.4
Try using $and, $or operators. Something like.
col.find({
$and:[
{"date": {$gte: new Date(content.date.from),$lte: new Date(content.date.to)}},
{"$or":[{"name": {$ne: {$type: null}}},{"name":test(content.name)}]},
{"$or":[{"text": {$ne: {$type: null}}},{"text":test(content.text)}]}
]
}).toArray((err, items) => {
console.log(items)
});
col.find({
$and: [
{
$and: [
{
"date": {
$gte: new Date(content.date.from)
}
},
{
"date": {
$lte: new Date(content.date.to)
}
},
]
},
{
$or: [
{ 'name': null },
{ 'name': content.name }
]
},
{
$or: [
{ 'text': null },
{ 'text': content.text }
]
}
]
})
Edited:
col.find({
$and: [
{
$and: [
{
"date": {
$gte: new Date(content.date.from)
}
},
{
"date": {
$lte: new Date(content.date.to)
}
},
]
},
{
$or: [
{ 'name': null },
{ 'name':'' }
{ 'name': content.name }
]
},
{
$or: [
{ 'text': null },
{ 'text': '' },
{ 'text': content.text }
]
}
]
})
null and empty is different, you need to add one more condition for empty string in query.
Logical Query Operators

Categories

Resources