How to weight documents to create sort criteria? - javascript

I'm trying to aggregate a collection in which there are documents that look like this:
[
{
"title" : 1984,
"tags" : ['dystopia', apocalypse', 'future',....]
},
....
]
And I have a criteria array of keywords, for instance:
var keywords = ['future', 'google', 'cat',....]
What I would like to achieve is to aggregate the collection in order to $group it according to a "convenience" criteria in order to sort the document by the one that contains the more of the keywords in its tags field.
This means, if one document contains in its tags: 'future', 'google', 'cat' it will be sorted before another one that has 'future', 'cat', 'apple'.
So far, I have tried something like this:
db.books.aggregate(
{ $group : { _id : {title:"$title"} , convenience: { $sum: { $cond: [ {tags: {$in: keywords}}, 1, 0 ] } } } },
{ $sort : {'convenience': -1}})
But the $in operator is not a boolean so it does not work. I've looked around and didn't find any operator that could help me with this.

As you said you need a logical operator in order to evaluate $cond. It's a bit terse, but here is an implementation using $or :
db.books.aggregate([
{$unwind: "$tags" },
{$group: {
_id: "$title",
weight: {
$sum: {$cond: [
// Test *equality* of the `tags` value against any of the list
{$or: [
{$eq: ["$tags", "future"]},
{$eq: ["$tags", "google"]},
{$eq: ["$tags", "cat"]},
]},
1, 0 ]}
}
}}
])
I'll leave the rest of the implementation up to you, but this should show the basic construction to the point of the matching you want to do.
Addition
From your comments there also seem to be a programming issue you are struggling with, related to how you perform an aggregation like this where you have an Array of items to query in the form you gave above:
var keywords = ['future', 'google', 'cat',....]
Since this structure cannot be directly employed in the pipeline condition, what you need to do is transform it into what you need. Each language has it's own approach, but in a JavaScript version:
var keywords = ['future', 'google', 'cat'];
var orCondition = [];
keywords.forEach(function(value) {
var doc = {$eq: [ "$tags", value ]};
orCondition.push(doc);
});
And then just define the aggregation query with the orCondition variable in place:
db.books.aggregate([
{$unwind: "$tags" },
{$group: {
_id: "$title",
weight: {
$sum: {$cond: [
{$or: orCondition }
1, 0 ]}
}
}}
])
Or for that matter, any of the parts you need to construct. This is generally how it is done in the real world, where we would almost never hard-code a data structure like this.

Related

$all not working in mongodb

I have two document in my collections
{ participants: [ '5ab8fcf6d8bfca2cc0aebb37', '5ab8fd15d8bfca2cc0aebb38' ],
_id: 5ab9a5a0cb274a2064b65d1b,
__v: 0
},
{ participants: [ '5ab8fcf6d8bfca2cc0aebb37', '5ab8fcf6d8bfca2cc0aebb37' ],
_id: 5ab9a5a7cb274a2064b65d1c,
__v: 0
}
and i have an array of persons like
persons = [ 5ab8fcf6d8bfca2cc0aebb37, '5ab8fcf6d8bfca2cc0aebb37' ]
Now I am trying to find a document which contains which contain participants fields similar to array persons using this query.
Participant.find({participants: {$all: persons}}).exec()
.then(connected => {
console.log(connected);
// perform some stuff
});
it throws me both document as an output.
I don't know what is the problem.
Thanx in advance.
I think what you want is the $setEquals operator.
db.collection.find({ $expr: { $setEquals: [ persons, "$participants" ] } } )
You can use the $setEquals operator with the $redact operator if the $expr operator is not available in the mongod version you're running.
you can use $eq operator as well
Participant.find( { participants: { $eq: persons } })
.then(connected => {
console.log(connected);
// perform some stuff
});
for more https://docs.mongodb.com/manual/reference/operator/query/eq/

Mongoose: Sorting

what's the best way to sort the following documents in a collection:
{"topic":"11.Topic","text":"a.Text"}
{"topic":"2.Topic","text":"a.Text"}
{"topic":"1.Topic","text":"a.Text"}
I am using the following
find.(topic:req.body.topic).(sort({topic:1}))
but is not working (because the fields are strings and not numbers so I get):
{"topic":"1.Topic","text":"a.Text"},
{"topic":"11.Topic","text":"a.Text"},
{"topic":"2.Topic","text":"a.Text"}
but i'd like to get:
{"topic":"1.Topic","text":"a.Text"},
{"topic":"2.Topic","text":"a.Text"},
{"topic":"11.Topic","text":"a.Text"}
I read another post here that this will require complex sorting which mongoose doesn't have. So perhaps there is no real solution with this architecture?
Your help is greatly appreciated
i will suggest you make your topic filed as type : Number, and create another field topic_text.
Your Schema would look like:
var documentSchema = new mongoose.Schema({
topic : Number,
topic_text : String,
text : String
});
Normal document would look something like this:
{document1:[{"topic":11,"topic_text" : "Topic" ,"text":"a.Text"},
{"topic":2,"topic_text" : "Topic","text":"a.Text"},
{"topic":1,"topic_text" : "Topic","text":"a.Text"}]}
Thus, you will be able to use .sort({topic : 1}) ,and get the result you want.
while using topic value, append topic_text to it.
find(topic:req.body.topic).sort({topic:1}).exec(function(err,result)
{
var topic = result[0].topic + result[0].topic_text;//use index i to extract the value from result array.
})
If you do not want (or maybe do not even can) change the shape of your documents to include a numeric field for the topic number then you can achieve your desired sorting with the aggregation framework.
The following pipeline essentially splits the topic strings like '11.Topic' by the dot '.' and then prefixes the first part of the resulting array with a fixed number of leading zeros so that sorting by those strings will result in 'emulated' numeric sorting.
Note however that this pipeline uses $split and $strLenBytes operators which are pretty new so you may have to update your mongoDB instance - I used version 3.3.10.
db.getCollection('yourCollection').aggregate([
{
$project: {
topic: 1,
text: 1,
tmp: {
$let: {
vars: {
numStr: { $arrayElemAt: [{ $split: ["$topic", "."] }, 0] }
},
in: {
topicNumStr: "$$numStr",
topicNumStrLen: { $strLenBytes: "$$numStr" }
}
}
}
}
},
{
$project: {
topic: 1,
text: 1,
topicNumber: { $substr: [{ $concat: ["_0000", "$tmp.topicNumStr"] }, "$tmp.topicNumStrLen", 5] },
}
},
{
$sort: { topicNumber: 1 }
},
{
$project: {
topic: 1,
text: 1
}
}
])

mongodb find with opposite of $elemMatch

I have a collection like this:
posts = {
title: 'Hey',
content: '...',
authors: [{
id: 'id',
name: 'John'
}, {
id: 'id2',
name: 'david'
}]
};
I want to make a query. I have found the $elementMatch, but I would like the opposite.
I have also found $nin but I don't if it works for mycase.
Here is what I want to do:
db.posts.find({
authors: {
$notElemMatch: {
id: 'id'
}
}
});
I want to find every posts except those writing by someone.
You don't even need $elemMatch since you only have a single field condition. Just use $ne instead:
db.posts.find({ "authors.id": { "$ne": 'id' } });
There is a $not condition, but it really does not need apply here.
While the original question does not require the use of $elemMatch (as answered by Blakes), here is one way to "invert" the $elemMatch operator.
Use $filter + $and as a subsitute for $elemMatch and check that the result has 0 length.
$expr allows the use of aggregation expressions in simple "find" queries.
All conditions in $elemMatch will be translated to items in the array supplied as an argument to $and.
Tested to work with MongoDB server version 5.0.9
{
"$expr": {
"$eq": [
{
"$size": {
"$filter": {
"input": "$authors",
"cond": {
"$and": [
{
"$eq": ["$$this.id", "id"]
}
]
}
}
}
},
0
]
}
}
Here is a better way to "invert" the $elemMatch operator with aggregation.
Use $expr $eq and $in
$in checks if the item is in the array, since we want opposite meaning false so we check if the result of the "$in" operation is $eq (equal) to false
{
"$expr": {
"$eq": [{ "$in": { 'id', "authors.id"} }, false]
}
}

Array of values

For example I have n doc in MongoDB collection
Films = new Mongo.Collection('films');
Film.insert({name: 'name n', actor: 'John'}); // *n
And I want to show array with only name values
var names = ['name 1', 'name 2',..,'name n'];
Any idea how to do it ?
And guys , ols write in comments correct title value of my question, to help other guys to find it, thx :)
You didn't provided any criteria for grouping name as an array.
You can use following query to get all names:
db.collection.distinct("name")
OR you can use MongoDB's aggregation to get all name by grouping them with some condition, if required. The query will be like following:
db.collection.aggregate({
$group: {
_id: null, //add condition if require
name: {
$push: "$name"
}
}
}, {
$project: {
name: 1,
_id: 0
}
})
If you want only distinct name then replace $push with $addToSet.

Why is it not possible to find documents with '$all' modifier?

I have the following documents:
{
_id: 1
title: "oneItem"
},
{
_id: 2,
title: "twoItem"
}
When I try to find these documents by using the following command:
db.collection.documents.find({_id: {$in: [1, 2]}});
I get these two documents but when I try to find these documents by using the following query:
db.collection.documents.find({_id: {$all: [1, 2]}});
I get nothing. Can you explain what's the problem? Basically I need to find all documents with _id 1 and 2 and if none exist then fail.
The reasoning is actually quite simple in that $in and $all have two completely different functions. The documentation links are there, but to explain:
Consider these documents:
{
_id: 1,
items: [ "a", "b" ]
},
{
_id: 2,
items: [ "a", "b", "c" ]
}
$in - provides a list of arguments that can possibly match the value in the field being tested. This would match as follows:
db.collection.find({ items: {$in: [ "a", "b", "c" ] }})
{
_id: 1,
items: [ "a", "b" ]
},
{
_id: 2,
items: [ "a", "b", "c" ]
}
$all - provides a list where the field being matched is expected to be an array and all of the listed elements are present in that array. E.g
db.collection.find({ items: {$all: [ "a", "b", "c" ] }})
{
_id: 2,
items: [ "a", "b", "c" ]
}
Hence why your query does not return a result, the field is not an array and does not contain both elements.
The MongoDB operator reference is something you should really read through.
As your your statement, ["I need to find all documents with _id 1 and 2 and if someone from them does not exists then fail."], matching various criteria is easy as you see in the usage of $in. Your problem is you want a whole set to match or otherwise return nothing ("fail"). This I have already explained to you an some length in a previous question, but to re-iterate:
db.collection.aggregate([
// Match on what you want to find
{$match: { list: {$in: [1,2]} }},
// Add the found results to a distinct *set*
{$group: { _id: null, list: {$addToSet: "$_id"} }},
// Only return when *all* the expected items are in the *set*
{$match: { list: {$all: [1,2]} }}
])
So after that manipulation, this will only return a result if all of the items where found. That is how we use $all to match in this way.

Categories

Resources