I am working on a MERN project. I have created a collection in MongoDB having different types of document. Is it an accepted practice to have different structure documents in a single collection? Secondly i need to fetch only a single document from the collection using the key name. My documents are
[{
"_id": {
"$oid": "6333f72822dc0acc4bea17bd"
},
"designation": [
{
"name": "Chairman",
"level": 17
},
{
"name": "Director",
"level": 13
},
{
"name": "Secretary ",
"level": 13
},
{
"name": "Account Officer",
"level": 9
},
{
"name": "Data Entry Operator-GR B",
"level": 5
}
]
},
{
"_id": {
"$oid": "6334313b22dc0acc4bea17c2"
},
"storeRole": ["manager", "approver", "accepter", "firstsignatory"]
},
{
"_id": {
"$oid": "63369d2083a7cc2e818990dd"
},
"designationSuffix": ["I","II", "III"]
}]
How do I get any of the three documents if I only know the key name i.e(designation, storeRole, designationSuffix). I dont want to use ID value.
Welcome to SO.
First, yes it is an accepted practice and indeed, a powerful feature of MongoDB to have different shapes of data in a single collection.
There are two important things to remember when querying for data:
Matching on fields that don't even exist in a document is OK; the document will simply be skipped. This permits you, for example, to query for storeRole and ignore the other documents with designation, etc. -- unless of course you wish to look for those too using an $or expression.
Matching (using $match) for elements in an array will return the whole array, not just the elements that match.
To illustrate this point, let's expand your input data slightly:
{"designation": [
{"name": "Chairman","level": 17},
{"name": "Director", "level": 13}
]
},
{"designation": [
{"name": "Secretary","level": 13}
]
},
We will use dot notation to reach into the structures in the designation array to find those docs where at least one of the name fields is Chairman:
db.foo.aggregate([
{$match: {"designation.name": "Chairman"}}
]);
{
"_id" : 0,
"designation" : [
{
"name" : "Chairman",
"level" : 17
},
{
"name" : "Director",
"level" : 13
}
]
}
The query eliminated the document with name = Secretary as expected but properly returned the whole document (and the whole array) where name = Chairman. Very often the goal is to fetch only the matching items in the array; this is accomplished with the $filter operator:
db.foo.aggregate([
{$match: {"designation.name": "Chairman"}},
{$project: {
// Assigning the output of $filter to the same name as input:
designation: {$filter: {
input: "$designation",
as: "zz",
cond: {$eq: ['$$zz.name','Chairman']}
}}
}}
]);
{
"_id" : 0,
"designation" : [
{
"name" : "Chairman",
"level" : 17
}
]
}
An alternative approach which is useful when query conditions yield null or empty arrays instead of eliminating the document altogether is to $filter first, then match only on results where the array has a length > 1. We must use the $ifNull function to protect $size from being passed a null by turning it into an empty (but not null) array:
db.foo.aggregate([
{$project: {
// Assigning the output of $filter to the same name as input:
designation: {$filter: {
input: "$designation",
as: "zz",
cond: {$eq: ['$$zz.name','Chairman']}
}}
}},
{$match: {$expr: {$gt:[{$size: {$ifNull:["$designation",[] ]}}, 0]}} }
]);
Try commenting out the $match to see what $filter returns when a document has the target array field but no matches vs. when the document does not have the field.
I have a model Book with a field "tags" which is of type array of String / GraphQLString.
Currently, I'm able to query the tags for each book.
{
books {
id
tags
}
}
and I get the result:
{
"data": {
"books": [
{
"id": "631664448cb20310bc25c89d",
"tags": [
"database",
"middle-layer"
]
},
{
"id": "6316945f8995f05ac71d3b22",
"tags": [
"relational",
"database"
]
},
]
}
}
I want to write a RootQuery where I can fetch all unique tags across all books. This is how far I am (which is not too much):
tags: {
type: new GraphQLList(GraphQLString),
resolve(parent, args) {
Book.find({}) // CAN'T FIGURE OUT WHAT TO DO HERE
return [];
}
}
Basically, I'm trying to fetch all books and then potentially merge all tags fields on each book.
I expect that if I query:
{
tags
}
I would get
["relational", "database", "middle-layer"]
I am just starting with Mongoose, MongoDB, as well as GraphQL, so not 100% sure what keywords to exactly look fo or even what the title of this question should be.
Appreciate the help.
You want to $unwind the arrays so they're flat, at that point we can just use $group to get unique values. like so:
db.collection.aggregate([
{
"$unwind": "$data.books"
},
{
"$unwind": "$data.books.tags"
},
{
$group: {
_id: "$data.books.tags"
}
}
])
Mongo Playground
MongoDb + JavaScript Solution
tags = Book.aggregate([
{
$project: {
tags: 1,
_id: 0,
}
},
])
This returns an array of objects that contain only the tags value. $project is staging this item in the aggregation pipeline by selecting keys to include, denoted by 1 or 0. _id is added by default so it needs to be explicitly excluded.
Then take the tags array that looks like this:
[
{
"tags": [
"database",
"middle-layer"
]
},
{
"tags": [
"relational",
"database"
]
}
]
And reduce it to be one unified array, then make it into a javascript Set, which will exclude duplicates by default. I convert it back to an Array at the end, if you need to perform array methods on it, or write back to the DB.
let allTags = tags.reduce((total, curr) => [...total, ...curr.tags], [])
allTags = Array.from(new Set(allTags))
const tags = [
{
"tags": [
"database",
"middle-layer"
]
},
{
"tags": [
"relational",
"database"
]
}
]
let allTags = tags.reduce((total, curr) => [...total, ...curr.tags], [])
allTags = Array.from(new Set(allTags))
console.log(allTags)
Pure MongoDB Solution
Book.aggregate([
{
$unwind: "$tags"
},
{
$group: {
_id: "_id",
tags: {
"$addToSet": "$tags"
}
}
},
{
$project: {
tags: 1,
_id: 0,
}
}
])
Steps in Aggregation Pipeline
$unwind
Creates a new Mongo Document for each tag in tags
$group
Merges the individual tags into a set called tags
Sets are required to be have unique values and will exclude duplicates by default
_id is a required field
_id will be excluded from the final aggregation so it doesn't matter what it is
$project
Chooses which fields to pull from the previous step in the pipeline
Using it here to exclude _id from the results
Output
[
{
"tags": [
"database",
"middle-layer",
"relational"
]
}
]
Mongo Playground Demo
While this solution gets the result with purely Mongo queries, the resulting output is nested and still requires traversal to get to desired fields. I do not know of a way to replace the root with a list of string values in an aggregation pipeline. So at the end of the day, JavaScript is still required.
hey I am quite new to mongoose and can't get my head around search.
models
User->resumes[]->employments[]
UserSchema
{
resumes: [ResumeSchema],
...
}
ResumeSchema
{
employments: [EmploymentSchema],
...
}
EmploymentSchema
{
jobTitle: {
type: String,
required: [true, "Job title is required."]
},
...
}
Background
User has to enter job title and needs suggestions from the existing data of the already present resumes and their employment's job title
I have tried the following code.
let q = req.query.q; // Software
User.find({ "resumes.employments.jobTitle": new RegExp(req.query.q, 'ig') }, {
"resumes.employments.$": 1
}, (err, docs) => {
res.json(docs);
})
Output
[
{
_id: '...',
resumes:[
{
employments: [
{
jobTitle: 'Software Developer',
...
},
...
]
},
...
]
},
...
]
Expected OutPut
["Software Developer", "Software Engineer", "Software Manager"]
Problem
1:) The Data returned is too much as I only need jobTitle
2:) All employments are being returned whereas the query matched one of them
3:) Is there any better way to do it ? via index or via $search ? I did not find much of information in mongoose documentation to create search index (and I also don't really know how to create a compound index to make it work)
I know there might be a lot of answers but none of them helped or I was not able to make them work ... I am really new to mongodb I have been working with relational databases via SQL or through ORM so my mongodb concepts and knowledge is limited.
So please let me know if there is a better solution to do it. or something to make the current one working.
You can use one of the aggregation query below to get this result:
[
{
"jobTitle": [
"Software Engineer",
"Software Manager",
"Software Developer"
]
}
]
Query is:
First using $unwind twice to deconstructs the arrays and get the values.
Then $match to filter by values you want using $regex.
Then $group to get all values together (using _id: null and $addToSet to no add duplicates).
And finally $project to shown only the field you want.
User.aggregate({
"$unwind": "$resumes"
},
{
"$unwind": "$resumes.employments"
},
{
"$match": {
"resumes.employments.jobTitle": {
"$regex": "software",
"$options": "i"
}
}
},
{
"$group": {
"_id": null,
"jobTitle": {
"$addToSet": "$resumes.employments.jobTitle"
}
}
},
{
"$project": {
"_id": 0
}
})
Example here
Also another option is using $filter into $project stage:
Is similar as before but using $filter instead of $unwind twice.
User.aggregate({
"$unwind": "$resumes"
},
{
"$project": {
"jobs": {
"$filter": {
"input": "$resumes.employments",
"as": "e",
"cond": {
"$regexMatch": {
"input": "$$e.jobTitle",
"regex": "Software",
"options": "i"
}
}
}
}
}
},
{
"$unwind": "$jobs"
},
{
"$group": {
"_id": null,
"jobTitle": {
"$addToSet": "$jobs.jobTitle"
}
}
},
{
"$project": {
"_id": 0
}
})
Example here
I am working on versioning, We have documents based on UUIDs andjobUuids, andjobUuids are the documents associated with the currently working user. I have some aggregate queries on these collections which I need to update based on the job UUIDs,
The results fetched by the aggregate query should be such that,
if the current usersjobUuid document does not exist then the master document with jobUuid: "default" will be returned(The document without any jobUuid),
if job uuid exists then only the document is returned.
I have a$match used to get these documents based on certain conditions, from those documents I need to filter out the documents based on the above conditions, and an example is shown below,
The data looks like this:
[
{
"uuid": "5cdb5a10-4f9b-4886-98c1-31d9889dd943",
"name": "adam",
"jobUuid": "default",
},
{
"uuid": "5cdb5a10-4f9b-4886-98c1-31d9889dd943",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "adam"
},
{
"uuid": "b745baff-312b-4d53-9438-ae28358539dc",
"name": "eve",
"jobUuid": "default",
},
{
"uuid": "b745baff-312b-4d53-9438-ae28358539dc",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "eve"
},
{
"uuid": "26cba689-7eb6-4a9e-a04e-24ede0309e50",
"name": "john",
"jobUuid": "default",
}
]
Results for "jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12" should be:
[
{
"uuid": "5cdb5a10-4f9b-4886-98c1-31d9889dd943",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "adam"
},
{
"uuid": "b745baff-312b-4d53-9438-ae28358539dc",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "eve"
},
{
"uuid": "26cba689-7eb6-4a9e-a04e-24ede0309e50",
"name": "john",
"jobUuid": "default",
}
]
Based on the conditions mentioned above, is it possible to filter the document within the aggregate query to extract the document of a specific job uuid?
Edit 1: I got the following solution, which is working fine, I want a better solution, eliminating all those nested stages.
Edit 2: Updated the data with actual UUIDs and I just included only the name as another field, we do have n number of fields which are not relevant to include here but needed at the end (mentioning this for those who want to use the projection over all the fields).
Update based on comment:
but the UUIDs are alphanumeric strings, as shown above, does it have
an effect on these sorting, and since we are not using conditions to
get the results, I am worried it will cause issues.
You could use additional field to match the sort order to be the same order as values in the in expression. Make sure you provide the values with default as the last value.
[
{"$match":{"jobUuid":{"$in":["d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12","default"]}}},
{"$addFields":{ "order":{"$indexOfArray":[["d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12","default"], "$jobUuid"]}}},
{"$sort":{"uuid":1, "order":1}},
{
"$group": {
"_id": "$uuid",
"doc":{"$first":"$$ROOT"}
}
},
{"$project":{"doc.order":0}},
{"$replaceRoot":{"newRoot":"$doc"}}
]
example here - https://mongoplayground.net/p/wXiE9i18qxf
Original
You could use below query. The query will pick the non default document if it exists for uuid or else pick the default as the only document.
[
{"$match":{"jobUuid":{"$in":[1,"default"]}}},
{"$sort":{"uuid":1, "jobUuid":1}},
{
"$group": {
"_id": "$uuid",
"doc":{"$first":"$$ROOT"}
}
},
{"$replaceRoot":{"newRoot":"$doc"}}
]
example here - https://mongoplayground.net/p/KrL-1s8WCpw
Here is what I would do:
match stage with $in rather than an $or (for readability)
group stage with _id on $uuid, just as you did, but instead of pushing all the data into an array, be more selective. _id is already storing $uuid, so no reason to capture it again. name must always be the same for each $uuid, so take only the first instance. Based on the match, there are only two possibilities for jobUuid, but this will assume it will be either "default" or something else, and that there can be more than one occurrence of the non-"default" jobUuid. Using "$addToSet" instead of pushing to an array in case there are multiple occurrences of the same jobUuid for a user, also, before adding to the set, use a conditional to only add non-"default" jobUuids, using $$REMOVE to avoid inserting a null when the jobUuid is "default".
Finally, "$project" to clean things up. If element 0 of the jobUuids array does not exist (is null), there is no other possibility for this user than for the jobUuid to be "default", so use "$ifNull" to test and set "default" as appropriate. There could be more than 1 jobUuid here, depending if that is allowed in your db/application, up to you to decide how to handle that (take the highest, take the lowest, etc).
Tested at: https://mongoplayground.net/p/e76cVJf0F3o
[{
"$match": {
"jobUuid": {
"$in": [
"1",
"default"
]
}
}
},
{
"$group": {
"_id": "$uuid",
"name": {
"$first": "$name"
},
"jobUuids": {
"$addToSet": {
"$cond": {
"if": {
"$ne": [
"$jobUuid",
"default"
]
},
"then": "$jobUuid",
"else": "$$REMOVE"
}
}
}
}
},
{
"$project": {
"_id": 0,
"uuid": "$_id",
"name": 1,
"jobUuid": {
"$ifNull": [{
"$arrayElemAt": [
"$jobUuids",
0
]
},
"default"
]
}
}
}]
I was able to solve this problem with the following aggregate query,
We are first extracting the results matching only the jobUuid provided by the user or the "default" in the match section.
Then the results are grouped based on the uuid, using a group stage and we are counting the results as well.
Using the conditions in replaceRoot first we are checking the length of the grouped document,
If the grouped document length is greater than or equal to 2, we are
filtering the document that matches the provided jobUuid.
If it's less or equal to the 1, then we are checking if it's matching the default jobUuid and returning it.
The Query is below:
[
{
$match: {
$or: [{ jobUuid:1 },{ jobUuid: 'default'}]
}
},
{
$group: {
_id: '$uuid',
count: {
$sum: 1
},
docs: {
$push: '$$ROOT'
}
}
},
{
$replaceRoot: {
newRoot: {
$cond: {
if: {
$gte: [
'$count',
2
]
},
then: {
$arrayElemAt: [
{
$filter: {
input: '$docs',
as: 'item',
cond: {
$ne: [
'$$item.jobUuid',
'default'
]
}
}
},
0
]
},
else: {
$arrayElemAt: [
{
$filter: {
input: '$docs',
as: 'item',
cond: {
$eq: [
'$$item.jobUuid',
'default'
]
}
}
},
0
]
}
}
}
}
}
]
I have this schema
module.exports = function(conn, mongoose) {
// var autoIncrement = require('mongoose-auto-increment');
var UsersSchema = new mongoose.Schema({
first_name: String,
last_name:String,
sex: String,
fk_hobbies: []
}
, {
timestamps: true
}, {collection: 'wt_users'});
return conn.model('wt_users', UsersSchema);
};
And for example I have these users in data base
{
"_id" : ObjectId("5aca2ac25c1d8adeb4a2dab0"),
first_name:"Pierro",
last_name:"pierre",
sex:"H",
fk_hobbies: [
{
"_id" : ObjectId("5ac9f84d5c1f8adeb4a2da97"),
"name" : "Art"
},
{
"_id" : ObjectId("5ac9f84d5c8d8adeb4a2da97"),
"name" : "Sport"
},
{
"_id" : ObjectId("5ac9f84d9c1d8adeb4a2da97"),
"name" : "Fete"
},
{
"_id" : ObjectId("5acaf84d5c1d8adeb4a2da97"),
"name" : "Série"
},
{
"_id" : ObjectId("6ac9f84d5c1d8adeb4a2da97"),
"name" : "Jeux vidéo"
}
]
},
{
"_id" : ObjectId("5ac9fa075c1d8adeb4a2da99"),
first_name:"jean",
last_name:"mark",
sex:"H",
fk_hobbies: [
{
"_id" : ObjectId("5ac7f84d5c1d8adeb4a2da97"),
"name" : "Musique"
},
{
"_id" : ObjectId("5ac9f24d5c1d8adeb4a2da97"),
"name" : "Chiller"
},
{
"_id" : ObjectId("5ac9f84c5c1d8adeb4a2da97"),
"name" : "Papoter"
},
{
"_id" : ObjectId("5ac9f84d2c1d8adeb4a2da97"),
"name" : "Manger"
},
{
"_id" : ObjectId("5ac9f84d5c1d8adeb4a2da97"),
"name" : "Film"
}
]
},
{
"_id" : ObjectId("5aca0a635c1d8adeb4a2da9d"),
first_name:"michael",
last_name:"ferrari",
sex:"H",
fk_hobbies: [
{
"_id" : ObjectId("5ac9f84d5c1d8adeb4a2ea97"),
"name" : "fashion"
},
{
"_id" : ObjectId("5ac9f84d5c1e8adeb4a2da97"),
"name" : "Voyage"
},
{
"_id" : ObjectId("5ac9f84c5c1d8adeb4a2da97"),
"name" : "Papoter"
},
{
"_id" : ObjectId("5ac9f84d2c1d8adeb4a2da97"),
"name" : "Manger"
},
{
"_id" : ObjectId("5ac9f84d5c1d8adeb4a2da97"),
"name" : "Film"
}
]
},
{
"_id" : ObjectId("5ac9fa074c1d8adeb4a2da99"),
first_name:"Philip",
last_name:"roi",
sex:"H",
fk_hobbies:
[
{
"_id" : ObjectId("5ac7f84d5c1d8adeb4a2da97"),
"name" : "Musique"
},
{
"_id" : ObjectId("5ac9f24d5c1d8adeb4a2da97"),
"name" : "Chiller"
},
{
"_id" : ObjectId("5ac9f84c5c1d8adeb4a2da97"),
"name" : "Papoter"
},
{
"_id" : ObjectId("5ac9f84d2c1d8adeb4a2da97"),
"name" : "Manger"
},
{
"_id" : ObjectId("5ac9f84d5c1d8adeb4a2da97"),
"name" : "Film"
}
]
}
I want to create a mongoose query that match user getted by id, with others users in database according this :
the query will return firstly the users that have the max number of the same hobbies, that is 5, then the users that have the same 4 hobbies ...
I create a solution fully Javascipt / node js, Is there any query with mongo ?
this is my solution
//var user : the current user that search other similar users : jean mark : 5ac9fa075c1d8adeb4a2da99
//var users : all other users
var tab = []
async.each(users, function(item, next1){
var j = 0;
var hobbies = item["fk_hobbies"]
for(var i = 0; i < 5; i++)
{
var index = hobbies.findIndex(x => x["_id"] == user[0]["fk_hobbies"][i]["_id"].toString());
if(index != -1)
j++
}
if(j != 0)
tab.push({nbHob:j, user:item})
next1()
}, function ()
{
var tab2 = tab.sort(compare)
res.json({success:true, data:tab2})
})
function compare(a,b) {
if (a.nbHob > b.nbHob)
return -1;
if (a.nbHob < b.nbHob)
return 1;
return 0;
}
the displayed result is like this
nbHob : represents the number of similar hobbies
{"success":true,"data":[{"nbHob":5,"user":{"_id":"5ac9fa074c1d8adeb4a2da99","u_first_name":"Akram","u_last_name":"Cherif","u_email":"","u_login":"","u_password":"","u_user_type":0,"u_date_of_birth":"","u_civility":0,"u_sex":"H","u_phone_number":"","u_facebook_id":"","u_google_id":"","u_twitter_id":"","u_profile_image":"","u_about":"","u_profession":"","u_fk_additional_infos":[null],"u_budget":0,"u_address":{"country":"France","state":"Paris","city":"TM","zip":76001},"u_fk_hobbies":[{"name":"Musique","_id":"5ac7f84d5c1d8adeb4a2da97"},{"name":"Chiller","_id":"5ac9f24d5c1d8adeb4a2da97"},{"name":"Papoter","_id":"5ac9f84c5c1d8adeb4a2da97"},{"name":"Manger","_id":"5ac9f84d2c1d8adeb4a2da97"},{"name":"Film","_id":"5ac9f84d5c1d8adeb4a2da97"}]}},{"nbHob":3,"user":{"_id":"5aca0a635c1d8adeb4a2da9d","u_first_name":"Chawki","u_last_name":"Gasmi","u_email":"","u_login":"","u_password":"","u_user_type":0,"u_date_of_birth":"","u_civility":0,"u_sex":"H","u_phone_number":"","u_facebook_id":"","u_google_id":"","u_twitter_id":"","u_profile_image":"","u_about":"","u_profession":"","u_fk_additional_infos":[null],"u_budget":{"min":500,"max":850},"u_address":{"country":"","state":"","city":"","zip":0},"u_fk_hobbies":[{"name":"fashion","_id":"5ac9f84d5c1d8adeb4a2ea97"},{"name":"Voyage","_id":"5ac9f84d5c1e8adeb4a2da97"},{"name":"Papoter","_id":"5ac9f84c5c1d8adeb4a2da97"},{"name":"Manger","_id":"5ac9f84d2c1d8adeb4a2da97"},{"name":"Film","_id":"5ac9f84d5c1d8adeb4a2da97"}]}}]}
Your question data seems a bit messed up due to probably far to liberal copy/paste since every hobby has the same ObjectId value. But I can correct that with a full self contained example:
const { Schema } = mongoose = require('mongoose');
const uri = 'mongodb://localhost/people';
mongoose.Promise = global.Promise;
mongoose.set('debug', true);
const hobbySchema = new Schema({
name: String
});
const userSchema = new Schema({
first_name: String,
last_name: String,
sex: String,
fk_hobbies: [hobbySchema]
});
const Hobby = mongoose.model('Hobby', hobbySchema)
const User = mongoose.model('User', userSchema);
const userData = [
{
"first_name" : "Pierro",
"last_name" : "pierre",
"sex" : "H",
"fk_hobbies" : [
"Art", "Sport", "Fete", "Série", "Jeux vidéo"
]
},
{
"first_name": "jean",
"last_name" : "mark",
"sex" : "H",
"fk_hobbies" : [
"Musique", "Chiller", "Papoter", "Manger", "Film"
]
},
{
"first_name" : "michael",
"last_name" : "ferrari",
"sex" : "H",
"fk_hobbies" : [
"fashion", "Voyage", "Papoter", "Manger", "Film"
]
},
{
"first_name" : "Philip",
"last_name" : "roi",
"sex" : "H",
"fk_hobbies" : [
"Musique", "Chiller", "Papoter", "Manger", "Film"
]
}
];
const log = data => console.log(JSON.stringify(data, undefined, 2));
(async function() {
try {
const conn = await mongoose.connect(uri);
await Promise.all(
Object.entries(conn.models).map(([k,m]) => m.remove())
);
const hobbies = await Hobby.insertMany(
[
...userData
.reduce((o, u) => [ ...o, ...u.fk_hobbies ], [])
.reduce((o, u) => o.set(u,1) , new Map())
]
.map(([name,v]) => ({ name }))
);
const users = await User.insertMany(userData.map(u =>
({
...u,
fk_hobbies: u.fk_hobbies.map(f => hobbies.find(h => f === h.name))
})
));
let user = await User.findOne({
"first_name" : "Philip",
"last_name" : "roi"
});
let user_hobbies = user.fk_hobbies.map(h => h._id );
let result = await User.aggregate([
{ "$match": {
"_id": { "$ne": user._id },
"fk_hobbies._id": { "$in": user_hobbies }
}},
{ "$addFields": {
"numHobbies": {
"$size": {
"$setIntersection": [
"$fk_hobbies._id",
user_hobbies
]
}
},
"fk_hobbies": {
"$map": {
"input": "$fk_hobbies",
"in": {
"$mergeObjects": [
"$$this",
{
"shared": {
"$cond": {
"if": { "$in": [ "$$this._id", user_hobbies ] },
"then": true,
"else": "$$REMOVE"
}
}
}
]
}
}
}
}},
{ "$sort": { "numHobbies": -1 } }
]);
log(result);
mongoose.disconnect();
} catch(e) {
} finally {
process.exit();
}
})()
Most of that is just "setup" to re-create the data set, but simply put we're just adding the users and their hobbies and keeping a "unique" identifier for each "unique hobby" by name. This is probably what you actually meant in the question, and it's the sort of model you should be following.
The interesting part is all in the .aggregate() statement, which is how we "query" then "count" the matching hobbies and enable the "server" to sort the results before returning to the client.
Given a current user ( and the last one in the list you included has the most interesting matches ), we then focus on this section of the code:
// Simulates getting the current user to compare against
let user = await User.findOne({
"first_name" : "Philip",
"last_name" : "roi"
});
// Just get the list of _id values from the current user for reference
let user_hobbies = user.fk_hobbies.map(h => h._id );
let result = await User.aggregate([
// Find all users not the current user with at least one of the hobbies
{ "$match": {
"_id": { "$ne": user._id },
"fk_hobbies._id": { "$in": user_hobbies }
}},
// Add the count of matches, "optionally" we are marking the matched
// hobbies in the array as well.
{ "$addFields": {
"numHobbies": {
"$size": {
"$setIntersection": [
"$fk_hobbies._id",
user_hobbies
]
}
},
"fk_hobbies": {
"$map": {
"input": "$fk_hobbies",
"in": {
"$mergeObjects": [
"$$this",
{
"shared": {
"$cond": {
"if": { "$in": [ "$$this._id", user_hobbies ] },
"then": true,
"else": "$$REMOVE"
}
}
}
]
}
}
}
}},
// Sort the results by the "most" hobbies, which is "descending" order
{ "$sort": { "numHobbies": -1 } }
]);
I've commented those steps for you but let's expand on that.
Firstly we presume you have the current user already returned from the database by whatever means you have already done. For the purposes of the rest of the operations, all your really need from that user is the _id of the "User" itself and of course the _id values from each of that user's chosen hobbies. We can do a quick .map() operation as it shown here, but we keep a copy for ease of reference and not repeating that through the remaining code.
Then we get to the actual aggregate statement. The first condition there is the $match, this works like a standard query expression with all the same operators. We want two things from these query conditions:
Get all users except the current user for consideration;
AND where those users contain at least one match on the same hobbies, by _id value.
So the condition for "everyone else" is essentially to supply the $ne "not equal to" operator in argument to the _id value, comparing of course to the current user _id. The second condition to get only those with the same hobbies uses the $in operator against the _id field of the fk_hobbies array. In MongoDB query parlance we denote this as "$fk_hobbies._id" in order to match against the "inner" _id property values.
The $in operator itself takes a "list" as it's argument and compares each value in the list supplied to the property the condition is assigned to. MongoDB itself does not care that fk_hobbies is an array or a single value, and will simply look for an match for anything in the provided list. Think of $in as a short way of writing $or, except you don't need to explicitly include the same property name on every condition.
Now you have the correct documents selected and have discarded any users who do not share any of the same hobbies we can move on to the next stage. Note also that the whole $match considers it logical that you only want those "matching" users. If you actually wanted to see "all users" including those with "no matches", then you can simply omit the whole $match pipeline stage. Your code is discarding anything that was not counted, so this code simply doesn't bother to count anything which "must" have a 0 count.
The $addFields stage pipeline stage is a quick way to "add new fields" to the document returned in results. The main output you want here is the "numHobbies" in addition to the other user details, so this pipeline stage operator is the optimal way to do this, but if you're MongoDB server is a bit older then you can simply specify "all" fields you want to include in addition to any new ones using $project instead.
In order to "count" the number of hobbies in common we essentially use two aggregation operators, which are $setIntersection and $size. Both of these should be available in an MongoDB version you really should be using in production.
In respective order the $setIntersection operator "compares sets" which is in this case the list of _id values within fk_hobbies, both from the current selected user we stored earlier and from the present document being considered in the expression. The result from this operator is the list of values which are the "same" between both lists.
Naturally the $size operator looks at the returned list ( or set ) from $setIntersection and returns the number of entries in that list. This of course is the "matched count".
The next part involves projecting a "re-written" form of the fk_hobbies array. This is totally optional and by my own design for demonstration purposes. "If" you wanted to do what I am doing here as well, then what this bit of code does is adds an additional property to the objects of the fk_hobbies array to indicate where that particular hobby was one of those which matched the list.
I'm saying this is "optional" because I'm actually demonstrating two features available for MongoDB 3.6 only. These involve the usage of $mergeObjects on the inner array elements and the usage of Conditionally Exlcuding Fields.
Stepping through that, since fk_hobbies is an array we need to use the $map operator in order to "reshape" the objects inside it. This operator allows us to process each array member and return a new value based on the transformations we include as it's argument. It's usage is much the same as .map() for JavaScript or any other language which implements a similar operation.
Therefore for each object in the array ( $$this ) we apply the $mergeObjects operator which will "merge" the result of it's arguments. These are provided as the $$this for the current object as it already is, and the second argument in the expression which is doing something new and interesting.
Here we use the $cond operator, which is a "ternary" operator ( or if..then..else expression ) which considers a condition if and then returns either the then argument where that expression was true, or the else expression where it was false. The expression here is another form of $in used as an aggregation expression. In this form the first argument is a singular value $$this._id which will be compared to a list expression in the second argument. That second argument is of course the list of the current user hobby id's we kept earlier, and are using again for comparison.
That usage of $in alone would return either true or false where it was a match. But the extra demonstrated action here is that within the $cond expresion, our else condition for false returns the new and special $$REMOVE value. What this means is that with our "shared" property we are adding to each object in the array, rather than assigning it a value of false where there was no match, we actually don't include that property in the output document at all.
That "optional" part is really just there as a "nice touch" to indicate which "hobbies" were matched in the conditions, rather than simply returning the count. If you like it then use it, and if you don't have MongoDB 3.6 with those features you can simply do that same alteration in the returned documents from the aggregation output anyway:
let result = await User.aggregate([
{ "$match": {
"_id": { "$ne": user._id },
"fk_hobbies._id": { "$in": user_hobbies }
}},
{ "$addFields": {
"numHobbies": {
"$size": {
"$setIntersection": [
"$fk_hobbies._id",
user_hobbies
]
}
}
}},
{ "$sort": { "numHobbies": -1 } }
]);
// map each result after return
result = result.map(r =>
({
...r,
fk_hobbies: r.fk_hobbies.map(h =>
({
...h,
...(( user_hobbies.map(i => i.toString() ).indexOf( h._id.toString() ) != -1 )
? { "shared": true } : {} )
})
)
})
)
Either way, the main thing you wanted out of any $addFields or $project statement was the actual "numHobbies" value indicating the count. And the main reason we did that on the server was so that we can also $sort on the server, which would in turn allow you to add things like $limit and $skip to larger result sets for purposes of paging where it simply would not be practical to get all the results from the collection, even if they were filtered in the initial match or regular query.
Anyhow, from the small sample of documents in the question as also generated in the sample listing, we get a result like this:
[
{
"_id": "5ad6bbe63365bc3428feed8a",
"first_name": "jean",
"last_name": "mark",
"sex": "H",
"fk_hobbies": [
{
"_id": "5ad6bbe63365bc3428feed7d",
"name": "Musique",
"__v": 0,
"shared": true
},
{
"_id": "5ad6bbe63365bc3428feed7e",
"name": "Chiller",
"__v": 0,
"shared": true
},
{
"_id": "5ad6bbe63365bc3428feed7f",
"name": "Papoter",
"__v": 0,
"shared": true
},
{
"_id": "5ad6bbe63365bc3428feed80",
"name": "Manger",
"__v": 0,
"shared": true
},
{
"_id": "5ad6bbe63365bc3428feed81",
"name": "Film",
"__v": 0,
"shared": true
}
],
"__v": 0,
"numHobbies": 5
},
{
"_id": "5ad6bbe63365bc3428feed90",
"first_name": "michael",
"last_name": "ferrari",
"sex": "H",
"fk_hobbies": [
{
"_id": "5ad6bbe63365bc3428feed82",
"name": "fashion",
"__v": 0
},
{
"_id": "5ad6bbe63365bc3428feed83",
"name": "Voyage",
"__v": 0
},
{
"_id": "5ad6bbe63365bc3428feed7f",
"name": "Papoter",
"__v": 0,
"shared": true
},
{
"_id": "5ad6bbe63365bc3428feed80",
"name": "Manger",
"__v": 0,
"shared": true
},
{
"_id": "5ad6bbe63365bc3428feed81",
"name": "Film",
"__v": 0,
"shared": true
}
],
"__v": 0,
"numHobbies": 3
}
]
So there are two users that were returned and we counted the matching hobbies as 5 and 3 respectively and returned the one with the most matched first. You can also see the addition of the "shared" property on each of the matched hobbies to indicate which of the hobbies in each of the returned users lists were also shared with the original user they were compared with.
NOTE: You were probably just "trying things" but your usage of async.each() in your question was not really necessary since none of the inner code is actually "async" itself. Even in the listing here, the only thing you actually need to "await" as an async call after you have the current user to compare is the .aggregate() response itself.
So if at any part of this you were presuming you would be "awaiting requests within a loop", then you were mistaken. Simply ask the database for the results and await their return.
One request to the database is all that is required.
N.B It's also 2018, so you really should start to understand Promises and usage of async/await with them. The code is much cleaner that way and surely any newly developed application should be running in an environment with this support. So "callback helper" libraries like "node async", are a little "old hat" and outmoded in a modern context.