I have a deeply nested document in mongoDB and I would like to fetch individual sub-objects.
Example:
{
"schoolName": "Cool School",
"principal": "Joe Banks",
"rooms": [
{
"number": 100
"teacher": "Alvin Melvin"
"students": [
{
"name": "Bort"
"currentGrade": "A"
},
// ... many more students
]
},
// ... many more rooms
]
}
Recently Mongo updated to allow 1-level-deep sub-object retrieval using $elemMatch projection:
var projection = { _id: 0, rooms: { $elemMatch: { number: 100 } } };
db.schools.find({"schoolName": "Cool School"}, projection);
// returns { "rooms": [ /* array containing only the matching room */ ] }
But when I try to fetch a student (2 levels deep) in this same fashion, I get an error:
var projection = { _id: 0, "rooms.students": { $elemMatch: { name: "Bort" } } };
db.schools.find({"schoolName": "Cool School"}, projection);
// "$err": "Cannot use $elemMatch projection on a nested field (currently unsupported).", "code": 16344
Is there a way to retrieve arbitrarily deep sub-objects in a mongoDB document?
I am using Mongo 2.2.1
I recently asked a similar question and can provide a suitably general answer (see Using MongoDB's positional operator $ in a deeply nested document query)
This solution is only supported for Mongo 2.6+, but from then you can use the aggregation framework's $redact function.
Here is an example query which should return just your student Bort.
db.users.aggregate({
$match: { schoolName: 'Cool School' }
}, {
$project: {
_id: 0,
'schoolName': 1,
'rooms.number': 1,
'rooms.students': 1
}
}, {
$redact: {
$cond: {
"if": {
$or: [{
$gte: ['$schoolName', '']
}, {
$eq: ['$number', 100]
}]
},
"then": "$$DESCEND",
"else": {
$cond: {
"if": {
$eq: ['$name', 'Bort']
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}
}
}
});
$redact can be used to make sub-queries by matching or pruning sub-documents recursively in the matched documents.
You can read about $redact here to understand more about what's going on but the design pattern I've identified has the following requirements:
The redact condition is applied at each sub-document level so you need a unique field at each level e.g. you can't have number as a key on both rooms and students say
It only works on data fields not array indices so if you want to know the returned position of a nested document (for example to update it) you need to include that and maintain it in your documents
Each part of the $or statement in $redact should match the documents you want at a specific level
Therefore each part of the $or statement needs to include a match to the unique field of the document at that level. For example, $eq: ['$number', 100] matches the room with number 100
If you aren't specifying a query at a level, you need to still include the unique field. For example, if it is a string you can match it with $gte: ['$uniqueField': '']
The last document level goes in the second if expression so that all of that document is kept.
I don't have mongodb 2.2 handy at the moment, so I can't test this, but have you tried?
var projection = { _id: 0, rooms: { $elemMatch: { "students.name": "Bort" } } };
db.schools.find({"schoolName": "Cool School"}, projection);
Related
I am stuck in a problem where I have a field which is sometimes string and sometimes the output of that field is in array so how can i tackle that in $addField query
I am sharing my mongo query code
db.ledger_scheme_logs.aggregate([
{
$match:{
"type":{ $in: ["add","edit"]},
}
},
{
"$addFields": {
"trail_beginning": {
$substr: [ "$metadata.schemes._trail", 0, 36 ]
}
}
},
{
$group: {
"_id": {
"trail_beginning":"$trail_beginning"
},
"count": { $sum: 1 },
"items": { $push: "$$ROOT" },
}
},
{
"$sort": {
count: -1
}
}
])
In this query the "$metadata.schemes._trail" here schemes is in array in some array of objects and because of that I am getting mongo error -> "message" : "can't convert from BSON type array to String" so how can I solve this type of problem any help with example would be appreciated.
Thanks in advance!
The bigger and trickier question here is about what behavior you would like the system to have rather than how to actually make the database do it. There's a closely related topic around (consistent) schema design that naturally follows.
To directly answer your question, you can use the $cond operator to conditionally calculate the new trail_beginning field based on the data type of the source document currently being processed. An example would be something like:
{
"$addFields": {
"trail_beginning": {
"$cond": {
"if": {
$eq: [
{
$type: "$metadata.schemes"
},
"array"
]
},
"then": {
"$map": {
"input": "$metadata.schemes._trail",
"in": {
$substr: [
"$$this",
0,
3
]
}
}
},
"else": {
$substr: [
"$metadata.schemes._trail",
0,
3
]
}
}
}
}
}
Using two sample documents with different schemas yields the following as demonstrated in this playground example:
[
{
"_id": 1,
"metadata": {
"schemes": {
"_trail": "ABCDEFG"
}
},
"trail_beginning": "ABC"
},
{
"_id": 2,
"metadata": {
"schemes": [
{
"_trail": "HIJKLMN"
},
{
"_trail": "OPQRSTU"
}
]
},
"trail_beginning": [
"HIJ",
"OPQ"
]
}
]
Taking a glance at the rest of your pipeline though, I suspect (but can't say for sure) that this isn't actually what you want to do. This is because the subsequent $group will use the entire array of values to do the grouping, but I'm (again) guessing that you want to group based on individual values.
If my assumptions are correct, then logically what you really want to do is $unwind the array first before you do the substring transformation. This will correct the subsequent grouping logic and, as a side effect, it will also eliminate your problem of having different possible input types during the $addFields stage. Your full pipeline would look something like this:
db.ledger_scheme_logs.aggregate([
{
$match:{
"type":{ $in: ["add","edit"]},
}
},
{
$unwind: "$metadata.schemes"
},
{
"$addFields": {
"trail_beginning": {
$substr: [ "$metadata.schemes._trail", 0, 36 ]
}
}
},
{
$group: {
"_id": {
"trail_beginning":"$trail_beginning"
},
"count": { $sum: 1 },
"items": { $push: "$$ROOT" },
}
},
{
"$sort": {
count: -1
}
}
])
Playground demonstration (using a shorter substring) here.
This works because $unwind will treat non-array field paths as a single element array. However, having a discrepancy in the schema may frequently result in you having to put in special conditional logic to account for the difference in various places in the application. Consider simplifying development by making the schema consistent (converting the non-arrays to arrays with single values).
I have a model Book with a field "tags" which is of type array of String / GraphQLString.
Currently, I'm able to query the tags for each book.
{
books {
id
tags
}
}
and I get the result:
{
"data": {
"books": [
{
"id": "631664448cb20310bc25c89d",
"tags": [
"database",
"middle-layer"
]
},
{
"id": "6316945f8995f05ac71d3b22",
"tags": [
"relational",
"database"
]
},
]
}
}
I want to write a RootQuery where I can fetch all unique tags across all books. This is how far I am (which is not too much):
tags: {
type: new GraphQLList(GraphQLString),
resolve(parent, args) {
Book.find({}) // CAN'T FIGURE OUT WHAT TO DO HERE
return [];
}
}
Basically, I'm trying to fetch all books and then potentially merge all tags fields on each book.
I expect that if I query:
{
tags
}
I would get
["relational", "database", "middle-layer"]
I am just starting with Mongoose, MongoDB, as well as GraphQL, so not 100% sure what keywords to exactly look fo or even what the title of this question should be.
Appreciate the help.
You want to $unwind the arrays so they're flat, at that point we can just use $group to get unique values. like so:
db.collection.aggregate([
{
"$unwind": "$data.books"
},
{
"$unwind": "$data.books.tags"
},
{
$group: {
_id: "$data.books.tags"
}
}
])
Mongo Playground
MongoDb + JavaScript Solution
tags = Book.aggregate([
{
$project: {
tags: 1,
_id: 0,
}
},
])
This returns an array of objects that contain only the tags value. $project is staging this item in the aggregation pipeline by selecting keys to include, denoted by 1 or 0. _id is added by default so it needs to be explicitly excluded.
Then take the tags array that looks like this:
[
{
"tags": [
"database",
"middle-layer"
]
},
{
"tags": [
"relational",
"database"
]
}
]
And reduce it to be one unified array, then make it into a javascript Set, which will exclude duplicates by default. I convert it back to an Array at the end, if you need to perform array methods on it, or write back to the DB.
let allTags = tags.reduce((total, curr) => [...total, ...curr.tags], [])
allTags = Array.from(new Set(allTags))
const tags = [
{
"tags": [
"database",
"middle-layer"
]
},
{
"tags": [
"relational",
"database"
]
}
]
let allTags = tags.reduce((total, curr) => [...total, ...curr.tags], [])
allTags = Array.from(new Set(allTags))
console.log(allTags)
Pure MongoDB Solution
Book.aggregate([
{
$unwind: "$tags"
},
{
$group: {
_id: "_id",
tags: {
"$addToSet": "$tags"
}
}
},
{
$project: {
tags: 1,
_id: 0,
}
}
])
Steps in Aggregation Pipeline
$unwind
Creates a new Mongo Document for each tag in tags
$group
Merges the individual tags into a set called tags
Sets are required to be have unique values and will exclude duplicates by default
_id is a required field
_id will be excluded from the final aggregation so it doesn't matter what it is
$project
Chooses which fields to pull from the previous step in the pipeline
Using it here to exclude _id from the results
Output
[
{
"tags": [
"database",
"middle-layer",
"relational"
]
}
]
Mongo Playground Demo
While this solution gets the result with purely Mongo queries, the resulting output is nested and still requires traversal to get to desired fields. I do not know of a way to replace the root with a list of string values in an aggregation pipeline. So at the end of the day, JavaScript is still required.
I am working on versioning, We have documents based on UUIDs andjobUuids, andjobUuids are the documents associated with the currently working user. I have some aggregate queries on these collections which I need to update based on the job UUIDs,
The results fetched by the aggregate query should be such that,
if the current usersjobUuid document does not exist then the master document with jobUuid: "default" will be returned(The document without any jobUuid),
if job uuid exists then only the document is returned.
I have a$match used to get these documents based on certain conditions, from those documents I need to filter out the documents based on the above conditions, and an example is shown below,
The data looks like this:
[
{
"uuid": "5cdb5a10-4f9b-4886-98c1-31d9889dd943",
"name": "adam",
"jobUuid": "default",
},
{
"uuid": "5cdb5a10-4f9b-4886-98c1-31d9889dd943",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "adam"
},
{
"uuid": "b745baff-312b-4d53-9438-ae28358539dc",
"name": "eve",
"jobUuid": "default",
},
{
"uuid": "b745baff-312b-4d53-9438-ae28358539dc",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "eve"
},
{
"uuid": "26cba689-7eb6-4a9e-a04e-24ede0309e50",
"name": "john",
"jobUuid": "default",
}
]
Results for "jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12" should be:
[
{
"uuid": "5cdb5a10-4f9b-4886-98c1-31d9889dd943",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "adam"
},
{
"uuid": "b745baff-312b-4d53-9438-ae28358539dc",
"jobUuid": "d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12",
"name": "eve"
},
{
"uuid": "26cba689-7eb6-4a9e-a04e-24ede0309e50",
"name": "john",
"jobUuid": "default",
}
]
Based on the conditions mentioned above, is it possible to filter the document within the aggregate query to extract the document of a specific job uuid?
Edit 1: I got the following solution, which is working fine, I want a better solution, eliminating all those nested stages.
Edit 2: Updated the data with actual UUIDs and I just included only the name as another field, we do have n number of fields which are not relevant to include here but needed at the end (mentioning this for those who want to use the projection over all the fields).
Update based on comment:
but the UUIDs are alphanumeric strings, as shown above, does it have
an effect on these sorting, and since we are not using conditions to
get the results, I am worried it will cause issues.
You could use additional field to match the sort order to be the same order as values in the in expression. Make sure you provide the values with default as the last value.
[
{"$match":{"jobUuid":{"$in":["d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12","default"]}}},
{"$addFields":{ "order":{"$indexOfArray":[["d275781f-ed7f-4ce4-8f7e-a82e0e9c8f12","default"], "$jobUuid"]}}},
{"$sort":{"uuid":1, "order":1}},
{
"$group": {
"_id": "$uuid",
"doc":{"$first":"$$ROOT"}
}
},
{"$project":{"doc.order":0}},
{"$replaceRoot":{"newRoot":"$doc"}}
]
example here - https://mongoplayground.net/p/wXiE9i18qxf
Original
You could use below query. The query will pick the non default document if it exists for uuid or else pick the default as the only document.
[
{"$match":{"jobUuid":{"$in":[1,"default"]}}},
{"$sort":{"uuid":1, "jobUuid":1}},
{
"$group": {
"_id": "$uuid",
"doc":{"$first":"$$ROOT"}
}
},
{"$replaceRoot":{"newRoot":"$doc"}}
]
example here - https://mongoplayground.net/p/KrL-1s8WCpw
Here is what I would do:
match stage with $in rather than an $or (for readability)
group stage with _id on $uuid, just as you did, but instead of pushing all the data into an array, be more selective. _id is already storing $uuid, so no reason to capture it again. name must always be the same for each $uuid, so take only the first instance. Based on the match, there are only two possibilities for jobUuid, but this will assume it will be either "default" or something else, and that there can be more than one occurrence of the non-"default" jobUuid. Using "$addToSet" instead of pushing to an array in case there are multiple occurrences of the same jobUuid for a user, also, before adding to the set, use a conditional to only add non-"default" jobUuids, using $$REMOVE to avoid inserting a null when the jobUuid is "default".
Finally, "$project" to clean things up. If element 0 of the jobUuids array does not exist (is null), there is no other possibility for this user than for the jobUuid to be "default", so use "$ifNull" to test and set "default" as appropriate. There could be more than 1 jobUuid here, depending if that is allowed in your db/application, up to you to decide how to handle that (take the highest, take the lowest, etc).
Tested at: https://mongoplayground.net/p/e76cVJf0F3o
[{
"$match": {
"jobUuid": {
"$in": [
"1",
"default"
]
}
}
},
{
"$group": {
"_id": "$uuid",
"name": {
"$first": "$name"
},
"jobUuids": {
"$addToSet": {
"$cond": {
"if": {
"$ne": [
"$jobUuid",
"default"
]
},
"then": "$jobUuid",
"else": "$$REMOVE"
}
}
}
}
},
{
"$project": {
"_id": 0,
"uuid": "$_id",
"name": 1,
"jobUuid": {
"$ifNull": [{
"$arrayElemAt": [
"$jobUuids",
0
]
},
"default"
]
}
}
}]
I was able to solve this problem with the following aggregate query,
We are first extracting the results matching only the jobUuid provided by the user or the "default" in the match section.
Then the results are grouped based on the uuid, using a group stage and we are counting the results as well.
Using the conditions in replaceRoot first we are checking the length of the grouped document,
If the grouped document length is greater than or equal to 2, we are
filtering the document that matches the provided jobUuid.
If it's less or equal to the 1, then we are checking if it's matching the default jobUuid and returning it.
The Query is below:
[
{
$match: {
$or: [{ jobUuid:1 },{ jobUuid: 'default'}]
}
},
{
$group: {
_id: '$uuid',
count: {
$sum: 1
},
docs: {
$push: '$$ROOT'
}
}
},
{
$replaceRoot: {
newRoot: {
$cond: {
if: {
$gte: [
'$count',
2
]
},
then: {
$arrayElemAt: [
{
$filter: {
input: '$docs',
as: 'item',
cond: {
$ne: [
'$$item.jobUuid',
'default'
]
}
}
},
0
]
},
else: {
$arrayElemAt: [
{
$filter: {
input: '$docs',
as: 'item',
cond: {
$eq: [
'$$item.jobUuid',
'default'
]
}
}
},
0
]
}
}
}
}
}
]
Meteor application, where I have a mongo collection that has a tags field.
[{name: "ABC", tags: {"#Movie", "#free", "!R"}},
{name: "DEF", tags: {"#Movie", "!PG"}},
{name: "GHI", tags: {"#Sports", "#free"}}]
On my UI, there are three groups of checkboxes that are populated on the fly, based on the first letter of the tag name.
filter group 1: [ ]Movie [ ] Sports
filter group 2: [ ]free
filter group 3: [ ]PG [ ]R
The filter logic is the following:
If filter group is empty then do not filter by that filter group
If any checkbox from a filter group is checked, then apply that filter
$and should be applied between filter groups (if movies and R checked, then only documents that have tags named "!Movie" and "#free" should be selected
I am struggling to build a mongo criteria parameters that follows the above logic. My code currently looks like spaghetti with lots of nested ifs (in pseudo code)
if (filter_group1 is empty) then if (filter_group2 is empty) then mongo_criteria= {_id: $in: $("input:checked", ".filtergroup1").map(function() {return this.value})}
What would be the right way of doing this?
Firstly, I'm sure you mean that "tags" is actually an array since otherwise the structure would be invalid:
{ "name": "ABC", "tags": ["#Movie", "#free", "!R"]},
{ "name": "DEF", "tags": ["#Movie", "!PG"]},
{ "name": "GHI", "tags": ["#Sports", "#free"]}
It's a novel idea to store "tags" data this way, but it does seem that your program logic to construct a query needs to be aware that there are at least "three" possible conditions that need to be considered in an $and combination.
In the simplest form where you only allowed one selection per filter group then you could get away with coming out to this with the $all operator. Just in simple MongoDB shell notation for brevity:
db.collection.find({ "tags": { "$all": [ "#Movie", "!R" ] } })
The problem there is that if you wanted multiple selections on a group, say the rating for example, then this would fail to get a result:
db.collection.find({ "tags": { "$all": [ "#Movie", "!R", "!PG" ] } })
No item in fact contains both those rating values so this would not be valid. So you would rather do this:
db.collection.find({ "$and": [
{ "tags": { "$in": [ "#Movie" ] } },
{ "tags": { "$in": [ "!R", "!PG" ] } }
])
That would correctly match all Movies with ratings tags for "R" and "PG". Extending this for another group is basically pushing another array item to the $and expression:
db.collection.find({ "$and": [
{ "tags": { "$in": [ "#Movie" ] } },
{ "tags": { "$in": [ "!R", "!PG" ] } },
{ "tags": { "$in": [ "#free" ] }
])
Getting only the document which contains each of those "types" of filters to the matching value, so the "PG" movie is not free and "Sports" was filtered out by not adding to the selection.
The basics of constructing the query is working with an array of selection options for $in in each filter group. Of course then you only append to the $and array when there is a selection present in your filter group.
So start with a base $and like this:
var query = { "$and":[{}] };
And then add in each of the checked options in each filter group to its own in:
var inner = { "tags": { "$in": [] } };
inner.tags["$in"].push( item );
And then append to the base query:
query["$and"].push( inner );
Rinse and repeat for each item. And this is perfectly valid since the base query will just select everything unfiltered, and this is also valid without constructing additional logic:
db.collection.find({ "$and": [
{ },
{ "tags": { "$in": [ "#Movie" ] } },
{ "tags": { "$in": [ "!R", "!PG" ] } },
{ "tags": { "$in": [ "#free" ] }
])
So it really comes down to contruction of the query as MongoDB understands it. This is really just simple JavaScript array manipulation in building the data structure. Which is all MongoDB queries really are.
I have a BSON object like this saved in MongoDB:
{
"title": "Chemistry",
"_id": "532d665f89ae4ae703b29730",
"__v": 0,
"sections": [
{
"week": 1,
"_id": "532d665f89ae4ae703b29731",
"assignments": [
{
"created_date": "2014-03-22T10:30:55.621Z",
"_id": "532d665f89ae4ae703b29733",
"questions": []
},
{
"created_date": "2014-03-22T10:30:55.621Z",
"_id": "532d665f89ae4ae703b29732",
"questions": []
}
],
"materials": []
}
],
"instructor_ids": [],
"student_ids": []
}
What I wish to do is to retrieve the 'assignment' with _id 532d665f89ae4ae703b29731. It is an element in the assignments array, which, in turn, is an element in the sections array.
I am able to retrieve the entire document with the query
{ 'sections.assignments._id' : assignmentId }
However, what I want is just the assignment subdocument
{
"created_date": "2014-03-22T10:30:55.621Z",
"_id": "532d665f89ae4ae703b29733",
"questions": []
}
Is there a way to accomplish such query? Should I resolve to have assignment in a different collection?
As of mongoose version 6.x, the accepted answer is not valid any more because $elemMatch cannot be used any more on nested documents, instead, aggregate should be used.
if you want ti use an _id to find the document you should convert the _id you get as argument to native mongoDb _id format otherwise it will be constructed as a string and an error will occur.
const native_id = mongoose.Types.ObjectId(id);
const assignment = await <your_model_here>.aggregate([
{ $unwind: "$sections" },
{ $unwind: "$sections.assignments" },
{ $match: { "sections.assignments._id": native_id } },
{ $project: { _id: true, sections: "$sections.assignments" } }
]
)
console.log(assignment) // you have what you want
you can do a aggregate query like this :
db.collection.aggregate(
{$unwind: "$sections"},
{$unwind: "$sections.assignments"},
{$match: {"sections.assignments._id": "532d665f89ae4ae703b29731"}},
{$project: {_id: false, assignments: "$sections.assignments"}}
)
However, I recommends you to think about creating more collections, like you said.
More collections seems to me a better solution then this query.
To retrieve a subset of the elements of an array, you'll need to use the $elemMatch projection operator.
db.collection.find(
{"sections.assignments._id" : assignmentId},
{"sections.assignments":{$elemMatch:{"_id":assignmentId}}}
)
Note:
If multiple elements match the $elemMatch condition, the operator returns the first matching element in the array.