Is it possible with Mongo to make this transformation (with a $group I think) ?
or should it be done with JavaScript on the client side ?
[
{
id: 1,
lib: 'x'
},
{
id: 2,
lib: 'a'
},
{
id: 1,
lib: 'b'
},
{
id: 1,
lib: 'v'
}
]
to
[
{
id: 1,
lib_1: 'x',
lib_2: 'b',
lib_3: 'v'
},
{
id: 2,
lib_1: 'a'
}
]
Query
with $group you can easily put them in an array, and i think its best to just do this
but if you want the exact output like in your expected output its more complicated, because you need to convert array to object, and add numbers in keys etc, in general in mongodb fields are not made for data, because queries become harder, fields are for the schema
Playmongo
aggregate(
[{"$group": {"_id": "$id", "libs": {"$push": "$lib"}}},
{"$set":
{"libs":
{"$map":
{"input": {"$range": [0, {"$size": "$libs"}]},
"in":
{"k": {"$concat": ["lib_", {"$toString": {"$add": ["$$this", 1]}}]},
"v": {"$arrayElemAt": ["$libs", "$$this"]}}}}}},
{"$set": {"libs": {"$arrayToObject": ["$libs"]}}},
{"$replaceRoot": {"newRoot": {"$mergeObjects": ["$libs", "$$ROOT"]}}},
{"$project": {"libs": 0}}])
Related
My MongooseSchema (simplified):
_id: (ObjectId)
storage: [
{
location: String
storedFood:[
{
"name": String
"code": String
"weight": Number
}
]
}
]
I want to dec the weight, but not below 0. There is a stackoverflow answer that does this(The second answer from #rpatel). Great! The problem here is that he uses a update-pipeline WITHOUT nested documents. I didnt find a source where I could learn something about mongdob pipelines and nested object (If you have any please let me know, I really want to learn complex pipelines)
Is someone here who could adapt the following code to decrement weight,where location equals for example "Dubai" and code equals for example "38102371982" ?
Code from #rpatel:
Mongo-Playground
Example-Document:
{
"key": 1,
value: 30
}
db.collection.update({},
[
{
$set: {
"value": {
$max: [
0,
{
$subtract: [
"$value",
20
]
}
]
}
}
}
])
A ready playground.
One option is to use double $map with $cond in order to get into the double nested array item:
db.collection.update(
{storage: {$elemMatch: {location: wantedLoc}}},
[{$set: {
storage: {
$map: {
input: "$storage",
as: "st",
in: {$cond: [
{$eq: ["$$st.location", wantedLoc]},
{location: "$$st.location",
storedFood: {$map: {
input: "$$st.storedFood",
in: {$cond: [
{$eq: ["$$this.code", wantedCode]},
{$mergeObjects: [
"$$this",
{weight: {$max: [0, {$subtract: ["$$this.weight", reduceBy]}]}}
]},
"$$this"
]}
}
}
},
"$$st"
]}
}
}
}}]
)
See how it works on the playground example
I am working on a MERN project. I have created a collection in MongoDB having different types of document. Is it an accepted practice to have different structure documents in a single collection? Secondly i need to fetch only a single document from the collection using the key name. My documents are
[{
"_id": {
"$oid": "6333f72822dc0acc4bea17bd"
},
"designation": [
{
"name": "Chairman",
"level": 17
},
{
"name": "Director",
"level": 13
},
{
"name": "Secretary ",
"level": 13
},
{
"name": "Account Officer",
"level": 9
},
{
"name": "Data Entry Operator-GR B",
"level": 5
}
]
},
{
"_id": {
"$oid": "6334313b22dc0acc4bea17c2"
},
"storeRole": ["manager", "approver", "accepter", "firstsignatory"]
},
{
"_id": {
"$oid": "63369d2083a7cc2e818990dd"
},
"designationSuffix": ["I","II", "III"]
}]
How do I get any of the three documents if I only know the key name i.e(designation, storeRole, designationSuffix). I dont want to use ID value.
Welcome to SO.
First, yes it is an accepted practice and indeed, a powerful feature of MongoDB to have different shapes of data in a single collection.
There are two important things to remember when querying for data:
Matching on fields that don't even exist in a document is OK; the document will simply be skipped. This permits you, for example, to query for storeRole and ignore the other documents with designation, etc. -- unless of course you wish to look for those too using an $or expression.
Matching (using $match) for elements in an array will return the whole array, not just the elements that match.
To illustrate this point, let's expand your input data slightly:
{"designation": [
{"name": "Chairman","level": 17},
{"name": "Director", "level": 13}
]
},
{"designation": [
{"name": "Secretary","level": 13}
]
},
We will use dot notation to reach into the structures in the designation array to find those docs where at least one of the name fields is Chairman:
db.foo.aggregate([
{$match: {"designation.name": "Chairman"}}
]);
{
"_id" : 0,
"designation" : [
{
"name" : "Chairman",
"level" : 17
},
{
"name" : "Director",
"level" : 13
}
]
}
The query eliminated the document with name = Secretary as expected but properly returned the whole document (and the whole array) where name = Chairman. Very often the goal is to fetch only the matching items in the array; this is accomplished with the $filter operator:
db.foo.aggregate([
{$match: {"designation.name": "Chairman"}},
{$project: {
// Assigning the output of $filter to the same name as input:
designation: {$filter: {
input: "$designation",
as: "zz",
cond: {$eq: ['$$zz.name','Chairman']}
}}
}}
]);
{
"_id" : 0,
"designation" : [
{
"name" : "Chairman",
"level" : 17
}
]
}
An alternative approach which is useful when query conditions yield null or empty arrays instead of eliminating the document altogether is to $filter first, then match only on results where the array has a length > 1. We must use the $ifNull function to protect $size from being passed a null by turning it into an empty (but not null) array:
db.foo.aggregate([
{$project: {
// Assigning the output of $filter to the same name as input:
designation: {$filter: {
input: "$designation",
as: "zz",
cond: {$eq: ['$$zz.name','Chairman']}
}}
}},
{$match: {$expr: {$gt:[{$size: {$ifNull:["$designation",[] ]}}, 0]}} }
]);
Try commenting out the $match to see what $filter returns when a document has the target array field but no matches vs. when the document does not have the field.
I am building a project using sequelize.js that includes a Tags table and a Stories table. They have a many to many relationship, which I created in sequelize with a through table of StoryTag. This all works perfectly so far, but I want to get a list of most popluar tags, as in how many stories they are associated with in the StoryTag table, and order them by the number of stories that use this tag.
This is the MySQL syntax of what I am trying to do. This works perfectly in MySQL Workbench:
SELECT tagName, COUNT(StoryTag.TagId)
FROM Tags
LEFT JOIN StoryTag on Tags.id = StoryTag.TagId
GROUP BY Tags.tagName ORDER BY COUNT(StoryTag.TagId) DESC;
This is what works in sequelize.js. It's a raw query, which is not ideal, but since this doesn't handle any sensitive information, it's not a huge worry, just very inelegant.
//DIRECT QUERY METHOD (TEST)
app.get("/api/directags", function (req, res) {
db.sequelize.query("select tags.id, tags.TagName, COUNT(stories.id) as num_stories
from tags left join storytag on storytag.TagId = tags.id
left join stories on storytag.StoryId = stories.id
group by tags.id order by num_stories desc;", {
type: db.Sequelize.QueryTypes.SELECT
}).then(function(result) {
res.send(result);
});
});
This outputs
[
{
"id": 3,
"TagName": "fiction",
"num_stories": 3
},
{
"id": 5,
"TagName": "Nursery Rhyme",
"num_stories": 2
},
...
{
"id": 4,
"TagName": "nonfiction",
"num_stories": 0
}
]
As it should. What doesn't quite work is:
//Sequelize count tags
//Known issues: will not order by the count
//Includes a random 'storytag' many-to-many table row for some reason
app.get("/api/sequelizetags", function (req, res) {
db.Tag.findAll({
attributes: ["id","TagName"],
include: [{
model: db.Story,
attributes: [[db.sequelize.fn("COUNT", "stories.id"), "Count_Of_Stories"]],
duplicating: false
}],
group: ["id"]
}).then(function (dbExamples) {
res.send(dbExamples);
});
});
Which outputs:
[
{
"id": 1,
"TagName": "horror",
"Stories": [
{
"Count_Of_Stories": 1,
"StoryTag": {
"createdAt": "2018-11-29T21:09:46.000Z",
"updatedAt": "2018-11-29T21:09:46.000Z",
"StoryId": 1,
"TagId": 1
}
}
]
},
{
"id": 2,
"TagName": "comedy",
"Stories": []
},
{
"id": 3,
"TagName": "fiction",
"Stories": [
{
"Count_Of_Stories": 3,
"StoryTag": {
"createdAt": "2018-11-29T21:10:04.000Z",
"updatedAt": "2018-11-29T21:10:04.000Z",
"StoryId": 1,
"TagId": 3
}
}
]
},
{
"id": 4,
"TagName": "nonfiction",
"Stories": []
},
...
{
"id": 8,
"TagName": "Drama",
"Stories": [
{
"Count_Of_Stories": 1,
"StoryTag": {
"createdAt": "2018-11-30T01:13:56.000Z",
"updatedAt": "2018-11-30T01:13:56.000Z",
"StoryId": 3,
"TagId": 8
}
}
]
},
{
"id": 9,
"TagName": "Tragedy",
"Stories": []
}
]
This is not in order, and the count of stories is buried. This seems like the sort of thing that would be a common and frequent request from a database, but I am at a loss of how to do this correctly with sequelize.js.
Resources that have failed me:
Sequelize where on many-to-many join
Sequelize Many to Many Query Issue
How to query many-to-many relationship data in Sequelize
Select from many-to-many relationship sequelize
The official documentation for sequelize: http://docs.sequelizejs.com/manual/tutorial/
Some less official and more readable documentation for sequelize: https://sequelize.readthedocs.io/en/v3/docs/querying/
Here's what finally worked, in case anyone else has this question. We also added a where to the include Story, but that's optional.
This resource is easier to understand than the official sequelize docs: https://sequelize-guides.netlify.com/querying/
I also learned that being familiar with promises is really helpful when working with sequelize.
db.Tag.findAll({
group: ["Tag.id"],
includeIgnoreAttributes:false,
include: [{
model: db.Story,
where: {
isPublic: true
}
}],
attributes: [
"id",
"TagName",
[db.sequelize.fn("COUNT", db.sequelize.col("stories.id")), "num_stories"],
],
order: [[db.sequelize.fn("COUNT", db.sequelize.col("stories.id")), "DESC"]]
}).then(function(result){
return result;
});
Please, use the same name if you mean the same thing (num_stories - Count_Of_Stories, etc.).
For ordering use order option.
Include count in top level attributes for get it on top level of instance.
I can't find include[].duplicating option in doc.
Your case:
db.Tag.findAll({
attributes: [
"id",
"TagName",
[db.sequelize.fn("COUNT", "stories.id"), "Count_Of_Stories"]
],
include: [{
model: db.Story,
attributes: [],
duplicating: false
}],
group: ["id"],
order: [
[db.sequelize.literal("`Count_Of_Stories`"), "DESC"]
]
});
Use through: {attributes: []} in options
I'm trying to use mongoDb aggregate on this data:
"_id": ObjectId("598dbd301ab6476e5b15e05e"),
"updated_at": ISODate("2017-08-11T14:20:32.865Z"),
"created_at": ISODate("2017-08-11T14:20:32.865Z"),
"action": ObjectId("59760749a398cb323cf1c051"),
"subAction": ObjectId("5980c3807a8cb300110d87d3"),
"person": ObjectId("598dbd2f1ab6476e5b15e05b"),
"session": ObjectId("598dbd2f1ab6476e5b15e05c"),
"dateAccomplish": ISODate("2017-08-11T14:20:32Z"),
"createdBy": ObjectId("595f8426645bf5f47366fb29"),
"updatedBy": ObjectId("595f8426645bf5f47366fb29"),
What I'm trying to do is that I need to retrieve 2 groups. It has to be grouped by actions and subactions.
The output data expected looks like this:
movactions: [
{
_id: $created_at,
count: ?,
data:[
{
_id: "$action",
count: 3,
data: [
{
_id: "$subaction",
count: 2
}
]
}
]
},
]
there are many subActions that have an action, i want to aggregate each action with their children subactions listed
To group by multiple fields, $group should be applied multiple times with compound _id, gradually reducing it in each next group:
db.collection.aggregate([
{$group:{
_id:{created_at:"$created_at", action:"$action", subAction:"$subAction"},
count: {$sum:1}
}},
{$group:{
_id:{created_at:"$_id.created_at", action:"$_id.action"},
count: {$sum:1},
data: {$push:{_id: "$_id.subAction", count:"$count"}}
}},
{$group:{
_id:"$_id.created_at",
count: {$sum:1},
data: {$push:{_id: "$_id.action", count:"$count", data:"$data"}}
}},
{$project: {
_id:0,
created_at:"$_id",
count:1,
data:1
}}
]);
I haven't been very good at Googling for this answer.
I have around 115 different fields that might be in each record. Collection is the output of a mapreduce on an amazingly large dataset.
Looks like this:
{_id:'number1', value:{'a':1, 'b':2, 'f':5}},
{_id:'number2', value:{'e':2, 'f':114, 'h':12}},
{_id:'number3', value:{'i':2, 'j':22, 'z':12, 'za':111, 'zb':114}}
Any ideas of how I might find records with 5 fields populated?
It's still not a nice query to run, but there is a slightly more modern way to do it via $objectToArray and $redact
db.collection.aggregate([
{ "$redact": {
"$cond": {
"if": {
"$eq": [
{ "$size": { "$objectToArray": "$value" } },
3
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
Where $objectToArray basically coerces the object into an array form, much like a combination of Object.keys() and .map() would in JavaScript.
It's still not a fantastic idea since it does require scanning the whole collection, but at least the aggregation framework operations use "native code" as opposed to JavaScript interpretation as is the case using $where.
So it's still generally advisable to change data structure and use a natural array as well as stored "size" properties where possible in order to make the most effective query operations.
Yes it is possible to do but not in the nicest way. The reason for this is that you are essentially using a $where operator query which uses JavaScript evaluation to match the contents. Not the most efficient way as this can never use an index and needs to test all the documents:
db.collection.find({ "$where": "return Object.keys(this.value).length == 3" })
This looks for the condition matching "three" elements, then only two of your listed documents would be returned:
{ "_id" : "number1", "value" : { "a" : 1, "b" : 2, "f" : 5 } }
{ "_id" : "number2", "value" : { "e" : 2, "f" : 114, "h" : 12 } }
Or for "five" fields or more you can do much the same:
db.numbers.find({ "$where": "return Object.keys(this.value).length >= 5" })
So the arguments to that operator are effectively JavaScript statements that are evaluated on the server to return where true.
A more efficient way is to store the "count" of the elements in the document itself. In this way you can "index" this field and the queries are much more efficient as each document in the collection selected by other conditions does not need to be scanned to determine the length:
{_id:'number1', value:{'a':1, 'b':2, 'f':5} count: 3},
{_id:'number2', value:{'e':2, 'f':114, 'h':12}, count: 3},
{_id:'number3', value:{'i':2, 'j':22, 'z':12, 'za':111, 'zb':114}, count: 5}
Then to get the documents with "five" elements you only need the simple query:
db.collection.find({ "count": 5 })
That is generally the most optimal form. But another point is that the general "Object" structure that you might be happy with from general practice is not something that MongoDB "plays well" with in general. The problem is "traversal" of elements in the object, and in this way MongoDB is much happier when you use an "array". And even in this form:
{
'_id': 'number1',
'values':[
{ 'key': 'a', 'value': 1 },
{ 'key': 'b', 'value': 2 },
{ 'key': 'f', 'value': 5 }
],
},
{
'_id': 'number2',
'values':[
{ 'key': 'e', 'value': 2 },
{ 'key': 'f', 'value': 114 },
{ 'key': 'h', 'value': 12 }
],
},
{
'_id':'number3',
'values': [
{ 'key': 'i', 'values': 2 },
{ 'key': 'j', 'values': 22 },
{ 'key': 'z'' 'values': :12 },
{ 'key': 'za', 'values': 111 },
{ 'key': 'zb', 'values': 114 }
]
}
So if you actually switch to an "array" format like that then you can do an exact length of an array with one version of the $size operator:
db.collection.find({ "values": { "$size": 5 } })
That operator can work for an exact value for an array length as that is a basic provision of what can be done with this operator. What you cannot do as is documented in a "in-equality" match. For that you need the "aggregation framework" for MongoDB, which is a better alternate to JavaScript and mapReduce operations:
db.collection.aggregate([
// Project a size of the array
{ "$project": {
"values": 1,
"size": { "$size": "$values" }
}},
// Match on that size
{ "$match": { "size": { "$gte": 5 } } },
// Project just the same fields
{{ "$project": {
"values": 1
}}
])
So those are the alternates. There is a "native" method available to aggregation and an array type. But it is fairly arguable that the JavaScript evaluation is also "native" to MongoDB, just not therefore implemented in native code.
Since MongoDB version 3.6 you can also use $jsonSchema for this (here's the documentation):
db.getCollection('YOURCOLLECTION').find({
"$jsonSchema":{
"properties":{
"value":{"minProperties": 5}
}
}
})