Can someone please help me update a collection based on another? I have a pickups collection like so.
{
"_id": {
"$oid": "53a46be700b94521574b6f75"
},
"created": {
"$date": 1403236800000
},
"receivers": [
{
"model": "somemodel1",
"serial": "someserial1",
"access": "someaccess1"
},
{
"model": "somemodel2",
"serial": "someserial2",
"access": "someaccess2"
},
{
"model": "somemodel3",
"serial": "someserial3",
"access": "someaccess3"
}
],
"__v": 0
}
I would like to iterate through the receivers array and search each access in another collection and if found add the activity it was found in.
Here is the workorders collection I want to search in.
{
"_id": {
"$oid": "53af72481b2aeade0b46d025"
},
"activityNumber": "someactivity",
"date": "06/28/2014",
"lines": [
{
"Line #": "1",
"Access Card #": "someaccess1"
},
{
"Line #": "2",
"Access Card #": "someaccess2"
},
{
"Line #": "3",
"Access Card #": "someacess3"
}
],
}
And this is what I would like to end up with.
{
"_id": {
"$oid": "53a46be700b94521574b6f75"
},
"created": {
"$date": 1403236800000
},
"receivers": [
{
"model": "somemodel1",
"serial": "someserial1",
"access": "someaccess1",
"activityNumber": "someactivity"
},
{
"model": "somemodel2",
"serial": "someserial2",
"access": "someaccess2",
"activityNumber": "someactivity"
},
{
"model": "somemodel3",
"serial": "someserial3",
"access": "someaccess3",
"activityNumber": "someactivity"
}
],
"__v": 0
}
I have created an array containing all the access from pickups.
var prodValues = db.pickups.aggregate([
{ "$unwind":"$receivers" },
{ "$group": {
"_id": null,
"products": { "$addToSet": "$receivers.access"}
}}
])
I can easily iterate through the array and search the workorders colleciton and return the activity these are used in. But I'm not sure how to perform a find and update the pickups collection when found.
db.workorders.find({ "lines.Access Card #": { "$in": prodValues.result[0].products }},{activityNumber:1})
Thank you for your help.
Really I would loop this in the completely opposite order as that should be more efficient:
var result = db.workorders.aggregate([
{ "$project": {
"activityNumber": 1,
"access": "$lines.Access Card #",
}}
]).result;
result.forEach(function(res) {
res.access.forEach(function(acc) {
db.pickups.update(
{ "receivers.access": acc },
{ "$set": { "receivers.$.activityNumber": res.activityNumber } }
);
});
});
With MongDB 2.6 you can clean this up a bit with a cursor on the aggregate output and the use of the bulk operations API:
var batch = db.pickups.initializeOrderedBulkOp();
var counter = 0;
db.workorders.aggregate([
{ "$project": {
"activityNumber": 1,
"access": "$lines.Access Card #",
}}
]).forEach(function(res) {
res.access.forEach(function(acc) {
batch.find({ "receivers.access": acc }).updateOne(
{ "$set": { "receivers.$.activityNumber": res.activityNumber } }
);
});
if ( counter % 500 == 0 ) {
batch.execute();
var batch = db.pickups.initializeOrderedBulkOp();
counter = 0;
}
});
if ( counter > 0 )
batch.execute();
Either way, you are basically matching the document and the position of the array on the values of "access" returned from the first aggregation query, and in the current line. This allows the update of the related information at the specified position.
The MongoDB 2.6 improvements are that you are not pulling all the results out of the "workoders" collection into memory as an array, so only each document is pulled in from the the cursor results.
The Bulk operations actions store the "updates" in manageable blocks that should fall under the 16MB BSON limit and then you send this in those blocks instead of individual updates. The driver implementation should handle most of this, but there is some "self management" added in just to be safe.
Related
I have been trying to insert a piece of data in my db collection, mys chema looks like this
{
"_id": {
"$oid": "62d30157607575f6be6ce948"
},
"semester": "sem-5",
"subjectData": {
"subjectName": "eco",
"questionBank": [{
"question": "who are",
"answer": {
"introduction": "Hello my name is Izaan",
"features": ["tall", "handsome"],
"kinds": ["very", "kind"],
"conclusion": "Mighty man",
"_id": {
"$oid": "62d30157607575f6be6ce94b"
}
},
"_id": {
"$oid": "62d30157607575f6be6ce94a"
}
}],
"_id": {
"$oid": "62d30157607575f6be6ce949"
}
},
"__v": {
"$numberInt": "0"
}
}
And I would like to insert some questions and answers depending on semester and subject, first I want to query for the semester and then subject and then insert data without deleting prev data in the questionBank array
Here is how I am doing this and which is giving all kinds of error and still I have not managed to get the desired result
// IF DATA EXISTS UPDATE QUESTIONSBANK WITHOUT DELETING ANY ITEMS
try {
const result = await semesterData.update({
semester: {
$eq: semesterRecieved
},
subjectName: {
$eq: subject
},
}, {
$push: {
question: questionRecieved,
answer: answerRecieved2,
},
}, );
res.send(result);
} catch (error) {
console.log(error);
res.status(404).send(error.message);
}
}
I have page like this. And the data is from my MongoDB server.
At first, I create User collection and put those user data. But I add the Counter collection because, I want to show their View Count.
Each user has multiple pages, and the View Count data is the sum of the Counter data.
User collection's are look like this and I only need login, id(unique) data.
Users Collection
{
"_id": {
"$oid": "5f7e92b88dc8f64cb4e6c2c0"
},
"login": "mojombo",
"id": 1,
"avatar_url": "https://avatars0.githubusercontent.com/u/1?v=4",
"url": "https://api.github.com/users/mojombo",
"html_url": "https://github.com/mojombo",
"type": "User",
"site_admin": "false",
"name": "Tom Preston-Werner",
"company": null,
"blog": "http://tom.preston-werner.com",
"location": "San Francisco",
"email": "tom#mojombo.com",
"hireable": null,
"bio": null,
"created_at": {
"$date": "2007-10-20T05:24:19.000Z"
},
"updated_at": {
"$date": "2020-09-22T15:50:44.000Z"
},
"registerDate": {
"$date": "2020-10-08T04:16:56.459Z"
},
"__v": 0
}
Counters collection
{
"_id": {
"$oid": "5f8e5bde9054ba2477dc2c57"
},
"repoName": "ale",
"repoNumber": 171780764,
"userName": "technicalpickles",
"userNumber": 159,
"viewDate": "2020-10-20",
"count": 1
}
In Counters collection I only need userName, count field to calculate counter data.
So I try for loop and aggregate method but the problem is User data is too much so I have to send too many queries to server. (If I have 10000 users data I have to send 10000 requests) So, I must not to use for loop method.
router.get(`/sitemap/0`, async (req, res, next) => {
let name;
let dataArray = [];
let pageNumber = (Number(req.params.page) * 10000); // Current Page Number
let nextPage = (Number(req.params.page) + 1) * 10000; // Next Page Number
let pageResult; // Page Result
try {
let users = await User.find({}, 'login id').limit(1000)
for (let i = 0; i < 1000; i++) {
name = users[i].login
let counters = await Counter.aggregate([{
$match: {
id: users.id,
userName: users[i].login
}
},
{
$group: {
_id: `${users[i].login}`,
count: {
$sum: "$count"
}
}
}
])
dataArray.push(counters)
console.log(counters)
}
console.log(dataArray)
} catch (e) {
// throw an error
throw e;
}
res.render("sitemap", {
dataArray
})
});
Codes result
So I want to send single query and I think it should be use 'aggregate' method. But without 'for loop'.
I want to join Users collection and Counters collection but I heard that in MongoDB & Mongoose there are no join function.
I just want to make the result like this. Is there anyway to make it like this?
{
"_id": {
"$oid": "5f7e92b88dc8f64cb4e6c2c0"
},
"login": "mojombo", --------------Match login with counter collection 'userName' value
"id": 1, ------------------------Match id with counter collection 'id' value
"count": [Sum of counters count data and it should be Number] ------- Only this field is added
"avatar_url": "https://avatars0.githubusercontent.com/u/1?v=4",
"url": "https://api.github.com/users/mojombo",
"html_url": "https://github.com/mojombo",
"type": "User",
"site_admin": "false",
"name": "Tom Preston-Werner",
"company": null,
"blog": "http://tom.preston-werner.com",
"location": "San Francisco",
"email": "tom#mojombo.com",
"hireable": null,
"bio": null,
"created_at": {
"$date": "2007-10-20T05:24:19.000Z"
},
"updated_at": {
"$date": "2020-09-22T15:50:44.000Z"
},
"registerDate": {
"$date": "2020-10-08T04:16:56.459Z"
},
"__v": 0
}
$lookup with counters collection using pipeline stage, let to pass 2 fields id, login to match with counters collection
$match both fields conditions
$addFields to count sum of count field from counters array, using $reduce to iterate loop and $add to sum values of count field
$skip for pagination
$limit pass your limit of documents
let page = 1; // start pagination from first
let limit = 1000;
let skip = (page - 1) * limit;
let users = await User.aggregate([
{
$lookup: {
from: "counters",
let: {
id: "$id",
login: "$login"
},
pipeline: [
{
$match: {
$expr: {
$eq: ["$userName", "$$login"],
$eq: ["$userNumber", "$$id"]
}
}
}
],
as: "count"
}
},
{
$addFields: {
count: {
$reduce: {
input: "$count",
initialValue: 0,
in: { $add: ["$$value", "$$this.count"] }
}
}
}
},
{ $skip: skip },
{ $limit: limit }
])
Playground
Is it possible to have facet to return as an object instead of an array? It seems a bit counter intuitive to need to access result[0].total instead of just result.total
code (using mongoose):
Model
.aggregate()
.match({
"name": { "$regex": name },
"user_id": ObjectId(req.session.user.id),
"_id": { "$nin": except }
})
.facet({
"results": [
{ "$skip": start },
{ "$limit": finish },
{
"$project": {
"map_levels": 0,
"template": 0
}
}
],
"total": [
{ "$count": "total" },
]
})
.exec()
Each field you get using $facet represents separate aggregation pipeline and that's why you always get an array. You can use $addFields to overwrite existing total with single element. To get that first item you can use $arrayElemAt
Model
.aggregate()
.match({
"name": { "$regex": name },
"user_id": ObjectId(req.session.user.id),
"_id": { "$nin": except }
})
.facet({
"results": [
{ "$skip": start },
{ "$limit": finish },
{
"$project": {
"map_levels": 0,
"template": 0
}
}
],
"total": [
{ "$count": "total" },
]
})
.addFields({
"total": {
$arrayElemAt: [ "$total", 0 ]
}
})
.exec()
You can try this as well
Model
.aggregate()
.match({
"name": { "$regex": name },
"user_id": ObjectId(req.session.user.id),
"_id": { "$nin": except }
})
.facet({
"results": [
{ "$skip": start },
{ "$limit": finish },
{
"$project": {
"map_levels": 0,
"template": 0
}
}
],
"total": [
{ "$count": "total" },
]
})
.addFields({
"total": {
"$ifNull": [{ "$arrayElemAt": [ "$total.total", 0 ] }, 0]
}
})
.exec()
imagine that you want to pass the result of $facet to the next stage, let's say $match. well $match accepts an array of documents as input and return an array of documents that matched an expression, if the output of $facet was just an element we can't pass its output to $match because the type of output of $facet is not the same as the type of input of $match ($match is just an example). In my opinion it's better to keep the output of $facet as array to avoid handling those types of situations.
PS : nothing official in what i said
I have the following schema:
{ "_id": {
"$oid": "58c0204d9f10810115f13e5d"
},"OrgName": "A",
"modules": [
{
"name": "test",
"fullName": "john smith",
"_id": {
"$oid": "58c0204d9f10810115f13e5e"
},
"TimeSavedPlanning": 520,
"TimeSavedWorking": 1000,
"costSaved": 0
},
{
"name": "test1",
"fullName": "john smith",
"_id": {
"$oid": "58c020f85437c22215be92cc"
},
"TimeSavedPlanning": 0,
"TimeSavedWorking": 1000,
"costSaved": 500
}
]
}
I want to aggregate the data within the "modules" array for all documents where OrgName = A and outputs the following totals.
TimeSavedPlanning = 520 (because 520 + 0 = 520)
TimeSavedWorking = 2000 (because 1000 + 1000 = 2000)
costSaved = 500 (because 0 + 500)
Just supply each field for the $group accumulators. And use the "double barreled" $sum to "sum" both from arrays, and from documents:
Model.aggregate([
{ "$match": { "OrgName": "A" } },
{ "$group": {
"_id": null,
"TimeSavedPlanning": { "$sum": { "$sum":"$modules.TimeSavedPlanning" } },
"TimeSavedWorking": { "$sum": { "$sum": "$modules.TimeSavedWorking" } },
"costSaved": { "$sum": { "$sum": { "$modules.costSaved" } }
}}
])
You have been allowed to use $sum like that since MongoDB 3.2. Since that release it has "two" functions:
Takes an "array" of values and "sums" them together.
Acts and an "accumulator" within $group to "sum" values provided from documents.
So here you use "both" functions by "reducing" the arrays down to numeric values per document, and then "accumulating" via the $group.
Of course the $match does the "selection" right at the beginning of the operation chain. Since that determines the selection of data, and you put that there for that purpose, as well as the fact you can use an "index" from that "first" stage.
My data looks like this:
{
"foo_list": [
{
"id": "98aa4987-d812-4aba-ac20-92d1079f87b2",
"name": "Foo 1",
"slug": "foo-1"
},
{
"id": "98aa4987-d812-4aba-ac20-92d1079f87b2",
"name": "Foo 1",
"slug": "foo-1"
},
{
"id": "157569ec-abab-4bfb-b732-55e9c8f4a57d",
"name": "Foo 3",
"slug": "foo-3"
}
]
}
Where foo_list is a field in a model called Bar. Notice that the first and second objects in the array are complete duplicates.
Aside from the obvious solution of switching to PostgresSQL, what MongoDB query can I run to remove duplicate entries from foo_list?
Similar answers that do not quite cut it:
https://stackoverflow.com/a/16907596/432
https://stackoverflow.com/a/18804460/432
These questions answer the question if the array had bare strings in it. However in my situation the array is filled with objects.
I hope it is clear that I am not interested querying the database; I want the duplicates to be gone from the database forever.
Purely from an aggregation framework point of view there are a few approaches to this.
You can either just apply $setUnion in modern releases:
db.collection.aggregate([
{ "$project": {
"foo_list": { "$setUnion": [ "$foo_list", "$foo_list" ] }
}}
])
Or more traditionally with $unwind and $addToSet:
db.collection.aggregate([
{ "$unwind": "$foo_list" },
{ "$group": {
"_id": "$_id",
"foo_list": { "$addToSet": "$foo_list" }
}}
])
Or if you were just interested in the duplicates only then by general grouping:
db.collection.aggregate([
{ "$unwind": "$foo_list" },
{ "$group": {
"_id": {
"_id": "$_id",
"foo_list": "$foo_list"
},
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$ne": 1 } } },
{ "$group": {
"_id": "$_id._id",
"foo_list": { "$push": "$_id.foo_list" }
}}
])
The last form could be useful to you if you actually want to "remove" the duplicates from your data with another update statement as it identifies the elements which are duplicates.
So in that last form the returned result from your sample data identifies the duplicate:
{
"_id" : ObjectId("53f5f7314ffa9b02cf01c076"),
"foo_list" : [
{
"id" : "98aa4987-d812-4aba-ac20-92d1079f87b2",
"name" : "Foo 1",
"slug" : "foo-1"
}
]
}
Where results are returned from your collection per document that contains duplicate entries in the array and which entries are duplicated. This is the information you need to update, and you loop the results as you need to specify the update information from the results in order to remove duplicates.
This is actually done with two update statements per document, as a simple $pull operation would remove "both" items, which is not what you want:
var cursor = db.collection.aggregate([
{ "$unwind": "$foo_list" },
{ "$group": {
"_id": {
"_id": "$_id",
"foo_list": "$foo_list"
},
"count": { "$sum": 1 }
}},
{ "$match": { "count": { "$ne": 1 } } },
{ "$group": {
"_id": "$_id._id",
"foo_list": { "$push": "$_id.foo_list" }
}}
])
var batch = db.collection.initializeOrderedBulkOp();
var count = 0;
cursor.forEach(function(doc) {
doc.foo_list.forEach(function(dup) {
batch.find({ "_id": doc._id, "foo_list": { "$elemMatch": dup } }).updateOne({
"$unset": { "foo_list.$": "" }
});
batch.find({ "_id": doc._id }).updateOne({
"$pull": { "foo_list": null }
});
});
count++;
if ( count % 500 == 0 ) {
batch.execute();
batch = db.collection.initializeOrderedBulkOp();
}
});
if ( count % 500 != 0 ) {
batch.execute();
}
That's the modern MongoDB 2.6 and above way to do it, with a cursor result from aggregation and Bulk operations for updates. But the principles remain the same:
Identify the duplicates in documents
Loop the results to issue the updates to the affected documents
Use $unset with the positional $ operator to set the "first" matched array element to null
Use $pull to remove the null entry from the array
So after processing the above operations your sample now looks like this:
{
"_id" : ObjectId("53f5f7314ffa9b02cf01c076"),
"foo_list" : [
{
"id" : "98aa4987-d812-4aba-ac20-92d1079f87b2",
"name" : "Foo 1",
"slug" : "foo-1"
},
{
"id" : "157569ec-abab-4bfb-b732-55e9c8f4a57d",
"name" : "Foo 3",
"slug" : "foo-3"
}
]
}
The duplicate is removed with the "duplicated" item still intact. That is how you process to identify and remove the duplicate data from your collection.