Is there an equivalent to LEFT JOIN query where right collection isn't exists in MongoDB?
SQL:
SELECT * FROM TableA as A LEFT JOIN TableB as B ON A.id = B.id
WHERE B.Id IS NULL
MongoDB: ???
P.S.: My initial sketch:
db.getCollection('collA').aggregate([
{
$lookup:
{
from: "collB",
localField: "_id",
foreignField: "_id",
as: "collB"
}
}
//, {$match : collB is empty}
])
Well your edit basically has the answer. Simply $match where the array is empty:
db.getCollection('collA').aggregate([
{ "$lookup": {
"from": "collB",
"localField": "_id",
"foreignField": "_id",
"as": "collB"
}},
{ "$match": { "collB.0": { "$exists": false } } }
])
The $exists test on the array index of 0 is the most efficient way to ask in a query "is this an array with items in it".
Neil Lunn's solution is working, but I have another approach, because
$lookup pipe does not support Shard collection in the "from" statement.
So I used to use simple java script as follows. It's simple and easy to modify. But for performance you should have proper indexes!
var mycursor = db.collA.find( {}, {_id: 0, myId:1} )
mycursor.forEach( function (x){
var out = db.collB.count( { yourId : x.myId } )
if ( out > 0) {
print('The id exists! ' + x.myId); //debugging only
//put your other query in here....
}
} )
Related
Hi i have a mongo scheme called "payments" that has a 2 keys that optional:
userId or representativeId (if userId exist representativeId not exist and same about representativeId).
When i found the payments scheme based on cheque scheme that using the $match to filter the result based on my data, I bring all payments in the first lookup that match for my query , in the second and third lookup I want to bring the user data / representative data.
Maybe it will be mix of them its ok, its what I want to see if one of them does not exist the other must to be exist.
I want to get the final array that include the user or the representative or mix of them in the same array.
I am using aggregate to implement this.
the problem its give me back a empty array when 2 of the lookup show(user and representative),
but when I comment the lookup of the user or the representative, and i left with 2 lookup one for payment and after that last lookup user / representative and its work me like I want but just for user / representative just if i remove one of the lookup.
I want its bring my array with two of them.
const userAndRepData = await ChequeDB.aggregate<{[key: string]: any}>([
{
$match: {
$and: [
{
'chequeNumber': {
$in: chequeData.map(c => c.chequeNumber)
},
'accountNumber': {
$in: chequeData.map(c => c.accountNumber)
}
}
]
}
},
{
$lookup: {
from: 'payments',
localField: 'paymentId',
foreignField: '_id',
as: 'payment'
}
},
{
$unwind:'$payment'
},
{
$lookup: {
from: 'representatives',
localField: 'payment.representativeId',
foreignField: '_id',
as: 'representative'
}
},
{
$unwind: '$representative'
},
{
$lookup: {
from: 'users',
localField: 'payment.userId',
foreignField: '_id',
as: 'user'
}
},
{
$unwind: '$user'
},
{
$project: {
user: 1,
_id: 0,
representative: 1
}
}
]);
Unfortunately, you can't use data from one $lookup in the next one since the data from $lookup still hasn't been loaded at the moment query is executing, i.e. payment.representativeId will be null.
You can write resource intensive queries that would work around this, but the easiest and best (performance-wise) way to execute this query would be to have representativeId and userId stored on the Cheque collection.
How to filter products by deep nested populated fields. catalogProduct is an ObjectId (ref to catalog product). category is an ObjectId inside catalogProduct (ref to categories). Categories is an array of category ids.
products = await StorageModel
.find({"catalogProduct.category": {$in: categories }})
.skip((page-1)*8)
.limit(8)
.populate({path: "catalogProduct", populate: {path: "category", select: "name"}})
.select('-__v')
.sort({_id: -1});
You'll need to do a $lookup on the catelogProduct collection so that you can access the catelogProduct data in the query.
Unfortunately that's only available when using Mongo Aggregation, however aggregation is very powerful and is perfect for this sort of thing. You could do something like this:
const products = await StorageModel.aggregate([
{ $lookup: { // Replace the Catelog Product ID with the Catelog Product
from: "catelogProduct",
localField: "catelogProduct",
foreignField: "_id",
as: "catelogProduct"
} },
{ $lookup: { // Replace the Category ID with the Category
from: "categories",
localField: "catelogProduct.category",
foreignField: "_id",
as: "catelogProduct.category"
} },
{ $addFields: { // Replace the Category with its name
"catelogProduct.category": "$catelogProduct.category.name"
} },
{ $match: {
"catalogProduct.category": { $in: categories }
} },
{ $sort: { _id: -1 } },
{ $skip: (page - 1) * 8 },
{ $limit: 8 }
]);
Ideally you wouldn't do the $lookup until you've paginated the results (using $skip and $limit), but in this case it makes sense to do the $lookup first. Make sure you've got an index on catelogProduct._id and categories._id to optimize the query.
For more info on $lookup, look at this article. For more info on Mongo Aggregation, look at this article.
I've two collections called user and subscription, every subscription has user_id which is _id of user collection. How can I join these two collections by where condition with is_account_active = 1.
Please check the below code which I'm using:
const users = await User.find({ is_account_active: 1 });
This will get me all users which have is_account_active flag as 1 but at the same time, I want subscription details also with respective user ids.
You can below query.
const users = await User.aggregate([
{
$match: {
your_condition
}
},
{
$lookup: {
from: 'subscriptions', // secondary db
localField: '_id',
foreignKey: 'user_id',
as: 'subscription' // output to be stored
}
}
]);
But instead of using _id as a foreign it should be better if you can use a new
field like user_id in primary collection and can use auto increment on that which will now automatically insert new data with new unique id, and you can create index on it for faster query execution.
You can use for example aggregate function.
If you keep user_id as string and you have mongo db version >= 4.0 then you can make _id conversion to string (because _id is an ObjectId type):
const users = await User.aggregate([
{
$match: {
is_account_active: 1
}
},
{
$project: {
"_id": {
"$toString": "$_id"
}
}
},
{
$lookup: {
from: 'subscriptions', //collection name
localField: '_id',
foreignKey: 'user_id',
as: 'subscription'. //alias
}
}
]);
But it is a better idea to store user_id in Subscription schema as Object id
user_id: {
type: mongoose.Schema.Types.ObjectId,
ref:'User'
}
so then
const users = await User.aggregate([
{
$match: {
is_account_active: 1
}
},
{
$lookup: {
from: 'subscriptions', //collection name
localField: '_id',
foreignKey: 'user_id',
as: 'subscription'. //alias
}
}
]);
More about ObjectId
More about Aggregate function
I'm using Mongodb right now with Mathon's excellent answer. I don't have the reputation points to state this in the comments: I believe there is a stray period after the 'as' and the argument foreignKey should be foreignField - at least Mongodd 6.0.3 is presenting an error with it and NodeJS. It works for me with those changes as shown below.
const users = await User.aggregate([
{
$match: {
is_account_active: 1
}
},
{
$project: {
"_id": {
"$toString": "$_id"
}
}
},
{
$lookup: {
from: 'subscriptions', //collection name
localField: '_id',
foreignField: 'user_id',
as: 'subscription' //alias
}
}
]);
In MongoDB, is it possible to update the value of a field using the value from another field? The equivalent SQL would be something like:
UPDATE Person SET Name = FirstName + ' ' + LastName
And the MongoDB pseudo-code would be:
db.person.update( {}, { $set : { name : firstName + ' ' + lastName } );
The best way to do this is in version 4.2+ which allows using the aggregation pipeline in the update document and the updateOne, updateMany, or update(deprecated in most if not all languages drivers) collection methods.
MongoDB 4.2+
Version 4.2 also introduced the $set pipeline stage operator, which is an alias for $addFields. I will use $set here as it maps with what we are trying to achieve.
db.collection.<update method>(
{},
[
{"$set": {"name": { "$concat": ["$firstName", " ", "$lastName"]}}}
]
)
Note that square brackets in the second argument to the method specify an aggregation pipeline instead of a plain update document because using a simple document will not work correctly.
MongoDB 3.4+
In 3.4+, you can use $addFields and the $out aggregation pipeline operators.
db.collection.aggregate(
[
{ "$addFields": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}},
{ "$out": <output collection name> }
]
)
Note that this does not update your collection but instead replaces the existing collection or creates a new one. Also, for update operations that require "typecasting", you will need client-side processing, and depending on the operation, you may need to use the find() method instead of the .aggreate() method.
MongoDB 3.2 and 3.0
The way we do this is by $projecting our documents and using the $concat string aggregation operator to return the concatenated string.
You then iterate the cursor and use the $set update operator to add the new field to your documents using bulk operations for maximum efficiency.
Aggregation query:
var cursor = db.collection.aggregate([
{ "$project": {
"name": { "$concat": [ "$firstName", " ", "$lastName" ] }
}}
])
MongoDB 3.2 or newer
You need to use the bulkWrite method.
var requests = [];
cursor.forEach(document => {
requests.push( {
'updateOne': {
'filter': { '_id': document._id },
'update': { '$set': { 'name': document.name } }
}
});
if (requests.length === 500) {
//Execute per 500 operations and re-init
db.collection.bulkWrite(requests);
requests = [];
}
});
if(requests.length > 0) {
db.collection.bulkWrite(requests);
}
MongoDB 2.6 and 3.0
From this version, you need to use the now deprecated Bulk API and its associated methods.
var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;
cursor.snapshot().forEach(function(document) {
bulk.find({ '_id': document._id }).updateOne( {
'$set': { 'name': document.name }
});
count++;
if(count%500 === 0) {
// Excecute per 500 operations and re-init
bulk.execute();
bulk = db.collection.initializeUnorderedBulkOp();
}
})
// clean up queues
if(count > 0) {
bulk.execute();
}
MongoDB 2.4
cursor["result"].forEach(function(document) {
db.collection.update(
{ "_id": document._id },
{ "$set": { "name": document.name } }
);
})
You should iterate through. For your specific case:
db.person.find().snapshot().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
Apparently there is a way to do this efficiently since MongoDB 3.4, see styvane's answer.
Obsolete answer below
You cannot refer to the document itself in an update (yet). You'll need to iterate through the documents and update each document using a function. See this answer for an example, or this one for server-side eval().
For a database with high activity, you may run into issues where your updates affect actively changing records and for this reason I recommend using snapshot()
db.person.find().snapshot().forEach( function (hombre) {
hombre.name = hombre.firstName + ' ' + hombre.lastName;
db.person.save(hombre);
});
http://docs.mongodb.org/manual/reference/method/cursor.snapshot/
Starting Mongo 4.2, db.collection.update() can accept an aggregation pipeline, finally allowing the update/creation of a field based on another field:
// { firstName: "Hello", lastName: "World" }
db.collection.updateMany(
{},
[{ $set: { name: { $concat: [ "$firstName", " ", "$lastName" ] } } }]
)
// { "firstName" : "Hello", "lastName" : "World", "name" : "Hello World" }
The first part {} is the match query, filtering which documents to update (in our case all documents).
The second part [{ $set: { name: { ... } }] is the update aggregation pipeline (note the squared brackets signifying the use of an aggregation pipeline). $set is a new aggregation operator and an alias of $addFields.
Regarding this answer, the snapshot function is deprecated in version 3.6, according to this update. So, on version 3.6 and above, it is possible to perform the operation this way:
db.person.find().forEach(
function (elem) {
db.person.update(
{
_id: elem._id
},
{
$set: {
name: elem.firstname + ' ' + elem.lastname
}
}
);
}
);
I tried the above solution but I found it unsuitable for large amounts of data. I then discovered the stream feature:
MongoClient.connect("...", function(err, db){
var c = db.collection('yourCollection');
var s = c.find({/* your query */}).stream();
s.on('data', function(doc){
c.update({_id: doc._id}, {$set: {name : doc.firstName + ' ' + doc.lastName}}, function(err, result) { /* result == true? */} }
});
s.on('end', function(){
// stream can end before all your updates do if you have a lot
})
})
update() method takes aggregation pipeline as parameter like
db.collection_name.update(
{
// Query
},
[
// Aggregation pipeline
{ "$set": { "id": "$_id" } }
],
{
// Options
"multi": true // false when a single doc has to be updated
}
)
The field can be set or unset with existing values using the aggregation pipeline.
Note: use $ with field name to specify the field which has to be read.
Here's what we came up with for copying one field to another for ~150_000 records. It took about 6 minutes, but is still significantly less resource intensive than it would have been to instantiate and iterate over the same number of ruby objects.
js_query = %({
$or : [
{
'settings.mobile_notifications' : { $exists : false },
'settings.mobile_admin_notifications' : { $exists : false }
}
]
})
js_for_each = %(function(user) {
if (!user.settings.hasOwnProperty('mobile_notifications')) {
user.settings.mobile_notifications = user.settings.email_notifications;
}
if (!user.settings.hasOwnProperty('mobile_admin_notifications')) {
user.settings.mobile_admin_notifications = user.settings.email_admin_notifications;
}
db.users.save(user);
})
js = "db.users.find(#{js_query}).forEach(#{js_for_each});"
Mongoid::Sessions.default.command('$eval' => js)
With MongoDB version 4.2+, updates are more flexible as it allows the use of aggregation pipeline in its update, updateOne and updateMany. You can now transform your documents using the aggregation operators then update without the need to explicity state the $set command (instead we use $replaceRoot: {newRoot: "$$ROOT"})
Here we use the aggregate query to extract the timestamp from MongoDB's ObjectID "_id" field and update the documents (I am not an expert in SQL but I think SQL does not provide any auto generated ObjectID that has timestamp to it, you would have to automatically create that date)
var collection = "person"
agg_query = [
{
"$addFields" : {
"_last_updated" : {
"$toDate" : "$_id"
}
}
},
{
$replaceRoot: {
newRoot: "$$ROOT"
}
}
]
db.getCollection(collection).updateMany({}, agg_query, {upsert: true})
(I would have posted this as a comment, but couldn't)
For anyone who lands here trying to update one field using another in the document with the c# driver...
I could not figure out how to use any of the UpdateXXX methods and their associated overloads since they take an UpdateDefinition as an argument.
// we want to set Prop1 to Prop2
class Foo { public string Prop1 { get; set; } public string Prop2 { get; set;} }
void Test()
{
var update = new UpdateDefinitionBuilder<Foo>();
update.Set(x => x.Prop1, <new value; no way to get a hold of the object that I can find>)
}
As a workaround, I found that you can use the RunCommand method on an IMongoDatabase (https://docs.mongodb.com/manual/reference/command/update/#dbcmd.update).
var command = new BsonDocument
{
{ "update", "CollectionToUpdate" },
{ "updates", new BsonArray
{
new BsonDocument
{
// Any filter; here the check is if Prop1 does not exist
{ "q", new BsonDocument{ ["Prop1"] = new BsonDocument("$exists", false) }},
// set it to the value of Prop2
{ "u", new BsonArray { new BsonDocument { ["$set"] = new BsonDocument("Prop1", "$Prop2") }}},
{ "multi", true }
}
}
}
};
database.RunCommand<BsonDocument>(command);
MongoDB 4.2+ Golang
result, err := collection.UpdateMany(ctx, bson.M{},
mongo.Pipeline{
bson.D{{"$set",
bson.M{"name": bson.M{"$concat": []string{"$lastName", " ", "$firstName"}}}
}},
)
I'm trying to add a new field in all documents that contain the sum of an array of numbers.
Here is the Schema (removed irrelevant fields for brevity):
var PollSchema = new Schema(
{
votes: [Number]
}
);
I establish the model:
PollModel = mongoose.model('Poll', PollSchema);
And I use aggregation to create a new field that contains the sum of the votes array.
PollModel.aggregate([
{
$project: {
totalVotes: { $sum: "$votes"}
}
}
]);
When I startup my server, I get no errors; however, the totalVotes field hasn't been created. I used this documentation to help me. It similarly uses the $sum operator and I did it exactly like the documentation illustrates, but no results.
MongoDb aggregation doesn't save its result into database. You just get the result of aggregation inline within a callback.
So after aggregation you would need to do multi update to your database:
PollModel.aggregate([
{
$project: { totalVotes: { $sum: "$votes"} }
}]).exec( function(err, docs){
// bulk is used for updating all records within a single query
var bulk = PollModel.collection.initializeUnorderedBulkOp();
// add all update operations to bulk
docs.forEach(function(doc){
bulk.find({_id: doc._id}).update({$set: {totalVotes: doc.totalVotes}});
});
// execute all bulk operations
bulk.execute(function(err) {
});
})
});
Unfortunately this does not work as you think it does because "votes" is actually an array of values to start with, and then secondly because $sum is an accumulator operator for usage in the $group pipeline stage only.
So in order for you to get the total of the array as another property, first you must $unwind the array and then $group together on the document key to $sum the total of the elements:
PostModel.aggregate(
[
{ "$unwind": "$votes" },
{ "$group": {
"_id": "$_id",
"anotherField": { "$first": "$anotherField" },
"totalVotes": { "$sum": "$votes" }
}}
],
function(err,results) {
}
);
Also noting here another accumulator in $first would be necessary for each additional field you want in results as $group and $project only return the fields you ask for.
Generally though this is better to keep as a property within each document for performance reasons, as it's faster than using aggregate. So to do this just increment a total each time you $push to an array by also using $inc:
PostModel.update(
{ "_id": id },
{
"$push": { "votes": 5 },
"$inc": { "totalVotes": 5 }
},
function(err,numAffected) {
}
);
In that way the "totalVotes" field is always ready to use without the overhead of needing to deconstruct the array and sum the values for each document.
You don't have totalVotes in your schema. Just try the below code.
var PollSchema = new Schema(
{
votes: [Number],
totalVotes: Number
}
);
PollModel.aggregate([
{
$project: {
totalVotes: { $sum: "$votes"}
}
}
]);
or
resultData.toJSON();
#Blakes Seven and #Volodymyr Synytskyi helped me arrive to my solution! I also found this documentation particularly helpful.
PollModel.aggregate(
[
{ '$unwind': '$votes' },
{ '$group': {
'_id': '$_id',
'totalVotes': { '$sum': '$votes' }
}}
],
function(err,results) {
// console.log(results);
results.forEach(function(result){
var conditions = { _id: result._id },
update = { totalVotes: result.totalVotes },
options = { multi: true };
PollModel.update(conditions, update, options, callback);
function callback (err, numAffected) {
if(err) {
console.error(err);
return;
} else {
// console.log(numAffected);
}
}
});
}
);