I would like to find a doc in a collection, and add items to a sub collection (which might not exist yet):
projects (collection)
project (doc)
cluster (collection) // might not exist
node1 (doc) // might not exist
statTypeA (collection) // might not exist
I was hoping for something like this:
// Know the doc:
db.ref(`projects/${projectId}/cluster/node1/${statType}`).add()
// Or filter and ref:
db.collection('projects').where(..).limit(1).ref(`cluster/node1/${statType}`).add()
I ended up solving it like this but it's ugly, verbose and slow as it has to come back with a number of read ops first. Am I doing this right?
const projectRefs = await db.collection('projects')
.where('licenseKey', '==', licenseKey)
.limit(1)
.get();
if (!projectRefs.docs) {
// handle 404
}
const projectRef = projectRefs.docs[0].ref;
const cluster = await projectRef.collection('cluster')
.doc('node1').get();
await cluster.ref.collection(statType).add({ something: 'hi' });
Edit:
The way I ended up handling this in a better way is a combination of flattening to other collections and also using arrays for stats. Feels much better:
// projects
{
projectId1
}
// instances (to-many-relationship) (filter based on projectId)
{
projectId
statTypeA: []
statTypeB: []
}
Your "nasty thing" is much closer to the way things work.
In your first attempt, you're trying to combine a query and a document creation in one operation. The SDK doesn't work like that at all. You are either reading or writing with any given bit of code, never both at once. You should do the query first, find the document, then use that to create more documents.
get() returns a promise that you need to use to wait on the results of the query. The results are not available immediately, as your code is currently assuming.
The documentation shows example code of how to handle the results of an asynchronous query. Since your code uses async/await, you can convert it as needed. Note that you have to iterate the QuerySnapshot obtained from the returned promise to see if a document is found.
Related
New to MongoDB, very new to Atlas. I'm trying to set up a trigger such that it reads all the data from a collection named Config. This is my attempt:
exports = function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
config_docs = collection.find().toArray();
console.log(JSON.stringify(config_docs));
}
the function is part of an automatically created realm application called Triggers_RealmApp, which has Cluster0 as a named linked data source. When I go into Collections in Cluster0, TestDB.Config is one of the collections.
Some notes:
it's not throwing an error, but simply returning {}.
When I change context.services.get("Cluster0"); to something else, it throws an error
When I change "TestDB" to a db that doesnt exist, or "Config" to a collection which doesn't exist, I get the same output; {}
I've tried creating new Realm apps, manually creating services, creating new databases and new collections, etc. I keep bumping into the same issue.
The mongo docs reference promises and awaits, which I haven't seen in any examples (link). I tried experimenting with that a bit and got nowhere. From what I can tell, what I've already done is the typical way of doing it.
Images:
Collection:
Linked Data Source:
I ended up taking it up with MongoDB directly, .find() is asynchronous and I was handling it incorrectly. Here is the reply straight from the horses mouth:
As I understand it, you are not getting your expected results from the query you posted above. I know it can be confusing when you are just starting out with a new technology and can't get something to work!
The issue is that the collection.find() function is an asynchronous function. That means it sends out the request but does not wait for the reply before continuing. Instead, it returns a Promise, which is an object that describes the current status of the operation. Since a Promise really isn't an array, your statment collection.find().toArray() is returning an empty object. You write this empty object to the console.log and end your function, probably before the asynchronous call even returns with your data.
There are a couple of ways to deal with this. The first is to make your function an async function and use the await operator to tell your function to wait for the collection.find() function to return before continuing.
exports = async function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
config_docs = await collection.find().toArray();
console.log(JSON.stringify(config_docs));
};
Notice the async keyword on the first line, and the await keyword on the second to last line.
The second method is to use the .then function to process the results when they return:
exports = function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
collection.find().toArray().then(config_docs => {
console.log(JSON.stringify(config_docs));
});
};
The connection has to be a connection to the primary replica set and the user log in credentials are of a admin level user (needs to have a permission of cluster admin)
I am working on versioning changes for an application. I am making use of the mongoose pre-hook to alter the queries before processing according to the versioning requirements, I came across a situation where I need to do a separate query to check whether the other document exists and if it is I don't have to execute the current query as shown below,
schema.pre('find', { document: false, query: true }, async function (next) {
const query = this.getQuery();
const doc = await model.find(query).exec();
if (!doc) {
const liveVersion = { ...query, version: "default" };
this.setQuery(liveVersion);
} else {
return doc;
}
});
In the above find pre-hook, I am trying to
check the required doc exists in the DB using the find query and return if does exist and
if the document does not exist, I am executing the query by setting the default version based query.
The problem here is mongoose will execute the set query no matter what and the result its returning is also the one which I got for the this.setQuery, not the other DB query result(doc).
Is there a way to stop the default query execution in mongoose pre-hook?
Any help will be appreciated.
The only way to stop the execution of the subsequent action would be to throw an error, so you can throw a specific error in else, with your data in the property of the error object, something like:
else {
let err = new Error();
err.message = "not_an_error";
err.data = doc;
}
but that would mean wrapping all your find calls with a try/catch, and in the catch deal with this specific error in the way of extracting your data, or throw for the main error checking if it's an actual error. In the end you'll be having a very ugly code and logic.
This is specifically for the way you ask it, but normally you can just define another method, like findWithCheck(), and do your checks of the pre hook above in this custom method.
Of course you could try also overriding the actual find(), but that would be overkill, and in this case it means pretty much breaking the whole thing more for test purposes rather than development.
The firestore api has me a little mixed up in trying to have a repeatable pattern for find-or-create style functions. I'd like the canonical version to look like this:
// returns a promise resolving to a DocumentSnapshot (I think??)
function findOrCreateMyObject(myObject) {
return findMyObject(myObject.identifier).then(documentSnapshot => {
return (documentSnapshot)? documentSnapshot : createMyObject(myObject);
});
};
I'm not sure if DocumentSnapshot is the appropriate return from this, but I figure the caller may want to inspect or update the result, like this:
return findOrCreateMyObject({ identifier:'foo' }).then(documentSnapshot => {
console.log(documentSnapshot.data.someProperty);
return documentSnapshot.ref.update({ someProperty:'bar' });
});
Assuming I am right about that (please tell me if not), it means that both the find and create functions must return a DocumentSnapshot. This is easy enough for the find...
function findMyObject(identifier) {
let query = db.collection('my-object-collection').where('identifier','=='identifier);
return query.get().then(querySnapshot => {
return (querySnapshot.docs.length)? querySnapshot.docs[0] : null;
});
}
...but rather awkward for the create, and the the gist of my problem. I'd want to write create like this....
function createMyObject(myObject) {
// get db from admin
let collectionRef = db.collection('my-object-collection');
return db.collection('my-object-collection').doc().set(myObject);
}
But I cannot because DocumentReference set() resolves to a "non-null promise containing void". Void? I must read back the object I just wrote in order to get a reference to it? In other words, my idealized create needs to be rewritten to be slower and clunkier...
function createMyObject(myObject) {
// get db from admin
let collectionRef = db.collection('my-object-collection');
return db.collection('my-object-collection').doc().set(myObject).then(() => {
// must we query the thing just written?
return findMyObject(myObject.identifier); // frowny face
});
}
This makes my generic create longer (and unnecessarily so when doing just a create). Please tell me:
is DocumentSnapshot the right "currency" for these functions to traffic in?
Am I stuck with a set() and then another query when creating the new object?
Thanks!
EDIT As an example of where I would apply this, say I have customers, uniquely identified by email, and they have a status: 'gold', 'silver' or 'bronze'. My CRM system decides that someone identifying himself as doug#stevenson.com deserves 'silver' status. We don't know at this point wither Mr. Stevenson is a current customer, so we do a find-or-create...
findOrCreateCustomer({ email:'doug#stevenson.com' }).then(customer => {
customer.update({ status:'silver' });
});
I wouldn't just create, because the customer might exist. I wouldn't just update, because the customer might not exist, or might not meet some other criterion for the update.
I'm working on a Node.js module/utility which will allow me to scaffold some directories/files. Long story short, right now I have main function which looks something like this:
util.scaffold("FileName")
This "scaffold" method returns an EventEmitter instance, so, when using this method I can do something like this:
util.scaffold("Name")
.on("done", paths => console.log(paths)
In other words, when all the files are created, the event "done" will be emitted with all the paths of the scaffolded files.
Everything good so far.
Right now, I'm trying to do some tests and benchmarks with this method, and I'm trying to find a way to perform some operations (assertions, logs, etc) after this "scaffold" method has been called multiple times with a different "name" argument. For example:
const names = ["Name1", "Name2", "Name3"]
const emitters = names.map(name => {
return util.scaffold(name)
})
If I was returning a Promise instead of an EventEmitter, I know that I could do something like this:
Promise.all(promises).then(()=> {
//perform assertions, logs, etc
})
However, I'm not sure how can I do the equivalent using EventEmitters. In other words, I need to wait until all these emitters have emitted this same event (i.e. "done") and then perform another operation.
Any ideas/suggestions how to accomplish this?
Thanks in advance.
With promise.all you have a unique information when "everything" is done.
Of course that is when all Promises inside are fullfiled/rejected.
If you have an EventEmitter the information when "everything" is done can not be stored inside your EventEmitter logic because it doesn't know where or how often the event is emmited.
So first solution would be to manage an external state "everything-done" and when this changes to true you perform the other operation.
So like promise.all you have to wrap around it.
The second approach i could imagine is a factory where you build your EventEmitters that keeps track of the instances. Then this factory could provide the information whether all instances have been fired. But this approach could fail on many levels: One Instance->many Calls; One Instance->no Call; ...
just my 5 cent and i would be happy to see another solution
The simplest approach, as mentioned by others, is to return promises instead of EventEmitter instances. However, pursuant to your question, you can write your callback for the done event as follows:
const names = ['Name1', 'Name2', 'Name3']
let count = 0
util.scaffold('Name').on('done', (paths) => {
count += 1
if (count < names.length) {
// There is unfinished scaffolding
} else {
// All scaffolding complete
}
})
I ended up doing what #theGleep suggested and wrapping each of those emitters inside a Promise, like this:
const names = ["Name1", "Name2", "Name3"]
const promises = names.map(name => {
return new Promise((resolve) => {
util.scaffold(name).on("done", paths => {
resolve(paths)})
})
})
// and then
Promise.all(promises).then(result => {
// more operations
})
It seems to be doing what I need so far, so I'll just use this for now. Thanks everyone for your feedback :)
I need to create several deployment scripts like data migration and fixtures for a MongoDB database and I couldn't find enough information about how to drop indexes using Mongoose API. This is pretty straight-forward when using the official MongoDB API:
To delete all indexes on the specified collection:
db.collection.dropIndexes();
However, I would like to use Mongoose for this and I tried to use executeDbCommand adapted from this post, but with no success:
mongoose.connection.db.executeDbCommand({ dropIndexes: collectionName, index: '*' },
function(err, result) { /* ... */ });
Should I use the official MongoDB API for Node.js or I just missed something in this approach?
To do this via the Mongoose model for the collection, you can call dropAllIndexes of the native collection:
MyModel.collection.dropAllIndexes(function (err, results) {
// Handle errors
});
Update
dropAllIndexes is deprecated in the 2.x version of the native driver, so dropIndexes should be used instead:
MyModel.collection.dropIndexes(function (err, results) {
// Handle errors
});
If you want to maintain your indexes in your schema definitions with mongoose (you probably do if you're using mongoose), you can easily drop ones not in use anymore and create indexes that don't exist yet. You can just run a one off await YourModel.syncIndexes() on any models that you need to sync. It will create ones in the background with .ensureIndexes and drop any that no longer exist in your schema definition. You can look at the full docs here:
https://mongoosejs.com/docs/api.html#model_Model.syncIndexes
It looks like you're attempting to drop all of the indexes on a given collection.
According to the MongoDB Docs, this is the correct command.
... I tried to use executeDbCommand adapted from this post, but with no success:
To really help here, we need more details:
What failed? How did you measure "no success"?
Can you confirm 100% that the command ran? Did you output to the logs in the callback? Did you check the err variable?
Where are you creating indexes? Can you confirm that you're not re-creating them after dropping?
Have you tried the command while listing specific index names? Honestly, you should not be using "*". You should be deleting and creating very specific indexes.
This might not be the best place to post this, but I think its worth posting anyway.
I call model.syncIndexes() every time a model is defined/created against the db connection, this ensures the indexes are current and up-to-date with the schema, however as it has been highlighted online (example), this can create issues in distributed architectures, where multiple servers are attempting the same operation at the same time. This is particularly relevant if using something like the cluster library to spawn master/slave instances on multiple cores on the same machine, since they often boot up in close proximity to each other when the whole server is started.
In reference to the above 'codebarbarian' article, the issue is highlighted clearly when they state:
Mongoose does not call syncIndexes() for you, you're responsible for
calling syncIndexes() on your own. There are several reasons for this,
most notably that syncIndexes() doesn't do any sort of distributed
locking. If you have multiple servers that call syncIndexes() when
they start, you might get errors due to trying to drop an index that
no longer exists.
So What I do is create a function which uses redis and redis redlock to gain a lease for some nominal period of time to prevent multiple workers (and indeed multiple workers in multiple servers) from attempting the same sync operation at the same time.
It also bypasses the whole thing unless it is the 'master' that is trying to perform the operation, I don't see any real point in delegating this job to any of the workers.
const cluster = require('cluster');
const {logger} = require("$/src/logger");
const {
redlock,
LockError
} = require("$/src/services/redis");
const mongoose = require('mongoose');
// Check is mongoose model,
// ref: https://stackoverflow.com/a/56815793/1834057
const isMongoModel = (obj) => {
return obj.hasOwnProperty('schema') && obj.schema instanceof mongoose.Schema;
}
const syncIndexesWithRedlock = (model,duration=60000) => new Promise(resolve => {
// Ensure the cluster is master
if(!cluster.isMaster)
return resolve(false)
// Now attempt to gain redlock and sync indexes
try {
// Typecheck
if(!model || !isMongoModel(model))
throw new Error('model argument is required and must be a mongoose model');
if(isNaN(duration) || duration <= 0)
throw new Error('duration argument is required, and must be positive numeric')
// Extract name
let name = model.collection.collectionName;
// Define the redlock resource
let resource = `syncIndexes/${name}`;
// Coerce Duration to Integer
// Not sure if this is strictly required, but wtf.
// Will ensure the duration is at least 1ms, given that duration <= 0 throws error above
let redlockLeaseDuration = Math.ceil(duration);
// Attempt to gain lock and sync indexes
redlock.lock(resource,redlockLeaseDuration)
.then(() => {
// Sync Indexes
model.syncIndexes();
// Success
resolve(true);
})
.catch(err => {
// Report Lock Error
if(err instanceof LockError){
logger.error(`Redlock LockError -- ${err.message}`);
// Report Other Errors
}else{
logger.error(err.message);
}
// Fail, Either LockError error or some other error
return resolve(false);
})
// General Fail for whatever reason
}catch(err){
logger.error(err.message);
return resolve(false);
}
});
I wont go into setting up Redis connection, that is the subject of some other thread, but the point of this above code is to show how you can use syncIndexes() reliably and prevent issues with one thread dropping an index and another trying to drop the same index, or other distributed issues with attempting to modify indexes concurrently.
to drop a particular index you could use
db.users.dropIndex("your_index_name_here")