What is the recommended way to drop indexes using Mongoose? - javascript

I need to create several deployment scripts like data migration and fixtures for a MongoDB database and I couldn't find enough information about how to drop indexes using Mongoose API. This is pretty straight-forward when using the official MongoDB API:
To delete all indexes on the specified collection:
db.collection.dropIndexes();
However, I would like to use Mongoose for this and I tried to use executeDbCommand adapted from this post, but with no success:
mongoose.connection.db.executeDbCommand({ dropIndexes: collectionName, index: '*' },
function(err, result) { /* ... */ });
Should I use the official MongoDB API for Node.js or I just missed something in this approach?

To do this via the Mongoose model for the collection, you can call dropAllIndexes of the native collection:
MyModel.collection.dropAllIndexes(function (err, results) {
// Handle errors
});
Update
dropAllIndexes is deprecated in the 2.x version of the native driver, so dropIndexes should be used instead:
MyModel.collection.dropIndexes(function (err, results) {
// Handle errors
});

If you want to maintain your indexes in your schema definitions with mongoose (you probably do if you're using mongoose), you can easily drop ones not in use anymore and create indexes that don't exist yet. You can just run a one off await YourModel.syncIndexes() on any models that you need to sync. It will create ones in the background with .ensureIndexes and drop any that no longer exist in your schema definition. You can look at the full docs here:
https://mongoosejs.com/docs/api.html#model_Model.syncIndexes

It looks like you're attempting to drop all of the indexes on a given collection.
According to the MongoDB Docs, this is the correct command.
... I tried to use executeDbCommand adapted from this post, but with no success:
To really help here, we need more details:
What failed? How did you measure "no success"?
Can you confirm 100% that the command ran? Did you output to the logs in the callback? Did you check the err variable?
Where are you creating indexes? Can you confirm that you're not re-creating them after dropping?
Have you tried the command while listing specific index names? Honestly, you should not be using "*". You should be deleting and creating very specific indexes.

This might not be the best place to post this, but I think its worth posting anyway.
I call model.syncIndexes() every time a model is defined/created against the db connection, this ensures the indexes are current and up-to-date with the schema, however as it has been highlighted online (example), this can create issues in distributed architectures, where multiple servers are attempting the same operation at the same time. This is particularly relevant if using something like the cluster library to spawn master/slave instances on multiple cores on the same machine, since they often boot up in close proximity to each other when the whole server is started.
In reference to the above 'codebarbarian' article, the issue is highlighted clearly when they state:
Mongoose does not call syncIndexes() for you, you're responsible for
calling syncIndexes() on your own. There are several reasons for this,
most notably that syncIndexes() doesn't do any sort of distributed
locking. If you have multiple servers that call syncIndexes() when
they start, you might get errors due to trying to drop an index that
no longer exists.
So What I do is create a function which uses redis and redis redlock to gain a lease for some nominal period of time to prevent multiple workers (and indeed multiple workers in multiple servers) from attempting the same sync operation at the same time.
It also bypasses the whole thing unless it is the 'master' that is trying to perform the operation, I don't see any real point in delegating this job to any of the workers.
const cluster = require('cluster');
const {logger} = require("$/src/logger");
const {
redlock,
LockError
} = require("$/src/services/redis");
const mongoose = require('mongoose');
// Check is mongoose model,
// ref: https://stackoverflow.com/a/56815793/1834057
const isMongoModel = (obj) => {
return obj.hasOwnProperty('schema') && obj.schema instanceof mongoose.Schema;
}
const syncIndexesWithRedlock = (model,duration=60000) => new Promise(resolve => {
// Ensure the cluster is master
if(!cluster.isMaster)
return resolve(false)
// Now attempt to gain redlock and sync indexes
try {
// Typecheck
if(!model || !isMongoModel(model))
throw new Error('model argument is required and must be a mongoose model');
if(isNaN(duration) || duration <= 0)
throw new Error('duration argument is required, and must be positive numeric')
// Extract name
let name = model.collection.collectionName;
// Define the redlock resource
let resource = `syncIndexes/${name}`;
// Coerce Duration to Integer
// Not sure if this is strictly required, but wtf.
// Will ensure the duration is at least 1ms, given that duration <= 0 throws error above
let redlockLeaseDuration = Math.ceil(duration);
// Attempt to gain lock and sync indexes
redlock.lock(resource,redlockLeaseDuration)
.then(() => {
// Sync Indexes
model.syncIndexes();
// Success
resolve(true);
})
.catch(err => {
// Report Lock Error
if(err instanceof LockError){
logger.error(`Redlock LockError -- ${err.message}`);
// Report Other Errors
}else{
logger.error(err.message);
}
// Fail, Either LockError error or some other error
return resolve(false);
})
// General Fail for whatever reason
}catch(err){
logger.error(err.message);
return resolve(false);
}
});
I wont go into setting up Redis connection, that is the subject of some other thread, but the point of this above code is to show how you can use syncIndexes() reliably and prevent issues with one thread dropping an index and another trying to drop the same index, or other distributed issues with attempting to modify indexes concurrently.

to drop a particular index you could use
db.users.dropIndex("your_index_name_here")

Related

Cant read data from collection in MongoDB Atlas Trigger

New to MongoDB, very new to Atlas. I'm trying to set up a trigger such that it reads all the data from a collection named Config. This is my attempt:
exports = function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
config_docs = collection.find().toArray();
console.log(JSON.stringify(config_docs));
}
the function is part of an automatically created realm application called Triggers_RealmApp, which has Cluster0 as a named linked data source. When I go into Collections in Cluster0, TestDB.Config is one of the collections.
Some notes:
it's not throwing an error, but simply returning {}.
When I change context.services.get("Cluster0"); to something else, it throws an error
When I change "TestDB" to a db that doesnt exist, or "Config" to a collection which doesn't exist, I get the same output; {}
I've tried creating new Realm apps, manually creating services, creating new databases and new collections, etc. I keep bumping into the same issue.
The mongo docs reference promises and awaits, which I haven't seen in any examples (link). I tried experimenting with that a bit and got nowhere. From what I can tell, what I've already done is the typical way of doing it.
Images:
Collection:
Linked Data Source:
I ended up taking it up with MongoDB directly, .find() is asynchronous and I was handling it incorrectly. Here is the reply straight from the horses mouth:
As I understand it, you are not getting your expected results from the query you posted above. I know it can be confusing when you are just starting out with a new technology and can't get something to work!
The issue is that the collection.find() function is an asynchronous function. That means it sends out the request but does not wait for the reply before continuing. Instead, it returns a Promise, which is an object that describes the current status of the operation. Since a Promise really isn't an array, your statment collection.find().toArray() is returning an empty object. You write this empty object to the console.log and end your function, probably before the asynchronous call even returns with your data.
There are a couple of ways to deal with this. The first is to make your function an async function and use the await operator to tell your function to wait for the collection.find() function to return before continuing.
exports = async function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
config_docs = await collection.find().toArray();
console.log(JSON.stringify(config_docs));
};
Notice the async keyword on the first line, and the await keyword on the second to last line.
The second method is to use the .then function to process the results when they return:
exports = function(changeEvent) {
const mongodb = context.services.get("Cluster0");
const db = mongodb.db("TestDB");
var collection = db.collection("Config");
collection.find().toArray().then(config_docs => {
console.log(JSON.stringify(config_docs));
});
};
The connection has to be a connection to the primary replica set and the user log in credentials are of a admin level user (needs to have a permission of cluster admin)

Mongoose Pre and Post hook: stopping a query from execution

I am working on versioning changes for an application. I am making use of the mongoose pre-hook to alter the queries before processing according to the versioning requirements, I came across a situation where I need to do a separate query to check whether the other document exists and if it is I don't have to execute the current query as shown below,
schema.pre('find', { document: false, query: true }, async function (next) {
const query = this.getQuery();
const doc = await model.find(query).exec();
if (!doc) {
const liveVersion = { ...query, version: "default" };
this.setQuery(liveVersion);
} else {
return doc;
}
});
In the above find pre-hook, I am trying to
check the required doc exists in the DB using the find query and return if does exist and
if the document does not exist, I am executing the query by setting the default version based query.
The problem here is mongoose will execute the set query no matter what and the result its returning is also the one which I got for the this.setQuery, not the other DB query result(doc).
Is there a way to stop the default query execution in mongoose pre-hook?
Any help will be appreciated.
The only way to stop the execution of the subsequent action would be to throw an error, so you can throw a specific error in else, with your data in the property of the error object, something like:
else {
let err = new Error();
err.message = "not_an_error";
err.data = doc;
}
but that would mean wrapping all your find calls with a try/catch, and in the catch deal with this specific error in the way of extracting your data, or throw for the main error checking if it's an actual error. In the end you'll be having a very ugly code and logic.
This is specifically for the way you ask it, but normally you can just define another method, like findWithCheck(), and do your checks of the pre hook above in this custom method.
Of course you could try also overriding the actual find(), but that would be overkill, and in this case it means pretty much breaking the whole thing more for test purposes rather than development.

Firebase transaction is not repeating on failure

UPDATE:
This is the console.error(err) message i get:
Error: 6 ALREADY_EXISTS: Document already exists: projects/projectOne-e3999/databases/(default)/documents/storage/AAAAAAAAAA1111111111
I do get why I get the error. Because a transaction running in parallel was faster executing the else Part of the transaction. So the transaction that is returning the error does not have to create but to update the document. It would be enough if the transaction woul re-run. But it does not altough it states Transactions are committed once 'updateFunction' resolves and attempted up to five times on failure.
And that is the whole point of the question. How can i prevent the transaction from failing when the document already exists, because it was created in parallel or how to re-run the transaction?
Original Question:
I have a cloud function that updates another document whenever a document in another collection gets created. As there can be multiple documents created in parallel which all access the same document this will run as transaction, idempotance is insured.
The structure of the function looks like the following. The problem is the very first write (so the creation of the storage doc). As i've said two documents can get created in parallel. But now the transaction reads a document which is not there and tries to create it, but as the other transaction (parallel running one) already created the document the transaction fails. (Which i do not understand in the first place, as I thought the transaction will block all access while executing)
This is not a problem at all, but it does not retry (although it states it will retry up to 5 times automatically). I think this has to do something with my try & catch block for the async function.
If it would retry it would detect that the document exists and automatically update the existing one.
Let's put it simple. This transaction must never fail. It would leave the database inconsistent without a possible automatic way to recover.
.onCreate(async (snapshot, context) => {
//Handle idempotence
const eventId = context.eventId;
try {
const process = await shouldProcess(eventId)
if (!process) {
return null
}
const storageDoc = admin.firestore().doc(`storage/${snapshot.id}`)
await admin.firestore().runTransaction(async t => {
const storageDbDoc = await storageDoc.get()
const dataDb = storageDbDoc.data()
if (dataDb) {
//For sake of testing just rewrite it
t.update(storageDoc, dataDb )
} else {
//Create new
const storage = createStorageDoc()
t.create(storageDoc, storage )
}
})
return markProcessed(eventId)
} catch (err) {
console.error(`Execution of ${eventId} failed: ${err}`)
throw new Error(`Transaction failed.`);
}
Your transaction insists on being able to create a new document:
t.create(storageDoc, storage)
Since create() fails on the condition of the document already existing, the entire transaction simply fails, without the ability to recover.
If your transaction must be able to recover, you should check if that document exists prior to trying to write it, and decide what you want to do in that case.

Firestore how to add to a sub collection

I would like to find a doc in a collection, and add items to a sub collection (which might not exist yet):
projects (collection)
project (doc)
cluster (collection) // might not exist
node1 (doc) // might not exist
statTypeA (collection) // might not exist
I was hoping for something like this:
// Know the doc:
db.ref(`projects/${projectId}/cluster/node1/${statType}`).add()
// Or filter and ref:
db.collection('projects').where(..).limit(1).ref(`cluster/node1/${statType}`).add()
I ended up solving it like this but it's ugly, verbose and slow as it has to come back with a number of read ops first. Am I doing this right?
const projectRefs = await db.collection('projects')
.where('licenseKey', '==', licenseKey)
.limit(1)
.get();
if (!projectRefs.docs) {
// handle 404
}
const projectRef = projectRefs.docs[0].ref;
const cluster = await projectRef.collection('cluster')
.doc('node1').get();
await cluster.ref.collection(statType).add({ something: 'hi' });
Edit:
The way I ended up handling this in a better way is a combination of flattening to other collections and also using arrays for stats. Feels much better:
// projects
{
projectId1
}
// instances (to-many-relationship) (filter based on projectId)
{
projectId
statTypeA: []
statTypeB: []
}
Your "nasty thing" is much closer to the way things work.
In your first attempt, you're trying to combine a query and a document creation in one operation. The SDK doesn't work like that at all. You are either reading or writing with any given bit of code, never both at once. You should do the query first, find the document, then use that to create more documents.
get() returns a promise that you need to use to wait on the results of the query. The results are not available immediately, as your code is currently assuming.
The documentation shows example code of how to handle the results of an asynchronous query. Since your code uses async/await, you can convert it as needed. Note that you have to iterate the QuerySnapshot obtained from the returned promise to see if a document is found.

In meteor, can pub/sub be used for arbitrary in-memory objects (not mongo collection)

I want to establish a two-way (bidirectional) communication within my meteor app. But I need to do it without using mongo collections.
So can pub/sub be used for arbitrary in-memory objects?
Is there a better, faster, or lower-level way? Performance is my top concern.
Thanks.
Yes, pub/sub can be used for arbitrary objects. Meteor’s docs even provide an example:
// server: publish the current size of a collection
Meteor.publish("counts-by-room", function (roomId) {
var self = this;
check(roomId, String);
var count = 0;
var initializing = true;
// observeChanges only returns after the initial `added` callbacks
// have run. Until then, we don't want to send a lot of
// `self.changed()` messages - hence tracking the
// `initializing` state.
var handle = Messages.find({roomId: roomId}).observeChanges({
added: function (id) {
count++;
if (!initializing)
self.changed("counts", roomId, {count: count});
},
removed: function (id) {
count--;
self.changed("counts", roomId, {count: count});
}
// don't care about changed
});
// Instead, we'll send one `self.added()` message right after
// observeChanges has returned, and mark the subscription as
// ready.
initializing = false;
self.added("counts", roomId, {count: count});
self.ready();
// Stop observing the cursor when client unsubs.
// Stopping a subscription automatically takes
// care of sending the client any removed messages.
self.onStop(function () {
handle.stop();
});
});
// client: declare collection to hold count object
Counts = new Mongo.Collection("counts");
// client: subscribe to the count for the current room
Tracker.autorun(function () {
Meteor.subscribe("counts-by-room", Session.get("roomId"));
});
// client: use the new collection
console.log("Current room has " +
Counts.findOne(Session.get("roomId")).count +
" messages.");
In this example, counts-by-room is publishing an arbitrary object created from data returned from Messages.find(), but you could just as easily get your source data elsewhere and publish it in the same way. You just need to provide the same added and removed callbacks like the example here.
You’ll notice that on the client there’s a collection called counts, but this is purely in-memory on the client; it’s not saved in MongoDB. I think this is necessary to use pub/sub.
If you want to avoid even an in-memory-only collection, you should look at Meteor.call. You could create a Meteor.method like getCountsByRoom(roomId) and call it from the client like Meteor.call('getCountsByRoom', 123) and the method will execute on the server and return its response. This is more the traditional Ajax way of doing things, and you lose all of Meteor’s reactivity.
Just to add another easy solution. You can pass connection: null to your Collection instantiation on your server. Even though this is not well-documented, but I heard from the meteor folks that this makes the collection in-memory.
Here's an example code posted by Emily Stark a year ago:
if (Meteor.isClient) {
Test = new Meteor.Collection("test");
Meteor.subscribe("testsub");
}
if (Meteor.isServer) {
Test = new Meteor.Collection("test", { connection: null });
Meteor.publish("testsub", function () {
return Test.find();
});
Test.insert({ foo: "bar" });
Test.insert({ foo: "baz" });
}
Edit
This should go under comment but I found it could be too long for it so I post as an answer. Or perhaps I misunderstood your question?
I wonder why you are against mongo. I somehow find it a good match with Meteor.
Anyway, everyone's use case can be different and your idea is doable but not with some serious hacks.
if you look at Meteor source code, you can find tools/run-mongo.js, it's where Meteor talks to mongo, you may tweak or implement your adaptor to work with your in-memory objects.
Another approach I can think of, will be to wrap your in-memory objects and write a database logic/layer to intercept existing mongo database communications (default port on 27017), you have to take care of all system environment variables like MONGO_URL etc. to make it work properly.
Final approach is wait until Meteor officially supports other databases like Redis.
Hope this helps.

Categories

Resources