Avoid mongodb bulk insert duplicate key error - javascript

How can I execute a bulk insert and continue in case of duplicate key error?
I have a collection with an unique index on the id field (not _id) and some data in it. Then I get more data and I want to add only the non-present documents to the collection.
I have the following code:
let opts = {
continueOnError: true, // Neither
ContinueOnError: true, // of
keepGoing: true, // this
KeepGoing: true, // works
};
let bulk = collection.initializeUnorderedBulkOp( opts );
bulk.insert( d1 );
bulk.insert( d2 );
bulk.insert( d3 );
...
bulk.insert( dN );
let result = yield bulk.execute( opts ); // this keep throwing duplicate key error
And I just want to ignore the errors and let the bulk finish with all the queued operations.
I searched in npm module api and in the MongoDB api for Bulk, initializeUnorderedBulkOp and the docs for Bulk write with no luck.
Also in the docs for Unordered Operations they say:
Error Handling
If an error occurs during the processing of one of the write operations, MongoDB will continue to process remaining write operations in the list.
Which is not true (at least in my case)

You can use db.collection.insertMany(), (new in version 3.2.) with:
ordered:false
With ordered to false, and in case of duplicate key error, the insert operation would continue with any remaining documents.
Here is link to documentation:
https://docs.mongodb.com/v3.2/reference/method/db.collection.insertMany/

Ordered Insert in MongoDB
db.hobbies.insertMany([{_id: "yoga", name: "Yoga"}, {_id: "cooking", name: "Cooking"}, {_id: "hiking", name: "Hiking"}], {ordered: true})
{ordered: true} is the default behaviour of insert statements
Unordered Insert in MongoDB
If you want mongodb to continue trying to insert other documents even after one or more failing due to any reason, you must set ordered to false. See example below:
db.hobbies.insertMany([{_id: "yoga", name: "Yoga"}, {_id: "cooking", name: "Cooking"}, {_id: "hiking", name: "Hiking"}], {ordered: false})

Related

Supabase - Upsert & multiple onConflict constraints

I cannot figure out how to proceed with an Upsert & "multiple" onConflict constraints. I want to push a data batch in a Supabase table.
My data array would be structured as follows:
items = [
{ date: "2023-01-26", url: "https://wwww.hello.com"},
{ date: "2023-01-26", url: "https://wwww.goodbye.com"},
...]
I would like to use the Upsert method to push this new batch in my Supabase table, unless if it already exists. To check if it already exists, I would like to use the date, and the url as onConflict criteria, if I understood well.
When I'm running this method
const { error } = await supabase
.from('items')
.upsert(items, { onConflict: ['date','url'] })
.select();
I'm having the following error:
{
code: '42P10',
details: null,
hint: null,
message: 'there is no unique or exclusion constraint matching the ON CONFLICT specification'
}
What am I missing? Where am I wrong?
You can pass more than one column in the upsert into by adding a column in a string (instead of using an array):
const { data, error } = await supabase
.from('items')
.upsert(items, { onConflict: 'date, url'} )
Postgres performs unique index inference as mentioned in https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT
It is necessary to have unique or indexes for this to work, as you can read in the documentation above:
INSERT into tables that lack unique indexes will not be blocked by
concurrent activity. Tables with unique indexes might block if
concurrent sessions perform actions that lock or modify rows matching
the unique index values being inserted; the details are covered in
Section 64.5. ON CONFLICT can be used to specify an alternative action
to raising a unique constraint or exclusion constraint violation
error.

How to filter an array of subdocuments by two fields in each subdocument

I am attempting to add a help request system which allows the requestor to make only one request for help on each topic from an expert. If the expert lists multiple topics which they can help, I want to limit each requestor to one help request per topic per expert.
I am using node.js and mongoose.js with a self-hosted mongodb instance
I have tried using the $and operator to find the ._id of the expert as long as they don't already have an existing request from the same requestor on the same topic. It works for one update but after the experts document has a subdocument inserted with either the topic_id or the requestor_id the filter is applied and no expert is returned.
// Schema
ExpertSchema = new mongoose.Schema({
expert_id: String,
helpRequests: [
requestor_id: String,
topic_id: String
]
});
//query
const query = {
$and:[
{expert_id: req.body.expert_id},
{'helpRequests.requestor_id': {$ne: req.body.requestor_id}},
{'helpRequests.topic_id': {$ne: req.body.topic_id}}
]
};
// desired update
const update = {
$push: {
helpRequests: {
requestor_id: req.body.requestor_id,
topic_id: req.body.topic_id
}
}
Expert.findOneAndUpdate(query, update, {new: true}, (err, expert) =>{
// handle return or error...
});
The reason you are not getting any expert is condition inside your query.
Results always returned based on the condition of your query if your condition inside query get satisfied you will get your result as simple as that.
Your query
{'helpRequests.requestor_id': {$ne: req.body.requestor_id}},
{'helpRequests.topic_id': {$ne: req.body.topic_id}}
you will get your expert only if requestor_id and topic_id is not exists inside helpRequests array. thats you are querying for.
Solution
As per you schema if helpRequests contains only requestor_id and topic_id then you can achieve what you desire by below query.
Expert.findOneAndUpdate(
{
expert_id: req.body.expert_id,
}, {
$addToSet: {
helpRequests: {
requestor_id: req.body.requestor_id,
topic_id: req.body.topic_id
}
}
}, { returnNewDocument: true });

MongoDB Bulk Save Equivalent?

I am a mongodb noob and am running into some difficulty trying to create an equivalent to bulk save (as I can't find a bulk save operation) using the MongoDB bulk operations. Briefly, given an array of documents:
[{ _id:1, name:"a" ... }, { _id:1, name:"b" ... } ... ]
I want to bulk upsert the documents in the array, using the _id attribute as the comparison field to determine which incoming records are equivalent to records already in mongodb. In pseudo-code I want mongodb to bulk upsert as follows:
if(incomingDocument._id == existingDocument._id){
update(incoming) // overwrite existing document with entire incoming document
} else {
insert(incoming)
}
Ideally, I would like to pass mongo an array and an comparator vs queuing up an individual bulk operation for each document.
How/can I do this with Bulk.find().upsert().update(<update>); or similar ?
(Alternately, is there an undocumented bulk save() operation?)
Thank you!
Bulk.find.upsert
With the upsert option set to true, if no matching documents exist for
the Bulk.find() condition, then the update or the replacement
operation performs an insert. If a matching document does exist, then
the update or replacement operation performs the specified update or
replacement.
But you will need to loop over your collection:
var bulk = db.items.initializeUnorderedBulkOp();
myDocumnets.forEach(function(doc) {
bulk.find({_id: doc._id}).upsert().replaceOne(doc);
});
bulk.execute({w: 1, j: true}, function (err, result) {
if (result.isOk()) {
...
}
More, or less, I am sorry I am not able to test it at the moment. I am also not able to say how it will behave on large amounts of documents.
UPDATE
I modified code, as suggested by Colin.

Inconsistent mongo results with unique field

Not sure when this issue cropped up but I am not able to fetch items from mongo consistently. I have 4000+ items in the db. Here's the schema.
var Order = new Schema({
code: {
type: String,
unique: true
},
...
});
Now run some queries:
Order.find().exec(function(err, orders) {
console.log(orders.length); // always 101
})
Order.find().limit(100000).exec(function(err, orders) {
console.log(orders.length); // varies, sometimes 1150, 1790, 2046 - never more
})
Now if I remove the 'unique: true' from schema it will always return the total amount:
Order.find().exec(function(err, orders) {
console.log(orders.length); // always 4213 (correct total)
})
Any idea as to why this behavior occurs? afaik the codes are all unique (orders from a merchant). This is tested on 3.8.6, 3.8.8
Ok issue was indeed unique index being not there/corrupted. I a guilty of adding the unique index later on in the game and probably had some dups already which prevented Mongo from creating indexes.
I removed the duplicates and then in the mongo shell did this:
db.orders({name: 1}, {unique: true, dropDubs: true});
I would think the above would remove dups but it would just die because of the dups. I am sure there is a shell way to do this but I just did it with some js code then ran the above to recreate the index which can be verified with:
db.orders.getIndexes()

Mongoose Not Creating Indexes

Trying to create MongoDB indexes. Using the Mongoose ODM and in my schema definition below I have the username field set to a unique index. The collection and document all get created properly, it's just the indexes that aren't working. All the documentation says that the ensureIndex command should be run at startup to create any indexes, but none are being made. I'm using MongoLab for hosting if that matters. I have also repeatedly dropped the collection. What is wrong.
var schemaUser = new mongoose.Schema({
username: {type: String, index: { unique: true }, required: true},
hash: String,
created: {type: Date, default: Date.now}
}, { collection:'Users' });
var User = mongoose.model('Users', schemaUser);
var newUser = new Users({username:'wintzer'})
newUser.save(function(err) {
if (err) console.log(err);
});
Hook the 'index' event on the model to see if any errors are occurring when asynchronously creating the index:
User.on('index', function(err) {
if (err) {
console.error('User index error: %s', err);
} else {
console.info('User indexing complete');
}
});
Also, enable Mongoose's debug logging by calling:
mongoose.set('debug', true);
The debug logging will show you the ensureIndex call it's making for you to create the index.
In my case I had to explicitly specify autoIndex: true:
const User = new mongoose.Schema({
name: String,
email: String,
password: String
},
{autoIndex: true}
)
Mongoose declares "if the index already exists on the db, it will not be replaced" (credit).
For example if you had previous defined the index {unique: true} but you want to change it to {unique: true, sparse: true} then unfortunately Mongoose simply won't do it, because an index already exists for that field in the DB.
In such situations, you can drop your existing index and then mongoose will create a new index from fresh:
$ mongo
> use MyDB
> db.myCollection.dropIndexes();
> exit
$ restart node app
Beware that this is a heavy operation so be cautious on production systems!
In a different situation, my indexes were not being created, so I used the error reporting technique recommended by JohnnyHK. When I did that I got the following response:
E11000 duplicate key error collection
This was because my new index was adding the constraint unique: true but there were existing documents in the collection which were not unique, so Mongo could not create the index.
In this situation, I either need to fix or remove the documents with duplicate fields, before trying again to create the index.
When I hooked the index event on the model that wasn't working, I started getting an error on the console that indicated "The field 'retryWrites' is not valid for an index specification." The only place in my app that referenced 'retryWrites' was at the end of my connection string. I removed this, restarted the app, and the index rebuild was successful. I put retryWrites back in place, restarted the app, and the errors were gone. My Users collection (which had been giving me problems) was empty so when I used Postman to make a new record, I saw (with Mongo Compass Community) the new record created and the indexes now appear. I don't know what retryWrites does - and today was the first day I used it - but it seemed to be at the root of my issues.
Oh, and why did I use it? It was tacked onto a connection string I pulled from Mongo's Atlas Cloud site. It looked important. Hmm.
It might be solve your problem
var schema = mongoose.Schema({
speed: Number,
watchDate: Number,
meterReading: Number,
status: Number,
openTrack: Boolean,
});
schema.index({ openTrack: 1 });
As you can see in mongoose documentations https://mongoosejs.com/docs/guide.html#indexes, we need define schema.index to create our indexes. Take a look in code below to test:
Note, after update your schema restart the server to check.
const schemaUser = new mongoose.Schema(
{
username: {
type: String,
required: true,
index: true,
unique: true,
dropDups: true,
},
hash: String,
created: {
type: Date,
default: Date.now,
},
},
{
autoCreate: true, // auto create collection
autoIndex: true, // auto create indexes
}
)
// define indexes to be create
schemaUser.index({ username: 1 })
const User = mongoose.model('Users', schemaUser)
const newUser = new Users({ username: 'wintzer' })
newUser.save(function (err) {
if (err) console.log(err)
})
Connections that set "readPreference" to "secondary" or "secondaryPreferred" may not opt-in to the following connection options: autoCreate, autoIndex.
Check your readPreference option in the mongoose connection
I had this problem when writing a data import command-line utility. After execution end, some indexes were created, some were not.
The solution was to call await model.ensureIndexes() before terminating the script. It worked regardless to autoIndex option value.
Сheck that the mongoose connect options do not specify :
autoIndex: false

Categories

Resources