PouchDB delete document upon successful replication to CouchDB - javascript

I am trying to implement a process as described below:
Create a sale_transaction document in the device.
Put the sale_transaction document in Pouch.
Since there's a live replication between Pouch & Couch, let the sale_transaction document flow to Couch.
Upon successful replication of the sale_transaction document to Couch, delete the document in Pouch.
Don't let the deleted sale_transaction document in Pouch, to flow through Couch.
Currently, I have implemented a two-way sync from both databases, where I'm filtering each document that is coming from Couch to Pouch, and vice versa.
For the replication from Couch to Pouch, I didn't want to let sale_transaction documents to go through, since I could just get these documents from Couch.
PouchDb.replicate(remoteDb, localDb, {
// Replicate from Couch to Pouch
live: true,
retry: true,
filter: (doc) => {
return doc.doc_type!=="sale_transaction";
}
})
While for the replication from Pouch to Couch, I put in a filter not to let deleted sale_transaction documents to go through.
PouchDb.replicate(localDb, remoteDb, {
// Replicate from Pouch to Couch
live: true,
retry: true,
filter: (doc) => {
if(doc.doc_type==="sale_transaction" && doc._deleted) {
// These are deleted transactions which I dont want to replicate to Couch
return false;
}
return true;
}
}).on("change", (change) => {
// Handle change
replicateOutChangeHandler(change)
});
I also implemented a change handler to delete the sale_transaction documents in Pouch, after being written to Couch.
function replicateOutChangeHandler(change) {
for(let doc of change.docs) {
if(doc.doc_type==="sale_transaction" && !doc._deleted) {
localDb.upsert(doc._id, function(prevDoc) {
if(!prevDoc._deleted) {
prevDoc._deleted = true;
}
return prevDoc;
}).then((res)=>{
console.log("Deleted Document After Replication",res);
}).catch((err)=>{
console.error("Deleted Document After Replication (ERROR): ",err);
})
}
}
}
The flow of the data seems to be working at first, but when I get the sale_transaction document from Couch, then do some editing, I would then have to repeat the process of writing the document in Pouch, then let it flow to Couch, then delete it in Pouch. But, after some editing with the same document, the document in Couch, has also been deleted.
I am fairly new with Pouch & Couch, specifically in NoSQL, and was wondering if I'm doing something wrong in the process.

For a situation like the one you've described above, I'd suggest tweaking your approach as follows:
Create a PouchDB database as a replication target from CouchDB, but treat this database as a read-only mirror of the CouchDB database, applying whatever transforms you need in order to strip certain document types from the local store. For the sake of this example, let's call this database mirror. The mirror database only gets updated one-way, from the canonical CouchDB database via transform replication.
Create a separate PouchDB database to store all your sales transactions. For the sake our this example, let's call this database user-data.
When the user creates a new sale transaction, this document is written to user-data. Listen for changes on user-data, and when a document is created, use the change handler to create and write the document directly to CouchDB.
At this point, CouchDB is recieving sales transactions from user-data, but your transform replication is preventing them from polluting mirror. You could leave it at that, in which case user-data will have local copies of all sales transactions. On logout, you can just delete the user-data database. Alternatively, you could add some more complex logic in the change handler to delete the document once CouchDB has recieved it.
If you really wanted to get fancy, you could do something even more elaborate. Leave the sales transactions in user-data after it's written to CouchDB, and in your transform replication from CouchDB to mirror, look for these newly-created sales transactions documents. Instead of removing them, just strip them of anything but their _id and _rev fields, and use these as 'receipts'. When one of these IDs match an ID in user-data, that document can be safely deleted.
Whichever method you choose, I suggest you think about your local PouchDB's _changes feed as a worker queue, instead of putting all of this elaborate logic in replication filters. The methods above should all survive offline cases without introducing conflicts, and recover nicely when connectivity is restored. I'd recommend the last solution, though it might be a bit more work than the others. Hope this helps.

Maybe additional field for delete - thus marking the record for deletion.
Then periodic routine running on both Pouch and Couch that scan for marked for deletion records and delete them.

Related

Performing Data Cleanup In Mongodb

My application tracks the movements of data throughout the system. When a movement is recorded it is placed in a separate collection that determines whether the document is enroute, available or out of service. I used $addToSet to place the _id, and $pullAll to make sure that when a doc is moved from enroute to available, it is not duplicated. But when the _id is moved to a new location entirely, I need to remove the old data from the old location and insert it into the new location. The insertion works but I cannot figure out how to properly remove the data from the old location. These are all down within Meteor Calls and Mongodb
if last.status is "Enroute"
LastLocation.update locationId: last.locationId,partId: last.partId,
$addToSet:
enroutePurchaseIds: lastPurchaseId
$pullAll:
availiblePurchaseIds: lastPurchaseId
outOfServicePurchaseIds: lastPurchaseId
Update
You can run the merge command from upcoming 4.4 version which allows updating the same collection the aggregation is running on. Pass the array as old location and new location
db.collection.aggregate([
{"$match":{"location":{"$in":[oldLocation,newLocation]}}},
{"$addFields":{"sortOrder":{"$indexOfArray":[[oldLocation,newLocation],"$location"]}}},
{"$sort":{"sortOrder":1}},
{"$group":{
"_id":null,
"oldLocationDoc":{"$first":"$$ROOT"},
"newLocationDoc":{"$last":"$$ROOT"}
}},
{"$addFields":{
"oldLocationDoc.old":{
"$filter":{
"input":"$oldLocationDoc.old",
"cond":{"$ne":["$$this",oldLocation]}
}
},
"newLocationDoc.new":{"$concatArrays":["$newLocationDoc.new",[newLocation]]}
}},
{"$project":{"locations":["$oldLocationDoc","$newLocationDoc"]}},
{"$unwind":"$locations"},
{"$replaceRoot":{"newRoot":"$locations"}},
{"$merge":{
"into":{"db":"db","coll":"collection"},
"on":"_id",
"whenMatched":"merge",
"whenNotMatched":"failed"
}}
]
Original
Not possible to move array/field value from one document to another document in a single update operation.
You would want to use transactions to perform multi document updates in a atomic way. Requires replica set.
var session = db.getMongo().startSession();
var collection = session.getDatabase('test').getCollection('collection');
session.startTransaction({readConcern: {level:'snapshot'},writeConcern: {w:'majority'}});
collection.update({location:oldLocation},{$pull:{availiblePurchaseIds:lastPurchaseId}});
collection.update({location:newLocation},{$push:{enroutePurchaseIds:lastPurchaseId}});
session.commitTransaction()
session.endSession()
Other options would be to perform bulk updates in case of standalone mongod instance.
var bulk = db.getCollection('collection').initializeUnorderedBulkOp();
bulk.find({location:oldLocation}).updateOne({$pull:{availiblePurchaseIds:lastPurchaseId}});
bulk.find({location:newLocation}).updateOne({$push:{enroutePurchaseIds:lastPurchaseId}});
bulk.execute();
Are you moving the entire document from one collection to another or just moving the document's id? I can't help much with coffeescript but if you're looking to move entire documents you might find the following thread helpful.
mongodb move documents from one collection to another collection

Redux - Remove object from store on http response error

Consider the following flow:
I have a page with a list of "products" and a modal to create a single "product". I open the modal, fill the form and submit the form.
At this point, I dispatch an action CREATING_PRODUCT, add the product to the store and send the http request to the server.
I close the modal and display the list of results with the new product.
Let's suppose I receive an error response from the server.
Desired behavior:
I would like to display an error, remove the project from the list, re-open the modal and display the form already filled.
Question
How can I find that project and remove it the list? I don't have an id (or a combination of unique properties) to find that project in the store. I don't see a clean way to link a request/response to that "product" object in the store.
Possible solution
The client adds a "requestId" into the project before adding it to the store. On response error, I dispatch a generic "CREATE_ERROR" and I remove the project with that requestId from the store.
Extra
Same problem with edit and delete. For example during a delete should I keep a reference to the deleted project with the requestId in the store, until the http request is successful?
I bet it is a problem with a common solution, but I can't find examples.
Thanks!
In general, your Redux store should be modeled somewhat like a relational database, in that every time you have a list of data models, each element of the list should have its own identifier. This helps a lot when dealing with more complex data schemes.
You should probably store your projects as an object, something like:
{
// ...other store properties
projects: {
"_0": { // ... project properties }
"_1": { // ... project properties }
// ...more projects...
},
}
This way, whenever you need to mess with an existing project, you can just reference its id and use projects[id] to access that project. This would also solve the edit and delete cases, as you could just pass the IDs around as handles.
I like this short piece on why your Redux store should be mostly flat and why data should always have identifiers very much. It also talks about using selectors to "hide" your IDs away, which may or may not be useful for you.
In your case, as you are getting IDs from a server, you could have an ID prefix which indicates unsaved values. So your projects object would become something like:
projects: {
"_0": { // ... }
"_1": { // ... }
"UNSAVED_2": { // ... }
}
This way, you could easily identify unsaved values and handle them when an error occurs, still get the benefits of generating temp IDs on the client-side in order to revert changes on error, and also warn your user if they try to leave your app while their data still hasn't been synchronized - just check if there are any "UNSAVED" IDs :)
When you get a response from the server, you could change the "UNSAVED_suffix" ID to an actual ID.

Efficient DB design with PouchDB/CouchDB

So I was reading a lot about how to actually store and fetch data in an efficient way. Basically my application is about time management/capturing for projects. I am very happy for any opinion on which strategy I should use or even suggestions for other strategies. The main concern is about the limited resources for local storage on the different Browsers.
This is the main data I have to store:
db_projects: This is a database where the projects itself are stored.
db_timestamps: Here go the timestamps per project whenever a project is running.
I came up with the following strategies:
1: Storing the status of the project in the timestamps
When a project is started, there is addad a timestamp to db_timestamps like so:
db_timestamps.put({
_id: String(Date.now()),
title: projectID,
status: status //could be: 1=active/2=inactive/3=paused
})...
This follows the strategy to only add data to the db and not modify any entries. The problem I see here is that if I want to get all active projects for example, I would need to query the whole db_timestamp which can contain thousands of entries. Since I can not use the ID to search all active projects, this could result in a quite heavy DB query.
2: Storing the status of the project in db_projects
Each time a project changes it's status, there is a update to the project itself. So the "get all active projects"-query would be much resource friendly, since there are a lot less projects than timestamps. But this would also mean that each time a status change happens, the project entry would be revisioned and therefor would produce "a lot" of overhead. I'm also not sure if the compaction feature would do a good job, since not all revision data is deleted (the documents are, but the leaf revisions not). This means for a state change we have at least the _rev information which is still a string of 34 chars for changing only the status (1 char). Or can I delete the leaf revisions after conflict resolution?
3: Storing the status in a separate DB like db_status
This leads to the same problem as in #2 since status changes lead to revisions on this DB. Or if the states would be added in "only add data"-mode (like in #1), it would just quickly fill with entries.
The general problem is that you have a limited amount of space that you could put into indexedDB. On the other hand the principle of ChouchDB is that storage space is cheap (which it is indeed true when you store on the server side only). Here an interesting discussion about that.
So this is the solution that I use for now. I am using a mix between solution 1 and solution 2 from above with the following additions:
Storing only the timesamps in a synced Database (db_timestamps) with the "only add data" principle.
Storing the projects and their states in a separate local (not
synced) database (db_projects). Therefor I still use pouchDB since
it has a lot simpler API than indexedDB.
Storing the new/changed
project status in each timestamp aswell (so you could rebuild db_projects
out of db_timestams if needed)
Deleting db_projects every so often and repopulate it, so the
revision data (overhead for this db in my case) is eliminated and the size is acceptable.
I use the following code to rebuild my DB:
//--------------------------------------------------------------------
function rebuild_db_project(){
db_project.allDocs({
include_docs: true,
//attachments: true
}).then(function (result) {
// do stuff
console.log('I have read the DB and delete it now...');
deleteDB('db_project', '_pouch_DB_Projekte');
return result;
}).then(function (result) {
console.log('Creating the new DB...'+result);
db_project = new PouchDB('DB_Projekte');
var dbContentArray = [];
for (var row in result.rows) {
delete result.rows[row].doc._rev; //delete the revision of the doc. else it would raise an error on the bulkDocs() operation
dbContentArray.push(result.rows[row].doc);
}
return db_project.bulkDocs(dbContentArray);
}).then(function(response){
console.log('I have successfully populated the DB with: '+JSON.stringify(response));
}).catch(function (err) {
console.log(err);
});
}
//--------------------------------------------------------------------
function deleteDB(PouchDB_Name, IndexedDB_Name){
console.log('DELETE');
new PouchDB(PouchDB_Name).destroy().then(function () {
// database destroyed
console.log("pouchDB destroyed.");
}).catch(function (err) {
// error occurred
});
var DBDeleteRequest = window.indexedDB.deleteDatabase(IndexedDB_Name);
DBDeleteRequest.onerror = function(event) {
console.log("Error deleting database.");
};
DBDeleteRequest.onsuccess = function(event) {
console.log("IndexedDB deleted successfully");
console.log(request.result); // should be null
};
}
So I not only use the pouchDB.destroy() command but also the indexedDB.deleteDatabase() command to get the storage freed nearly completely (there is still some 4kB that are not freed, but this is insignificant to me.)
The timings are not really proper but it works for me. I'm happy if somone has an idea to make the timing work properly (The problem for me is that indexedDB does not support promises).

MEAN / AngularJS app check if object already posted

I have thig angularJS frontend and I use express, node and mongo on the backend.
My situation looks like:
//my data to push on server
$scope.things = [{title:"title", other proprieties}, {title:"title", other proprieties}, {title:"title", other proprieties}]
$scope.update = function() {
$scope.things.forEach(function(t) {
Thing.create({
title: t.title,
//other values here
}, function() {
console.log('Thing added');
})
})
};
//where Thing.create its just an $http.post factory
The HTML part looks like:
//html part
<button ng-click="update()">Update Thing</button>
Then on the same page the user has the ability to change the $scope.things and my problem is that when I call update() again all the things are posted twice because literally thats what I'm doing.
Can someone explain me how to check if the 'thing' its already posted to the server just to update the values ($http.put) and if its not posted on server to $http.post.
Or maybe its other way to do this?
I see a few decisions to be made:
1) Should you send the request after the user clicks the "Update" button (like you're currently doing)? Or should you send the request when the user changes the Thing (using ngChange)?
2) If going with the button approach for (1), should you send a request for each Thing (like you're currently doing), or should you first check to see if the Thing has been updated/newly created on the front end.
3) How can you deal with the fact that some Thing's are newly created and others are simply updated? Multiple routes? If so, then how do you know which route to send the request to? Same route? How?
1
To me, the upside of using the "Update" button seems to be that it's clear to the user how it works. By clicking "Update" (and maybe seeing a flash message afterwards), the user knows (and gets visual feedback) that the Thing's have been updated.
The cost to using the "Update" button is that there might be unnecessary requests being made. Network communication is slow, so if you have a lot of Thing's, having a request being made for each Thing could be notably slow.
Ultimately, this seems to be a UX vs. speed decision to me. It depends on the situation and goals, but personally I'd lean towards the "Update" button.
2
The trade-off here seems to be between code simplicity and performance. The simpler solution would just be to make a request for each Thing regardless of whether it has been updated/newly created (for the Thing's that previously existed and haven't changed, no harm will be done - they simply won't get changed).
The more complex but more performant approach would be to keep track of whether or not a Thing has been updated/newly created. You could add a flag called dirty to Thing's to keep track of this.
When a user clicks to create a new Thing, the new Thing would be given a flag of dirty: true.
When you query to get all things from the database, they all should have dirty: false (whether or not you want to store the dirty property on the database or simply append it on the server/front end is up to you).
When a user changes an existing Thing, the dirty property would be set to true.
Then, using the dirty property you could only make requests for the Thing's that are dirty:
$scope.things.forEach(function(thing) {
if (thing.dirty) {
// make request
}
});
The right solution depends on the specifics of your situation, but I tend to err on the side of code simplicity over performance.
3
If you're using Mongoose, the default behavior is to add an _id field to created documents (it's also the default behavior as MongoDB itself as well). So if you haven't overridden this default behavior, and if you aren't explicitly preventing this _id field from being sent back to the client, it should exist for Thing's that have been previously created, thus allow you to distinguish them from newly created Thing's (because newly created Thing's won't have the _id field).
With this, you can conditionally call create or update like so:
$scope.things.forEach(function(thing) {
if (thing._id) {
Thing.update(thing._id, thing);
}
else {
Thing.create(thing);
}
});
Alternatively, you could use a single route that performs "create or update" for you. You can do this by setting { upsert: true } in your update call.
In general, upsert will check to see if a document matches the query criteria... if there's a match, it updates it, if not, it creates it. In your situation, you could probably use upsert in the context of Mongoose's findByIdAndUpdate like so:
Thing.findByIdAndUpdate(id, newThing, { upsert: true }, function(err, doc) {
...
});
See this SO post.
#Adam Zemer neatly addressed concerns I raised in a comment, however I disagree on some points.
Firstly, to answer the question of having an update button or not, you have to ask yourself. Is there any reason why the user would like to discard his changes and not save the work he did. If the answer is no, then it is clear to me that the update should not be place and here is why.
To avoid your user from loosing his work you would need to add confirmations if he attempts to change the page, or close his browser, etc. On the other if everything is continuously saved he has the peace of mind that his work is always saved and you dont have to implement anything to prevent him from loosing his work.
You reduce his workload, one less click for a task may seem insignificant but he might click it many time be sure to have his work save. Also, if its a recurrent tasks it will definitely improve his experience.
Performance wise and code readability wise, you do small requests and do not have to implement any complicated logic to do so. Simple ng-change on inputs.
To make it clear to him that his work is continuously save you can simply say somewhere all your changes are saved and change this to saving changes... when you make a request. For exemple uses, look at office online or google docs.
Then all you would have to do is use the upsert parameter on your mongoDB query to be able to create and update your things with a single request. Here is how your controller would look.
$scope.update = function(changedThing) { // Using the ng-change you send the thing itself in parammeter
var $scope.saving = true; // To display the saving... message
Thing.update({ // This service call your method that update with upsert
title: changedThing.title,
//other values here
}).then( // If you made an http request, I suppose it returns a promise.
function success() {
$scope.saving = false;
console.log('Thing added');
},
function error() {
//handle errors
})
};

How to undo "Meteor.publish", and undo "new Meteor.Collection"

I see that when publishing, the collection._connection.publish_handlers is populated, and so does the collection._connection.method_handlers, and probably other areas.
I want to basically cleanup the memory by removing the references to that collection and it's publication entirely.
Basically each user of the app has a list of collections for that user. There is a publish function that looks like this for the user to get their list of collections:
Meteor.publish('users_collections', function() {
var self = this;
var handle = UsersCollections.find({ownerId: self.userId}).observeChanges({
added: function(id, collectionInfo) {
UsersCollectionManager.addUsersCollection(self.userId, collectionInfo.name);
}
});
});
That publishes that user's list of collections (and any user that connects gets their list).
Once the user gets their list, each of those collections is made reactive with new Meteor.Collection and then published.
UsersCollectionManager.addUsersCollection = function(userId, collectionName) {
if (self.collections[userId].collections[collectionName] === undefined) {
self.collections[userId].collections[collectionName] = new Meteor.Collection(collectionName);
Meteor.publish(collectionName, function() {
return self.collections[userId].collections[collectionName].find();
});
}
};
Once the user disconnects I have a function that gets run.
if that user doesn't have any connections open (ex: if they had multiple windows open and all the connections are closed "all windows closed") then it starts a 30s timeout to:
cleanup all these publish calls and new Meteor.Collection calls" to save memory
As the other user's of the app won't need this user's collections.
I'm not sure how to actually cleanup those from memory.
I don't see a "unpublish" or "Collection.stop" type of methods in the Meteor API.
How would I perform cleanup?
You need to do this in two steps. First, stop and delete the publications, then delete the collection from Meteor.
The first step is fairly easy. It requires you to store the handles of each subscription:
var handles = [];
Meteor.publish('some data', function() {
//Do stuff, send cursors, this.ready(), ...
handles.push(this);
});
And later in time, stop them:
var handle;
while(handle = handles.shift()) {
handle.stop();
}
All your publications are stopped. Now to delete the publication handler. It's less standard :
delete Meteor.default_server.publish_handlers['some data'];
delete Meteor.server.publish_handlers['some data'];
You basically have to burn down the reference to the handler. That's why it's less standard.
For the collection you first need to remove all documents then you have to delete the reference. Fortunately deleting all documents is very easy:
someCollection.remove({}, {multi : true}); //kaboom
Deleting the reference to the collection is trickier. If it is a collection declared without a name (new Mongo.Collection(null);), then you can just delete the variable (delete someCollection), it is not stored anywhere else as far as I know.
If it is a named collection, then it exists in the Mongo database, which means you have to tell Mongo to drop it. I have no idea how to do that at the moment. Maybe the Mongo driver can do it, ot it will need some takeover on the DDP connection.
You can call subscriptionHandle.stop(); on the client. If the user has disconnected, the publications will have stopped anyway.

Categories

Resources