atomic 'read-modify-write' in javascript - javascript

I'm developing an online store app, and using Parse as the back-end. The count of each item in my store is limited. Here is a high-level description of what my processOrder function does:
find the items users want to buy from database
check whether the remaining count of each item is enough
if step 2 succeeds, update remaining count
check if remaining count becomes negative, if it is, revert remaining count to the old value
Ideally, the above steps should be executed exclusively. I learned that Javascript is a single-threaded and event-based, so here are my questions:
no way in Javascript to put the above steps in a critical section, right?
assume only 3 items are left, and two users try to order 2 of them respectively. The remaining count will end up as -1 for one of the users, so remaining count needs to be reverted to 1 in this case. Imagine another user tries to order 1 item when the remaining count is -1, he will fail although he should be allowed to order. How do I solve this problem?
Following is my code:
Parse.Cloud.define("processOrder", function(request, response) {
Parse.Cloud.useMasterKey();
var orderDetails = {'apple':2, 'pear':3};
var query = new Parse.Query("Product");
query.containedIn("name", ['apple', 'pear']);
query.find().then(function(results) {
// check if any dish is out of stock or not
_.each(results, function(item) {
var remaining = item.get("remaining");
var required = orderDetails[item.get("name")];
if (remaining < required)
return Parse.Promise.error(name + " is out of stock");
});
return results;
}).then(function(results) {
// make sure the remaining count does not become negative
var promises = [];
_.each(results, function(item) {
item.increment("remaining", -orderDetails[item.get("name")]);
var single_promise = item.save().then(function(savedItem) {
if (savedItem.get("remaining") < 0) {
savedItem.increment("remaining", orderDetails[savedItem.get("name")]);
return savedItem.save().then(function(revertedItem) {
return Parse.Promise.error(savedItem.get("name") + " is out of stock");
}, function(error){
return Parse.Promise.error("Failed to revert order");
});
}
}, function(error) {
return Parse.Promise.error("Failed to update database");
});
promises.push(single_promise);
});
return Parse.Promise.when(promises);
}).then(function() {
// order placed successfully
response.success();
}, function(error) {
response.error(error);
});
});

no way in Javascript to put the above steps in a critical section, right?
See, here is the amazing part. In JavaScript everything runs in a critical section. There is no preemption and multiprocessing is cooperative. If your code started running there is simply no way any other code can run before yours completes.
That is, unless your code is done executing.
The problem is, you're doing IO, and IO in JavaScript yields back to the event loop before actually happening kind of like in blocking code. So when you create and run a query you don't actually continue running right away (that's what your callback/promise code is about).
Ideally, the above steps should be executed exclusively.
Sadly that's not a JavaScript problem, that's a host environment problem in this case Parse. This is because you have to explicitly yield control to the other code when you use their APIs (through callbacks and promises) and it is up to them to solve it.
Lucky for you, parse has atomic counters. From the API docs:
To help with storing counter-type data, Parse provides methods that atomically increment (or decrement) any number field. So, the same update can be rewritten as.
gameScore.increment("score");
gameScore.save();
There are also atomic array operations which you can use here. Since you can do step 3 atomically, you can guarantee that the counter represents the actual inventory.

Related

Bulk Upsert Javascript stored procedure always exceeds execution cap of 5 seconds and results in a timeout

I'm currently running a script in python SDK which programmatically bulk upserts 1.5 million documents into a collection in azure cosmos db. I've been using the bulk import sproc from the samples provided in the github repo: https://github.com/Azure/azure-cosmosdb-js-server/tree/master/samples/stored-procedures, the only change being that I've swapped collection.createDocument with collection.upsertDocument. I'll include my sproc in full below.
The stored procedure does run successfully - it upserts documents consistently and relatively quickly. Although this will be the case only up until around 30% progress when this error will be thrown:
CosmosHttpResponseError: (RequestTimeout) Message: {"Errors":["The requested operation exceeded maximum alloted time. Learn more: https://aka.ms/cosmosdb-tsg-service-request-timeout"]}
ActivityId: 9f2357c6-918c-4b67-ba20-569034bfde6f, Request URI: /apps/4a997bdb-7123-485a-9808-f952db2b7e52/services/a7c137c6-96b8-4b53-a20c-b9577981b353/partitions/305a8287-11d1-43f8-be1f-983bd4c4a63d/replicas/132488328092882514p/, RequestStats:
RequestStartTime: 2020-11-03T23:43:59.9158203Z, RequestEndTime: 2020-11-03T23:44:05.3858559Z, Number of regions attempted:1
ResponseTime: 2020-11-03T23:44:05.3858559Z, StoreResult: StorePhysicalAddress: rntbd://cdb-ms-prod-centralus1-fd22.documents.azure.com:14354/apps/4a997bdb-7123-485a-9808-f952db2b7e52/services/a7c137c6-96b8-4b53-a20c-b9577981b353/partitions/305a8287-11d1-43f8-be1f-983bd4c4a63d/replicas/132488328092882514p/, LSN: -1, GlobalCommittedLsn: -1, PartitionKeyRangeId: , IsValid: False, StatusCode: 408, SubStatusCode: 0, RequestCharge: 0, ItemLSN: -1, SessionToken: , UsingLocalLSN: False, TransportException: null, ResourceType: StoredProcedure, OperationType: ExecuteJavaScript, SDK: Microsoft.Azure.Documents.Common/2.11.0
Is there a way to add some retry logic or to extend the timeout period for bulk upserts? I believe the section of code in the sproc below if (!isAccepted) getContext().getResponse().setBody(count); is supposed to help with this scenario but it doesn't seem to work in my case.
Bulk upsert stored procedure in Javascript:
function bulkUpsert(docs) {
var collection = getContext().getCollection();
var collectionLink = collection.getSelfLink();
// The count of imported docs, also used as current doc index.
var count = 0;
// Validate input.
if (!docs) throw new Error("The array is undefined or null.");
var docsLength = docs.length;
if (docsLength == 0) {
getContext().getResponse().setBody(0);
return;
}
// Call the CRUD API to create a document.
tryCreate(docs[count], callback);
// Note that there are 2 exit conditions:
// 1) The upsertDocument request was not accepted.
// In this case the callback will not be called, we just call setBody and we are done.
// 2) The callback was called docs.length times.
// In this case all documents were created and we don't need to call tryCreate anymore. Just call setBody and we are done.
function tryCreate(doc, callback) {
var isAccepted = collection.upsertDocument(collectionLink, doc, callback);
// If the request was accepted, callback will be called.
// Otherwise report current count back to the client,
// which will call the script again with remaining set of docs.
// This condition will happen when this stored procedure has been running too long
// and is about to get cancelled by the server. This will allow the calling client
// to resume this batch from the point we got to before isAccepted was set to false
if (!isAccepted) {
getContext().getResponse().setBody(count);
}
}
// This is called when collection.upsertDocument is done and the document has been persisted.
function callback(err, doc, options) {
if (err) throw err;
// One more document has been inserted, increment the count.
count++;
if (count >= docsLength) {
// If we have created all documents, we are done. Just set the response.
getContext().getResponse().setBody(count);
} else {
// Create next document.
tryCreate(docs[count], callback);
}
}
}
I think that the problem may lie in the stored procedure rather than the python script, if this isn't the case though I can provide my python script. Any help on this would be massively appreciated, it's been a head scratcher for me for days now!
Extra Info:
Throughput = 10,000, partition upsert size ~ 1.9MB consistently.
If anyone else has this problem, the workaround I've used is to increase the throughput to 100,000 instead of 10,000 temporarily whilst the bulk upsert operation is underway. The error doesn't occur if you use that bulk upsert stored procedure in conjunction with a sufficiently high throughput. I think the timeout was happening frequently once the bulk upsert operation had upserted around 30% of the 1.5 million records, likely because the throughput wasn't divided sufficiently between partitions and it was causing a bottleneck. I may have to again assign a greater throughput to my container once it is used in practice or maybe I'll be able to reduce it to save costs. Either way the code to do this is quite simple with just the method below:
new_throughput = 10000; container.replace_throughput(new_throughput)
Stored procedures have a bounded execution time of 5 seconds. However you can write your stored procedure to handle bounded execution by checking a boolean return value and then use the count of items inserted in each invocation of the stored procedure to track and resume progress across batches. There is an example here.

Socketio Get number of clients in room

I would like to ask for your help. I'm having a hard time with this function. It's supposed to check if the room has 0 or 1 clients inside, and then gives information back about whether another client can join the room or not (with a max of 2 users per room).
I'm out of ideas about getting the number of clients in the room. I've checked the site and there were quite a few answers about this topic, working with earlier versions of socket.io. Now I've came to this function:
io.in(room).clients((err, clients) => {
console.log(clients.length);
});
It works and logs the right amount of clients inside the room but I have no idea how can I return that value to the outer function.
The var user consists of a whole JSON and I've been wondering if there is a quicker way to return the length of the array without digging into JSON.
There's the outer function:
function isRoomFree(room) {
var user = io.in(room).clients((err, clients) => {
console.log(clients.length);
});
//console.log(user);
if(user < 2)
return true;
else
return false;
}
Is there any way to do that? I'm kinda new to the js, socketio and node.js
Your function isRoomFree(room) is essentially synchronous, meaning that you call it and you wait for the result, however io.in(room).clients is asynchronous, meaning that you don't know when the result will arrive.
Mixing the 2 of them presents a challenge.
What you need to do is change your function to become async. I suggest you become familiar with the concept.
function isRoomFree(room, callback) {
var user = io.in(room).clients((err, clients) => {
if(clients < 2)
callback(true);
else
callback(false);
});
}
Use it like this:
isRoomFree(room, function(status) {
if (status)
console.log("free");
else
console.log("not free");
//continue your program logic inside the callback
});

Chaining multiple firebase actions and rollbacks

I'm using firebase and I want to chain some actions. Here is the scenario:
I want to add an item to the array and because I don't want to use push Id's I update a 'Last_Id' variable in firebase every time an item is added. I also update a 'Counter' variable to count the number of records (so I don't end up using numChildren() which can be slow).
The count and last_id variable are in the same tree like this:
Count:
---------->last_id
---------->Counter
I did this so that they can both be updated at the same time in a single transaction
So when I add an item I want 3 things to happen in order:
1- last_id is retreived
Item is added
last_id and Counter are
both updated
This is my code which makes use of promises.
add:function(ref,obj){
//get last_id
return baseRef.child('Count').child("Last_Id").once("value")
.then(function(snapshot){
return (snapshot.val()+1);
})
//add new data
.then(function(key){
return baseRef.child(ref).child(key).set(obj,function(error){
if (error)
console.log(error.code)
})
})
//update Count and last key
.then(this.updateCountAndKey(ref,1))
},
updateCountAndKey:function(ref,i){
return baseRef.child('Count').transaction(function(currentValue) {
if (currentValue!==null)
return {
Counter:(currentValue.Counter||0) +i,
Last_Id:(currentValue.Last_Id||0)+1
}
},function(err,commited,snap) {
if( commited )
console.log("updated counter to "+ snap.val());
else {
console.log("oh no"+err);
}
},false)
}
since I'm new to javascript and promises in particular want to know if this is a robust way of doing things. I also want to know how to do roll-backs if something goes wrong. so that if one thing fails then everything else fails (e.g if the update to Last_id and Counter fail then the item is not added).
Any help is much appreciated.
As the Firebase documentation specifies , transactions can only Atomically modify the data at this location, hence you can't use transactions to update other nodes in Firebase.
It is recommended to use push ID's (generated by Firebase in a safe way). This will remove the need to use a transaction for this part of your process. You will need to still use a transaction if you need to maintain the count. This should be done on success of #2 (adding an item).
Now your process will look like this:
push an item (auto generated ID)
on success, use a transaction to increment the count

Parse Cloud Code Ending Prematurely?

I'm writing a job that I want to run every hour in the background on Parse. My database has two tables. The first contains a list of Questions, while the second lists all of the user\question agreement pairs (QuestionAgreements). Originally my plan was just to have the client count the QuestionAgreements itself, but I'm finding that this results in a lot of requests that really could be done away with, so I want this background job to run the count, and then update a field directly on Question with it.
Here's my attempt:
Parse.Cloud.job("updateQuestionAgreementCounts", function(request, status) {
Parse.Cloud.useMasterKey();
var query = new Parse.Query("Question");
query.each(function(question) {
var agreementQuery = new Parse.Query("QuestionAgreement");
agreementQuery.equalTo("question", question);
agreementQuery.count({
success: function(count) {
question.set("agreementCount", count);
question.save(null, null);
}
});
}).then(function() {
status.success("Finished updating Question Agreement Counts.");
}, function(error) {
status.error("Failed to update Question Agreement Counts.")
});
});
The problem is, this only seems to be running on a few of the Questions, and then it stops, appearing in the Job Status section of the Parse Dashboard as "succeeded". I suspect the problem is that it's returning prematurely. Here are my questions:
1 - How can I keep this from returning prematurely? (Assuming this is, in fact, my problem.)
2 - What is the best way of debugging cloud code? Since this isn't client side, I don't have any way to set breakpoints or anything, do I?
status.success is called before the asynchronous success calls of count are finished. To prevent this, you can use promises here. Check the docs for Parse.Query.each.
Iterates over each result of a query, calling a callback for each one. If the callback returns a promise, the iteration will not continue until that promise has been fulfilled.
So, you can chain the count promise:
agreementQuery.count().then(function () {
question.set("agreementCount", count);
question.save(null, null);
});
You can also use parallel promises to make it more efficient.
There are no breakpoints in cloud code, that makes Parse really hard to use. Only way is logging your variables with console.log
I was able to utilize promises, as suggested by knshn, to make it so that my code would complete before running success.
Parse.Cloud.job("updateQuestionAgreementCounts", function(request, status) {
Parse.Cloud.useMasterKey();
var promises = []; // Set up a list that will hold the promises being waited on.
var query = new Parse.Query("Question");
query.each(function(question) {
var agreementQuery = new Parse.Query("QuestionAgreement");
agreementQuery.equalTo("question", question);
agreementQuery.equalTo("agreement", 1);
// Make sure that the count finishes running first!
promises.push(agreementQuery.count().then(function(count) {
question.set("agreementCount", count);
// Make sure that the object is actually saved first!
promises.push(question.save(null, null));
}));
}).then(function() {
// Before exiting, make sure all the promises have been fulfilled!
Parse.Promise.when(promises).then(function() {
status.success("Finished updating Question Agreement Counts.");
});
});
});

Self-triggered perpetually running Firebase process using NodeJS

I have a set of records that I would like to update sequentially in perpetuity. Basically:
Get least recently updated record
Update record
Set date of record to now (aka. send it to the back of the list)
Back to step 1
Here is what I was thinking using Firebase:
// update record function
var updateRecord = function() {
// get least recently updated record
firebaseOOO.limit(1).once('value', function(snapshot) {
key = _.keys(snapshot.val())[0];
/*
* do 1-5 seconds of non-Firebase processing here
*/
snapshot.ref().child(key).transaction(
// update record
function(data) {
return updatedData;
},
// update priority after commit (would like to do it in transaction)
function(error, committed, snap2) {
snap2.ref().setPriority(snap2.dateUpdated);
}
);
});
};
// listen whenever priority changes (aka. new item needs processing)
firebaseOOO.on('child_moved', function(snapshot) {
updateRecord();
});
// kick off the whole thing
updateRecord();
Is this a reasonable thing to do?
In general, this type of daemon is precisely what was envisioned for use with the Firebase NodeJS client. So, the approach looks good.
However, in the on() call it looks like you're dropping the snapshot that's being passed in on the floor. This might be application specific to what you're doing, but it would be more efficient to consume that snapshot in relation to the once() that happens in the updateRecord().

Categories

Resources