I'm building a UI for a data importer using angularjs. The angular app is going to be crunching through the input data source (a spreadsheet, or whatever) and sending GETs/POSTs to an API to create/update records on the server and retrieve changes, etc.
If a user is importing thousands of records, I probably don't want to be opening up thousands of ajax calls at once (not that Angular would be able to get all of the requests sent out before the first finished). My thought was to add some sort of connection pool so that it could be throttled to just 10 or 50 or so ajax calls at once.
Does angular already have a built-in means of throttling ajax calls? I know I could build one without too much trouble, but I don't want to re-invent the wheel if there's already something slick out there. Can anyone recommend any tools/plugins for that? I know there are a few for jquery, but I'm hoping to avoid jquery, as much as possible, for this project.
In investigating this further, I found that, according to this question and this page, browsers automatically throttle http requests to the same server to somewhere between 6 and 13 concurrent requests, and they queue up additional requests until other requests complete. So, apparently, no action is needed, and I can just let my code fly.
I looked at angular-http-throttler (suggested by nathancahill), but (1) it appears to duplicate what browsers are already doing by themselves, and (2) it doesn't (currently) appear to have a mechanism for handling error cases, so if the server doesn't respond, or if it was a bad request, it doesn't decrement the request count, thus permanently clogging up it's queue.
Letting the browser queue up too many requests at once can still cause memory/performance issues if dealing with very large amounts of data. I looked at various techniques for creating a fixed-length queue in javascript, where it can fire off a callback when a queue spot becomes available, but the non-blocking nature of javascript makes such a queue tricky & fragile... Hopefully, I won't need to go that direction (I'm not dealing with THAT much data), and the browser's throttling will be sufficient for me.
Related
I was wondering if there is any good practice how to utilize long-running requests in nodejs?
I am using Express.js and have been wondering if there is any way to, I don't know, move one of express requests to the different instance of node or something?
The reason why I am wondering is that if strongly uses a CPU, blocking at the same time event loop in node. The main problem is that firstly, I am fetching some data, then calculating and inserting missing data into DB, then fetching other data depends on the result, and then calculating again and updating other stuff in DB.
I haven't noticed this problem as I was the only one using this request so far. But now I am testing it with other users and if few of them use this request, the whole application freeze for a few seconds. And on my monitoring tools I see a huge CPU usage (100%-200% lol)
So I have tried clustering but it seems, it does not work. I have used pm2 and ran app on all cores of my CPU. Because of complexicity of my alghoritm I tried to move several functions to worker threads but it looks like these threads would be used very, very often and I am afraid of crashing the whole node instance.
I have no clue, which solution would be the best and don't know if are there any good tools for Express, which would help me offload/utilize these requests? I have not diveed in partitioning yet, but it looks like this may work. Has anyone used this for the larger projects and knows if this could be the solution to distribute one request into several ticks?
What about the job queue like Bull or Kue? Would be this a good idea to push these tasks into a queue? I have never used such tools and I am asking, as I have no idea if this would make any sense.
Cheers
in case you have data present - You can sent data to user using incoming HTTP request.
but if you don't found any data you can send empty response in HTTP request.
And you need to do calculating and inserting missing data using queue service. bee-queue is nice one.
Use redis with bee-queue to make background jobs faster.
You probably came here to chide me but this is a real use case.
In the world of online education, there are SCORM courses. I have to make old SCORM courses work on a site. SCORM courses are "web based" and run in a browser, but they expect to run in an iframe and they expect the parent to supply a GetValue method and a SetValue.
So these SCORM courses are doing things like parent.SetValue("score", "90") and moving on. That function is supposed to return "false" if there was any issue.
SCORM comes from the 90's, and in modern web we know we have to do callbacks/promises and http fails "often". You might think the solution is a SetValue that writes to local data and then tries and retries until it get's through, but the SCORM course typically is set up to only move to the next screen if the SetValue worked, so you shouldn't be letting the user advance unless the SetValue actually was saved on the server.
TL;DR
Assuming a syncronous request is a requirement, what is the right way to do it?
So far I know of $.ajax({async:false ... but now browsers warn about that and sound like they're going to just ignore your request to be synchronous. I am thinking maybe using websockets or web workers or something is the right way to do a syncronous request in modern programming. But I don't know how to make a request like that. And I am not allowed change the code of the SCORM courses (they are generated with various course-making tools)
To clarify, I have full control over the implementation of the SetValue function.
Will $.ajax({async:false ... work long term? (5-10 years)
NOTE: it is entirely acceptable in this use case to completely freeze the UI until the request either succeeds or fails. That's what the courses assume.
So far I know of $.ajax({async:false… but now browsers warn about that
This is the right way (if you're using jQuery), it sends a synchronous XMLHttpRequest. Just ignore the warning. It's a warning that you are using outdated technology, which you already know.
and sound like they're going to just ignore your request to be synchronous.
That's unlikely.
I am thinking maybe using websockets or web workers or something is the right way to do a syncronous request in modern programming.
No, websockets and web workers are always asynchronous, you can't use them to make your asynchronous request look synchronous (in fact there's nothing that lets you do this).
Will $.ajax({async:false… work long term? (5-10 years)
We cannot know (and SO is not a crystal ball). It might, especially in older browsers, or it might not. Browser vendors are reluctant to break compatibility of features that run the web, and synchronous requests still are needed from time to time. At some point, too few (important) web pages will use it (<1%, <1‰, whatever threshold they decide on) and browsers will finally be confident to remove it. At that point, your business will have realised to deprecate these outdated course-making tools.
Based on my experience with learning management systems, the answer is: fake it.
You wrote:
it is entirely acceptable in this use case to completely freeze the UI until the request either succeeds or fails. That's what the courses assume.
Perhaps your courses assume this, but this is not the case in any learning management system I've used over the past decade.
From what I've seen, learning management systems don't use synchronous requests because they block other scripts, which gives the impression the page/course is locked up or broken. The workaround is to use async calls via an abstraction layer (which includes the SCORM API), and return 'true' to the course even if you have no way of verifying that the AJAX call was in fact was successful.
High-level view of how LMSs typically handle SCORM data:
When a course is launched, the LMS gets ALL of the course's existing SCORM data from the database, then puts it into a JavaScript object on the client side (accessible via the SCORM API). When you fetch data via SCORM, you are typically fetching data that is in this pre-loaded JS object -- you are NOT getting a real-time response directly from the database. Therefore AJAX is not needed when using SCORM's API.GetValue.
When you attempt to API.SetValue, you're initially storing the key/value pair in the JS object, not the SCORM database. Therefore the client-side JS object needs to synchronously indicate whether it successfully stored the data ('true') or not ('false'). The database -- and AJAX -- doesn't come into play until you try to persist the data to the database using API.Commit().
When you try to get a success value from API.Commit(), which is invoking AJAX, most LMSs will fake it. They will do an asynchronous request for the sake of ensuring the course doesn't feel broken, so the value returned from Commit() will almost always be 'true'. It's not reliable.
I'm porting my Firefox Extension to Chrome and the lack of a synchronous preferences service is making life fun.
I have an options page for my users and I'm using the approach from here: https://developer.chrome.com/extensions/options. I'll be running a content script to get data from the page and using sendMessage to send/receive callbacks to a background script.
The content script needs access to my Extension Options. It needs these before it does it's processing. Of course the Storage API is Asynchronous. I've tried cheating with Stratify.js to force the Storage API to behave Synchronously, but that's ugly as heck.
That leaves me writing code like this:
chrome.storage.sync.get(defaultPrefs, function(myPrefs) {
//Do all my webpage processing here,
//basically writing my entire Extension inside
// this call to chrome.storage.sync.get()
}
I've seen this question asked before with a few solutions, but they mostly use localStorage, which won't work since I want to use my preferences from here.
This just feels wrong, but if I'm going to use the Storage Sync API for my Extension preferences then I'm kinda stuck. The localStorage solutions I've used mostly involve calls to sendMessage and leave me stuck in the same sort of callback pattern. I'm I missing something?
You are not missing any thing. The Google API is callback driven as it follow the Javascript philosophy. And you should accept it to become a good javascript developer.
A reason of the asynchronous mode for Sync Storage is the latency of the network and the possibly long time to send/receive the data from the synced storage. Javascript VM is mono-thread so if the call to the storage is synchronous and take some times, the user interface will freeze waiting for the response. This is not acceptable for the user experience. The only way to avoid this behavior is to use callbacks. you give the function that you want to execute when the request is finished.
It's not the better pattern ever made but it does the job. But it has a limitation : the Callback Hell. You can try to manage it with Promises and defining simple and short functions that only do one atomic action. Having a Functional Programming approach can help to do this.
An other way to avoid it is to create an object that will automatically synchronize himself with the storage. It allow you to use it fully synchronously but it's more difficult to handle possible errors. I have made one here. It lacks of error handling and can largely be improved but you can get the idea.
I will try to improve this later but I lack time...
I'm developing a website that uses a notification system (like the Facebook's one).
For this purpose I think I'll write a JQuery polling function that look for new notification in the server side, using ajax.
My question is, is this a good idea?
It will be ok for client side, it's more likely to be a server issue. http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2007-016.pdf
In conclusion, if you want high data coherence and network performance you should use push rather than pulling. But push will consume more cpu thus has scalability issue.
The efficiency of pulling depends on the pull interval vs publish interval. If they equal, everything will be perfect. But realistically they will never meet.
Some additional personal opinion, if your server script is blocking in nature (like ruby on rails with single thread server). You better rethink the solution.
How can I save the application state for a node.js Application that consists mostly of HTTP request?
I have a script in Node.JS that works with a RESTful API to import a large number (10,000+) of products into an E-Commerce application. The API has a limit on the amount of requests that can be made and we are staring to brush up against that limit. On a previous run the script exited with a Error: connect ETIMEDOUT probably due to exceeding API limits. I would like to be able to try connecting 5 times and if that fails resume after an hour when the limit has been restored.
It would also be beneficial to save the progress throughout in case of a crash (power goes down, network crashes etc). And be able to resume the script from the point it left off.
I know that Node.js operates as a giant event-queue, all http requests and their callbacks get put into that queue (together with some other events). This makes it a prime target for saving the state of current execution. Other pleasant (not totally necessary for this project) would be being able to distribute the work among several machines on different networks to increase throughput.
So is there an existing way to do it? A framework perhaps? Or do I need to implement this myself, in that case, any useful resources on how this can be done would be appreciated.
I'm not sure what you mean when you say
I know that Node.js operates as a giant event-queue, all http requests and their callbacks get put into that queue (together with some other events). This makes it a prime target for saving the state of current execution
Please feel free to comment or expound on this if you find it relevant to the answer.
That said, if you're simply looking for a persistence mechanism for this particular task, I might recommend Redis, for a few reasons:
It allows atomic operations on many data types; for example, if you had an entry in Redis called num_requests_made that represented the number of requests made, you could increment this number easily in Redis using INCR num_requests_made, and it's guaranteed to be atomic, making it easier to scale to multiple workers.
It has several data types that could prove useful for your needs; for example, a simple string could represent the number of API requests made during a certain period of time (as in the previous bullet point); you might store details on failed API request that need to be resubmitted in a list; etc.
It provides pub/sub mechanisms which would allow you to communicate easily between multiple instances of the program.
If this sounds interesting or useful and you're not already familiar with Redis, I highly recommend trying out the interactive tutorial, which introduces you to a few data types and commands for them. Another good piece of reading material is A fifteen minute introduction to Redis data types.