I was wondering if for a large number of updates (>1000) if saveAll or individual save in a loop should be used. For example, I am making a batch update to multiple objects in a class and both options work. What I am wondering about is whether for large number of updates, if there is limitations on how many objects saveAll can save accurately. I know that there was some limitation when this was running on free account years ago, but I would think that limitation should no longer exist since doesn't exist anymore.
There is a 50 objects limitation in one call (updates/saves). It depends a lot on the aftersave/beforesave cloud code. There is no time limit nor request limit. If your deploy is well orchestrated you can push lots of items super fast. For example pm2 with multiple threads in a multicore cpu. You have also have in mind the mongodb hit. Normally it should be a issue saving is not as expensive as finding.


Nodejs can i save an array in my backend, and use it when i need it?

I have a small website, when you go into it it'll show you a quote.
Till today what I was doing is, when a user goes to my website a random quote that directly comes from the database will be shown (when I say directly I mean a connection was made to the database and return a quote from it) but sometimes it took some time like 1 or 2 seconds, today I did something different when my nodejs application starts I grab every quote in the database and store them inside an Array. So when someone comes to my website I'll randomly choose a quote in the Array, and it was so much faster compared to the first way of doing it and I make some changes so when I add new quote to the database the Array automatically updated.
So here is my question, is it bad to store data inside an array and serve users with it?
There will be a few different answer according to your intentions. First of all, if the dataset of quotes are a lot in quantity. I assure you it is a very bad idea but if you are talking about a few items. Well, it's acceptable. However, if you are building a scalable application, it's not much recommended because you will keep all copies of the dataset in each node etc.
If you want a very fast quote storage, I would recommend redis (a key value storage for RAM). It shares the state for each node which means your all nodes connect to redis and the quotes are kept in redis so that you do not need to keep the copies and it becomes fast. Also, if you activate the disk record option, you can use redis as your primary quote storage. In the end, you won't update these quotes too much and they won't be searched with a complex query.
Alternatively, if your database is mysql, postgre or mongodb, you can activate ram storage option so that you don't need to keep that data on your array but directly take it form db which is much more fast but also queryable.
There's the old joke: The two hard things in software engineering are naming things, caching things, and off-by-one errors.
You're caching something: your array of strings. Then you select one at random from the array each time you need one.
What is right? You get your text string from memory, and eliminate the time-delay involved in getting it from the database. It's a good optimization.
What can go wrong?
Somebody can add or remove strings from your database, which makes your cache stale.
You can have so many text strings you blow out your nodejs RAM. This seems unlikely; it's hard to imagine a list of quotes that big. The Hebrew Bible, the New Testament, and the Qur'an together comprise less than a million words. You probably won't have more text in your quotable-quotes than that. 10-20 megabytes of RAM is nothing these days.
So, what about your stale cache in RAM? What to do?
You could ignore the problem. Who cares if the cache is stale?
You could reread the cache every so often.
Your use of RAM for this is a good optimization. But, it adds a cache to your application. A cache adds complexity, and the potential for a bug. Is the optimization worth the trouble? Only you can guess the answer to that question.
And, it's MUCH MUCH better than doing SELECT ... ORDER BY RAND() LIMIT 1; every time you need something random. That is a notorious query-performance antipattern.

In node.js, should a common set of data be in the database or a class?

In my database I have a table with data of cities. It includes the city name, translation of the name (it's a bi-lingual website), and latitude/longitude. This data will not change and every user will need to load this data. There are about 300 rows.
I'm trying to keep the pressure put on the server as low as possible (at least to a reasonable extent), but I'd also prefer to keep these in the database. Would it be best to have this data inside a class that is loaded in my main app.js file? It should be kept in memory and global to all users, correct? Or would it be better on the server to keep it in the database and select the data whenever a user needs it? The sign in screen has the listing of cities, so it would be loaded often.
I've just seen that unlike PHP, many of the Node.js servers don't have tons of memory, even the ones that aren't exactly cheap, so I'm worried about putting unnecessary things into memory.
I decided to give this a try. Using an example data set consisting of 300 rows (each containing 24 string characters and two doubles and property names), a small node.js script indicated an additional memory usage of 80 to 100 KB.
You should ask yourself:
How often will the data be used? How much of the data does a user need?
If the whole dataset will be used on a regular basis (let's say multiple times a second), you should consider keeping the data in memory. If, however, your users will need a part of the data and only once from time to time, you might consider loading the data from a database.
Can I guarantee efficient loading from the database?
An important fact is that loading parts of the data from a database might even require more memory, because the V8 garbage collector might delay the collection of the loaded data, so the same data (or multiple parts of the data) might be in memory at the same time. There is also a guaranteed overhead due to database / connection data and so on.
Is my approach sustainable?
Finally, consider the possibility of data growth. If you expect the dataset to grow by a non-trivial amount, think about the above points again and decide whether a growth is likely enough to justify database usage.

saving project as incremental json diffs?

I've been building a web paint program wherein the state of a user's artwork is saved as a json object. Every time I add to the client's undo stack (just an array of json objects describing the state of theproject), I want to save the state to the server too.
I am wondering if there an elegant way to [1] only send up the diffs and then [2] be able to download the project later and recreate the current state of the project? I fear this could get messy and am trending towards just uploading the complete json project state at every undo step. Any suggestions or pointers to projects which tackle this sort of problem gracefully?
Interesting - and pretty large - question.
A lot of implementations / patterns / solutions apply to this problem and they vary depending on the type of "document" you're keeping track of updates of.
Anyway, a simple approach to avoid getting mad is, instead than saving "states", saving "command which produced those states".
If your application is completely deterministic (which I assume it is, since it's a painting program), you can be sure that for every command at given time & position, the result will be the same at every execution.
So, I would instead note down an "alphabet" representing the commands available in your program:
Draw[x,y,size, color]
and so on. You can take inspiration from SVG implementation. Then push/pull strings of commands to/from the server:
timestamp: MOVE[0,15]DRAW[15,20,4,#000000]ERASE[4,8,10]DRAW[15,20,4,#ff0000]
This is obviously only a general, pseudocoded idea. Hope you can get some inspiration.

Processing a large (12K+ rows) array in JavaScript

The project requirements are odd for this one, but I'm looking to get some insight...
I have a CSV file with about 12,000 rows of data, approximately 12-15 columns. I'm converting that to a JSON array and loading it via JSONP (has to run client-side). It takes many seconds to do any kind of querying on the data set to returned a smaller, filtered data set. I'm currently using JLINQ to do the filtering, but I'm essentially just looping through the array and returning a smaller set based on conditions.
Would webdb or indexeddb allow me to do this filtering significantly faster? Any tutorials/articles out there that you know of that tackles this particular type of issue? (no longer maintained, see for a newer fork.)
Crossfilter is a JavaScript library for exploring large multivariate
datasets in the browser. Crossfilter supports extremely fast (<30ms)
interaction with coordinated views, even with datasets containing a
million or more records...
This reminds me of an article John Resig wrote about dictionary lookups (a real dictionary, not a programming construct).
He starts with server side implementations, and then works on a client side solution. It should give you some ideas for ways to improve what you are doing right now:
Local Storage
Memory Considerations
If you require loading an entire data object into memory before you apply some transform on it, I would leave IndexedDB and WebSQL out of the mix as they typically both add to complexity and reduce the performance of apps.
For this type of filtering, a library like Crossfilter will go a long way.
Where IndexedDB and WebSQL can come into play in terms of filtering is when you don't need to load, or don't want to load, an entire dataset into memory. These databases are best utilized for their ability to index rows (WebSQL) and attributes (IndexedDB).
With in browser databases, you can stream data into a database one record at a time and then cursor through it, one record at a time. The benefit here for filtering is that this you means can leave your data on "disk" (a .leveldb in Chrome and .sqlite database for FF) and filter out unnecessary records either as a pre-filter step or filter in itself.

How to improve performance of Jquery autocomplete

I was planning to use jquery autocomplete for a site and have implemented a test version. Im now using an ajax call to retrieve a new list of strings for every character input. The problem is that it gets rather slow, 1.5s before the new list is populated. What is the best way to make autocomplete fast? Im using cakephp and just doing a find and with a limit of 10 items.
This article - about how flickr does autocomplete is a very good read. I had a few "wow" experiences reading it.
"This widget downloads a list of all
of your contacts, in JavaScript, in
under 200ms (this is true even for
members with 10,000+ contacts). In
order to get this level of
performance, we had to completely
rethink how we send data from the
server to the client."
Try preloading your list object instead of doing the query on the fly.
Also the autocomplete has a 300 ms delay by default.
Perhaps remove the delay
$( ".selector" ).autocomplete({ delay: 0 });
1.5-second intervals are very wide gaps to serve an autocomplete service.
Firstly optimize your query and db
connections. Try keeping your db connection
alive with memory caching.
Use result caching methods if your
service is highly used to ignore re-fetchs.
Use a client cache (a JS list) to keep the old requests on the client. If user types back and erases, it is going to be useful. Results will come from the frontend cache instead of backend point.
Regex filtering on the client side wont be costly, you may give it a chance.
Before doing some optimizations you should first analyze where the bottle-neck is. Try to find out how long each step (input → request → db query → response → display) takes. Maybe the CakePHP implementation has a delay not to send a request for every character entered.
Server side on PHP/SQL is slow.
Don't use PHP/SQL. My autocomplete written on C++, and uses hashtables to lookup. See performance here.
This is Celeron-300 computer, FreeBSD, Apache/FastCGI.
And, you see, runs quick on huge dictionaries. 10,000,000 records isn't a problem.
Also, supports priorities, dynamic translations, and another features.
The real issue for speed in this case I believe is the time it takes to run the query on the database. If there is no way to improve the speed of your query then maybe extending your search to include more items with a some highly ranked results in it you can perform one search every other character, and filter through 20-30 results on the client side.
This may improve the appearance of performance, but at 1.5 seconds, I would first try to improve the query speed.
Other than that, if you can give us some more information I may be able to give you a more specific answer.
Autocomplete itself is not slow, although your implementation certainly could be. The first thing I would check is the value of your delay option (see jQuery docs). Next, I would check your query: you might only be bringing back 10 records but are you doing a huge table scan to get those 10 records? Are you bringing back a ton of records from the database into a collection and then taking 10 items from the collection instead of doing server-side paging on the database? A simple index might help, but you are going to have to do some testing to be sure.

