I'm building a nodejs app that needs access to some data. I am not sure what is the best way to store the data. If it is json or mongodb or a sql database considering the performance of the read operation.
The app will never update/ insert/ delete any of the data. That's why I wrote it is static. And the amount of data could be a total of at most a few hundreds small objects.
What is your opinion on that? Really considering the max performance of the read operation.
Since it is 'static' data and that too only a few hundreds small objects, I'd recommend that you go ahead with JSON. SQL should be preferred when operations such as data manipulation, concurrent sessions etc. are involved.
This is not opinion based.
The answer is a flat file.
Reasoning: When leveraging a database, there are defined use cases. triggers, inserts, deletes, updates, etc. All of this is managed by a database language of your choosing.
If you are not leveraging any key aspects of a database, then why do you need the overhead of it.
The best way to approach this situation would be to consolidate the access to a class you create called: StaticService or whatever fits your fancy. In this class you will read in the data and store it in memory as a property. Then have various methods in that service which will get you the data you request.
Even with a Database, you would still implement this kind of service worker, but you dont have this overhead. You can also optimize it as you see fit, but it sounds like you may be looking to display lists, or specific values which are generally o(1) access if the json is designed correctly.
Related
So I have an app that needs to JSON.stringify its data to put into localStorage, but as the data gets larger, this operation gets outrageously expensive.
So, I tried moving this onto a webWorker so it's off the main thread, but I'm now learning posting an object to a webWorker is even more expensive than stringifying it.
So I guess I'm asking, is there any way whatsoever to get JSON.stringify off the main thread, or at least make it less expensive?
I'm familiar with fast-json-stringify, but I don't think I can feasibly provide a complete schema every time...
You have correctly observed that passing object to web worker costs as much as serializing it. This is because web workers also need to receive serialized data, not native JS objects, because the instance objects are bound to the JS thread they were created in.
The generic solution is applicable to many programming problems: chose the right data structures when working with large datasets. When data gets larger it's better sacrifice simplicity of access for performance. Thus do any of:
Store data in indexedDB
If your large object contains lists of the same kind of entry, use indexed DB for reading and writing and you don't need to worry about serialization at all. This will require refactor of your code, but this is the correct solution for large datasets.
Store data in ArrayBuffer
If your data is mostly fixed-size values, use an ArrayBuffer. ArrayBuffer can be copied or moved to web worker pretty much instantly and if your entries are all same size, serialization can be done in parallel. For access, you may write simple wrappers classes that will translate your binary data into something more readable.
I have a set of data associating zipcodes to GPS coordinates (namely latitude and longitude). The very nature of the data makes it immutable, so it has no need to be updated.
What are the pro and cons of storing them in a SQL database vs directly as a JavaScript hashmap? The table resides on the server, it's Node.js, so this is not a server vs browser question.
When retrieving data, one is sync, the other async, but there is less than 10k elements, so I'm not sure whether storing these in MySQL and querying them justifies the overhead.
As there is no complex querying need, are there some points to consider that would justify having the dataset in a database?
* querying speed and CPU used for retrieving a pair,
* RAM used for a big dataset that would need to fit into working memory.
I guess that for a way bigger dataset, (like 100k, 1M or more), it would be too costly in memory and a better fit for the database.
Also, JavaScript obejects use hash tables internally, so we can infer they perform well even with non trivial datasets.
Still, would a database be more efficient at retrieving a value from an indexed key than a simple hashmap?
Anything else I'm not thinking about?
You're basically asking a scalability question... "At what point do I swap from storing things in a program to storing things in a databse?"
Concurrency, persistence, maintainability, security, etc.... are all factors.
If the data is open knowledge, only used by one instance of one program, and will never change, then just hard code it or store it in a flat file.
When you have many applications with different permissions calling a set of data and making changes, a database really shines.
Most basically, an SQL database will [probably ...] be "server side," while your JavaScript hash-table will be "client side." Does the data need to be persisted from one request to the next, and between separate invocations of the JavaScript program? If so, it must be stored ... somewhere.
The decision of whether to use "a hash table" is also up to you: hash tables are great when you are looking for explicit keys. But they're not the only data-structure available to you in JavaScript.
I'd say: carefully work out all the particulars of your exact situation, and use these to inform your decision. "An online web forum like this one" really can't step into your shoes on this. "You're the engineer ..."
My Node server gathers data in the form of nested arrays, once every minute or so. The data looks like this [[8850, 3.1, '2009jckdsfj'], ..., [8849.99, 25.3, '8sdcach83']]
There are about 2000 of these arrays that need to be cached. Persistence isn't important since I'm updating it so often.
Using redis seems like the "thing to do," however I can't see the benefit. Using a javascript object, I wouldn't need to stringify and parse the arrays to store and consume them.
What advantages would redis offer in this situation?
Here are some reasons to use redis:
Multiple processes can access the data. It runs in a separate process and has a networked API so multiple processes can access the data. For example, if you were using clustering and wanted all clustered instances to have access to the same data, you would need to use some external database (such as Redis).
Memory usage separate from node.js. It runs in a separate process so its memory usage is separate from node.js. If you were storing a very large amount of data, it's possible that redis would handle that large memory usage better than node.js would or that you'd be better off with the usage split between two processes rather than all in node.js.
Redis offers features that aren't part of standard Javascript. These include pub/sub, data querying, transactions, expiring keys (ideal for data meant to expire such as sessions), LRU aging of keys (ideal for bounded caches), data structures not built into Javascript such as sorted sets, bitmaps, etc... to name only a few.
Redundancy/replication/high availability. If your data doesn't need to be long term on disk, but does need to be robust, you may want data protection against the failure of any single server. You can use data replication to multiple redis servers and thus have failover, backup without taking on the added performance drag of persisting to disk.
This said, there's no reason to use redis because it's the "thing to do". Use redis only if you identify a problem you have that it solves better than just using an object store inside of node.js.
FYI, the redis site offers a whole series of white papers on all sorts of things related to redis. Those whitepapers might be a good source of further info also.
Currently I'm experimenting with localStorage to store a large amount of objects of same type, and I am getting a bit confused.
One way of thinking is to store all the object in an array. But then for each read/write of a single object I need to deserialise/serialise the whole array.
The other way is to directly store each object with its key in the localStorage. This will make accessing each object much easier but I'm worried of the amount of objects that will be stored (tens of thousands). Also, getting all the objects will require iterating the whole localStorage!
I'm wondering which way will be better in your experience? Also, would it be worthwhile to try on more sophisticated client side database like PouchDB?
If you want something simple for storing a large amount of key/values, and you don't want to have to worry about the types, then I recommend LocalForage. You can store strings, numbers, arrays, objects, Blobs, whatever you want. It uses IndexedDB and WebSQL where available, so the storage limits are much higher than LocalStorage.
PouchDB works too, but the API is more complex, and it's better-suited for when you want to sync data with CouchDB on the server.
If you do not want to have a lot of keys, you can:
concat row JSONs with \n and store them as a single key
build and update an index(es) stored under separate keys, each linking some key with a particular row number.
In this case parsing rows is just .split('\n') that is ~2 orders of magnitude faster, then JSON.parse.
Please, notice, that you possibly need special effort to syncronize simultaneously opened tabs. It can be a challenge in complex cases.
localStorage has both good and bad parts.
Good parts:
syncronous;
extremely fast, both read and write are just memcpy – it‘s 100+Mb/s throughput even on weak devices (for example JSON.stringify is in general 5-20 times slower than localStorage.setItem);
thoroughly tested and reliable.
Bad news:
no transactions, so you need an engineering effort to sync tabs;
think you have not more than 2Mb (cause there exist systems with this limit);
2Mb of storage actually mean 1M chars you can save.
These points show borders of localStorage applicability as a DB. LS is good for tasks, where you need syncronicity and speed, and where you can trim you DB to fit into quota.
So localStorage is good for caches and logs. Not more.
I hadn't personally used localStorage to manage so many elements.
However, the pattern I usually use to manage data is to load the complete info database into a javascript object, manage it on memory during the proccess and saving it again to localStorage when the proccess is finished.
Of course, this pattern may not be a good approach to your needings, depending on your project specifications.
If you need to save data constantly, data access could become a problem, and thus probably using some type of small database access is a better option.
If your data volume is exceptionally high it also could be a problem to manage it on memory, however, depending on data model, you'd be able to build it to efficient structures that would allow you to load and save data just when it's needed.
I need a mechanism for storing complex data structures created in client side javascript. I've been considering using the stringify method to convert the javascript object into a string, store it in the database and then pull it back out and use the reverse parse method to give me the javascript object back.
Is this just a bad idea or can it be done safely? If it can, what are some pitfalls I should be sure to avoid? Or should I just come up with my own method for accomplishing this?
It can be done and I've done it. It's as safe as your database.
The only downside is it's practically impossible to use the stored data in queries. Down the track you may come to wish you'd stored the data as table fields to enable filtering and sorting etc.
Since the data is user created make sure you're using a safe method to insert the data to protect yourself from injection attacks (don't just blindly concatenate the data into a query string).
It's fine so long as you don't deserialize using eval.
Because you are using a database it means you need a serverside language to communicate with the database. Any data you have is easily converted from and to json with most serverside languages.
I can't imagine a proper usecase unless you have a sh*tload of javascript, it needs to be very performant, and you have exhausted all other possibilities such as caching, query optimization, etc...
An other downside of doing this is that you can't easily query the data in your database which is always nice when you want to get any kind of reporting done.
And what if your json structure changes? Will you update all the scripts in your database? Or will you force yourself to cope with the changes in the parsing code?
Conclusion
Imho it is not dangerous to do so but it leaves little room for manageability and future updates.