I am beginner in redis and had used it in my node.js project and its providing good results when I see the caching mechanism it's been spinning
So basically in world where MySql,firebase and mongodb are top in there perspective, where would redis fit? Can we use redis for better optimization replacing any of these most popular databases or can have greater application role with specific technologies ? Maybe it should be used with javascript and its framework(eg. node.js has good analogy with redis) more?
Redis is widely used for caching. Meaning, in a high availability infrastructure, when some data has to be accessed many times, you would store it in your database and then store it in redis with some unique key which you could rebuild easily with parameters. When the data is updated, you just clear that key in redis and add it again with the new data.
Example:
You have thousands of users.
They all connect many many times and go on their profile.
You might want to store their profile info in redis with a key {userid}_user_info.
The user tries to access his profile:
first check if data exists in redis,
if yes return it,
else get it from db and insert it in redis
The user updates his profile info, just refresh the redis value.
etc.
There is also another way redis is used, it's for queuing tasks and synchronising websockets broadcasts across machines. Here is a useful article about it
http://fiznool.com/blog/2016/02/24/building-a-simple-message-queue-with-redis/
As per using redis as a database, well for simple data it can be used, like settings where a simple key/value is enough. For storing complex data, it's a bit of a hassle, specially if you want to use relational functionalities, or searching features. Redis is fast because it does not have all these features, and keeps data in memory (not only, but it does contribute).
Related
I have 2 requirements in my application:
I have multiple clients, which should be completely separated
Each client can have multiple subsidiaries that he should be able to switch between without re-authenticating but the data should be separated (e.g. all vendors in subsidiary 1 should not be shown in subsidiary 2)
As for the first requirement, I'm thinking of using a multi-tenancy architecture. That is, there will be one API instance, one frontend instance per customer and one database per customer. Each request from the frontend will include a tenant ID by which the API decides which database it needs to connect to / use. I would use mongoose's useDb method for this.
Question 1: is this method a good approach and/or are there any known drawbacks performance wise? I'm using this article as a reference.
As for the second requirement, I would need to somehow logically separate certain schemas. E.g., I have my mongoose vendorSchema. But I would need to somehow separate the entries per subsidiary. I could only imagine to add a field to each of these "shared schemas" e.g.
const vendorSchema = new mongoose.Schema({
/* other fields... */
subsidiary {
type: mongoose.Schema.Types.ObjectId,
ref: "Subsidiary",
required: true
}
})
and then having to use this a subsidiary in every request to the API to use in the mongoose query to find the right data. That seems like a bad architectural decision and an overhead though, and seems little scalable.
Question 2: Is there a better approach to achieve this logical separation as per subsidiary for every "shared" schema?
Thanks in advance for any help!
To maybe answer part of your question..
A multi tenant application is, well normal.. I honestly don't know of any web-app that would be single tenant, unless it's just a personal app.
With that said the architecture you have will work but as noted in my comments there is no need to have a separate DB for each users, this would be a bit overkill and is the reason why SQL or Mongo queries exist.
Performance wise, in general database servers are very performant, that's what they are designed for, but this will rely on many factors
Number of requests
size of requests
DB optimization
Query optimization
Resources of DB server
I'm sure there are many more I didn't list but you get the idea..
To your second question, yes you could add a 'Subsidiary' field, this would say be the subsidiary ID, so then when you query Mongo you use where subsidiar = 'id' this would then return only the items for said user...
From the standpoint of multiple request to mongo for each API call, yah you want to try and limit the number of calls each time but thats where caching comes in, using something like redis to store the responses for x minutes etc. Then the response is mainly handled by redis, but again this is going to depend a lot on the size of the responses and frequency etc.
But this actually leads into why I was asking about DB choices, Mongo works really well for frequently changing schemas with little to no relation to each other. We use Mongo for an a chat application and it works really well for that because it's more or less just a JSON store for us with simply querying for chats but the second you need to have data relate to each other it can start to get tricky and end up costing you more time and resources trying to hack around mongo to do the same task.
I would say it could be worth doing an exercise where you look at your current data structure, where it is today and where it might go in the future. If you can foresee having your data related in anyway in the future or maybe even crypto ( yes mongo does have this but its only in the enterprise version) then it may be something to look at.
I have a set of data associating zipcodes to GPS coordinates (namely latitude and longitude). The very nature of the data makes it immutable, so it has no need to be updated.
What are the pro and cons of storing them in a SQL database vs directly as a JavaScript hashmap? The table resides on the server, it's Node.js, so this is not a server vs browser question.
When retrieving data, one is sync, the other async, but there is less than 10k elements, so I'm not sure whether storing these in MySQL and querying them justifies the overhead.
As there is no complex querying need, are there some points to consider that would justify having the dataset in a database?
* querying speed and CPU used for retrieving a pair,
* RAM used for a big dataset that would need to fit into working memory.
I guess that for a way bigger dataset, (like 100k, 1M or more), it would be too costly in memory and a better fit for the database.
Also, JavaScript obejects use hash tables internally, so we can infer they perform well even with non trivial datasets.
Still, would a database be more efficient at retrieving a value from an indexed key than a simple hashmap?
Anything else I'm not thinking about?
You're basically asking a scalability question... "At what point do I swap from storing things in a program to storing things in a databse?"
Concurrency, persistence, maintainability, security, etc.... are all factors.
If the data is open knowledge, only used by one instance of one program, and will never change, then just hard code it or store it in a flat file.
When you have many applications with different permissions calling a set of data and making changes, a database really shines.
Most basically, an SQL database will [probably ...] be "server side," while your JavaScript hash-table will be "client side." Does the data need to be persisted from one request to the next, and between separate invocations of the JavaScript program? If so, it must be stored ... somewhere.
The decision of whether to use "a hash table" is also up to you: hash tables are great when you are looking for explicit keys. But they're not the only data-structure available to you in JavaScript.
I'd say: carefully work out all the particulars of your exact situation, and use these to inform your decision. "An online web forum like this one" really can't step into your shoes on this. "You're the engineer ..."
My Node server gathers data in the form of nested arrays, once every minute or so. The data looks like this [[8850, 3.1, '2009jckdsfj'], ..., [8849.99, 25.3, '8sdcach83']]
There are about 2000 of these arrays that need to be cached. Persistence isn't important since I'm updating it so often.
Using redis seems like the "thing to do," however I can't see the benefit. Using a javascript object, I wouldn't need to stringify and parse the arrays to store and consume them.
What advantages would redis offer in this situation?
Here are some reasons to use redis:
Multiple processes can access the data. It runs in a separate process and has a networked API so multiple processes can access the data. For example, if you were using clustering and wanted all clustered instances to have access to the same data, you would need to use some external database (such as Redis).
Memory usage separate from node.js. It runs in a separate process so its memory usage is separate from node.js. If you were storing a very large amount of data, it's possible that redis would handle that large memory usage better than node.js would or that you'd be better off with the usage split between two processes rather than all in node.js.
Redis offers features that aren't part of standard Javascript. These include pub/sub, data querying, transactions, expiring keys (ideal for data meant to expire such as sessions), LRU aging of keys (ideal for bounded caches), data structures not built into Javascript such as sorted sets, bitmaps, etc... to name only a few.
Redundancy/replication/high availability. If your data doesn't need to be long term on disk, but does need to be robust, you may want data protection against the failure of any single server. You can use data replication to multiple redis servers and thus have failover, backup without taking on the added performance drag of persisting to disk.
This said, there's no reason to use redis because it's the "thing to do". Use redis only if you identify a problem you have that it solves better than just using an object store inside of node.js.
FYI, the redis site offers a whole series of white papers on all sorts of things related to redis. Those whitepapers might be a good source of further info also.
Immutable.js states (https://facebook.github.io/immutable-js/)
Immutable.js provides many Persistent Immutable data structures including: List, >Stack, Map, OrderedMap, Set, OrderedSet and Record.
These data structures are highly efficient on modern JavaScript VMs by using >structural sharing via hash maps tries and vector tries as popularized by Clojure >and Scala, minimizing the need to copy or cache data.
Redis states (https://redis.io/)
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
Similar language shared in their descriptions lends me to think I could use Immutable.js as a "light" Redis solution or Redis as a "heavy" Immutable.js solution to track my multiplayer's game state (all active games state and players state).
What are some key differences between these?
When should someone select immutable.js over redis?
When should someone select redis over immutable.js?
At present I cannot find a comparison of these two libraries (when I search Google or Stack Overflow), which makes me believe my assumption (that they can be used in place of each other at times) is woefully inaccurate, but I can find nothing to confirm or deny that.
I am not wondering if they can be used together (unless it is in a context of how they are different and compliment each other). I recognize Immutable.js is focused on handling structures that are Immutable, while Redis does not seem to be focused on that, but that still begs the question, can't I just stick with Immutable.js?
Here is my hang up... both provide data structures in memory. If I don't care about persistence, why should I care which I choose?
To be be clear the context is on a server not browser and if the server resets I don't want server state to persist (outside what is stored in mongoldb).
Immutable.js will hold your data structures on the client side and will be wiped whenever the user's browser memory is cleared, i.e when they close the browser.
Redis is an in-memory database for your server that will be able to do very quick lookups and send that data to your clients. You can use redis in lieu of a database, however it's usually a good idea to have a database as backup as redis memory is quite volatile.
They are both used to solve very different problems, for your case it sounds like all you need is immutable.js
If you needed your application to have save states you would then set up a way to send your immutable.js data structures to your server to be stored in a redis database.
I'm writing a multi chatroom application that requires persistent storage of the conversations (ie new users should be able to see old messages). I'm using socket.io if that helps.
Currently, when a user logs into a room, my node app checks to see if someone has been to that room yet (and each room has a hierarchical parent, for instance the room called Pets may belong to North America since there'd be a separate European Pets room).
If this is the first time a person has been in the room for a while, it loads all messages from redis for that room. (Eventually redis stored conversations make their way into MySQL).
So I have a multidimensional array called messages["PARENT"]["ROOM"], such that messages["North America"]["Pets"] will be an array that has all the messages for that room. Aside from misunderstanding how arrays in JS work (as explained in this question: javascript push multidimensional array), it feels like I'm over complicating the situation. My reasoning for using the MD array was that it didn't make sense to be round trips requesting all the messages from Redis for a room that was active.
What would be the most logical approach to what I'm trying to accomplish? Should I just be using Redis and forgo this? How about some message queue or maybe a pubsub server? (I'm trying to not complicate this as it's just a prototype for a friend).
Thank you,
From an architectural point of view, this is a poor design. What if tomorrow you wanted to scale this application by setting up more servers? Surely these multi-dimensional arrays would be specific to each node instance. Decoupling the storage has its own advantages, one being scaling out - the storage is now shared among several servers. It all depends on what you want to achieve. You may also run out of memory if your MD increases in size, thus hampering your applications performance.