Node chatroom, store messages in array or just redis?

Node chatroom, store messages in array or just redis? - javascript

I'm writing a multi chatroom application that requires persistent storage of the conversations (ie new users should be able to see old messages). I'm using socket.io if that helps.
Currently, when a user logs into a room, my node app checks to see if someone has been to that room yet (and each room has a hierarchical parent, for instance the room called Pets may belong to North America since there'd be a separate European Pets room).
If this is the first time a person has been in the room for a while, it loads all messages from redis for that room. (Eventually redis stored conversations make their way into MySQL).
So I have a multidimensional array called messages["PARENT"]["ROOM"], such that messages["North America"]["Pets"] will be an array that has all the messages for that room. Aside from misunderstanding how arrays in JS work (as explained in this question: javascript push multidimensional array), it feels like I'm over complicating the situation. My reasoning for using the MD array was that it didn't make sense to be round trips requesting all the messages from Redis for a room that was active.
What would be the most logical approach to what I'm trying to accomplish? Should I just be using Redis and forgo this? How about some message queue or maybe a pubsub server? (I'm trying to not complicate this as it's just a prototype for a friend).
Thank you,

From an architectural point of view, this is a poor design. What if tomorrow you wanted to scale this application by setting up more servers? Surely these multi-dimensional arrays would be specific to each node instance. Decoupling the storage has its own advantages, one being scaling out - the storage is now shared among several servers. It all depends on what you want to achieve. You may also run out of memory if your MD increases in size, thus hampering your applications performance.

Related

When to use redis for better optimization?

I am beginner in redis and had used it in my node.js project and its providing good results when I see the caching mechanism it's been spinning
So basically in world where MySql,firebase and mongodb are top in there perspective, where would redis fit? Can we use redis for better optimization replacing any of these most popular databases or can have greater application role with specific technologies ? Maybe it should be used with javascript and its framework(eg. node.js has good analogy with redis) more?

Redis is widely used for caching. Meaning, in a high availability infrastructure, when some data has to be accessed many times, you would store it in your database and then store it in redis with some unique key which you could rebuild easily with parameters. When the data is updated, you just clear that key in redis and add it again with the new data.
Example:
You have thousands of users.
They all connect many many times and go on their profile.
You might want to store their profile info in redis with a key {userid}_user_info.
The user tries to access his profile:
first check if data exists in redis,
if yes return it,
else get it from db and insert it in redis
The user updates his profile info, just refresh the redis value.
etc.
There is also another way redis is used, it's for queuing tasks and synchronising websockets broadcasts across machines. Here is a useful article about it
http://fiznool.com/blog/2016/02/24/building-a-simple-message-queue-with-redis/
As per using redis as a database, well for simple data it can be used, like settings where a simple key/value is enough. For storing complex data, it's a bit of a hassle, specially if you want to use relational functionalities, or searching features. Redis is fast because it does not have all these features, and keeps data in memory (not only, but it does contribute).

Is immutable.js interchangeable with redis on a server?

Immutable.js states (https://facebook.github.io/immutable-js/)
Immutable.js provides many Persistent Immutable data structures including: List, >Stack, Map, OrderedMap, Set, OrderedSet and Record.
These data structures are highly efficient on modern JavaScript VMs by using >structural sharing via hash maps tries and vector tries as popularized by Clojure >and Scala, minimizing the need to copy or cache data.
Redis states (https://redis.io/)
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.
Similar language shared in their descriptions lends me to think I could use Immutable.js as a "light" Redis solution or Redis as a "heavy" Immutable.js solution to track my multiplayer's game state (all active games state and players state).
What are some key differences between these?
When should someone select immutable.js over redis?
When should someone select redis over immutable.js?
At present I cannot find a comparison of these two libraries (when I search Google or Stack Overflow), which makes me believe my assumption (that they can be used in place of each other at times) is woefully inaccurate, but I can find nothing to confirm or deny that.
I am not wondering if they can be used together (unless it is in a context of how they are different and compliment each other). I recognize Immutable.js is focused on handling structures that are Immutable, while Redis does not seem to be focused on that, but that still begs the question, can't I just stick with Immutable.js?
Here is my hang up... both provide data structures in memory. If I don't care about persistence, why should I care which I choose?
To be be clear the context is on a server not browser and if the server resets I don't want server state to persist (outside what is stored in mongoldb).

Immutable.js will hold your data structures on the client side and will be wiped whenever the user's browser memory is cleared, i.e when they close the browser.
Redis is an in-memory database for your server that will be able to do very quick lookups and send that data to your clients. You can use redis in lieu of a database, however it's usually a good idea to have a database as backup as redis memory is quite volatile.
They are both used to solve very different problems, for your case it sounds like all you need is immutable.js
If you needed your application to have save states you would then set up a way to send your immutable.js data structures to your server to be stored in a redis database.

Caching query results, to do or not to do, overkill or performance energizer?

Good evening,
my project uses the MEAN Stack and has a few collections and a single database from which the data is retrieved.
Thinking about how the user would interface itself with the webapp I am going to build, I figured that my idea of the application is quite a bit of a waste.
Now, the application is hosted on a private server on the LAN, making it very fast on requests and it's running an express server.
The application is made around employee management, services and places where the services can take place. Just describing, so to have an idea.
The "ring to rule them all" is pretty much the first collection, services, which starts the core of the application. There's a page that let's you add rows, one for each service that you intend to do and within that row you choose an employee to "run the service", based on characteristics that this employee has, meaning that if the service is about teaching Piano, the employee must know how to play Piano. The same logic works for the rest of the "columns" that will build up my row into a full service recognized by the app as such.
Now, what I said above is pretty much information retrieval from a database and logic to make the application model the information retrieved and build something with it.
My question or rather my doubt comes from how I imagined the querying to work for each field that is part of the service row. Right now I'm thinking about querying the database (mongodb) each time I have to pick a value for a field, but if you consider that I might want to add a 100 rows, each of which would have 10 fields, that would make up for a lot of requests to the database. Now, that doesn't seem elegant, nor intelligent to me, but I can't come up with a better solution or idea.
Any suggestions or rule of thumbs for a MEAN newb?
Thanks in advance!
EDIT: Answer to a comment question which was needed.
No, the database is pretty static (unless the user willingly inserts a new value, say a new employee that can do a service). That wouldn't happen very often. Considering the query that would return all the employees for a given service, those employees would (ideally) be inside an associative array, with the possibility to be "pop'd" from it if chosen for a service, making them unavailable for further services (because one person can't do two services at the same time). Hope I was clear, I'm surely not the best person at explaining oneself.

It would query the database on who is available when a user looks at that page and another query if the user assigns an employee to do a service.
In general 1 query on page load and another when data is submitted is standard.
You would only want to use an in memory cache for
frequent queries but most databases will do this automatically.
values that change frequently like:
How many users are connected
Last query sent
Something that happens on almost every query (>95%)

What's the maximum number of rooms socket.io can handle?

I am building an app using socket.io
I'm using socket.io's rooms feature, there are 5 "topics" a user can subscribe to. Each message broadcast in that topic has a message type, of which there are 100. A user will only receive the messages of the types they are allowed to receive, which could be between 30 and 70.
My question: is it feasible to create a room for every topic + message type combination, which will be 5 x 100 rooms? Will socket.io perform well like this, or is there a better way to approach this problem? Would emitting individual messages to each individual socket, instead of using rooms, be better?
Thanks for your help.

socket.io rooms are a lightweight data structure. They are simply an array of connections that are associated with that room. You can have as many as you want (within normal memory usage limits). There is no heavyweight thing that makes a room expensive in terms of resources. It's just a list of sockets that wishes to be associated with that room. Emitting to a room is nothing more than iterating through the array of sockets in the room and sending to each one.
A room costs only a little bit of memory to keep the array of sockets that are in each room. Other than that, there is no additional cost.
Further, if your alternative is to just maintain an array of sockets for each topic anyway, then your alternative is probably not saving you much or anything.
My question: is it feasible to create a room for every topic + message
type combination, which will be 5 x 100 rooms?
Yes, that is easily feasible.
Will socket.io perform well like this, or is there a better way to
approach this problem?
There's no issue with having this many rooms. Whether it performs well or not depends entirely upon what you're doing with that many rooms. If you're reguarly sending lots of messages to lots of rooms that each have lots of sockets in them, then you'll have to benchmark if that has a performance issue or not.
Would emitting individual messages to each individual socket, instead
of using rooms, be better?
There won't be an appreciable difference. A room is just a convenience tool. Emitting to a room, just has to iterate through each socket in the room and sending to it anyway - the same as you proposed doing yourself. May as well use the built-in rooms capability rather than reimplement yourself.

node.js storing gamestate, how?

I'm writing a game in javascript, and to prevent cheating, i'm having the game be played on the server (it's a board game like a more complicated checkers). Since the game is fairly complex, I need to store the gamestate in order to validate client actions.
Is it possible to store the gamestate in memory? Is that smart? Should I do that? If so, how? I don't know how that would work.
I can also store in redis. And that sort of thing is pretty familiar to me and requires no explanation. But if I do store in redis, the problem is that on every single move, the game would need to get the data from redis and interpret and parse that data in order to recreate the gamestate from scratch. But since moves happen very frequently this seems very stupid to me.
What should I do?

If you really, really don't want the overhead of I/O then just store the game state in a global object keyed by the game id:
var global_gamesate = {}
Then on each connection check what the game id is to retrieve he game state:
var gamestate = global_gamestate[game_id];
Presumably you already have a mechanism to map client sessions to game id.
Usually, game state is small and would hardly take up much RAM. Let's be pessimistic and assume each game state takes up 500K. Then you can serve two million thousand games (four million thousand users if we assume two users per game) for each gigabyte of RAM on your server.
However, I would like to point out that databases like MySQL already implement caching (which is configurable) so loading the most frequently used data basically loads from RAM with a minor socket I/O overhead. The advantages of databases is that you can have much more data than you have RAM because they store the rest on disk.
If your program ever reaches the load where you start thinking of writing your own disk-serialization algorithm to implement a swap file then you're basically re-inventing the wheel. In which case I'd say go with databases.

Develop Reference

JavaScript is the programming language of the Web.