What are the potential complexities / problems unique to a large NodeJS Application

What are the potential complexities / problems unique to a large NodeJS Application - javascript

I come from a "traditional web application" background: think Java, .NET, PHP, ColdFusion, etc.
In evaluating NodeJS for use as the primary server-side technology for non-trivial applications, I'm wondering what complexities, problems, challenges a team of developers and operations folks might expect to face which are unique to NodeJS. In short, I'd like to reduce my unknowns. Some (not all) examples:
How well does it lend itself to large-team development? What unique challenges exist, for Node, in a team of 20 or 50 or 200 developers?
What unique challenges exist with respect to Database access? "Enterprisey" data access concerns are handled mostly trivially in Java (connection pools, security, etc via Spring). Is this the case with Node?
Reporting-heavy applications often require Excel, PDF, even PNG export... How does Node fare in this type of application?
Does Node present any unique challenges with respect to development-time debugging?
What unique operations challenges exist? Everything from server restarts / hot code swap / load balancing to tools for monitoring and managing a production cluster.
And so forth. What lessons exist for developing, maintaining, and production-managing a 100+K LoC codebase, deployed across a farm of servers, touched by dozens of developers?

I'll try to answer some of your questions
How well does it lend itself to large-team development? What unique challenges exist, for Node, in a team of 20 or 50 or 200 developers?
It does the same, as every other language does; nothing! Teams of programmers normally use versioning systems such as git, svn, mercurial etc. to deal with co-workers working on the same files
What unique challenges exist with respect to Database access? "Enterprisey" data access concerns are handled mostly trivially in Java (connection pools, security, etc via Spring). Is this the case with Node?
Node is Database agnostic. You can use any database with node where drivers/wrappers exist for it (same as with PHP). There are many drivers/wrappers for relational Databases (MySQL, SQLite) and NoSQL (MongoDB)
Reporting-heavy applications often require Excel, PDF, even PNG export... How does Node fare in this type of application?
Node can, as php and others do, access the normal shell. So if you handle an Image with php, you are using a wrapper for ImageMagick or GD. GD needs to be installed on the WebServer for it to work, because php sends the commands down to the command line. Same with node; find a wrapper (http://search.npmjs.org) and use the feature you desire.
Does Node present any unique challenges with respect to development-time debugging?
Node is Javascript, so it doesn't compile, but does JIT. So a failure will only be detected as soon as it is executed. You'd want a local env for every developer next to your staging/dev machine and live servers so they can test the code prior to commiting it to the staging server
What unique operations challenges exist? Everything from server restarts / hot code swap / load balancing to tools for monitoring and managing a production cluster.
No "Unique" challanges I am aware of. You'll deal with monitoring/heartbeats as with all other languages (yeah, there are languages that do this, like Erlang). You'll need a service like forever or supervisord to startup node on server restarts, but thats the same with Apache+PHP/Perl.
(Node has a Cluster module which helps you handle multiple workers
(for multi-core servers))
look at Git for managing your code. And choose the language based on what you want to be doing (high concurrency, scalability)

I will comment on the things for which I am qualified:
Connection Pooling to data sources.
The standard Node HTTP(S) server tools implement pooling by default, and there are knobs that you can configure to improve or control performance. The community being very active, there are lots of other projects (like poolee) that implement either general-purpose or specialized connection pooling. You will need to look around. In fact, given your traditional Webdev background...
1.1 Sidenote: Third-party libraries
When developing with Node, you may find yourself using a lot of third-party libraries. depending on your particular background, this may seem strange. To understand why, consider .NET or Java: the standard libraries for those platforms are gargantuan. As a result, when using those platforms you might pick very few third-party tools or plugins to help you get your work done. Node, by comparison, has an intentionally narrow and strictly controlled standard library. All the necessary APIs to unify "the way things are written" in the interest of server performance are there, but no more.
To help manage third-party libraries, the npm package manager was designed and included very early with node. It's known to be of high quality. The authors clearly anticipated a lot of third-party use.
Compute Power
You mentioned image export. Javascript libraries for all of these things exist, so as far as "it can be done easily", it can. Keep in mind that if you have a compute-heavy task, Javascript may not be the most effective language to implement the core computation. The v8 engine allows you to write modules in C, if you need to, but forwarding a request to a specialized backend server is the thing that Node does extremely well.
Operations Challenges
Node.js will not scale up to your hardware. If your server has multiple cores (which by now it most certainly does), you will need to run multiple server processes on the same physical hardware to achieve high utilization. This may make a different picture for operations: lots more processes than you would normally see, those processes groupable by physical or virtual machine.
Breaking your server into processes is not a bad thing for reliability, by the way: One large traditional process will crash on an unhandled exception, taking down with it eight (or whatever) cores worth of active sessions. Multiple Node processes will still crash individually, but with a proportionally smaller impact. There's probably room to explore the point, "how many processes is too many?", taking into account both performance and monitorability. It may actually be worth it to use more Node processes per server than there are available cores.

Related

Options for updating/migrating remote client standalone db schema during updates

Scenario:
I am currently working on a desktop-based application involving multiple independent production sites, each having its own server, database and local clients.
The software undergoes continuous automated updates, as part of these updates each production site's database (postgresql) will need to undergo schema migrations (while maintaining persistence/transformation on the existing data during the migration).
Accessing the client machines manually (e.g. SSH / Remote Desktop) to run scripts/perform the update is not an option. The client would need to be self-sufficient in checking, downloading and installing the updates.
Problem:
I have been looking around for a few months and have not yet suitable option apart from writing SQL scripts and using a separate service to run it one by one or using the updated app to run it.
Question:
What tools would you suggest would be suitable for this scenario to ease the migration process?

It is a generally an accepted practice to run SQL scripts when doing database migrations. Tools like ActiveRecord (in the ruby world) provide an intermediate layer (a DSL), but are essentially wrappers around SQL.
There is a good reason for that - a migration is what is being done on the SQL server and the only strictly defined interface for manipulating tables in SQL servers is the PL/SQL language language flavor that the particular RDBMS server is using.
This means, that if SQL scripts are an acceptable option for migrations (acceptable in the sense that the team knows it, nobody objects to this, etc.) you should use them.
To make the task easier (the way ruby does it) you could take a look at: http://migrate4j.sourceforge.net/
They describe the tool as The initial intent of migrate4j was to make a Java version of Ruby's db:migrate. But this is not a fundamental requirement.

What would it take to implement a good redis-client in the web-browser?

This has been asked before at
Can I connect directly to a Redis server from JavaScript running in a browser?
(notice my comment)
and
Connecting directly to Redis with (client side) javascript?
but I wonder about something which would have perfect realtime connection. Reading the (a node-redis client) https://github.com/luin/ioredis source I noticed the net part of node`s library is likely containing the kind of functionality we'd need to reproduce in the browser to do this .
Guessing maybe something with hacked together from pieces of webrtc functions could do it ?
Prospective benefits relate to building large distributed app systems infrastructure -- like social media (from comment on first question linked above):
I'm asking this question again, but stipulating we want a 'real' as in realtime redis-client -- not HTTP anything -- operating in the browser. Could build a great realtime 'infrastructure' with just CDN serving assets constituting the client webapp communicating with Redis directly. I want to cut out the unnecessary WebSocket server aspect of the system. All the control logic can be internalised to redis cluster in Lua.

To implement a direct redis-client in the web-browser you need to change Redis itself, so it will expose WebSocket interface. That way you will get the simplest protocol browser is allowed to use.
Other approaches will involve intermediate layers. I think it should be possible to proxy commands via ws-tcp-relay which is pretty fast.

Do I need nodejs or I can rely only on Nginx?

I have an application and I have a debate with my peers on if we need to use node.js or not.
Our decision is to use angular.js for the front-end and to communicate via REST api with the app server. The application server will not be in node.js. It could be in .net or Java
Nginx will be in front as it is better for serving static files, gzip etc..
There are many options to boilerplate your angular application and many of them includes nodejs. My first approach was to use node.js as the primary web server and scale it out for solving performance issues. Although, it wasn't a good approach as node's roles isn't to act as web server. Well, here my question arrives.
By keeping in mind the two aforementioned points are there any reasons to generate the front-end including node.js ?
Is there something that I could benefited by that I haven't think of?

Here's the short answer: If you are set on using nginx in front of a .net or Java back end, and you're just looking for a deployment tool for angular.js, then just choose whatever javascript deployment tool makes sense, which may well be built with node.
What follows is a little more exposition:
I'm not quite sure what you mean by "node's roles isn't to act as web server"... if you mean in general, then that's precisely how it's generally used, if you mean your application server will be .net or java, but not javascript, then fine.
Generally speaking, nginx will perform faster for static files, but the margin of improvement over node.js is likely to be meaningless for pretty much anyone. If you need to (or it makes sense to) include node as part of your stack for angular deployment, then it could make sense to use it as your reverse proxy and just eliminate nginx altogether. The odds that you'll get measurable benefit from using nginx instead is vanishingly small.
That said... if you've already got nginx set up and moving to node instead represents extra work that you've already done once, then it loses its primary appeal.
What node.js has going for it more than any other project I'm aware of is that it's extremely capable at every level of the web stack. But it's not necessarily more capable than individual projects used in their appropriate level of the stack, and if you're not going to use it to reduce the complexity of your stack by homogenizing the technology and applications involved, then it just comes down to preference.

If you are caring about performance especially for static files, you could add caching layer as a proxy in front of your backend nginx or a server of your choice .. Varnish-cache is a good choice.
If you want to serve static files at large scale, there is yet a better solution through which you could host your static files in CDNs that will be much better solutions for your live deployments .. Clouding services are built for ease of use and also cost effective for example fastly.com is a good choice for hosting static files with a very persistent cached layer that is built on top of Varnish-Cache .. Cloud front is also another choice if you are fan of Amazon services.
More resources that might help is a comparison benchmarks having popular servers, also another benchmark existshere

Can I use node to power a web application on a separate server?

I asked this (voted to be too broad) Question while working my way through a starter book on node. Reading this book, I'm sure I'll learn the answer to this later, but I'd be more comfortable if I knew this up front:
My Question: Can I (efficiently) continue using a usual webhost such as iPage or GoDaddy to host my web application, building and hosting the front end in a simple, traditional manner through an Apache web server, and communicate with a separate Node.js server (my application back-end) via AJAX for queries and other things that I can more efficiently process via Node?
More specifically, would this be a bad programming practice in terms of efficiency and organization? In other words, would it be likely that a large scale commercial application would ever be handled via this method?

Yes, you can separate the front-end of your web application and the APIs that power it. In fact, this is a very common configuration, especially for "large scale commercial applications".
Where you draw the separation line between the two specifically depends on what you are doing, how you're doing it, and what your needs are. Also, in terms of hosting, remember that if you're accessing something server-side across the internet, you're adding extra latency to everything. Consider getting off Go Daddy and using another host that gives you more flexibility, such as a VPS provider.

It's ok. Actually, this is how things shoud be done. You have a backend API on a separate server and lots of apps which are using the API. Just go with Nginx server, check this Apache vs Nginx.

Yes you can use node js as a part of some big application. It depends on wich type of interaction you would like to get. Is it comfortable to you to mix technologies? Then node is pretty good thing to work over web. I've finished a part of big nodejs-ruby-php-c-flash application (my part was nodejs) for very large data mounts. This application have different levels of interaction. Sometimes I use 2 languages at one time to create each part of my application the best for task I'm working on. There is applications that initiate, run and destroy mutiple OS instances. So using of multi environmental application not so hard.

Using NodeJS for a big project [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
Is NodeJS a good framework/codebase for a large server-side application? What I am looking to develop is a large application that will require HTTP transactions (states) and large amounts of concurrent users.
From what I've read online, NodeJS is not the best tool to use when it comes to large programs. What I've come across is as follows:
NodeJS runs on JavaScript which runs on event loops which are not very efficient when used in bulk.
NodeJS may be non-blocking, but all the requests are handled within a single thread so this can cause a bit of a bottleneck when many requests are handled.
NodeJS is built atop its own HTTP server so future maintenance will require its own sysadmin/developer hybrid to take care of the application.
There isn't as much well-tested and diverse software available for NodeJS that helps you build a bigger application.
Is there something I'm missing? Is NodeJS really as powerful as it can be?

Is NodeJS a good framework/codebase for a large server-side application?
That question is a bit subjective but I'm including actual objective points which solve real problems when working with node in a large project.
Update after working on project for awhile:
It is best as a front end / API server which is I/O bound (most front end/api servers are). If you have backend compute needs (processing etc...) it can be paired which other technologies (C# net core, go, Java etc... worker nodes)
I created this project as a sample illustrating most points - Sane Node Development:
https://github.com/bryanmacfarlane/sanenode
NodeJS is not built atop its own http server. It's built atop the V8 chrome javascript engine and doesn't assume an http server. There is a built in http module as well as the popular express web server but there's also socket modules (as well as socket.io). It's not just an http server.
The single thread does not cause a bottleneck because all I/O is evented and asynchronous. This link explains it well: http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
As far as the software module, you can search at the npm registry. Always look at how many other folks use it (downloads) and visit the github repo to see if it's actively being maintained (or is there a bunch of issue never getting attention).
Regarding "large project" what I've found critical for sane development is:
Compile time support (and intellisense): Find issues when you compile. If you didn't think you needed this like I did when I started, you will change your mind after that first big refactor.
Eliminate Callback Hell: Keep performance which is critical (noted above) but eliminate callback code. Use async / await to write linear code and keep async perf. Integrates with promises but much better than solely using promises.
Tooling: Lots of options but I've found best is Typescript (ES6/7 today), VS Code (intellisense), Mocha (unit testing).
Instrumentation/Logging: Getting insights into your app with tracing and instrumentation is critical.
Build on well vetted frameworks: I use express as an example but that's a preference and there's others.

Node.js is a very good tool to build distributed network services. What is your large scale application design is more than a question 'which tools to use'. A lot of people use node.js in a very heterogeneous way together with ruby, php, erlang, apache & nginx & HAproxy. If you don't know why you need node you probably don't need it. Possible reasons to consider node:
you want to share common Javascript code between server and client
you expect highly concurrent load (thousands to hundreds of thousands simultaneous connections per server)
you (or your team) is much more proficient with JavaScript than with any other available language/framework
if one of 7600+ modules is implementing large portion of required functionality

One "issue" with NodeJS, is that building a large application requires discipline on the part of the developer / team.
This is particularly true with multiple teams within the same company. Existing frameworks are a bit loose, and different teams will come up with different approaches to solving an issue.
KrakenJS is a framework, built on top of express. It adds a layer of convention and configuration that should make it easy(er) to build large projects, involving multiple teams.

Really NodeJs is powerful in its own way, Some more information,
You can run multiple instance of your app under load balance to handle massive request.
Choose NodeJs to read 2000 files instead calculating 20th prime number.
Put NodeJs busy with reading/writing in files or ports.
Very useful when you need to broadcast your response to multiple client.
Don't care about dead lock in NodeJs, But care about how frequent you are doing same operation.
Most important thing is, the values live in V8 engine until the process is terminated. Be sure how much lines of code, you are going to feed in NodeJs.

I find the most important thing is to use CPU time as least as possible. If your application needs to use CPU intensively, event loop latency would increase and the application would fail to respond any other requests.

So far as I have learned, It's the raw speed and async await that makes the biggest difference.
For those who are moderately skilled in Javascript and particularly understand REST as well as the above mentioned aspects, node.js is one of the best solutions for big enterprise applications.
Correct me if I am wrong, but even using raw express (without those frameworks built on top of it ) is actually good enough if teams agree on certain conventions and standards to follow.

Develop Reference

JavaScript is the programming language of the Web.