I would like to deploy a nodejs project with frequent updates. npm is not available at the site so I must package the node_modules. This works ok but takes a long time to send to the customer over the available ftp connection (80MB of mostly node_module files every time). My workflow looks like this:
git clone project
npm install # installs all my dev tools which I need for packaging
grunt build
tar xvzf build.tar.gz build/
The build step minfifies my code packaging only what is needed. The node_modules folder is copied into the build folder. If I use npm install --production, I get a smaller footprint but miss the tools I need to build it in the first place. So in the end I go to some effort to make my code footprint small but all my work is undone by having to package such a large node_modules tree.
Is my approach wrong? Is there a simpler way to deploy where npm is not available on the production server or is there a good way to reduce the size of the node_modules folder?
Update: Since writing this answer, npm3 (and yarn) arrived, and
flattened npm dependencies. This reduces the size of the
node_modules folder considerably (perhaps 20% - 30% for a
typical project). Nevertheless, some of the tips below that will
reduce your footprint by an order of magnitude.
I have compiled the list of findings for anyone wanting to
deploy without npm on the server or
reduce the footprint of the node_modules folder
Smaller node_modules footprint:
Use npm prune --production to remove devDependencies and purge additional modules
In my case this node_modules folder size by about 20%.
The bulk of large files under node_modules folders is confined to a small number of modules that are unused at runtime!. Purging/deleting these reduces the footprint by a factor of 10! e.g: karma, bower, less and grunt. Many of these are used by the modules themselves and have no place in a
production build. The drawback is that npm install has to be run before each build.
Use partial npm packages
Many npm packages are available in parts. For example, of
installing all of async or lodash install only the bits you
need: e.g.
Bad: npm install -save lodash async
Good: npm install --save async.waterfall async.parallel lodash.foreach
Typically, individual lodash modules are 1/100th the size of the full package.
npm-package-minifier may be used to reduce the size of the node_modules tree
Compacting node_modules for client-side deployment
This basically deletes a lot of unused files in the
node_modules tree. This tool will reduce the size of
devDependencies also so it should be run on a 'production'
version of node_modules.
Reducing size of updates
Differential deployment
As mentioned in the comments, updates may be split into updates where dependency changes are required or only business logic changes. I have tried this approach and it greatly reduces the footprint for most updates. However, it also increases the complexity of deployment.
Related
What are the differences between Yarn and NPM?
At the time of writing this question I can only find some articles on the Internet showing what's the Yarn equvalent of an NPM command like this.
Do they have the same functionalities (I know Yarn does local caching and looks like you only need to download a package once) but other than this is there any benefits for moving from NPM to Yarn?
UPDATE: March 2018 (bit late...)
Since version 5, npm
generates a 'lockfile' called package-lock.json that fixes your entire dependency tree much the same way the yarn (or any other) locking mechanism does,
A tool has been made
--save is now implied for npm i
Better network and cache usage
npm 5.7.0 further introduced the npm ci command to install dependencies more quickly in a continuous integration environment by only installing packages found in the package-lock.json (reporting an error if the package-lock.json and package.json are not synchronized).
Personally, I still use npm.
Original
I am loathe to quote directly from docs, but they do a great job of explaining why, concisely enough that I don't see how to further summarize the ideas.
Largely:
You always know you're getting the same thing on every development
machine
It paralellizes operations that npm does not, and
It makes more efficient use of the network.
It may make more efficient use of other system resources (such as RAM) as well.
What are people's production experiences with it? Who knows, it's an infant to the general public.
TL;DR from Yehuda Katz:
From the get-go, the Yarn lockfile guarantees that repeatedly running
yarn on the same repository results in the same packages.
Second, Yarn attempts to have good performance, with a cold cache, but
especially with a warm cache.
Finally, Yarn makes security a core value.
Nice blog post
“NPM vs Yarn Cheat Sheet” by Gant Laborde
Slightly longer version from the project:
Fast: Yarn caches every package it downloads so it never needs to
again. It also parallelizes operations to maximize resource
utilization so install times are faster than ever.
Reliable: Using a detailed, but concise, lockfile format, and a
deterministic algorithm for installs, Yarn is able to guarantee that
an install that worked on one system will work exactly the same way on
any other system.
Secure: Yarn uses checksums to verify the integrity of every installed
package before its code is executed.
And from the README.md:
Offline Mode: If you've installed a package before, you can install it again without any internet connection.
Deterministic: The same dependencies will be installed the same exact way across every machine regardless of install order.
Network Performance: Yarn efficiently queues up requests and avoids request waterfalls in order to maximize network utilization.
Multiple Registries: Install any package from either npm or Bower and keep your package workflow the same.
Network Resilience: A single request failing won't cause an install to fail. Requests are retried upon failure.
Flat Mode: Resolve mismatching versions of dependencies to a single version to avoid creating duplicates.
More emojis. 🐈
What is PNPM?
pnpm uses hard links and symlinks to save one version of a module only ever once on a disk. When using npm or Yarn for example, if you have 100 projects using the same version of lodash, you will have 100 copies of lodash on disk. With pnpm, lodash will be saved in a single place on the disk and a hard link will put it into the node_modules where it should be installed.
As a result, you save gigabytes of space on your disk and you have a lot faster installations! If you'd like more details about the unique node_modules structure that pnpm creates and why it works fine with the Node.js ecosystem, read this small article: Why should we use pnpm?
How to install PNPM?
npm install -g pnpm
How to install npm package using PNPM?
pnpm install -g typescript // or your desired package
Benefits of PNPM over Yarn and NPM
Here is progress-bar showing installation time taken by NPM, YARN and PNPM (shorter-bar is better)
Click for Complete check Benchmark
for more details, visit https://www.npmjs.com/package/pnpm
Trying to give a better overview for beginners.
npm has been historically (2010) the most popular package manager for JavaScript. If you want to use it for managing the dependencies of your project, you can type the following command:
npm init
This will generate a package.json file. It contains all the dependencies of the project.
Then
npm install
would create a directory node_modules and download the dependencies (that you added to the package.json file) inside it.
It will also create a package-lock.json file. This file is used to describe the tree of dependecies that was generated. It allows developpers to install exectly the same dependencies. For example, you could imagine a developper upgrading a dependency to v2 and then v3 while another one directly upgrading to v3.
npm installs dependencies in a non-deterministically way meaning the two developper could have a different node_modules directory resulting into different behaviours. **npm has suffered from bad reputation as for example
in February 2018: an issue was discovered in version 5.7.0 in which running sudo npm on Linux systems would change the ownership of system files, permanently breaking the operating system.
To resolve those problems and others, Facebook introduced a new package manager (2016): Yarn a faster, more securely, and more reliably package manager for JavaScript.
You can add Yarn to a project by typing:
yarn init
This will create a package.json file. Then, install the dependencies with:
yarn install
A folder node_modules will be generated. Yarn will also generate a file called yarn.lock. This file serve the same purpose as the package-lock.json but is instead constructed using a deterministic and reliable algorithm thus leading to consistant builds.
If you started a project with npm, you can actually migrate to Yarn easily. yarn will consume the same package.json. See Migrating from npm for more details.
However, npm has been improved with each new releases and some projects still uses npm over yarn.
The answer by #msanford covers almost everything, however, I'm missing the security (OWASP's Known Vulnerabilities) part.
Yarn
You can check them using yarn audit, however, you cannot fix them. This is still an open issue on a GitHub (https://github.com/yarnpkg/yarn/issues/7075).
npm
You can use npm audit fix, so some of them you can fix by yourself.
Both of them, i.e. npm audit & yarn audit have their own Continuous Integration tools. These are respectively https://github.com/IBM/audit-ci (used, works great!) and https://yarnpkg.com/package/audit-ci (haven't used).
npm:
The package manager for JavaScript. npm is the command-line
interface to the npm ecosystem. It is battle-tested, surprisingly
flexible, and used by hundreds of thousands of JavaScript developers
every day.
NPM generates a correct lock file whereas a Yarn lock file could be
corrupt in some cases and has to be fixed with yarn-tools
Yarn:
A new package manager for JavaScript. Yarn caches every package it
downloads so it never needs to again. It also parallelizes
operations to maximize resource utilization so install times are
faster than ever.
Yarn doesn't support login with a password (while NPM does)
When you install a package using Yarn (using yarn add packagename), it places the package on your disk. During the next install, this package will be used instead of sending an HTTP request to get the tarball from the registry.
Yarn comes with a handy license checker, which can become really powerful in case you have to check the licenses of all the modules you depend on.
If you are working on proprietary software, it does not really matter which one you use. With npm, you can use npm-shrinkwrap.js, while you can use yarn.lock with Yarn.
For more information please read the following blog
https://blog.risingstack.com/yarn-vs-npm-node-js-package-managers/
Yarn
Advantages::
Supports features like parallel installation and
Zero-Install results in better performance
More secure
Large active user community
Disadvantages::
Doesn’t work with older versions of Node.js (lower than version 5)
Problems with installing native modules
NPM
Advantages::
Ease of use, especially for developers working with older
versions.
Optimized local package installation to save hard drive space.
Disadvantages::
Security vulnerabilities are still there
Conclusion:
Is Yarn better than NPM?
In terms of speed and performance Yarn is better than NPM because it performs the parallel installation. Yarn is still more secure than NPM. However, Yarn uses more disk space than NPM.
I inherited maintenance of an NPM package. It is a little unusual in that its main file is in dist/; it’s built with webpack (via npm run build).
This is fine for our purposes, but when we install this package into a consuming application, we get just oodles and oodles of dependencies. It’s adding minutes to the consumer’s npm install time, and all for nothing, as the main is already built.
I’m pretty sure we’re “doing it wrong.” Is there a better way to distribute an npm package that delivers a pre-built js file such that dependencies aren’t needlessly passed on to users?
#mscdex nailed it. I wasn’t aware that npm already behaves the way I wanted: npm install will install devDependencies by default, but only your direct, top-level devDependencies, not recursively. So your dependencies' devDepedencies will not be installed.
Good to know.
When installing anything via npm, it downloads dozens of not needed files. Usually I am looking for a library final build, a *.min.js file or anything like that but the rest is useless.
How do you handle all these useless files? Do you remove them by hand or generate the final app with any build tool like gulp or grunt?
I'm quite confused as I have plenty of npm modules installed in my webapp and the folder size is about 50 megabytes but it could be 2mb only.
npm install --production
Just doing an npm install brings in both development and runtime dependencies. You could also set the ENV to production globally for the server: npm config set production.
See this github issue. Note that this won't get you only the final minified build of everything, but will greatly reduce the bloat. For instance, a library might rely on babel-cli, babel-preset-es2015, and uglifyjs to be built (devDependency), but you don't need any of that if it also includes the transpiled minified file.
Managing Packages
For front end non-development packages I prefer Bower. It maintains the minified and non-minified version of your packages.
Build Tool
Use either Gulp or Grunt. Gulp would be my tool of choice.
Gulp task that will greatly improve your code are:
minification of both css and js
optimization/compression of images
concatenation and caching to reduce the number of calls to the server
package versioning
automatic injection of project dependencies
automatic injection of external dependencies
static analysis of js and css
automatic builds on code changes
deployment
testing
Node
If you can, leave to node all your development tools and leave to bower all your release plugins. Most node packages that are used in released apps have a bower installation counterpart.
Edit
Don't delete anything from Node manually as you don't know which packages have other packages as dependencies. If you are afraid that you may have junk in there, use npm rimraf to delete the node_modules folder, and then run npm install. Most importantly check your package.json for unnecessary saved packages.
Whenever I make projects, I have to download all dependencies of node modules. Without copying the node_modules, Is there anyway to share the central node_modules in multiple projects?
like the followings, I have to run many commands every time..
npm install gulp-usemin
npm install gulp-wrap
npm install gulp-connect
npm install gulp-watch
npm install gulp-minify-css
npm install gulp-uglify
npm install gulp-concat
npm install gulp-less
npm install gulp-rename
npm install gulp-minify-html
You absolutely can share a node_modules directory amongst projects.
From node's documentation:
If the module identifier passed to require() is not a native module,
and does not begin with '/', '../', or './', then node starts at the
parent directory of the current module, and adds /node_modules, and
attempts to load the module from that location.
If it is not found there, then it moves to the parent directory, and
so on, until the root of the file system is reached.
For example, if the file at '/home/ry/projects/foo.js' called
require('bar.js'), then node would look in the following locations, in
this order:
/home/ry/projects/node_modules/bar.js /home/ry/node_modules/bar.js
/home/node_modules/bar.js /node_modules/bar.js
So just put a node_modules folder inside your projects directory and put in whatever modules you want. Just require them like normal. When node doesn't find a node_modules directory in your project folder, it will check the parent folder automatically. So make your directory structure like this:
-myProjects
--node_modules
--myproject1
---sub-project
--myproject2
So like this, even your sub-project's dependencies can draw on your main node_modules repository.
One drawback to doing it this way is you will have to build out your package.json file manually (unless someone knows a way to automate this with grunt or something). When you install your packages and add the --save arg to an npm install command it automatically appends it to the dependencies section or your package.json, which is convenient.
Try pnpm instead of npm.
pnpm uses hard links and symlinks to save one version of a module only ever once on a disk.
If you have npm installed, you can install in your terminal with:
npm install -g pnpm
To update your existing installations (and sub-directories) use:
pnpm recursive install
Or use the shorthand command (leave off -r if you need to target only one directory)
pnpm -r i
One helpful note: You may find some rare packages don't have all their dependencies defined. They might rely on the flat node_modules file directory structure of npm or yarn installs. If you run into issues of missing dependencies, use this command to hoist all the sub dependencies into a flat-file structure:
pnpm install --shamefully-hoist
It's best to avoid using the --shamefully-hoist flag as it defeats the purpose of using pnpm in the first place, so try using the command pnpm i your-missing-package first (See pnpm FAQ).
I found a trick, just take a look at the Symbolic Links (symlinks) on Windows or Linux, it is working just like shortcuts but more powerful.
Simply you need to make a Junction for your node_modules folder anywhere you want. The junction is nothing but a short cut to your original node_modules folder. Create it inside your project folder where the actual node_modules would have been created if used npm install.
To achieve this you need at least one node_modules real folder then make a Junction to it in the other projects.
On Windows, you can either use the Command Prompt, or use an application. Using the Command Prompt gives you a bit more control, using an application is easier I suggest Link Shell Extension.
Main directory should look like this
node_modules
Project 1
Project 2
Project 3
Project 4
just open the file Project 1/.angular-cli.json
change the schema
"$schema": "./node_modules/#angular/cli/lib/config/schema.json",
to
"$schema": "./../node_modules/#angular/cli/lib/config/schema.json"
and don't forget to create node_modules empty folder inside your project directory
See also npm v7.0.0's support for workspaces
RFC
https://github.com/npm/rfcs/blob/latest/implemented/0026-workspaces.md
Documentation
https://docs.npmjs.com/cli/v7/using-npm/workspaces
By looking at some articles it seems that Lerna
is a good tool for managing multiple projects inside a single directory (monorepo). It supports modules sharing without duplicating the entire packages in every folder and commands to install them in multiple projects.
Javascript monorepos
Monorepos by example
Building large scale apps in a monorepo
pnpm is also a simple and efficient tool, which doesn't duplicate those modules which are already installed for other projects.
Let's assume that having a single node_modules it should contain all the packages for all applications. thus your apps will also share most of the unique package.json entries (just the name should change)
my idea would be to have a single root and multiple src level as below
root\package.json
root\node_modules
root\\..
root\app1\src\\..
root\app2\src\\..
the only issue you might face would be having a backup of json (or tsconfig) for any app and restore them when you work on it or setup your startup scripts to serve any app
I created a small test program for web applications that uses jasmine, and I'm preparing it for easy downloads. Before installing my package, the user's project should look something like this:
myProject/
app/
lib/
...
I want to be able to have the user cd to myProject in the terminal, issue a single command that points to the app and lib folders, and then end up with this:
myProject/
app/
lib/
requirejs
test/
lib/
node_modules/
specs/
SpecRunner.html
server.js
...
app/ should contain the js project files, lib/ should contain all the external js dependencies for the project, and test/lib/ should contain all the external dependencies for the tests. server.js runs with nodejs and depends on apps installed in node_modules/.
What's the best way to go about doing this? I could make a bash script, but I'd rather use a package manager. I'm not sure how I'd do this in bower or npm. And am I right in thinking it's better to have two libs, one for the project and one for testing, rather than one? I know I can declare certain packages as test packages in bower, but it seems like they should live in a separate libraries.
And am I right in thinking it's better to have two libs, one for the project and one for testing, rather than one?
No. The idiomatic way in the npm-verse is to have tests in the same package in the test folder. Since bower is based on npm I'd say the same applies there too. If you don't want bower users to have to download test-stuff you should be able to ignore the test folder in the bower.json file (according to this answer). You should also specify node modules that are only used for tests as devDependencies.
Developers who want to run your test should IMO install it directly from source using e.g. git clone git#github.com/your/repo.git (and then just run npm install). Or simply npm install x if it's available on npm. Even if you really want the tests in their own package, I'd still suggest not using a package manager but ask the developer to clone it from the repo into the test folder.
Anyway to answer the question, the following one-liner should work (assuming npm, I'm not too familiar with bower):
npm install x-test && mv node_modules/x-test test