How does Node.JS handle duplicate transitive dependencies?

How does Node.JS handle duplicate transitive dependencies? - javascript

I apologize if my questions are naïve. In full disclosure I'm relatively new to Node.JS and JavaScript in general. I'm hoping someone could shed some light on how Node.JS handles duplicate, possibly transitive, dependencies? Not even in terms of the global namespace or any kind of conflicts, or different versions of the same module (e.g. v0.1 vs v0.2 elsewhere in your app), but more around being smart and efficient where possible. For example:
Is there any chance that Node is smart enough in terms of footprint to not have multiple copies of the same exact version of the library in your modules folder? Something like 1 copy for each required version with symbolic links or something similar pointing to this code for each module that depends on that version of that module?
What about in terms of loading duplicate modules into memory at runtime? If v0.1 of module x is already loaded into memory, if some other depended upon module comes along that requires that same version of that module, will the code be re-loaded into memory, or is Node smart enough to see that code is already loaded and re-use it? How sandboxed is Node in this regard?
Thanks!

Node.js has no concept of versions. The require() function resolves its argument to a full path to a .js file, and caches them by filename.
You may be asking how npm installs modules; that depends on the order in which you install them.
You can run npm dedup to do nice things here.

Related

Any ideas on getting away from lengthy relative imports in cloud functions? Written in node.js and javascript

I was wondering if anyone here had some ideas or experience getting rid of long, hard-to-maintain relative imports inside of large node.js cloud functions projects. We’ve found that the approach which uses local NPM packages is very sub-optimal because of how quickly we tend to roll out and test new packages and functionality, and refactoring from JS to TS is impossible for us at the moment. We'd love to do it in the near future but are so slammed as it is currently :(
Basically what I’m trying to do in cloud functions is go from const {helperFunction} = require(‘../../../../../../helpers’)
to
const {helperFunction} = require(‘helpers’)
I have been unable to get babel or anything similar to that working in cloud functions. Intuitively I feel like there is an obvious solution to this beyond something like artifact registry or local NPM packages but i’m not seeing it! Any help would be greatly, greatly appreciated :)

For CommonJS modules in general, you have several options. I don't know the Google Cloud environment so you will have to decide which options seem appropriate to it.
The most attractive option to me is #5 as it's just a built-in search up the directory tree, designed specifically to look in the node_modules sub-directories of your parent directories. If you can install the shared modules in that way, then nodejs should be able to find them as long as your directory hierarchy is retained by Google Cloud.
Options #3 and #4 are hacking on the loader which has its own risks, but does give you imlementation flexibility as you could implement your own prefix that looks in a particular spot. But, it's hacking and may or may not work in the Google cloud environment.
Options #1 and #2 rely on environment variables and shared directories which may or may not be relevant in the Google cloud environment.
You can specify the environment variable NODE_PATH as a colon-delimited (or semi-colon on Windows) list of paths to search for modules. Doc here.
In addition, nodejs will search $HOME/.node_modules and $HOME/.node_libraries where $HOME is the user's home directory.
There is a package called module-alias here that is designed specifically to help you solve this. You define module aliases in your package.json, import this one module and then you can use the directory aliases in your require() statements.
You can make your own pre-processor for resolving a module filename that is being loaded by require. You cando this by monkey patching Module._resolveFilename to either modify the filename passed to it or to add additional search paths to the options argument. This is the general concept that the module-alias package (mentioned in point #3 above) uses.
If the actual location of the helper module you want to load is in a node_modules directory somewhere above your current module directory on this volume, it can be found automatically as long as the require is just a filename as in require("helpers"). An example in the doc here describes this:
For example, if the file at /home/ry/projects/foo.js called require('bar.js'), then Node.js would look in the following locations, in this order:
/home/ry/projects/node_modules/bar.js
/home/ry/node_modules/bar.js
/home/node_modules/bar.js
/node_modules/bar.js
You can see how it automatically searches up the directory tree looking in each parent node_modules sub-directory all the way up to the root. If you put these common, shared modules in your main project file's node_modules or in any directory above, then it will be found automatically. This might be the simplest way to do things as it's just a directory structuring and removing of all the ../../ stuff you have in the paths. Clearly these common, shared modules are already located somewhere common - you just need to make sure they're in this search hierarchy so they can be found automatically.
Note this info is for CommonJS modules and may be different for ESM modules.

Excessive Recursion of Node.js Project Dependencies

So after a long day at work, I sit down and see an alert for Windows SkyDrive in the system tray:
Files can't be uploaded because the path of this file or folder is too long. Move the item to a different location or shorten its name.
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\grunt-contrib-nodeunit\node_modules\nodeunit\node_modules\tap\node_modules\runforcover\node_modules\bunker\node_modules\burrito\node_modules\traverse\example\stringify.js
... and for a while, I laughed at that technological limitation.
But then, I wondered: is that amount of directory recursion within a Node project really necessary? It would appear that the paths beyond "angular-app\server\node_modules" are simply dependencies of the project as a whole and might be better expressed as:
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\grunt-contrib-nodeunit\
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\nodeunit\
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\tap\
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\runforcover\
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\bunker\
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\burrito\
C:\Users\Matthew\SkyDrive\Documents\Projects\Programming\angular-app\server\node_modules\traverse\
I hadn't really given it much thought before, as package management in Node seems like magic compared to many platforms.
I would imagine that some large-scale Node.js projects even contain many duplicate modules (having the same or similar versions) which could be consolidated into a lesser amount. It could be argued that:
The increased amount of data stored and transmitted as a result of
duplicate dependencies adds to the cost of developing software.
Shallower directory structures (especially in this context) are
often easier to navigate and understand.
Excessively long path names can cause problems in some computing
environments.
What I am proposing (if such a thing does not exist) is a Node module that:
Recursively scans a Node project, collecting a list of the nested node_modules folders and how deeply they are buried in relation to the root of the project.
Moves the contents of each nested node_modules folder to the main node_modules folder, editing the require() calls of each .js file such that no references are broken.
Handles multiple versions of duplicate dependencies
If nothing else, it would make for an interesting experiment. What do you guys think? What potential problems might I encounter?

See if
npm dedupe
sets you right.
API Doc here

See fenestrate, npm-flatten, flatten-packages, npm dedupe, and multi-stage-installs.
Quoting Sam Mikes from this StackOverflow question:
npm will add dedupe-at-install-time by default. This is significantly more feasible than Node's module system changing, but it is still not exactly trivial, and involves a lot of reworking of some long-entrenched patterns.
This is (finally) currently in the works at npm, going by the name multi-stage-install, and is targeted for npm#3. npm development lead Forrest Norvell is going to spend some time running on Windows in the new year, so please do create windows-related issues on the npm issue tracker < https://github.com/npm/npm/issues >

Load remote scripts with browserify

I really like using cdnjs to load up javascript on the client-side, it makes my project smaller and cleaner, and loads everything faster as well. I currently use require.js for module loading, which can load from cdnjs and shim traditional scripts to work with it easily. I've been looking more into browserify recently as an alternative, and while I did find browserify-shim, which can shim non-cjs modules much like require does, I'm curious if there is a way to load a script from a remote source with browserify, or if you have to install everything locally no matter what.
If the answer is that you would have to install everything locally through npm, this makes things a little weird. On one hand, you can add node_modules to the .gitignore file and not have to worry about keeping all the deps on version control if you are using a package.json, but on the other hand, you'd need to get the modules back in there on deploy, which means an additional post-deploy step that would run npm install and that node would need to be installed wherever you are deploying to, which also seems a little awkward to me for a static site especially.
Really, any ideas or discussion on this would be great : )

The way I think about it is this, you have three options: concat the JS files together locally (browserify) before deployment, load them in real-time (require.js), or a mix of both. To be fair, you can use require.js to concat your files with r.js too. For me at least, I like how browserify is designed to use the same syntax and mentality as npm modules. I think in the end the weirdness your experiencing doesn't really matter. If all the code is packaged together, you deploy, and there aren't any dependencies, seems like a win to me. Also, I think this is more in line with Java and similar compiled languages are doing, which is putting all the deps together in a deployable package. I know I mention Java but don't let that scare you because really we are all benefitting from the ideas of those around us even the languages we think we don't like. If I had to bet my money, I would bet on browserify since it's offering (what I consider) to be a more mature means of handling modules (organized by file based rather than syntax). The npm also gives us a great way to share our code so two thumbs up for them.

What is the difference between Bower and npm?

What is the fundamental difference between bower and npm? Just want something plain and simple. I've seen some of my colleagues use bower and npm interchangeably in their projects.

All package managers have many downsides. You just have to pick which you can live with.
History
npm started out managing node.js modules (that's why packages go into node_modules by default), but it works for the front-end too when combined with Browserify or webpack.
Bower is created solely for the front-end and is optimized with that in mind.
Size of repo
npm is much, much larger than bower, including general purpose JavaScript (like country-data for country information or sorts for sorting functions that is usable on the front end or the back end).
Bower has a much smaller amount of packages.
Handling of styles etc
Bower includes styles etc.
npm is focused on JavaScript. Styles are either downloaded separately or required by something like npm-sass or sass-npm.
Dependency handling
The biggest difference is that npm does nested dependencies (but is flat by default) while Bower requires a flat dependency tree (puts the burden of dependency resolution on the user).
A nested dependency tree means that your dependencies can have their own dependencies which can have their own, and so on. This allows for two modules to require different versions of the same dependency and still work. Note since npm v3, the dependency tree will be flat by default (saving space) and only nest where needed, e.g., if two dependencies need their own version of Underscore.
Some projects use both: they use Bower for front-end packages and npm for developer tools like Yeoman, Grunt, Gulp, JSHint, CoffeeScript, etc.
Resources
Nested Dependencies - Insight into why node_modules works the way it does

This answer is an addition to the answer of Sindre Sorhus. The major difference between npm and Bower is the way they treat recursive dependencies. Note that they can be used together in a single project.
On the npm FAQ: (archive.org link from 6 Sep 2015)
It is much harder to avoid dependency conflicts without nesting
dependencies. This is fundamental to the way that npm works, and has
proven to be an extremely successful approach.
On Bower homepage:
Bower is optimized for the front-end. Bower uses a flat dependency
tree, requiring only one version for each package, reducing page load
to a minimum.
In short, npm aims for stability. Bower aims for minimal resource load. If you draw out the dependency structure, you will see this:
npm:
project root
[node_modules] // default directory for dependencies
-> dependency A
-> dependency B
[node_modules]
-> dependency A
-> dependency C
[node_modules]
-> dependency B
[node_modules]
-> dependency A
-> dependency D
As you can see it installs some dependencies recursively. Dependency A has three installed instances!
Bower:
project root
[bower_components] // default directory for dependencies
-> dependency A
-> dependency B // needs A
-> dependency C // needs B and D
-> dependency D
Here you see that all unique dependencies are on the same level.
So, why bother using npm?
Maybe dependency B requires a different version of dependency A than dependency C. npm installs both versions of this dependency so it will work anyway, but Bower will give you a conflict because it does not like duplication (because loading the same resource on a webpage is very inefficient and costly, also it can give some serious errors). You will have to manually pick which version you want to install. This can have the effect that one of the dependencies will break, but that is something that you will need to fix anyway.
So, the common usage is Bower for the packages that you want to publish on your webpages (e.g. runtime, where you avoid duplication), and use npm for other stuff, like testing, building, optimizing, checking, etc. (e.g. development time, where duplication is of less concern).
Update for npm 3:
npm 3 still does things differently compared to Bower. It will install the dependencies globally, but only for the first version it encounters. The other versions are installed in the tree (the parent module, then node_modules).
[node_modules]
dep A v1.0
dep B v1.0
dep A v1.0 (uses root version)
dep C v1.0
dep A v2.0 (this version is different from the root version, so it will be an nested installation)
For more information, I suggest reading the docs of npm 3

TL;DR: The biggest difference in everyday use isn't nested dependencies... it's the difference between modules and globals.
I think the previous posters have covered well some of the basic distinctions. (npm's use of nested dependencies is indeed very helpful in managing large, complex applications, though I don't think it's the most important distinction.)
I'm surprised, however, that nobody has explicitly explained one of the most fundamental distinctions between Bower and npm. If you read the answers above, you'll see the word 'modules' used often in the context of npm. But it's mentioned casually, as if it might even just be a syntax difference.
But this distinction of modules vs. globals (or modules vs. 'scripts') is possibly the most important difference between Bower and npm. The npm approach of putting everything in modules requires you to change the way you write Javascript for the browser, almost certainly for the better.
The Bower Approach: Global Resources, Like <script> Tags
At root, Bower is about loading plain-old script files. Whatever those script files contain, Bower will load them. Which basically means that Bower is just like including all your scripts in plain-old <script>'s in the <head> of your HTML.
So, same basic approach you're used to, but you get some nice automation conveniences:
You used to need to include JS dependencies in your project repo (while developing), or get them via CDN. Now, you can skip that extra download weight in the repo, and somebody can do a quick bower install and instantly have what they need, locally.
If a Bower dependency then specifies its own dependencies in its bower.json, those'll be downloaded for you as well.
But beyond that, Bower doesn't change how we write javascript. Nothing about what goes inside the files loaded by Bower needs to change at all. In particular, this means that the resources provided in scripts loaded by Bower will (usually, but not always) still be defined as global variables, available from anywhere in the browser execution context.
The npm Approach: Common JS Modules, Explicit Dependency Injection
All code in Node land (and thus all code loaded via npm) is structured as modules (specifically, as an implementation of the CommonJS module format, or now, as an ES6 module). So, if you use NPM to handle browser-side dependencies (via Browserify or something else that does the same job), you'll structure your code the same way Node does.
Smarter people than I have tackled the question of 'Why modules?', but here's a capsule summary:
Anything inside a module is effectively namespaced, meaning it's not a global variable any more, and you can't accidentally reference it without intending to.
Anything inside a module must be intentionally injected into a particular context (usually another module) in order to make use of it
This means you can have multiple versions of the same external dependency (lodash, let's say) in various parts of your application, and they won't collide/conflict. (This happens surprisingly often, because your own code wants to use one version of a dependency, but one of your external dependencies specifies another that conflicts. Or you've got two external dependencies that each want a different version.)
Because all dependencies are manually injected into a particular module, it's very easy to reason about them. You know for a fact: "The only code I need to consider when working on this is what I have intentionally chosen to inject here".
Because even the content of injected modules is encapsulated behind the variable you assign it to, and all code executes inside a limited scope, surprises and collisions become very improbable. It's much, much less likely that something from one of your dependencies will accidentally redefine a global variable without you realizing it, or that you will do so. (It can happen, but you usually have to go out of your way to do it, with something like window.variable. The one accident that still tends to occur is assigning this.variable, not realizing that this is actually window in the current context.)
When you want to test an individual module, you're able to very easily know: exactly what else (dependencies) is affecting the code that runs inside the module? And, because you're explicitly injecting everything, you can easily mock those dependencies.
To me, the use of modules for front-end code boils down to: working in a much narrower context that's easier to reason about and test, and having greater certainty about what's going on.
It only takes about 30 seconds to learn how to use the CommonJS/Node module syntax. Inside a given JS file, which is going to be a module, you first declare any outside dependencies you want to use, like this:
var React = require('react');
Inside the file/module, you do whatever you normally would, and create some object or function that you'll want to expose to outside users, calling it perhaps myModule.
At the end of a file, you export whatever you want to share with the world, like this:
module.exports = myModule;
Then, to use a CommonJS-based workflow in the browser, you'll use tools like Browserify to grab all those individual module files, encapsulate their contents at runtime, and inject them into each other as needed.
AND, since ES6 modules (which you'll likely transpile to ES5 with Babel or similar) are gaining wide acceptance, and work both in the browser or in Node 4.0, we should mention a good overview of those as well.
More about patterns for working with modules in this deck.
EDIT (Feb 2017): Facebook's Yarn is a very important potential replacement/supplement for npm these days: fast, deterministic, offline package-management that builds on what npm gives you. It's worth a look for any JS project, particularly since it's so easy to swap it in/out.
EDIT (May 2019)
"Bower has finally been deprecated. End of story." (h/t: #DanDascalescu, below, for pithy summary.)
And, while Yarn is still active, a lot of the momentum for it shifted back to npm once it adopted some of Yarn's key features.

2017-Oct update
Bower has finally been deprecated. End of story.
Older answer
From Mattias Petter Johansson, JavaScript developer at Spotify:
In almost all cases, it's more appropriate to use Browserify and npm over Bower. It is simply a better packaging solution for front-end apps than Bower is. At Spotify, we use npm to package entire web modules (html, css, js) and it works very well.
Bower brands itself as the package manager for the web. It would be awesome if this was true - a package manager that made my life better as a front-end developer would be awesome. The problem is that Bower offers no specialized tooling for the purpose. It offers NO tooling that I know of that npm doesn't, and especially none that is specifically useful for front-end developers. There is simply no benefit for a front-end developer to use Bower over npm.
We should stop using bower and consolidate around npm. Thankfully, that is what is happening:
With browserify or webpack, it becomes super-easy to concatenate all your modules into big minified files, which is awesome for performance, especially for mobile devices. Not so with Bower, which will require significantly more labor to get the same effect.
npm also offers you the ability to use multiple versions of modules simultaneously. If you have not done much application development, this might initially strike you as a bad thing, but once you've gone through a few bouts of Dependency hell you will realize that having the ability to have multiple versions of one module is a pretty darn great feature. Note that npm includes a very handy dedupe tool that automatically makes sure that you only use two versions of a module if you actually have to - if two modules both can use the same version of one module, they will. But if they can't, you have a very handy out.
(Note that Webpack and rollup are widely regarded to be better than Browserify as of Aug 2016.)

Bower maintains a single version of modules, it only tries to help you select the correct/best one for you.
Javascript dependency management : npm vs bower vs volo?
NPM is better for node modules because there is a module system and you're working locally.
Bower is good for the browser because currently there is only the global scope, and you want to be very selective about the version you work with.

My team moved away from Bower and migrated to npm because:
Programmatic usage was painful
Bower's interface kept changing
Some features, like the url shorthand, are entirely broken
Using both Bower and npm in the same project is painful
Keeping bower.json version field in sync with git tags is painful
Source control != package management
CommonJS support is not straightforward
For more details, see "Why my team uses npm instead of bower".

Found this useful explanation from http://ng-learn.org/2013/11/Bower-vs-npm/
On one hand npm was created to install modules used in a node.js environment, or development tools built using node.js such Karma, lint, minifiers and so on. npm can install modules locally in a project ( by default in node_modules ) or globally to be used by multiple projects. In large projects the way to specify dependencies is by creating a file called package.json which contains a list of dependencies. That list is recognized by npm when you run npm install, which then downloads and installs them for you.
On the other hand bower was created to manage your frontend dependencies. Libraries like jQuery, AngularJS, underscore, etc. Similar to npm it has a file in which you can specify a list of dependencies called bower.json. In this case your frontend dependencies are installed by running bower install which by default installs them in a folder called bower_components.
As you can see, although they perform a similar task they are targeted to a very different set of libraries.

For many people working with node.js, a major benefit of bower is for managing dependencies that are not javascript at all. If they are working with languages that compile to javascript, npm can be used to manage some of their dependencies. however, not all their dependencies are going to be node.js modules. Some of those that compile to javascript may have weird source language specific mangling that makes passing them around compiled to javascript an inelegant option when users are expecting source code.
Not everything in an npm package needs to be user-facing javascript, but for npm library packages, at least some of it should be.

How can I convert a multi-file node.js app to a single file?

If I have a node.js application that is filled with many require statements, how can I compile this into a single .js file? I'd have to manually resolve the require statements and ensure that the classes are loaded in the correct order. Is there some tool that does this?
Let me clarify.
The code that is being run on node.js is not node specific. The only thing I'm doing that doesn't have a direct browser equivalent is using require, which is why I'm asking. It is not using any of the node libraries.

You can use webpack with target: 'node', it will inline all required modules and export everything as a single, standalone, one file, nodejs module
https://webpack.js.org/configuration/target/#root
2021 edit: There are now other solutions you could investigate, examples.
Namely:
https://esbuild.github.io
https://github.com/huozhi/bunchee

Try below:
npm i -g #vercel/ncc
ncc build app.ts -o dist
see detail here https://stackoverflow.com/a/65317389/1979406

If you want to send common code to the browser I would personally recommend something like brequire or requireJS which can "compile" your nodeJS source into asynchronously loading code whilst maintaining the order.
For an actual compiler into a single file you might get away with one for requireJS but I would not trust it with large projects with high complexity and edge-cases.
It shouldn't be too hard to write a file like package.json that npm uses to state in which order the files should occur in your packaging. This way it's your responsibility to make sure everything is compacted in the correct order, you can then write a simplistic node application to reads your package.json file and uses file IO to create your compiled script.
Automatically generating the order in which files should be packaged requires building up a dependency tree and doing lots of file parsing. It should be possible but it will probably crash on circular dependencies. I don't know of any libraries out there to do this for you.

Do NOT use requireJS if you value your sanity. I've seen it used in a largish project and it was an absolute disaster ... maybe the worst technical choice made at that company. RequireJS is designed to run in-browser and to asynchronously and recursively load JS dependencies. That is a TERRIBLE idea. Browsers suck at loading lots and lots of little files over the network; every single doc on web performance will tell you this. So you'll very very quickly end up needing a solution to smash your JS files together ... at which point, what's the point of having an in-browser dependency resolution mechanism? And even though your production site will be smashed into a single JS file, with requireJS, your code must constantly assume that any dependency might or might not be loaded yet; in a complex project, this leads to thousands of async load barriers wrapping every interaction point between modules. At my last company, we had some places where the closure stack was 12+ levels deep. All that "if loaded yet" logic makes your code more complex and harder to work with. It also bloats the code increasing the number of bytes sent to the client. Plus, the client has to load the requireJS library itself, which burns another 14.4k. The size alone should tell you something about the level of feature creep in the requireJS project. For comparison, the entire underscore.js toolkit is only 4k.
What you want is a compile-time step for smashing JS together, not a heavyweight framework that will run in the browser....
You should check out https://github.com/substack/node-browserify
Browserify does exactly what you are asking for .... combines multiple NPM modules into a single JS file for distribution to the browser. The consolidated code is functionally identical to the original code, and the overhead is low (approx 4k + 140 bytes per additional file, including the "require('file')" line). If you are picky, you can cut out most of that 4k, which provides wrappers to emulate common node.js globals in the browser (eg "process.nextTick()").

Develop Reference

JavaScript is the programming language of the Web.