Hacker News new | past | comments | ask | show | jobs | submit login
Flat Tree Dependency Resolution in Npm v3 (npmjs.com)
113 points by funerr on Dec 31, 2015 | hide | past | favorite | 119 comments



I understand that this is a hairy, messy problem they are solving, and that they traded off one aspect of the problem for another (Install order dependency! as a new "feature" in 2015!).

But I wish they had aimed higher. The better goal would be that the entire state of the node modules directory is a pure function of the contents of the package.json file plus the platform details (compiler used for native modules).

While they're at it, the ecosystem could be improved considerably if there was some sort of obvious "penalty" applied to any package that compiles native code, because such things cause considerable extra trouble for Windows users. A visible penalty, transitively carried up the dependency tree, would discourage use of such modules with native code; projects would use them (depend on them) only if absolutely necessary instead of accidentally, all over, all the time.


>The better goal would be that the entire state of the node modules directory is a pure function of the contents of the package.json file plus the platform details

Yes, this would have been much better than this new, also-broken approach. Functional package managers like the one I work on, GNU Guix, have this problem solved and solved well. Language-specific package manager developers would do well to either implement similar systems in their projects, or realize the inherent problems with limiting the scope of dependency management to a single language rather than the full dependency graph and switch to using functional package managers.


> Language-specific package manager developers would do well to either implement similar systems in their projects

This is what we did with Dart's package manager[1]. It works like Bundler where it finds a single set of package versions that satisfy all of the constraints and there are no duplicate packages. It's a challenge because constraint solving is NP-complete, but it works very well in practice.

[1]: https://pub.dartlang.org/

> realize the inherent problems with limiting the scope of dependency management to a single language rather than the full dependency graph and switch to using functional package managers.

The problem then is now your language has a dependency on an outside package manager, one which is often OS specific. Most modern languages need to support a variety of OSes.


>This is what we did with Dart's package manager[1]. It works like Bundler where it finds a single set of package versions that satisfy all of the constraints and there are no duplicate packages.

Constraint solving is the status quo, and I am strongly opposed to it. Rather than saying "I want foo >= 1.0" in a package manager, you should have the build system test the environment for the features you need. This is what the Autotools and other build systems that have been around for decades do. This way, the project isn't tied to a single package manager and, in the case of free software, greatly eases the burden on package maintainers that want to add your software to their distribution.

Functional package management doesn't need constraint solving because every package in the system precisely describes itself: exactly which dependencies are needed for build-time and runtime (which precisely describe themselves, recursively, all the way down to libc), precisely which source code (tarball/directory/whatever + checksum), and the exact build script that turns that source code into a (hopefully bit-reproducible) binary. There's no constraint solving to determine what the dependency tree is, it's already been encoded in the package objects themselves. This is what any robust package manager should enable, but the only two that I know of that do this are Nix and GNU Guix.

>The problem then is now your language has a dependency on an outside package manager, one which is often OS specific. Most modern languages need to support a variety of OSes.

A programming language should not depend on any specific package manager. This is what build systems are for! More and more language communities conflate the two, but this is a huge mistake that we are paying for by making our software nearly impossible to reproduce from source code. [0] Everyone just uses pre-built binaries full of circular dependencies and reproducibility issues because things are so tangled that no one actually knows how to build anything from source anymore.

I think the value of language-specific package managers lies in providing an easy way to fetch pure (no native code) modules to faciliate simple code sharing and helping newcomers get bootstrapped quickly. However, for serious software development and deployment they are terrible and we desperately need better tools. I think functional package managers are the tools we need, as they have greatly simplified software building and deployment for me whilst also greatly increasing the reliability of the systems that use them thanks to transactional upgrades and rollbacks.

[0] http://www.vitavonni.de/blog/201503/2015031201-the-sad-state...


Language specific package managers seem to actually work in real life and are cross platform, e.g. npm. But I'm not aware of anything for e.g. C++ which would offer the features of npm (mainly installing dependencies with a single command and supporting GNU/Linux, OS X and Windows).

Is there a single non-language specific package manager which is cross platform, and supports a large percentage of packages for a given programming language?

Note that I'm not saying npm is excellent, but I'm not aware of a single successful language-independent package manager. If there are none, why?


The aforementioned GNU Guix and the similar Nix project are exactly that: Cross-platform (mostly, at least Nix requires a unixy layer on Windows) language-independent package managers.

One could argue that they aren't "successful" (in that they don't have all the packages you might want), but they are still quite new and are pretty different from traditional package managers. I believe that Nix/Guix are the future, but a future that will take some time to become widespread.


> Cross-platform (mostly, at least Nix requires a unixy layer on Windows)

Not good enough, language package managers must work on Windows, with a regular command prompt.


Windows is hopeless. If Windows is a must, you really cannot do much better than a package manager per language that you use. No binary reproducibility, no transactional management, no provenance, no thanks.


Conda aspires to be exactly that. It attempts to package the entire Python scientific stack, but its general purpose nature lets you package other kinds of software as well (for instance, conda can replace compiling arbitrary C programs from source and installing them into your home folder)


Spack [1,2] is our attempt to do this for HPC, where we have two things going on: 1) Our applications aren't single-language. They include Fortran, Python, C, C++, Yorick (!), Lua, etc. etc. 2) We have a major combinatorial versioning problem, because HPC clusters (and scientific software in general) rely on ABI-incompatible libraries like MPI, and MANY different compilers.

Spack is build-from-source and takes a lot of inspiration from Nix and its descendants, but it adds support for things like swapping out compilers/MPI versions/etc., as well as a syntax for composing new builds of things on the fly That is, you can build something with a slightly different version of a dependency without ever editing a package file. Our users to be able to build a package with 6 different compilers and 3 different MPI library versions, and have their 18 versions coexist on the same system. This has been super helpful for testing production code at LLNL, because we have to run our stuff on other peoples' often very different clusters. For instance, at LANL, we prefer PGI compilers. At LLNL we tend to like Intel or Clang.

One reason languages tend to grow their own package managers is that there are all kinds of peculiarities about how different languages manage modules/extensions. Have you seen how many different ways there are to deploy Python modules? It's very hard to have multiple versions of things coexisting in the same Python env, and things like pkg_resources require editing client code [3]. Ew. Spack tries to address this by allowing different language packages to implement their own module activation/deactivation logic, so you can kind of have a language-specfic package manager for each install of the language. I'd rather have something more like Nix's profiles, though.

I wouldn't say that the project is huge yet, but we've had some interest from mostly other HPC sites. It would be cool if the project caught on in the "real" world, too.

[1] https://github.com/llnl/spack

[2] http://www.computer.org/csdl/proceedings/sc/2015/3723/00/280...

[3] http://stackoverflow.com/questions/5265731


I find that Spack in fact takes very little inspiration from functional package managers, Nix and GNU Guix. Dependencies are specified in a loose fashion which, on one hand, gives a lot of flexibility, but on the other hand is detrimental to reproducibility.

More importantly, part of the dependency graph, including compilers, is treated specially and essentialy considered outside of the scope of Spack. This seriously hinders reproducibility, as we tried to explain in https://hal.inria.fr/hal-01161771/en .


>'m not aware of a single successful language-independent package manager

Ummmm... dpkg and yum? You know, the package managers that manage the software on everyone's production servers. Or do these not count because they don't work on Windows or something?


Yes, his previous sentence specifically mentioned 'cross-platform'.


When we ship node modules to be embedded into Chocolat.app, we have to strip out all the tests and documentation because otherwise it bloats the download size. And the computer it eventually runs on may not be connected to the internet.

The nice thing about node is that it's quite small (3.4MB zipped). We don't have to ship npm with the app, and certainly an OS level package manager would be way too large!


Each package should strip tests and documentation from the package. Of course, many don't..


True enough... it's not always easy to consider as creating an npm module and building/pushing to NPM is really easy to do.


This is also what Composer (package manager for PHP) did. They ported openSUSE's libzypp to PHP to solve the dependency management problem. After resolving versions the dependencies are installed in a flat list.


This "penalty" you propose would be a cosmetic thing, it wouldn't really make a difference. Many devs would choose to use the module as a dependency of their modules.

The real solution to the sub-dependency-insanity of npm is ... use fewer dependencies. Don't use dependencies that have sub-dependencies. Write more code to directly solve your problem.

This is a viewpoint you won't often hear, because it comes from someone who doesn't write js or use node ... I've just had to fix the deployment of some node applications written by frontend devs a few times. I've seen npm trees with over 1000 modules to do some very trivial stuff. Node actually seems nice, and it has all the basics you need built in.

I typically use C or Python, which pretty much never use nested dependencies, and have never felt the desire to install/update/manage/debug exponentially more dependencies.


I wouldn't recommend using fewer dependencies, but smaller ones. One of the best features of npm is that it solves the "dependency hell" problem. You can depend on modules freely and not worry about making it harder for users of your module to install it.

The issue comes when people create these huge "do-everything" modules. This is worth a read: https://github.com/sindresorhus/ama/issues/10#issuecomment-1...


That just creates a situation where you depend on an insane number of dependencies and your npm install takes 10 minutes.


One of the best features of npm is that it solves the "dependency hell" problem. You can depend on modules freely and not worry about making it harder for users of your module to install it.

That statement is true in theory, as long as everyone follows semantic versioning and dependencies never have unexpected interactions because of bugs.

Unfortunately, in the real world, neither of these conditions holds perfectly. The moment you depend on any module in your package.json that directly or indirectly depends on any other module where any version is not precisely fixed, you are immediately running uncontrolled and probably untested (by you) code. Moreover, if anything does go wrong, you may also have considerable difficulty identifying and reproducing the exact combination of versions of everything that caused the failure.

I have spent far too many days over the past few months tracking down exactly these kinds of issues, sometimes in some of the best known and most popular modules available via npm. For example, as the article itself mentions, what you wind up with in your node_modules directory can depend on install order under the new scheme. I don't think this is a good thing.

I therefore agree with those who advocate minimising dependencies, within reason of course. I also advocate strict version numbering and 100% reproducible builds, which is unfortunately quite hard to achieve once indirect dependencies start entering the picture with Node and npm, under either the old or the new scheme.


where this really starts to bite in my mind, is when you wind up seeing multiple modules in different dependencies that do the same thing.

Lodash, Underscore, Ramda and other come to mind in particular here... It's really easy to wind up with at least two of those three. Part of why I wouldn't mind seeing a few "winners" even if slightly larger win out. We're at a point where we get to tear jQuery out of projects, but it's easy enough to wind up with bundles that are larger than if we'd just used jQuery or Zepto, or whatever.


Why do dev communities have to continually pay the Windows tax? If Microsoft keeps insisting on being an outlier in the world of OSes, shouldn't they be the ones burdened?

I gotta say, as someone who never ever uses Windows but maintains a couple of open source packages, I'm really sick of the Windows only problems that crop up.


As someone who has no choice but to use Windows at work, I'm really sick of all the problems that crop up because so many tools use were written by people who won't use Windows and haven't tested anything on it.


That's a funny point of view. Windows is made by a multi-billion dollar company, rented by countless wealthy software development shops, but it's volunteer open-source programmers who are supposed to take on the burden of supporting it? Seriously, I can't even comprehend that thought process.

I write software for fun, I release it on github, and some people find it useful. If there are other people who don't, they can improve it for their own purposes, or they can find or write something better. I don't think anybody has a right to expect me to support Microsoft's product. That's insane.


I don't think anybody has a right to expect me to support Microsoft's product. That's insane.

True enough, though it's also fair to consider a language/platform/ecosystem inferior for some work if it relies on that kind of individual and often less portable contribution for its effectiveness. A lot of FOSS advocates will promote community support and a broad contributor base as advantages of that style of development, but there is another side to that coin, which is that sometimes you get just what you (didn't) pay for and you're on your own in terms of support or working around any problems.


If I had no choice but to use Windows at work, and using Windows meant I couldn't use a large amount of development tools that I rely on, then the last ones I would blame would be the open source developers that won't use or develop for Windows. Your employer is the problem here.


Why do you expect developers to pay for software, OSes, and licenses they don't use in order to develop open source software for free? Help them out by contributing, or blame Microsoft who has the resources to make their own OS more compatible

Doesnt Microsoft have their own V8 or node.js fork now?


Kinda funny isn't it, after a decade where one might be required to use a proprietary VPN or email client that only runs on windows. And those things didn't use standard, published, or even consistent protocols, nor were they open-source. In fact they were hostile to compatible implementations. They had to be reverse-engineered.

So now you get just a little, watered-down taste of using the non-preferred platform for something.


I feel your pain, but am not convinced open source maintainers should have to be the ones to resolve the issue. Especially since so many projects are done using spare time and with no compensation.


hi! i totally know this is a common feel for many npmjs users. this is why i've been running an empathy campaign for Windows, Windows Wednesday, where i do all my work (with/for npm) on windows. it's been good for finding bugs, either technical or UX. i'm working on improving the docs to make them friendlier to all platforms (not just OSX, which is the most common platform at npmjs).

npmjs cares about the enterprise + windows.


I think this is the wrong approach. We shouldn't pick up MS's slack, MS should. This just enforces the current situation and keeps enabling MS to ignore the problem.


you might also consider the fact that OSX machines are expensive. and the education for learning linux is generally expensive or not accessible. so supporting windows also supports ... poor people, people from different countries that have different economies, etc.

should we not pick up capitalism's slack? capitalism sure isn't going to pick up its own slack, err... systemic method of restricting underprivileged/poor people's access to power.


> the education for learning linux is generally expensive or not accessible

Wow, this is incredibly wrong. The cost of education to learn Linux is living in a country where you are likely to be literate and have access to a computer. That is literally the cost to learn Linux, guy. There is nothing like an MSDN subscription for Linux documentation.


What problem is that? That they have a different operating system?

npm is also a bit different, in your objection: they're a company. Companies need users. Supporting more platforms to get more users seems like pretty much, well, business as usual.


That is a fair point. npm being a company does change the situation.


So, are you of the opinion that OSS projects should never be built for any commercial, proprietary OS or just Windows (i.e. because it's not Unix)?

For instance - OS X window managers suck. Do you think that a project like Spectacle should have never been built because they're just picking up Apple's slack?


I don't see how a project like Spectacle, which is OSX specific, relates. If someone wants to make an OS specific project, more power to them. The issue arises on platforms that span OSes, and they just never quite span onto Windows correctly.

Microsoft is very happy that Node, Java, Clojure, etc work on Windows. It helps them a lot. But they make little to no effort to help bridge the gap, they usually leave that to everyone else. That's my gripe, right there.

And to be fair, this is not a black and white situation. It's complex and Microsoft's role in all of this is complex too. And sometimes this issue does crop up on other OSes, for example Docker on OSX is sometimes a little painful.


"But they make little to no effort to help bridge the gap, they usually leave that to everyone else."

OK, but that's also where you're wrong, right there. In general, but also particularly for this exact issue - Microsoft considered attempting to fix the file path limit for the first release of Windows 10. Maybe you don't understand, but it's going to take a herculean effort to fix this. They want to do it and if you need proof of that I can find posts of them talking about what I just said.


Don't UNC paths solve this problem already (30000+ char limit)?


The problem is the default/legacy APIs don't use the new paths... in .Net I had this problem a lot, and serious issues every method I tried to work around it would hard crash VS, let alone the application I was working on... in the end, after days trying the solution was to say f*ck it and just put the directory I was working with closer to the root of the drive in question.

Why they can't fix .Net under the covers to use UNC in windows to avoid the class of problem I don't know, understand or comprehend... and that's just one very common platform for windows. I would say the same for npm/node and libuv for that matter.

I use bash a lot in windows (installed with msysgit), and that's got it's own set of problems. More problematic is the number of node modules that rely on bash scripts for builds, that don't work at all in windows.... which makes it very hard to contribute.


I believe Microsoft worked to get Node.js supported on Windows.


Not just that, but they're working on making Node work with Chakra.


Can they make it so I can spawn a process and not worry about whether I'm spawning on unix or Windows?


I don't know it at that level of detail.


Yes, npm is very problematic/buggy on Windows. I wasted many hours because of https://github.com/npm/npm/issues/9696 (meaning any build on visual studio online breaks ~50% of the time simply running 'npm install').

Microsoft should finally fix the MAX_PATH issue in Windows, but npm should just fix their software: it's too important to be that buggy.


It sounds like you're getting annoyed that the devs of certain projects like Node.js have chosen to support an OS that you don't use without asking you about it first... If you're in control of the project - you have every right to ignore Windows users. If you're not in control, then you really have zero right to complain. Zero.

Devs certainly aren't forced to "pay the Windows tax" in any way. Plenty of OSS projects completely ignore Windows. I'm sure you're aware though that Windows has the majority market share of desktop operating systems. If your software is interesting to the vast, vast population of Windows users, you might get some requests to support Windows. And since it's not an outlier in the market, other desktop OSes should obviously strive to be more like Windows ;)


Windows is very much an outlier in the world of software development. Unless you're specifically using Microsoft tech (ie .NET and all that), then you're in a world dominated by Linux and OSX.

I'm annoyed that Microsoft insists on going forward with a proprietary OS that does not conform to standards that pretty much every single other OS does. Sure, platform specific issues do happen for other OSes, but they are rare. They are extremely common for Windows.

For sure, I have no obligation to fix Windows specific issues in my projects. All of my projects are done in my spare time and given away for free. But Windows specific issues are still annoying and disappointing.

Another aspect of this that is annoying is that major, mission critical open source projects have to dedicate so many resources to Windows. For example this bug in ClojureScript: https://github.com/clojure/clojurescript/commit/80d46bdb7969...

Imagine if the Clojure team didn't have to worry about file separator differences, and the multitude of other Windows annoyances. How much extra energy, time and resources would they gain? I suspect a lot.

EDIT: And I would also say I have every right to complain about Microsoft. Ignoring standards and doing their own thing for their own financial gain at everyone else's expense is definitely worthy of complaint. Also see Internet Explorer.


>> Windows is very much an outlier in the world of software development. Unless you're specifically using Microsoft tech (ie .NET and all that)

I think you need to get out more. Or at least go to meetups that don't only meet at coffee shops. You're cutting out a huge chunk there, and there are still huge chunks that use Windows to do Java, C++, etc.

I mean, game development is a multi billion dollar industry that is almost exclusively Windows-but-not-.NET, outside of consoles, and even on consoles it's still Windows in a large number of cases.

>> I'm annoyed that Microsoft insists on going forward with a proprietary OS that does not conform to standards that pretty much every single other OS does.

This is different from Apple with OSX and iOS how? This is different from Google with Android how? This is different from Canonical with Unity how? This is different than RedHat and Debian and pretty much every other major Linux distro with systemd how?

And for crying out loud, file separator issues? That's not even a hard one.


>> This is different from Apple with OSX and iOS how?

I'll never run "npm install" or "gem install" or what have you on iOS, Android, etc. Proprietary OSes come in all shapes and sizes, all bringing their own pros and cons to the mix. I just happen to be focusing on a very large con of Windows at the moment.

As for OSX, Redhat, Debian, etc. Sure, they're all different, of course. But generally speaking, at least when it comes to Node and npm (and as far as I know, Ruby, Clojure, Python, etc, please correct me if I'm wrong), these OSes tend to get along pretty OK. Windows, does not.

>> And for crying out loud, file separator issues? That's not even a hard one.

And yet how many times has that problem been hit over the decades? I mean heck, look at the commit message, "Another Windows path issue". And yeah, it's not even a hard one, it just gets worse from there.

I came to that issue from here (which is my project): https://github.com/city41/reagent-breakout/issues/11

So for "not even a hard one", still took me 20 minutes of digging to find out what was going on. X minutes across all developers across how long Windows has been around really adds up. For just one of Windows's differences.


I don't think you realize just how different even various Linux distros running even the same kernel can be, and I think you're giving waaay too much credit to OSX/Linux for being "alike". MAKE build scripts have notoriously been a mess for decades because of the differences between distros.

Any reasonable language has the ability to query all of the platform-specific stuff directly from the standard library. You can't even count on two Windows users having their home directories in the same place, or two Arch users having all of their config files in the same place. It's a necessity to be able to abstract all of these away even for a single operating system, which means you get it for free when moving to other operating systems.


I do get that. I totally understand all OSes are very different. How often those differences surface really depends on a lot of things, mostly at what level of the OS you are working in. Node modules and apps tend to stay pretty high level. But again, I still stand by my original statement that Windows specific issues crop up more than any other OS. Just take a look at the README for node-gyp: https://github.com/nodejs/node-gyp -- yup in OSX you have to get Xcode and a few other OSX specific steps to arrive at gcc and make, but compare that to the Windows section ...


Other projects have to do a lot of work to support OS X too. So I guess I still don't see your point. Electron, for instance, had to program an entirely different object model to represent OS X's kludgy global menu bar when every other OS has menus attached to application windows.

I can't believe your defending OS X though when Apple makes you buy a whole computer from them to use it...meanwhile I can run a 6 month trial of Windows (over and over again) on any virtual machine. Honestly, it's way more work to support OS X than Windows if you really bother to look at it.


Apple is definitely not innocent here. They force a lot of terrible decisions too. I think ultimately though, this conversation boils down to my bias and experience versus your bias and experience.


Sure, everybodys bias is painfully obvious here on HN.

But I don't think that negates the fact that to develop anything at all for OS X I have to buy a whole computer from Apple, since it's damn near impossible to get a stable OS X experience by running it in a VM or directly on non-Apple hardware. Furthermore, remotely accessing Xcode on a Mac from a non-Mac computer is really painful since the only option is VNC - the bottom of the barrel, lowest common denominator of remote screen sharing protocols.


There is no reason building Windows software has to be this hard, other than the maintainer just not putting even a reasonable amount of effort into it.

When developers are hostile torwards Windows, this is the result. This doesn't do anything to Microsoft, this just hurts other developers and users.


gyp was created by Google for building Chrome. I would guess they've put a decent amount of effort into it.


Given windows is the vast majority of desktops, and that there is more variety between different Linux distros and OSX versions than there are in the decades of Windows APIs, I think you have things a little backwards.

I've done far more dev targeting linux deployments the past few years, and much of it is far nicer on *nix than windows... but your viewpoint seems to be kind of arrogant. I haven't worked for a company that has more than 100 employees that doesn't do most development on windows, even if that isn't the target for deployment. That includes two major financial institutions, some large internet services and many other smaller companies over the years.


I don't mean to be arrogant, and if I am honestly it comes from ignorance. I've worked in MS shops before (I did .NET, COM, etc for nearly a decade). This whole thing blew up out of frustration at Windows specific issues cropping up more often than any other OS in environments like Node. I do think that's hard to deny. If that's arrogant, then I don't know what to say :-/


There is a real problem here, but I think you're laying the blame a little unfairly at Microsoft's doorstep. It's probably fair to say that at least until relatively recently Microsoft has gone to far greater lengths to ensure compatibility and longevity of code build on its platforms than any other organisation in the history of software development.

We used to value writing portable software as a skill, we used to value languages and libraries with robust specifications that could be used to write portable code, and portable code is also relatively future-proof code. I've worked on large projects that shipped on literally a dozen or more different platforms at any given time and were maintained for well over a decade with the significant variations in platforms that happen over that kind of time frame. Those projects built probably 95+% of the same source code for each platform, with platform-specific APIs and conventions carefully isolated.

That attitude and the related skills seem to have been much less valued in recent years, not least by the Linux community. (What, you want to build this C or C++ code with a tool chain other than GCC and friends?) If people carelessly scatter platform-specific code all over their projects, then of course they won't be easily portable, but more often than not such limitations are entirely artificial and could easily be avoided at negligible cost. In my experience, this is also true of a lot of libraries with the likes of Node and Python where native code is used in managed packages, and I think it's fair to consider that the resulting portability limitations are a potential disadvantage when choosing the language for a project.


Seems to me that reasonable people could disagree over whether POSIX should be considered the be-all and end-all of OS standardization.


I wouldn't call it the be-all and end-all, but rather the bare minimum. "If you implement this API spec, there's a good chance a lot of software can successfully build"


Microsoft initially aimed POSIX compatibility with Windows NT 3.1. There were POSIX subsystems for WinNT 3-5.1. Though, things changed and got worse.


Better yet, why don't they just make a tool that fixes the windows problem, as a standalone patch. No need to change NPM, just need to fix the problem on Windows... The whole flat folder thing looks really bad.


Aimed higher is correct. I've always wondered why no one looked at the way git does this under the covers and applied it to npm. A directed graph that points to (with a sym link or similar mechanism) the correct package version on the file system.

That way you have the best of both worlds - dependencies don't collide and a single copy of a package version on the file system in node_modules.


The biggest bug is that Windows Explorer gui has troubles with long file paths.


I love the path that some take which is to maintain both a "pure" JS-only solution and a "native" solution, and have the native one be optionally installed. That way it works as best it can automatically.

Now I know that can't always be the case as it adds a lot of work to the development and maintenance, but it is a really nice thing when i see it.


Meh. Anyone who ever worked with the v8 C++ interop API knows that it's a rather unpleasant experience (unlike e.g. Python's C interop API, which is very nice). Nobody uses it out of pure laziness, but because They Have A Very Good Reason To Do So™.


"dependency resolution depends on install order"

Does this sound insane to anyone else?

EDIT: I understand not wanting to modify node's `require` semantics, but this is an unacceptable sacrifice of consistency for efficiency. Surely it would have been possible for an `npm install --save x` to put the `node_modules` directory in a state identical to `npm install --save x && rm -rf node_modules && npm install`. It might take a little longer to shuffle some directories around, but certainly not longer than a full `npm install`.


What's more is that standard `npm install` install order is ALWAYS alphabetic. Which causes some packages, purely arbitrarily, to influence directory structure.


Surely they will have to change that. Right? It can't last long.


Totally. That was the first thing that stuck out to me.


That's a great improvement over the old behavior, but I'm wondering why the team didn't go a different route.

Why not always store packages at the top level, and create a directory for each version that's required?

For example: directory "A" contains two subdirectories: "v0.1.0" and "v0.1.1"


Because then they'd have to change the semantics of how `require()` works. As it is, node packages are not version aware at runtime. Instead NPM is designed to place packages such that any library will have the correct version first in their load path.

Changing the directory structure to encode versions is probably the Right Way in a grand sense, but it's a much more substantial change than just promoting packages by default.


I thought about that, but clearly they have enough information to figure out whether to load the package from the top level or a nested directory. (I'm guessing simply by bubbling up and looking for the "node_packages" directory.)

It doesn't seem too big of a leap to do it the way I was suggesting.

Either bubble up and look for "node_packages", but this time store a symlink there, or add a dot file that tells you "this is where I found package.json".

I'm sure there was a reason they didn't go for either of those solutions because I've followed the team's discussions when they were iojs, and they have very smart people. I was just curious about the reasoning :)


The information may be available, but ultimately the work would have to be done by node, not npm, and it seems reasonable to not want to change such a fundamental node behavior, especially just to support a change in npm.


Ah, fair enough.

I assumed the same team was in charge of both.


Probably a lot of overlap, but they're separate projects with different versions and release schedules.


That doesn't handle peer dependencies. Those are resolved by walking up the filesystem. You can't walk up the file system with a symlinked structure. A single npm can be resolving different peer dependencies depending on where it is in the tree.


Read the next-highest package.json file for whatever code is executing (dependencies copy their package.json file when they install), find the dependency listing in there, with the desired version constraint. If none exists (i.e. the dependency was installed manually), take wahtever is the highest version by default. If that's not what the developer wants, it's their own fault for not specifying the dependency in their package file.


When your actual Node .js file runs `require('a')`, it has no context to determine which version it is requiring. To expect Node to look for a package.json file at runtime is absurd. This presents a problem.


I'm guessing the current behavior for require is to bubble up the directory structure and look for "node_packages".

So resolving things at runtime is already happening.

I agree using package.json at runtime is not the best solution though.

A way to keep things clean and avoid using package.json is to use symlinks instead of downloading a fresh copy, which would make putting everything at the top level a possibility.

I'm sure that option was considered, but I'm curious about why it was not taken.

I suppose I should look at the discussion notes. Hope I'll remember to do that when I'm home!


Symlinks would make copying and archiving the node_modules structure more problematic.


Symlinks break peer dependencies.


Why is this absurd? Order-dependent installation and massive node_modules trees seem a lot more absurd to me. Why not just have node parse all the package.json's (or, perhaps, npm-shrinkwrap.json) at startup time? Then `require` could easily parse that and know what dependency version from the flat node_modules to deliver.


That version information could be encoded into symlinks if it weren't for the need to support Windows.


So, assuming I have two libraries, A and B, that both require the same version of library C, do those libraries still get their own separate in-memory copies of C, or do they share a singleton?

It's terrible practice, but it's not unheard of for an NPM module to monkey patch its dependencies, since before this the library could assume it had sole ownership of its whole subtree.


If depend on A and B and both A and B depend on the same version range as C, C is now a top-level dependency.

Your node_modules will look like this:

    - Package_A
    - Package_B
    - Package_C
It's only when A and B depend on different versions of C that cannot be resolved via semver as safe.

    - Package_A
    -- node_modules
    --- Package_C
    - Package_B
    -- node_modules
    --- Package_C
I am pretty certain that monkey patching your dependencies is frowned upon in the Node world. It's best to fork the repo make your changes, and then depend on that.


Sadly, this is the result of the second situation:

    - Package_A
    - Package_C_vX
    - Package_B
    -- node_modules
    --- Package_C_vY


Interesting. This would indeed be a problem. I hope they don't share them between modules because if one mutates a dependancy, it would be nearly impossible to debug/fix.


Jspm solves this by installing the module in a directory appending the package version. There is no maximally flat tree, the tree is 100% flat. At most there are several versions of the same package side by side, but no nesting.

It even supports circular dependencies.


Are there any trade-offs to this approach? This seems so obvious to me I'm confused why it is more widely used.


In node, require()'ing a dependency is stateless - it searches ./node_modules/ for the module, then ../n_m/ then ../../n_m/, etc, until it finds an appropriately named module.

In JSPM/SystemJS, require()'ing/importing a module is still by name (as it supports NPM modules), but the package.json file has to be parsed in order to map module names to an installed module version. Note, this mapping is only done in the developer environment - once you build a bundle all the mapping is statically compiled into one file.


Wow. Thanks for opening my eyes to do that. I was wondering why I would use another package manager.


This is good news for client side bundlers like Webpack and Browserify -- where size really matters -- they don't have to end up with multiple copies of the same module.

I would also assume for very large apps it may improve startup time because you don't have to initialize and retain multiple copies of the same module.


This has been around for a number of months. The sad thing is because of this unpredictable (or rather arbitrarily alphabetical) `npm install` order different dependency trees can result which can still lead to a very common module being bundled multiple times. I was a fan of bower's strictly flat model because it prevents such duplication and even notifies you when incompatibilities occur. However bower seems to be losing the battle with NPM as the defacto web/javascript module repo.

The allowed/unpredictable duplication can even cause very hard to identify bugs when a peer dependency relies on "instanceof" checks and there are multiple versions of this dependency. I've seen it happen with React and Backbone to name a couple.

If the `npm install` allowed control over install order (instead of just being alphabetical) and there was a way to be notified of incompatibilities that would cause potentially unnecessary duplication that would be at least something that could prevent problems like this from occurring.


I don't know why that would be the case. Browserify can do semver comparison of dependencies and not bundle multiple copies. I would assume they were already doing this; if not, why?

I work on a client-side module loader StealJS[1] that implements the npm algorithm (2 and 3) in the browser and 3 makes things worse for us. With NPM 2 we could load a project without causing 404s (except in the cause of require("./folder")) but since NPM 3 says "install order matters" now it's no longer deterministic and 404s are more common.

It would be nice if NPM had some working group to discuss their algorithm and invited in other implementers for feedback.

[1]http://stealjs.com/


[deleted]


Not really -- unless you want to fork each top-level dependency and update the common dependencies to the latest common subversion in package.json (and maintain that). Far from ideal IMO.


While I can understand the want for that, there's too much issues. The only reason npm does it this way is because the module that depends on B v1.3 and a module that depends on B v2.1 could be introducing some really bad bugs or breakage if you force all modules to use B v2.1.

That's one of the reason bower is losing out.


forcing a single copy of a dep is exactly the opposite of the dependency strategy both the node module loader and npm take. the fact that the node module loader can load more than one version of a module into memory is its strength and npm plays to this, and in doing so avoids "dependency hell". you might checkout this page in the docs that talks more about this: https://docs.npmjs.com/how-npm-works/npm2


as the person who wrote these docs-- if you have questions or things you'd like to see addressed, i'd really love if you filed issues on the repo. https://github.com/npm/docs/issues


If this works it might reduce the slug size of a built node app by a lot of a little.

Though, if there are problems, I wonder - can the flat dep resolution be disabled using some CLI flag? Or when installing deps, or in .npmrc, or during a shrinkwrap?


I just filed a bug: https://github.com/npm/npm/issues/10999

I guess I'm not sure what level of non-determinism they expect, but on this page: https://docs.npmjs.com/how-npm-works/npm3-nondet it appears to make the claim that the only effect is on tree structure, not the actual versions of packages that are picked up. And in fact in their example this IS the case. I think this is fine btw.

However, I have found edge cases where install order actually changes the versions of packages that are picked up, and in ways that make it very very difficult to work around (basically you will be forced to manually edit a shrink-wrap file -- so it is necessarily on the end user not the package writer).

Basically, if any package lists and absolute dependency (vs a semver range), it will affect ALL the packages alphabetically later than it and FORCE them to take the same dependency.


I do not understand why they just didn't have a two level directory structure:

node_modules/[module_name]/[version]

Then it would be flat and support multiple versions of the same module in a way that is completely deterministic and also fully deduplicated.

This new system is unnecessarily complex.


The file system layout is constrained by how dependency resolution works. For example, a require() call has a very (computationally) simple algorithm.

I certainly agree that your suggestion simplifies file-system layouts. The tradeoff is that the complexity shifts to other parts of the system.

That said, I'm not a fan of the v3 approach. I'd have preferred a central package cache with a structure similar to your suggestion. I'd add that each package in the cache should have all of its dependencies resolved in its own /node_modules/ dir with symlinks. Unfortunately, I still can't see a nice way to handle peer dependencies. Peer dependencies require the ability to walk up the file system to resolve, which you can't do with symlinks.


With that approach, the module loader needs to know which version of a module to load for you when you require() something - in the general case that can't be done without parsing your package.json . This would require changes to node itself, not just npm, as well as all the external tools that implemented node's current require algorithm.


Sorry to be negative but one of the things that I hate more than anything in my software development work is typing "npm install blah" and almost always being hit with wave after wave of errors typically related to dependencies. I don't know why it happens and I don't care I just wish they'd fix it. So many, many errors.

Go on, try installing X packages at random using npm - did they install cleanly?

The baseline outcome for using a package installer should not be reams of errors, it should be a cleanly installed package. Installing packages works fine with other language ecosystems, why not with npm?


NPM has some dumb legacy features like optionalDependencies. optionalDependencies are often native code dependencies that might fail but the package is still usable, maybe just some specific feature isn't. That's the most common reason I come across for errors.


Whatever the reason, they should fix it - I spend all day fighting through vast numbers of problems and errors in all sorts of software systems but nothing ranks as high as "npm install" for bad user experience. It's the one thing that browns me off more than anything - I dread having to type "npm install" and its my number one gripe in a world of broken software which is really saying something when so much software is broken.


This will help Windows users who have filesystem and zip compression issues with deeply nested dependency trees.


Windows to this day still has the 260 chars path limit.

http://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx...

https://en.wikipedia.org/wiki/Long_filename

Microsoft should finally fix such old limitations (-> update WinAPI), instead adding work arounds to third party projects like Nodejs.

https://github.com/Microsoft/nodejstools/issues/69

They could also improve the command line shell (cmd.exe) that also PowerShell relies on.


That's a serious breaking change to existing windows programs and can safely be put in the "never going to happen" pile.


V2's deep nesting recently caused trouble for us at work. We had errors where our node_modules folder was too long for Windows, due to the deep folder structure. Updating to v3 resolved the problem. Hopefully other flaws being discussed here can be worked out as npm evolves.


Will this prevent node_modules/ from having 17 copies of the the same file in 17 different places?


yes, as long as there aren't version conflicts.


Which there will be because everyone uses --save so each module has the version around when they added the module.


--save adds the fairly loose ^ version number, so any major version should match, so less conflicts than if you'd added an exact version match, or a ~ match


A good percentage of npm modules are <1.0.0 so you're very likely to get dupes.


I think they are trying to be too smart about it ... I rather waste several GB's of HDD space then having my production crash once because of dependencies.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: