Hacker News new | past | comments | ask | show | jobs | submit login
Npm Private Modules (npmjs.com)
141 points by mts_ on April 14, 2015 | hide | past | favorite | 74 comments



So this reads to me that they're simply not going to update the ability to use GIT. Right now you can point to a repo or specific branch / tag but it doesn't read any of the versioning like bower does, so it can't handle versioning properly. If they updated the use of GIT you could completely eliminate the need for this feature.

That kinda rubs me the wrong way; it feels like they're trying to force this monetization as they've had plenty of us from the community wanting better GIT support for this very reason.

Honestly npm isn't a very complex piece of software, you could even replace it with bower if you want git with versions. I wonder if any competitors are going to spring up who can simply iterate faster. I won't even get started on their login system, ugh.


> Right now you can point to a repo or specific branch / tag but it doesn't read any of the versioning like bower does, so it can't handle versioning properly.

You can even point to a specific commit hash. But no, it doesn't read any of the versioning. What if two different branches have the same version number? The publish step is pretty important, because ...

> If they updated the use of GIT you could completely eliminate the need for this feature.

... You'd still miss out on the prepublish hook - one of the major reasons I've looked into using NPM Enterprise.

Sometimes your package needs to do some housekeeping _before_ you publish, and the files generated by that should be distributed -- but not checked into source control.


> You can even point to a specific commit hash. But no, it doesn't read any of the versioning. What if two different branches have the same version number?

The semver mechanism would be based on the tag, so it would not work when pointing to a specific commit hash (and you wouldn't want it to).

> ... You'd still miss out on the prepublish hook - one of the major reasons I've looked into using NPM Enterprise. Sometimes your package needs to do some housekeeping _before_ you publish, and the files generated by that should be distributed -- but not checked into source control.

That's a nice feature and all but not required to have proper git support.


You can easily refer to a version by specifying it via hash

some-user/some-module#1.0.0

Now this is not very robust, as it does not manually incriment semver. That being said a module it not a git repo. There are pre-publish hooks that can be used to execute scripts before publishing. Further, tracking off of git introduces an entirely new set of problems from tracking tar balls. I can see why the team would not be particularly interested in investing time in it.


> Now this is not very robust, as it does not manually incriment semver.

That was the parent's criticism. Other systems do increment semver.


> read any of the versioning like bower does, so it can't handle versioning properly

What would the "proper" behavior be? How would npm know which commit is the one you intended to be the version change? Scan the history of package.json and only grab the commit when the version number changed?

Isn't that entirely the point of using tags?


NPM doesn't use tags to do semver comparisons. To NPM a tag is no different than a branch or a commit hash. This differs from Bower and half dozen other package managers that do semver against tags.


Bower has a backend that is reading the git repo and converting tags into semver versions. They automated NPMs publishing process. Doing this in realtime in the NPM client would be very time consuming.


> Doing this in realtime in the NPM client would be very time consuming.

Nonsense; this is a very quick operation in git. Even in enterprise situations you're probably not going to have a huge amount of private repositories.


We're already using "private" modules, by hosting them in a private git repo. NPM can install from these, e.g., git+ssh://user@server:Account/Repo.git, without problems. I'm not sure what value this would add for us, and the URL doesn't sell me well on it.


This is indeed a great way to shim, but in case any readers aren't aware of the limitations:

* NPM is unable to keep track of the version. This means no semver and that `npm install` re-installs the module every time. * Git sub-modules do not work. This can sometimes be a deal breaker. * This requires either your username password or private key to install. This gets especially messy if you are trying to do some type of CI build (e.g., build a docker container) and don't want your keys to leak into the build artifact.


I would say more correctly that NPM doesn't keep track of the version, not that it isn't able to. Many git tags are version numbers, NPM just doesn't parse them.


PHP's Composer does this properly, so it's surprising that NPM doesn't.


Not really. Nobody would need private modules if it did, and they need some monetization strategy.


I would say tying language's main package installer capability to monetary need of the company hurts how I will node/npm as a stable platform.

Imagine if RubyGem/PythonPackage/NuGet goes with this stragtegy.


We're glad private repos work for you :-) However, the overwhelming feedback from our users has been that git dependencies are a gigantic pain in the neck, and this has been our own personal experience as users of npm. Much like npm itself, we are doing something you could do without npm, it's just much easier with npm, and we hope that reduction in friction is valuable :-)


I take this to mean you won't be improving git dependencies then.


Is it modeled strictly per-user or is there any notion of an "organization" similar to Github? It's hard to imagine a business having to manage paying for a bunch of individual accounts so they can have access to the company's private repo.


Organization accounts are coming soon:

  Currently, private packages are only available for individual users, but support for organization accounts is coming soon. Feel free to create a user for your organization in the meantime, and we can upgrade it to an organization when support is here.
https://www.npmjs.com/private-modules#organizations


NPM Enterprise is a potential solution. You have to host it yourself, but it gives you namespaced modules and the option of selectively mirroring the public registry.


How does versioning work with this? Do you have to point directly at a branch/tag and get the latest or can you somehow use semver fuzzy versions with this? i.e. versions compatible with 1.4 -> '^1.4'


There is no versioning in this way, unfortunately. So yeah you have to point at a branch or tag.


IMHO, tying language's defacto package management's capabiity to a single company's business model may hurts node.

No other language does this. you have all the same capability for propretary package with RubyGem/Python/Nuget, even Ubuntu allow you to add third party package repository.

It gives me the clue that npm may no longer improve, or many improvement will be "premium" feature.

I hope someone fork npm and add ability to parse semver from tag for github repo.


Everything looks pretty awesome, except the payment model.

I'm a little surprised they didn't go with a model similar to GitHub: - Payment plans with X number of private modules, or $1/month per private module - Unlimited paying/non-paying collaborators for private modules (perhaps only read access for non-paying)

With npm's model all my collaborators will have to pay for npm private modules as well.


Charging per-module like github would encourage people to author fewer packages and to lump more functionality into the same package, which goes completely against the ethos of npm and the spirit of tiny abstractions that do one thing well.

The way that npm has structured things, programmers pay once for membership in a commercial tier, where presumably money is already changing hands to work on private code. I think this makes much more sense and won't bias the code itself in a negative direction.


That's a very good point!

I hadn't thought of it from the point of encouraging/incentivizing the authoring of more public packages.

But couldn't the current model then discourage the authoring of public packages, and lead people to start primarily publishing private packages.

I assume whatever reason people are authoring public packages today won't change because they get a paid account. Let's at least hope that's the case.


Sounds like a big pain to have to pay individually for each person on a team if your company wants to use private modules. We're generally willing to throw money at problems like those private modules solve, but if we have to do it a dozen times it probably isn't going to happen.


Does anyone know whether the open source NPM implementation allows me to implement and host my own private repository system without forking npm? Or is npm (and thus node and iojs) hard-tied to npm, Inc's proprietary offering for private modules?


Yes, the registry url is configurable. As of npm 2.x, you can also specify a different registry url for specific scopes, so for example, `@foo/bar` can live at a different registry than `@baz/boo`. "Global" packages (ie, those without a scope designator) are always fetched from the top-level "registry" config.

https://docs.npmjs.com/misc/config


Nexus serves NPM packages, Ruby GEMs, YUM packages and also serves as a Maven repository.

https://books.sonatype.com/nexus-book/reference/npm-configur...

Source code: https://github.com/sonatype/nexus-oss


One of our DevOps guys setup Nexus to serve npm packages. It both allows for the publishing of private repos (meaning you get semantic versioning support unlike git links) and caching of public repos from npm. It works extremely well.


That was my first thought on reading the business plan: it sounds like the "nexus" tools used with Maven.

I suppose if you are not using Maven, though, a custom tool integrated with the Node toolchain would be more comfortable.


You can configure different registries on a global and per-scope basis:

https://docs.npmjs.com/misc/scope#associating-a-scope-with-a...

Some of the other commercial npm hosts and open source offerings already support scoped packages.


Oh wow, I'm so behind the times :) great, and thanks!

Ok in that case I hereby officially really like the way npm Inc are working on monetization.


Have a look at the sinopia package. You can run your own server on your own machine (without costs).


Sinopia [1] lets you run a private NPM repositroy. It can also act as a cache for the official npmjs.com repository.

[1] https://github.com/rlidwka/sinopia


Sinopia has iffy authorization support, last I looked.

We ended up going with NPM Enterprise instead.


Can you explain what you mean by that?


You had to add users' authorization information from their .npmrc manually to sinopia's YAML configuration file and reloading of the application. This was a showstopper.

NPM Enterprise' github authorization module requires an access token in the user's .npmrc file, but no modification to the actual running app to add/remove new users: they just have to configure their .npmrc correctly (by default, if a user has read access to the github repository referenced by the repository field in the package.json, they can install the package - if they have write access, they can publish it too). No fiddling with sending someone's NPM credentials around the network, inserting them in a YAML file and restarting the registry application.

It was six months ago that we looked into it, though, so it may have improved since then.

We also looked at using nodejitsu, but their administrative portal was abysmal, their billing buggy, and reliability unimpressive. All of which explains the godaddy purchase ;)


I don't know much about this, but our sysadmin reports that Sinopia has a pluggable authentication system, and that we ended up writing our own plugin that uses PAM (and thus system accounts) underneath.

The auth token you need locally in your own .npmrc is added automatically when you do "npm login".


That's new, then. When we were looking at it, you had to supply the npm auth token to the yaml config: it was just plaintext checking against its list of available logins.

We wanted something available "now," not "in a bit, after we write an auth plugin for it."


we've made an attempt to make sure that scoped modules are supported in the npm-registry-couchapp:

https://www.npmjs.com/package/npm-registry-couchapp

npm's open-source project which lets you run your own registry on CouchDB.


The Python equivalent to this is to simply place your source release tarballs on a simple private web / filesystem directory:

    pip install --find-links http://dist.example.com/packages/
This also makes it simple to mirror all requirements locally for more reliable installs.

I wish I could do the same with npm.


If anyone is wonder the pricing like me just head to the home page: "publish unlimited private modules for just $7/month".

Here's a question: people with read-only access (to my private packages) have to be paid users too?


From the page:

> Give read access or read-write access for those packages to any other paid user


My bad, I completely missed that.

So in the end is $7 per user, you can't have just one paying user publishing for everyone else. I guess I'll have to wait for organizations accounts for now.


Suggestion, add a pricing link? I usually do a ctrl-F to search for pricing in a service like this, make it easier for me to find.


If anybody else is wondering, it costs $7 per month (per user)


I'm fearful of where this monetization is going. Part of what I like about npm is that everything is free, from the dinky little packages that do one thing really well, to the monumental ones that provide a host of functionality.

I would hate if NPM went the way of wordpress plugins, where every stupid little plugin costs $5 to access.


I might have missed the obvious but what is the difference between this and hosting your own private git repository?


And now it's down:

http://status.npmjs.org/


And back up, but you can see a dip on charts.


Somewhat off-topic: "NPM Private Module" would be an excellent recursive acronym.


I recently got interested in nodejs. However, then I discovered that:

1. It doesn't support threads (facilitating structural sharing of large data-structures between parallel tasks, which cannot be done using ordinary processes).

2. The module-loading mechanism ("require()") natively doesn't support delayed loading, which is needed when loading from within a browser. Yes, there is the "browserify" package, but, come on, something as basic like this should be supported out of the box. Especially considering the fact that there is a "http" module hardwired inside nodejs (why isn't this a separate npm module, btw?)

3. To make my own privately held modules and install them properly, I have to run a npm server? This seems like an awful lot of work for something as basic as this. Ok, so now I can use the cloud for this, but come on, I should be able to do this just from within the filesystem, like e.g. git does it.

For people interested, one can use the package "sinopia" for hosting your own private modules. It seems to be a pretty decent package, but be aware that the authentication settings out of the box are completely insecure.


1. Node.js is (for the most part) single-threaded; that's its draw. It's not trying to be a swiss army knife, and if your use case requires a threaded language then Node.js certainly isn't the tool for that job. But it might find a useful place in your toolbox for other tasks.

2. require() is part of the CommonJS spec, and how it physically works is dependent on the implementation. You point out that Node's implementation doesn't work well in the browser, but Node itself does not work in the browser so that point is moot. I agree that it might be interesting to load remote modules in Node, but keeping that operation synchronous does simplify the language quite a bit.

3. You can also map modules to public or private git repositories in the package.json, as long as the private key used during npm install has access. If the git repo has tags, a tag can be specified in the git uri as well. Private npm repos are the superior way to distribute private modules with wider access, but I think this is handled fairly cleanly already.


Thanks for your comments. Some remarks here.

1. Node.js is a tool for building servers. On a server you generally cannot afford to have the event loop blocked by a computational intensive task. You need threads.

2. It would only require a "promise" to make the module-loading asynchronous. Leaving that out is not what I would call "quite a bit of a simplification", especially if using asynchronous callbacks is the "modus operandi" of programming on the Node.js platform itself.

3. Okay, I stand corrected. I remember that I waded through the documentation quite a bit though, trying to figure this out.


On a server you generally cannot afford to have the event loop blocked by a computational intensive task.

You are not supposed to use your main event loop for computational intensive tasks.

Offload those tasks to separate workers and use queues.

That's node's basic knowledge. Its a trade off that you're supposed to be aware of when using node.


The problem with workers is that they don't have an event-loop (like the main thread). So it is not possible to use asynchronous code written for the main thread in those worker threads, which is of course quite limiting.

EDIT: I mean workers which run in a thread (as opposed to in a process). An example is given by the webworker-threads npm module. Threads allow one to structurally share large data-structures, so one does not have to serialize them when calling a worker (serializing large structures would block the main thread).


Sorry you are getting a lot of downvotes. For what it is worth I don't think you deserve them, as your comments just show inexperience and lack of understanding of Node.js, and aren't trolling. However, I think you would be well served by doing some research into what Node.js and and how it works. Basically every Node.js process has an event loop. Your workers have an event loop just like your servers do.

Here is how a typical node stack works:

Nginx load balancer talks to a cluster of node server processes, one per core. The server processes handle all incoming web requests that won't block the event loop. On a typical REST server this is 99% of your tasks, and each node process can handle thousands of concurrent requests due to the way that the event loop works.

If there is a heavy, blocking task like processing an image or PDF file, (although even these things should be able to be done in a nonblocking stream manner) the server processes send a message through a background queue such as RabbitMQ, or Amazon SQS or the like to a background process which has the sole purpose of processing heavy tasks pulled from that queue. Fundamentally if you are using Node.js properly you don't need multiple threads. Instead you use multiple processes, and the processes are essentially "threads" that can talk to each other either using parent/child processes communication, HTTP, redis pubsub, or any other mechanism you want.

But there is no reason why anything should block a Node.js process if it is written properly. I've even done heavy video transcoding in a streaming manner in a Node.js process without blocking the event loop.


The reason for the downvotes, I suspect, is because this looks like an attempt to derail a thread to get tech support on a barely-related topic. Worse, the initial comment was worded as "this thing sucks because..." instead of a question, despite showing very little knowledge about the thing it complained about.


Thanks for the explanation and the moral support :)

I think most people here misread the line "facilitating structural sharing of large data-structures between parallel tasks, which cannot be done using ordinary processes" in my first post.

And by large data-structures, I don't necessarily mean structures which can be "naturally" streamed. I'm thinking more of a large index, for example, which can be used for fast lookup, and be used from several threads at the same time.

Having processes (here named workers) is a nice feature, but doesn't cut it when you want to share large amounts of data between threads (serializing that data would completely block the main thread). In my view, it is unfortunate that the designers of Node.js didn't opt for having multiple threads as opposed to putting every thread in a separate process.


I ended up replying to one of your other comments with more details but the answer to this problem is streams. You can use streams for any and all incoming data, whether it is file data uploaded via multipart upload from a browser, or streaming result set from a database, or raw data streaming out of a storage service like S3 or Dropbox. There is even a streaming JSON parser for Node.js in case you have the ungodly situation of having say a 500 MB JSON file or something horrible like that: http://oboejs.com/


You're describing the worst possible use-case for node, and one that it explicitly is not intended to handle well. If you're doing computationally-intensive tasks on large data sets, use a language that supports that. Node.js is intended for I/O-heavy workloads.


Node workers are just new processes. They do have an event-loop.


@1. You don't need threads if workers are enough for you, that's how you should do computation intensive tasks...


Computation intensive tasks often take large amounts of data as input. And sharing data with a worker always has to be done by serializing this data (in a message). So for large inputs, this approach doesn't work (the main thread would block the cpu while serializing the messages).

But my biggest problem with workers is that they don't have an event-loop, so I can't share asynchronous code between the main thread and the workers.


There is no need to serialize large amounts of data. The way it is designed to work in Node is you use a stream. So for example lets say you have a multi TB data dump in Amazon S3, you want to process it, and then upload a transformed multi TB result set back to Amazon S3. (This is something I've worked on before).

The way it works is you open a download stream from S3, pipe it into a Node.js transform stream, and then pipe that stream into an upload stream that uploads the data back to S3 using the multipart upload API.

The Node.js design is very much like using Unix pipes. You can pipe a huge multi TB file through grep without blocking anything. The data just streams from disk into the grep process, grep filters it down to things that match, and then streams the results onto the screen.

Computation on huge streams in Node.js works the same way. Your event loop remains unblocked even when operating on a stream TB's in size because you are only ever touching a portion of the dataset at a time. Additionally if you do it properly your overall memory usage remains low as you are exporting the data back out of the machine as fast as it comes in. I've used this technique to process streaming data many GB in size while keeping the node process under 200 MB of memory used from the system perspective.

Recommended reading: https://nodejs.org/api/stream.html

Here is an example of an upload stream that I created for the use case of processing a large multi TB data set and piping the result up to Amazon S3: https://www.npmjs.com/package/s3-upload-stream


For streaming, I can see that this can work.

But basically, what I wanted to do, is implement a module that works as an index between threads (e.g., a search-tree for fast lookup). However, since in Node.js all threads are in a separate process, it is (afaict) impossible to make this efficient, as processes do not share data.


So in Node.js this would be accomplished by using a shared data store like Redis. For example I run eight processes per c3.xlarge instance, and the instances share a Redis which contains data like that. Particularly indexes could be stored in the Redis hash structure.

Basically Node.js is designed around the concept of microservices and separation of concerns. Rather than doing everything in one giant, multithreaded monolithic process you break your service up into loosely coupled components that talk to each other via messaging and share common datastores. Some people really like this pattern (I'm a strong advocate of it myself) because it scales really, really well.


Well, the "index" was merely an example. Actually, what I want to do is implement persistent data structures (a.k.a. functional or immutable data structures) in a combination of javascript and C++. See [1]

[1] http://en.wikipedia.org/wiki/Persistent_data_structure


The `servicebus` module is a really cool way to coordinate events between microservices, especially if they don't necessarily "know" about each other.


It sounds like nodejs is not what you want then.


One can also install modules from a git repo, from which access to the module can be controlled in the regular git fashion:

  "dependencies": {
    "private-module": "git+ssh://..."
  }
or:

  "dependencies": {
    "private-module": "git+https://<user>:<password>@..."
  }

If one chooses to use Github, there is also the option to use an auth token in the url scheme instead of needing to distribute an SSH PK (bad) or having login credentials in the package.json:

  "dependencies": {
    "private-module": "git+https://<token>:x-oauth-basic@github.com/<account>/<repo>.git"
  }
One could also just have a folder of private modules mounted from some shared file server or whatever. An NPM server is not an absolute requirement to use npm.


> (why isn't this a separate npm module, btw?)

It is[0]. It's a dependency of browserify.

If you don't use it, part of the concat+minification step prunes dead code.

> To make my own privately held modules and install them properly, I have to run a npm server? This seems like an awful lot of work for something as basic as this. Ok, so now I can use the cloud for this, but come on, I should be able to do this just from within the filesystem, like e.g. git does it.

If you don't care about having your package published somewhere other than your local machine, you can use npm link[1], or `npm install <directory>`[2].

0. https://github.com/substack/http-browserify/

1. https://docs.npmjs.com/cli/link

2. https://docs.npmjs.com/cli/install




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: