This should also be a reminder to everyone that you shouldn't be reliant on a si...

JonnieCache · on Jan 30, 2013

    bundle package

puts all your app's dependencies in vendor/cache. That can then be put into a git submodule.

The problem then becomes the Gemfile and Gemfile.lock, which should really be in that submodule as well. You need to pass flags to bundler commands because it assumes the Gemfile is in the project root.

timr · on Jan 30, 2013

I don't think Heroku's deploy is smart enough to recognize that you've packaged, right? It'll still try to bundle install, which would break in the current situation.

I think a full solution requires packaging, and using a modified buildpack that skips the bundle step.

gry · on Jan 30, 2013

$ bundle package

Places the gem binaries in vendor/cache, as noted. SCM those.

"While installing gems, Bundler will check vendor/cache and then your system's gems. If a gem isn't cached or installed, Bundler will try to install it from the sources you have declared in your Gemfile."

http://gembundler.com/bundle_install.html

Heroku uses this tree lookup AFAIK.

timr · on Jan 30, 2013

Yeah, I knew that part...I wasn't sure what the default heroku ruby buildpack did. I'm still digging into the source to see what the build process is. It's non-trivial.

UPDATE:

For others' edification, the default heroku ruby buildpack respects vendor/cache, but will purge it in the following scenarios:

  * if vendor/ruby_version exists
  * if vendor/heroku/buildpack_version exists, but vendor/heroku/ruby_version does not
  * if the bundler cache exists, but vendor/heroku/ruby_version file specifies a different version of ruby than the one actually being used.

simonw · on Jan 30, 2013

The way we handle that for our Python deploys is to have a separate "deploy" git repo which includes complete .tar.gz files of all of our dependencies, then have our pip requirements.txt file point to those file paths rather than using external HTTP URLs.

To avoid packages sneakily trying to download their own dependencies from the internet we run pip install with a "--proxy http://localhost:9999 argument (where nothing is actually running on that port) so that we'll see an instant failure if something tries to pull a dependency over the network.

Pewpewarrows · on Jan 30, 2013

We do something very similar, but like you said there are the occasional sneaky devils trying to download their own dependencies. Nine times out of ten it seems like it's some version of distribute that they insist on fetching.

The non-existant proxy trick seems useful, I'll have to try that out.

supwiggles · on Jan 30, 2013

We run our own internal PyPi server. Makes it super easy to install our own packages over Pip, as well as ensure that dependancies are nice and clean.

mapgrep · on Jan 30, 2013

Indeed. I presume this is why Perl's package repository CPAN is actually a network of repositories ("Comprehensive Perl Archive Network"); Wikipedia says CPAN "is mirrored worldwide at more than 200 locations."

Does anyone know why rubygems does not work this way? I had always just assumed it did (due to the historical intertwining of Ruby and Perl communities).

rmoriz · on Jan 30, 2013

The central architecture of rubygems allows you to publish and yank gems within minutes. CPAN takes some hours (and deletion may not be controlled).

Personally I'm a big fan of the CPAN approach as it is fairly simply. Just mirror via FTP. It's a nobrainer to setup and run a mirror.

That said, CPAN's master (PAUSE.cpan.org) is a SPOF as well.

What I like is that not a single party is responsible for paying server bills + maintaining the platform. Ruby Central and the team of volunteers do a great job, but in the end, people only care when something breaks.

Instead every big company/university that profits from the Ruby ecosystem should imho run a public rubygems mirror as a contribution to the open source world. That's common practice for other projects, too. Think of all mirrors of the Linux distributions, kernel.org, cpan, python etc.

=> http://slideshare.net/rmoriz/rubygems-behind-the-gems

I also want to mention, that ftp.ruby-lang.org is a single homed box. There is no other official mirror of the MRI/C-Ruby source that can be used as failover or load balancer. This is bad, too.

themgt · on Jan 30, 2013

Agreed. We've looked into running our own mirrors for rubygems and it's there's nothing really supported out there. The addition of git gems in bundler means you'd really need a git mirror tool as well.

jim-greer · on Jan 30, 2013

At Kongregate we just check the gems into our repo under vendor/cache and then run 'bundler install --local' at deploy time.

Here's the docs that describe it:

"While installing gems, Bundler will check vendor/cache and then your system's gems."<and then tries to fetch remotely unless you pass --local>

http://gembundler.com/v1.2/bundle_install.html

phillmv · on Jan 30, 2013

If I had to guess, I would wager it's because it's expensive and hard. Plus there's the fortunate coincidence that - as far as I recall - rubygems has mostly Just Worked Fine, Thank You Very Much™ .

(I miss the days from when github also hosted a gem repository…)

Solving the authenticity problem alone is probably not fun – tho obviously there is much to be learned from CPAN. Given recent problems there will probably be enough political will to make this happen in the future, though.

steiza · on Jan 30, 2013

I only recently realized how easy it was to run your own PyPI - it just has to handle a few HTTP GET / POSTs.

If you want to run your own PyPI internally, here's a very simple PyPI server (~150 lines of Python) that I wrote: https://github.com/steiza/simplepypi

po · on Jan 31, 2013

Also of interest is http://crate.io/

What I've personally been looking for is an easy to setup caching proxy for PyPI. Something that is pip-compatible and serves files if it has them but will also fetch and then store packages if it doesn't. That way you could build up a collection of 3rd party packages over time, without having to explicitly manage it.

It probably wouldn't be hard to roll my own with a reverse proxy but it never gets moved to the front burner.

tobych · on Jan 30, 2013

We're trying out DjangoPyPI 2 to host our own PyPI. Seems very actively maintained, and works a treat, despite it being still early days.

http://djangopypi2.readthedocs.org/en/latest/

For now though we'll probably just create a new git repo with a folder full of source distros (tarballs and zips), as mentioned above.

rykov · on Jan 31, 2013

Gemfury also supports private Python packages

profquail · on Jan 30, 2013

That's a very good point to make.

.NET developers, you can set up a similar cache for NuGet packages to avoid downtime (and reduce bandwidth usage): http://www.hanselman.com/blog/HowToAccessNuGetWhenNuGetorgIs...

kawsper · on Jan 31, 2013

Is bandwidth usage really a concern?

profquail · on Jan 31, 2013

For most people/enterprises, no. But there are still many places in the world -- in the US, even -- with slow and/or spotty internet connections, so it would make sense for them.

sumone4life · on Jan 30, 2013

They still allow you to deploy you just have to explicitly set a variable in the deploy command so they know you are aware whats going on

tlrobinson · on Jan 30, 2013

The point was unless you also previously cached all your gems somewhere you'd have to deploy using potentially compromised gems from rubygems.

splatcollision · on Jan 30, 2013

Is this safe if you haven't changed any gems since the last deploy? I have a bugfix that I would like to deploy...

ikawe · on Jan 30, 2013

Heroku runs bundle install on deploy, so it's not safe until all your gems (and their dependencies in gemfile.lock) are cleared.

stock_toaster · on Jan 30, 2013

  >  It's something that we in the Python community have already learned due to the historical unreliability of our equivalent package repo, PyPI.

learned sounds a touch condescending to me for some reason. The python community has certainly run into it, but (anecdote time) in my experience people still often rely on pypi for their deploys (but use the --mirrors option to pip). Encountered may be more appropriate.

pekk · on Jan 30, 2013

I think you are being oversensitive about a tiny difference of wording, maybe due to some prior history with the Python community?

stock_toaster · on Jan 30, 2013

No negative history certainly. I write python code all day long at $dayjob. It just struck me as odd wording.

Thanks for the perspective at any rate. Maybe too much coffee for me this morning? :)

Pewpewarrows · on Jan 30, 2013

True, "learned" does sort of imply that it's a best practice now used by nearly everyone in the community. I know that's far from the truth. "Encountered" is more appropriate, so I'll edit my OP.

acdha · on Jan 30, 2013

I read “learned“ as in “learned the hard way”, which definitely has a different feel