I maintain a database of all the nodes (servers, Linux or BSD so not talking routers ) as well - one would go mad otherwise. And I feel that at certain scales one should tell hosts what to do, not rely on some auto discovery
Process - especially if that process is not out and out cloning.
Edit: just to clarify I am not attacking your choice here or your contribution to the state if the art in GitHub - we need more people making. But I would like to understand why a guy who is unlikely to run more than a couple dozen servers again needs to learn to debug and battle something like chef when the imperative scripting approach is simpler and more direct
I suppose I am asking, at what scale does chef et al pay off compared to expect-like scripts (assuming the development time of a simple script is roughly equal in both)
Then, yes, Chef is probably overkill. Although, the resources (template, command, etc...) are nice and helpful. But I wouldn't replace an existing working process with Chef is there's no obvious benefit.
Do it early, for many reasons. I worked as a systems administrator for startups before we wrote Chef, and I wouldn't go back to managing infrastructure by hand.
A few reasons:
* You don't have to maintain that database manually, Chef collects and stores node information for you.
* If you've already automated building out your servers, you're all set when you need more of them.
* The recipes document your systems, so if you "get hit by a bus" someone else can figure out how the bits get around.
* When your datacenter burns down, you can rebuild quickly from bare metal, a chef repository, and backups of your application data.
Also see this article written by Jesse Robbins, one of the Opscode founders, but before Chef existed:
> needs to learn to debug and battle something like chef when the imperative scripting approach is simpler and more direct
The biggest difference is, as the previous replies hinted at, Chef/Puppet/CFengine will each run "constantly" checking the state of the system against their rules.
With your fabric-approach you can only cause changes to happen when you're in front of the computer are you run "fab apply", "fab apt-get-upgrade", etc.
FWIW I also use a simpler alternative, recently documented, introduced, and explored here:
With the obvious caveat that you must have written the rules/policies/recipes for that to happen.
(I've seen some cfengine repositories, for example, that assume Apache is present and just start tweaking the configuration files. If you're doing it properly you'd want to define a rule initially "install apache". That's why it is often instructive to test out your rules on a pristine/virgin installation of your preferred distribution to see that it works from start to finish.)
And I feel that at certain scales one should tell hosts what to do, not rely on some auto discovery Process - especially if that process is not out and out cloning.
That's one thing, in Chef, that scares me. Access control on node attributes is inexistent. Even in private/hosted chef. So that a node can modify its own run_list, its own tags, etc...
A frontend node that gets rooted can suddenly become a backend database. If the environment is configured to allow for automated configuration of everything, this can get ugly very quickly.
Even with Chef, human approval of new nodes and roles is still mandatory.
I wrote a firewall cookbook that takes advantages of Chef's search capabilities to dramatically reduce the maintenance of our firewalls, while keeping the rules ultra specific (inbound and outbound). Check it out: https://github.com/jvehent/AFW/ & http://jve.linuxwall.info/blog/index.php?post/2012/11/14/AFW