I did a midrange deployment with bcfg2 (~250 machines), and what I found is that it was slow to use.
Firing off a bcfg2 push would take almost 20 minutes, which was pretty unusable; The big problem was that it didn't handle large numbers of connections simultaneously well, so I ended up having to stagger them.
CFengine was able to do the same push in < 30 seconds. Win.
I'll be looking at Chef again in the near future. It's got a LOT of nice features; Hopefully we can make it scale out quickly.
I use a custom AMI. I started with a base image a long time ago. Since then I have been upgrading and tweaking. There are several tutorials on creating the image. API tools are available for the whole process if you are on an instance-store image. It creates an image of the instance filesystem in /mnt and then breaks it into chunks to upload to s3 with a manifest and registers the manifest as a new private ami.
yes, that is what i've done, or rather just log in, set things up, and then save a snapshot, repeat as needed. not sure this will speed things up in general as far as iterations, but may save the $0.10 you may spend setting up a workable vm.