Shoutout for Hetzner's 99 euro/month server with a GTX 1080, much better than the pseudo-K80s that Google Cloud provides for $520/month. The Google K80s are half or quarter the speed of a real K80, part of the reason they show so badly in the comparison.
Just to reiterate barrus's (the Product Manager for K80s) point, K80s come with two dies per board, so we're giving you the granularity. We struggled with wording, but as both NVIDIA and AMD ping pong between GPUs with two dies per board as the best part versus one we didn't want to make the minimum granularity "a part sold by a vendor". So there's no conspiracy or half or quarter speed nonsense, just that it's probably not as clear as it should be that this is half a K80 board.
You could survey your customers to see how widely this is understood. Likewise "cores"/hyperthreads.
Meanwhile the 10x price performance difference is the main point. Really eager to see the TPUs rolled out broadly, please do price them to take market share from NVIDIA
GTX 1080 GPU, i7-6700 Skylake, 64 GB DDR4 RAM, 2x500 GB 6 Gb/s SSDs for 99EUR/month with a one-time 99EUR setup fee.
My lord. HPC resources are incredibly affordable. Hetnzer and some of the other dedicated server companies in Europe/Canada have some amazing deals (we've used OVH in the past with great success, and right now we use Paperspace for CPU intensive stuff we want to share expensive licensing on, like Visual3D).
Wow, to go off topic; I've been using Versaweb for the past 3-4 years, but became very unhappy after they forced a "server management" fee down our throats.
We've been looking at setting up a small cluster of servers at work (budget of about $500), and I was still going to go with Versaweb. After seeing Hetzner, I'm going to reassess, and likely move everything there.
I'm paying €150 for what it seems I could pay €100 for. There was something that made me decide against Hetzner a few years ago, but I'll research and see if their TOS are now different.
Thanks again!
EDIT: My numbers are wrong, I'm going to pay less for 4x the RAM (256GB)
How so? I understand that there are some differences between the two, but the fundamentals are the same; which is that I want a physical server that I can manage.
Their pricing structures are slightly different, Versaweb gives me a bit more flexibility when configuring, a wider IP subnet bundled (instead of 1 usable IP), and a few other things which I'm investigating.
I also have to consider laws and network latency as these are in different regions.
In the end, I am paying $180 for a Haswell Xeon with lots of disk space and IO. I could pay the same amount for more RAM on the same CPU, albeit with slightly less space.
If I keep the same setup at a fraction of the cost, I could end up getting the 1080 GPU on the same datacenter. It somehow feels like the same or similar market to my needs ...
>How so? I understand that there are some differences between the two, but the fundamentals are the same; which is that I want a physical server that I can manage.
They're on different continents, which is a pretty fundamental difference.
I'm sure they are, but there's turnover, customer service, attrition, obsolescence, etc that is all baked in, though clearly GTX 1080s will be valuable for some time, as will the i7 Skylake architecture.
Relative to the market, that price is very, very good.
If I were to buy such as system it would be over €2000, that's not including cooling or a case for it either. Granted I live in Sweden so taxes are a bit on the high side.
Regardless, that will be at least 20 months before they make a dime (assuming they can rent it 100% of the time). And in that time it will collect rackspace along with electricity, bandwitdh (2 gbits and 50 TB per month) and a dedicated IP.
And after all that time that computer is not that hot anymore, but still draws just as much electricity regardless.
Just signed up and ported my model + data:
- it's indeed noticeably faster than the Google VMs. As usual, I compiled tensorflow for this GPU vs K80 (feature 6.1 vs 3.7).
- ubuntu 16 minimal is indeed "minimal" ! but it worked...
- GTX 1080 (7.92GB) has less GPU RAM than the K80 (11.17GiB) -
this required me to reduce the model design slightly.
For my model/data, Hetzner runs 1 training epoch in 1 hr vs 1.75 hr for Google. I'm moving the rest of my work over tomorrow. When Google has TPUs available, I'll look at it again.
This is presumably just the full board versus half nomenclature noted above. But yes, consumer GPUs are way more cost competitive than Tesla class parts. Being able to train bigger models is valuable to some folks, but not everyone, so I don't begrudge using the GTX line.
(Also, I can no longer edit, but a colleague pointed out that I should have read more carefully. A GTX 1080 is a Pascal part, which compared to the poor old Kepler in K80s, it'll really shine. Volta all the more so in the next year).
Yeah, also have a 30€/month hetzner dedicated server with 2x TB HDD and 32 GB RAM.
At the same time at my company we pay sometimes up to a 1000$/month for a really weak AWS machine because of the costs for traffic and storage. Ridiculous, but.. Yeah...its not my money.
Looked into this a bit more. The GTX 1080 is based on the Pascal architecture and so will be faster than any Kepler-based K80 on any cloud - even faster than a K80 card with 2 GPUs. The GTX is a consumer board and is less expensive than the datacenter equivalent P100 PCIe card. The P100 has 16 GB ram and HBM2 memory (twice the memory and more than twice the memory bandwidth) and supports ECC if you care about detecting memory corruption. The P100 will be faster than the GTX 1080 once it is available. As I said before,
GCP offers K80 GPUs in passthrough mode and you can use a single K80 die ($0.70 / hour billed by the minute) or you can attach up to 8 K80 GPUs to a single VM. Disclaimer: I am a product manager for GPUs in Google Cloud.
The P100 is about 10x the price of the 1080 ($6000-9000 vs $500 for the 1080 and $700 for the 1080TI).
I've talked with several second-tier cloud providers, and the GTX 1080TI is what their large-deployment customers use. At the NVIDIA conference they were all promoting the P100 (NVIDIA insisted), but all admitted that nobody asked them to deploy P100s at scale.
The Hetzner box is about 0.15 an hour. That means more GPUs per developer.
Each Google K80s is one GPU or 1/2 of a K80 board, so technically you are correct that a Google K80 GPU is half of a K80 board. However, they are offered in passthrough mode and achieve full performance. If you want a whole K80 board, attach 2 K80 GPUs to a single VM. You can have 1, 2, 4 or 8 K80 GPUs attached to each VM in GCP. (I'm one of the GPU product managers at Google Cloud).
It would be polite to indicate that on the price list (understatement).
While you're here: the other reason we switched to Hetzner is reliability. Sure we can continue training from the last checkpoint but we still lost half a day on average for the many surprise reboots. We suspect that you've overbooked the GPUs and someone has to lose when too many connect.
If you upvote it, you can find it again through your profile. I'm on mobile but I think you can favourite comments too via the time stamp link, also shows on your profile
Though I never get to use Hetzner sop can't comment how good they are, I got into issues with them because of there convoluted process.
I was trying to calculate the total cost as their list price excludes VAT. It turns out they just booked the server for me and started sending invoice. Of course they allow to cancel within 14 days but I was handling a personal issue so didn't check my emails for almost a month. It turned messy.
If Hetzner support are listening, please improve the process and if possible take to credit card/payment details upfront so that person is aware that you are spinning the server for them.
https://www.hetzner.com/dedicated-rootserver/ex51-ssd-gpu?co...