I would expect ~2,000 tapes in the back of a pickup, at 6TB each. Note that:
- Tapes are easier to stack and load in boxes
- Tapes are are more resistant to shock and damage from vibrations
- Tapes are generally more resistant to damage from the environment
- Tapes weigh less than hard drives (this is true whether measured per tape or per byte)
But if you transmit the same data using networks, you also need to read / write from the tape/hdd/whatever, unless you're talking about RAM-to-RAM speed.
IMHO you should just account for the time to mount/connect the HDDs to the computer.
I don't think this is true for the general case, though. At some point one has to declare that the data has "arrived" in its usable form at the destination, and RAM seems like the wrong threshold (since, for now, it's volatile).
Although one could consider HDDs to be immediately usable at the destination once they're "plugged in", there are too many exceptions (including good reasons not to re-use a device subjected to such conditions in production) just to assume it's true.
If they were SSDs, I might agree, but those are still too expensive for the use case.
If it takes 4 hours to write to the drive/tape, 4 hours to transport it, and 4 hours to read it back in, that is 12 hours. If you are sending it over a wire, say the wire speed is 6 hours, the reading and writing are happening at the same time as the transmission. So total time is still 6 hours (you don't have to read the entire drive before beginning the transmission, and you write out to the remote side as it is being received).
And you could read/write transported HDDs or tape in parallel. If you’re transmitting over the internet, it’s much more difficult to have parallel transmission paths.
You can still have multiple drives on either side. Sure they are expensive, but if we're talking hundreds or thousands of tapes (we have the size of a station wagon to work with), there's a budget for more than one drive on either side. Even with drives, if you get too elaborate of an enclosure or JBOD, you're looking at thousands too.
Hell, you could even bring the drives with the tapes!
My point isn't that parallelism isn't possible with tapes, but, rather, that it not comparable, due to cost, at same order of magnitude or two.
> Even with drives, if you get too elaborate of an enclosure or JBOD, you're looking at thousands too.
This actually supports my point (except the "too", which is misleading). One can reasonably expect that $2k-$3k enclosure to support 16-44 disks for that price [1], compared to a tape drive's singleton.
[1] Full-featured Synology NAS with 8 drives is under ~$1k, for example, while a CSE-847E1C-R1K28JBOD holds 44 drives for $2800.
If you’re saving on the order of a hundred dollars per tape (12TB), you’re going to have enough left over for a few readers.
But - this is a silly argument. I think we actually agree on all of this and we’re talking about an extreme hypothetical that while I’d love to test out, isn’t something I’m going to do in the near future!
Although, a few years ago, I did have a similar issue. I needed to love about 500TB of data from one data enter in Menlo Park to a new one in SF. We tried everything we could to make sufficient backups, but it was hard to backup 500TB of data over even a 10Gb/s link. We ended up just moving the physical JBODs in the back of a rental car. Most stressful drive I’ve ever made.
> If you’re saving on the order of a hundred dollars per tape (12TB), you’re going to have enough left over for a few readers.
That's a good point, and it certainly improves the attractiveness of tape overall. Still, a $200 saving in medium doesn't make up for a $2600 difference in drive cost. How well does it improve the appeal of tape for parallelism?
The media+drive cost of LTO-8 for 12TB would be around $2850, while for disk is only $550. Considering LTO-8 has transfer speeds of 1.6x or so that of disks, that means only 5 tape units are needed to match 8 disks. That's still $14.2k compared to $4.4k, over 3x. (With disks, I'd want to do RAID6 or RAID-Z3 at the very least [1], which further reduces that ratio, but not below 2x).
That's certainly much closer to parity than an off-the-cuff estimate, but it's still not close enough to be practical. Anyone needing to transfer data in bulk, fast, would still do well to choose disks, not tape.
> this is a silly argument. I think we actually agree on all of this and we’re talking about an extreme hypothetical
I disagree as to silliness, and as to argument, since I also think we generally agree. This is a discussion in the spirit of HN, satisfying intellectual curiosity.
I also disagree that it represents an extreme hypothetical. I believe your own anecdote is nowhere near as rare as you imply. Even at lesser scale, it's an issue: the existence of the AWS Snowball is a testament to that.
[1] Ideally true sector-level ECC, though AFAIK, ZFS can provide a close enough approximation
So that’s 10PB/4 days = 231 Gbps
Not bad!