Hacker News new | past | comments | ask | show | jobs | submit login

Since the OP noted that they were trying to accomplish this task on a c4.xlarge machine, the data is already "in the cloud" and so transfer should be blazingly fast.

Using the zipped CSV at [1] as a size estimate, I'd ballpark 300 million CSV rows as somewhere between 6.7-10.6 GB:

    (300000000/500000) * 11.10 = 6.7 GB
    (300000000/500000) * 17.68 = 10.6 GB
That could one minute to transfer. (Obviously if the rows contain a ton of data, these size estimates could be off, in an unbounded way.)

1 = http://eforexcel.com/wp/downloads-18-sample-csv-files-data-s...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: