Using the “Range” header and greenlets, you can do very fast downloads from S3. The speed improvements are considerably higher than with uploads. With a 130M file, it was more than three-times faster to parallelize a download.
I’ve put the code in RandomUtility.
Usage:
$ python s3_parallel_download.py (access key) (secret key) (bucket name) (key name) ... 2014-06-23 11:45:06,896 - __main__ - DEBUG - 19% 6% 13% 6% 6% 13% 27% 2014-06-23 11:45:16,896 - __main__ - DEBUG - 52% 26% 26% 26% 39% 26% 68% 2014-06-23 11:45:26,897 - __main__ - DEBUG - 85% 32% 52% 39% 52% 45% 100% 2014-06-23 11:45:36,897 - __main__ - DEBUG - 100% 78% 78% 59% 65% 65% 100% 2014-06-23 11:45:46,897 - __main__ - DEBUG - 100% 100% 100% 78% 91% 91% 100% Downloaded: /var/folders/qk/t5991kt11cb2y6qgmzrzm_g00000gp/T/tmpU7pL8I