Hacker News new | past | comments | ask | show | jobs | submit login

Here is a small test:

  dd if=/dev/zero of=/tmp/file count=50k bs=100 # create 5MB file of zero
  aes -e -f file -o file2 -p asdfasdf # create aes encrypted version of the file
  tar -czf file.tar.gz file # compress file
  tar -czf file2.tar.gz file2 # compress encrypted file
  du -sh file* # check size of all files
Here is the output I got:

  4.9M	file
  5.0M	file2
  8.0K	file.tar.gz
  5.0M	file2.tar.gz
These results pretty much speak for themselves. Just think of it this way, compression works by finding patterns (like every byte is zero), and only storing the patterns. If the encrypted data has patterns, then the plain text could more easily be found.



Compressing an all-zeros file is not representative of anything. Why not do it with actual text?


It depends what you want to show, in this case I was showing the size difference between a file unencrypted and encrypted. An all-zeros file has nearly no randomness, so it compressed very well. Then I show that somehow the encryption process takes away this lack of randomness, and leaves a file almost incompressible. I'm really just increasing the scale to make it more 'dramatic'.

If I happened to be showing how compressed codecs like jpeg, mp3 or h264 weren't compressible, I would definitely pick something more like an actual text file.


(You know you don't have to use tar for a single file, and can just gzip it outright, right? :P)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: