Hacker News new | past | comments | ask | show | jobs | submit login

The format function is going to end up allocating a string for every single byte. That's a huge overhead.

Edit: Turns out to be about 22% overhead, see https://github.com/sharkdp/hexyl/pull/23. Also it was 2 strings per byte, not 1.




Thanks to that PR, hexyl is now slightly faster than hexdump. Both are about a factor of 2-3 slower than xxd:

    Benchmark #1: hexyl $(which hexyl)
      Time (mean ± σ):     169.8 ms ±   8.2 ms    [User: 152.5 ms, System: 17.1 ms]
      Range (min … max):   162.2 ms … 189.1 ms    16 runs
     
    Benchmark #2: hexdump -C $(which hexyl)
      Time (mean ± σ):     188.5 ms ±   4.4 ms    [User: 186.2 ms, System: 2.2 ms]
      Range (min … max):   184.1 ms … 198.2 ms    14 runs
     
    Benchmark #3: xxd $(which hexyl)
      Time (mean ± σ):      72.8 ms ±   2.7 ms    [User: 71.9 ms, System: 1.1 ms]
      Range (min … max):    71.0 ms …  87.8 ms    40 runs


I made a little clone for fun and got a bit carried away optimising. Now at about 3x the speed of hexyl 0.3.1:

https://github.com/sjmulder/hxl

Most of the improvement came from not using printf, fputs, and putchar in favour of operating directly on an array for the line that can be fwritten in one call.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: