Edit: Turns out to be about 22% overhead, see https://github.com/sharkdp/hexyl/pull/23. Also it was 2 strings per byte, not 1.
Benchmark #1: hexyl $(which hexyl) Time (mean ± σ): 169.8 ms ± 8.2 ms [User: 152.5 ms, System: 17.1 ms] Range (min … max): 162.2 ms … 189.1 ms 16 runs Benchmark #2: hexdump -C $(which hexyl) Time (mean ± σ): 188.5 ms ± 4.4 ms [User: 186.2 ms, System: 2.2 ms] Range (min … max): 184.1 ms … 198.2 ms 14 runs Benchmark #3: xxd $(which hexyl) Time (mean ± σ): 72.8 ms ± 2.7 ms [User: 71.9 ms, System: 1.1 ms] Range (min … max): 71.0 ms … 87.8 ms 40 runs
https://github.com/sjmulder/hxl
Most of the improvement came from not using printf, fputs, and putchar in favour of operating directly on an array for the line that can be fwritten in one call.
Edit: Turns out to be about 22% overhead, see https://github.com/sharkdp/hexyl/pull/23. Also it was 2 strings per byte, not 1.