Show HN: A (marginally) useful x86-64 ELF executable in 466 bytes

meribold · 2024-03-27T22:12:05 1711577525

This project is a rewrite of a Bash script that I started to teach myself a little bit of x86 assembly. I still don't know much, but I did learn that ELF executables can be surprisingly small and simple for a very basic program. A 64-byte file header and a single 56-byte program header are the only "overhead" that is really mandatory (and it's even possible to make those overlap a little).

But it's difficult to stop the linker from adding extra stuff, which is why I eventually specified the headers (and three addresses) by hand in the assembly file and stopped using a linker.

LegionMammal978 · 2024-03-28T00:13:52 1711584832

Good luck on your x86-golfing journey! If it helps, I have a nice template for overlapping the two headers into 80 bytes, and stuffing up to ~24 bytes of instructions into them. It's included near the top of my article on the smallest x86-64 ELF Hello World [0]. (In the same article, I have a 73-byte template that's a bit shorter but trickier to use.)

You may find some of the other tricks in the article helpful, but it might be hard to follow depending on your level of experience with assembly. My general advice would be that 'push' and 'pop' are your two best friends if you want to move around 64-bit values.

[0] https://tmpout.sh/3/22.html

meribold · 2024-03-28T15:46:14 1711640774

I'll have a look. Thanks! I'll avoid using potentially transient details of Linux's ELF loader, though, since I use `btry` daily and don't want to tempt fate to have it suddenly break following a kernel update too much. (Perhaps that's a bit silly given the hardcoded `/sys` path and syscall numbers.)

LegionMammal978 · 2024-03-28T17:47:38 1711648058

You're in the clear with the syscall numbers, at least. Linux treats them as part of the stable public API for each platform. If they want to update a syscall, they have to make a new version with its own number, and keep the old version around for as long as x86-64 is supported.

samatman · 2024-03-27T23:13:45 1711581225

The installation instructions made me smile. "turn this Base64 string into an executable" is the new "curl | sh"!

Why did you decide to build the string backward?

meribold · 2024-03-27T23:38:21 1711582701

I don't remember if it's why I initially did it, but it allows hardcoding the final characters of the output string ("%)\n") somewhere into the ELF header where they don't do damage ahead of execution.

fear91 · 2024-03-27T22:20:49 1711578049

You can shrink it further by doing xorl reg, reg. On x86, the upper 32 bits are cleared for you when using 32 bit opcodes. No need to do a 64-bit reg, reg xor.

Instead of doing cmp $0, %eax, you can use test eax, eax - that's another low hanging fruit.

It seems that you could also preset a dedicated reg to 0 and another to 1, further shaving a few bytes.

meribold · 2024-03-27T22:37:49 1711579069

Thanks for the suggestions! I'll definitely look into those. I'd been hoping posting on HN would result in being able to shave off yet a few more bytes.

userbinator · 2024-03-28T00:11:57 1711584717

Also learn the string instructions --- I can see plenty of places where a lodsb would help greatly.

im3w1l · 2024-03-28T00:44:43 1711586683

This is kind of a tangent, but there are two types of programming that seem like fun little games. On the one hand low-level assembly and on the other hand high-level code that uses logic or type system trickery to prove various correctness or at least some nice properties of the program.

I just hope that someone at some point figures out how to combine these two things, so we can pursue the best possible way of writing a program, and then prove it correct too. This would be a sort-of immortal program that would never need updating. Well, until it becomes obsolete because the world has changed around it.

binary132 · 2024-03-28T04:07:47 1711598867

I am slowly working on a “programmable assembler” for basically this purpose — the hope is to start with making it easier to construct simple binaries while also leaving headroom to build abstractions for proofs as scripts.

kitd · 2024-03-28T07:01:31 1711609291

Love the installation instructions. No downloads, the instruction is the binary.

lifthrasiir · 2024-03-28T07:22:17 1711610537

It is rather an xz-compressed binary. :-) The following would be an uncompressed version, and I personally don't see any reason to use xz [1]. (Caution: Do not trust me on the validity of this instruction!)

  base64 -d <<EOF > btry && chmod a+x btry
  f0VMRgIBAQAAAAAAAAAAAAIAPgABAAAACQFAAAAAAAA4AAAAAAAAAAAAAAAAJSkKAAAAAEAAOAAB
  AAAABwAAAAAAAAAAAAAAAABAAAAAAAAAAEAAAAAAANIBAAAAAAAA0wEAAAAAAAAAEAAAAAAAAEgx
  wLACSDH2DwVIhcB4M0iJx0gxwEiNdCT3SDHSsgkPBf7ITTH2SDHJSDHSTWv2CopUDPeA6jBJAdZI
  /8FIOch16sMx0kG5QEIPAEH38UGJwInQMdJBuaCGAQBB9/FBsQpB9vGAxDBJ/8qI4EGIAkn/ykHG
  Ai5EicDoAQAAAMNBuQoAAAAx0kH38YDCMEn/ykGIEoP4AHXtw0nHwi0AQABBvCBXaCBIx8eqAUAA
  6E7///9Ix8eqAUAASIXAeGdFiffHRyRub3cA6DP///9IMdJMifBIa8BkSff36KD///9BxkL/KEmD
  6gVFiSJEifjoUP///2ZBx0L+LyBJg+oGRYkiRInw6Dr///8xwP/AicdMidZIx8IwAEAATCnSDwWw
  PEAw/w8FZkG8IEHHRx1jaGFyxkciZelz////L3N5cy9jbGFzcy9wb3dlcl9zdXBwbHkvQkFUMC9l
  bmVyZ3lfZnVsbA==
  EOF

[1] Well, maybe that makes a sort-of free checksum though.

Kluggy · 2024-03-27T22:07:12 1711577232

So this just reads /sys/class/power_supply/BAT0/energy_now (or charge_now) and /sys/class/power_supply/BAT0/charge_full and outputs it nicely. While the asm is cool, how large would the bash script to do it?

meribold · 2024-03-27T22:17:59 1711577879

This project started as a rewrite of a Bash script (for learning purposes), so I can easily answer this question. It's indeed small. Here's the full Bash script I used to use (including comments):

    #!/usr/bin/env bash

    # Print the remaining amount of energy (or charge)
    # stored in the battery identified by `BAT0`.  This
    # is what works on my ThinkPad X220 and may not be
    # portable to other laptops without changes.

    cd /sys/class/power_supply/BAT0 || exit $?

    # Sometimes there are `energy_now` and `energy_full`
    # pseudofiles, and sometimes there are `charge_now`
    # and `charge_full` pseudofiles instead.
    if [[ -e energy_now ]]; then
       now=$(<energy_now)
       full=$(<energy_full)
       unit=Wh
    elif [[ -e charge_now ]]; then
       now=$(<charge_now)
       full=$(<charge_full)
       unit=Ah
    fi
    percent=$((100 * now / full))
    # Convert from microwatt-hours (or microampere-hours)
    # to watt-hours (or ampere-hours).
    now=$(bc <<< "scale=1; $now / 1000000")
    full=$(bc <<< "scale=1; $full / 1000000")
    echo "$now $unit / $full $unit (${percent}%)"

bregma · 2024-03-28T10:37:41 1711622261

One is really hard-pressed to argue these little loadable executables are ELF files. They do not conform to the ELF standard. They're simply tiny loadable Linux executables because they exploit the Linux kernel loader's slackness in interpreting ELF binaries.

qweqwe14 · 2024-03-29T02:15:55 1711678555

Prior art: http://www.muppetlabs.com/~breadbox/software/tiny/return42.h...

benterix · 2024-03-28T12:12:22 1711627942

Funny how the tech world comes full circle. I remember writing a similar program (for CPUID) as a COM file in mid-90s and it definitely had less than 466 bytes, maybe 1/10th of it.

Alifatisk · 2024-03-27T21:14:35 1711574075

[flagged]

snvzz · 2024-03-27T23:38:32 1711582712

>unportable

(X) Doubt.

It's no stretch to say many a HN user can, easily, port such a trivial program to other assembly languages.

msk-lywenn · 2024-03-27T21:17:22 1711574242

it's not (and I don't think it's supposed to)

freedomben · 2024-03-28T11:54:11 1711626851

Yeah, that seems like just a factual statement to me.

If you start from the premise that everything is a flex, then you can easily confirmation bias your way into reading it that way, but IMHO that's not a helpful way to live life unless you want to feel like everybody is constantly flexing regardless of whether they are or not.