More

raimue · on March 31, 2024

Read the code of the check again. It mostly checks that the required SYS_* constants are defined to be able to use the syscalls. You can compile this on a system that does not have landlock enabled in the running kernel, but the libc (which imports the kernel system call interface) has to provide the syscall numbers.

viraptor · on March 31, 2024

You're right. I didn't see SYS... symbols being actually used, but they are: https://git.tukaani.org/?p=xz.git;a=blob;f=src/xz/sandbox.c;...

This doesn't change my opinion in general - that version should be exposed through a library call and knowing about the specific syscalls shouldn't be needed in xv.

Denvercoder9 · on March 31, 2024

I see your point, but suggesting adding an additional library dependency while we're discussing a supply chain attack is quite ironic.

viraptor · on March 31, 2024

Should've said function call not library call. My bad. Basically if you already have the linux/landlock.h, that should provide everything you need to do without explicit references to SYS...

raimue · on March 31, 2024

Now we are running in circles. As you see in the git commit, the compile check was added because the existance of linux/landlock.h alone was not enough to check that the feature can be used.

This header defines the data types for the Linux kernel interface, but not how the syscall landlock_create_ruleset(2) will be issued. That is provided by libc either as a separate wrapper function (does not exist in glibc) or the implementation needs to use syscall(SYS_landlock_create_ruleset, ...), with the constant also being provided by libc. That is how it works for all syscalls and you won't be able to change this.

SAI_Peregrinus · on March 31, 2024

The only source of the claim that the existence of.linux/landlock.h is insufficient is (AFAICT) the malicious git commit. Why trust the comment, written by the attacker, to explain away a malicious change?

raimue · on March 31, 2024

I already explained above why the existence of linux/landlock.h is not sufficient. Why do you still question it? If you know a bit about system programming and how configure checks work, the change in itself is totally reasonable.

raimue · on Dec 25, 2023

The question is whether you can expect this format to stay stable and reproducible across git versions. Remember the fallout from git 2.38 when the output of 'git archive' changed. Although for this backup use case it would just mean the next backup with a new format would make a full copy once.

jacoblambda · on Dec 25, 2023

git bundles have a standardised format (defined in [1]) while git archives did not and continue to not have one. Git can still handle all previous bundle versions and you can specify which version of the bundle standard you want to use when creating bundles.

So no the git 2.38 issue should not be a problem for bundles even if the format changes in the future.

1. https://git-scm.com/docs/gitformat-bundle

avar · on Dec 25, 2023

Standardized format != guaranteed reproducible. Git makes no promises that it'll keep PACK contents stable, just that they're guaranteed to "deflate" to the same contents.

Which is what the linked article discovered. Threading is a trivial way to discover this, but there's other ways PACK contents might differ across versions.

wkennington · on Dec 26, 2023

Yeah, even tar is not stable bit for bit across versions.

kzrdude · on Dec 26, 2023

Absolute stability is not needed for this use case. Just reproducible per git version would be enough, to ensure mostly no redundant backups.

raimue · on Oct 31, 2023

In the 2000s, Apple ran a big campaign with the phrase "I'm a Mac - I'm a PC" to highlight the difference.

raimue · on Aug 27, 2023

Back then, your suggestions were niche languages (and some still are) and are still not popular for embedded systems or network equipment. Large runtimes or huge static binaries are not suitable due to the memory and storage constraints.

You have to consider the surrounding ecosystem. Those interested in such languages are not necessarily those interested in contributing solutions to the problem space. Any project attempting to use such a language in OpenWrt would very likely not have survived until today.

raimue · on July 14, 2023

The author missed that sshd will always execute the user's shell and pass it the command with arguments as a `-c` argument. This means that the given command string will always be parsed by the remote shell. This is required to restrict users to certain commands with special shells like scponly or rbash.

When you keep in mind that the given command string will be parsed twice, first by your local shell and then again by the remote shell, it becomes clear why a running a remote ssh command behaves like this.

jchw · on July 14, 2023

Yep! God though, this hits me in the face so often. Trying to add `sh -c` to fix it is a trap, because obviously, you just create yet another layer of escaping.

It really becomes one hell of a puzzle sometimes, especially if you're necessarily nesting another layer of escaping. It feels like you're trying to write a quine.

This works:

    ssh host -- ls "folder\ name"

This also works:

    ssh host -- ls \"folder name\"

This works:

    ssh host -- sh -c \"ls \\\"folder name\\\"\"

OK, so clearly, just throwing more escaping at it fixes it. But even if you figure that out, the real mental gymnastics would be figuring out which of the three shells interpreting your command line in the last case would handle shell expansion.

In this case, it's the host:

    ssh host -- sh -c \"ls \\\"folder nam\\\"*\"

In this case it's the remote:

    ssh host -- sh -c "\"ls \\\"folder nam\\\"*\""

Of course where you put the quotes makes no difference. All it does is prevent your shell from processing it. So this works just as well:

    ssh host -- "sh -c \"ls \\\"folder nam\\\"*\""

If you sit and think each layer through, it usually isn't completely impossible to understand, but the odds that you are going to get something wrong the first time is astonishingly high.

It does make me wonder why ssh handles it the way it does, though. Because with the way SSH handles it, it may as well just automatically escape the spaces. Right now, not putting an SSH command in quotes doesn't make much sense unless you for some reason want local shell expansion for something.

Wicher · on July 14, 2023

If you can invest a little bit of time in configuration of the remote host, then my "special shell" (also mentioned elsewhere on this page) may be of use, and you will no longer need to hit yourself in the face so often!

mention: https://news.ycombinator.com/item?id=36726111

crate: https://crates.io/crates/arghsh

zokier · on July 14, 2023

neat. one little pedantic note, argsh requires all arguments to be valid unicode strings while I think at least Linux allows argv to be arbitrary sequences of non-zero bytes. But then I really hope there are not many cases where that is relevant consideration.

Wicher · on July 15, 2023

Good point, I'll add that to the README next time I touch it. To warn off those who were thinking about passing jpegs via argv ;-)

wolletd · on July 14, 2023

From my experience, it's possible to just do

  ssh host 'ls "folder name"'

and practically ignore the `command [arguments]` split in SSHs synopsis. It's all passed to a shell and parsed again, anyway.

1vuio0pswjnm7 · on July 14, 2023

For simple tasks, that's how I have always done it.

Alternatively,

   echo ls \"folder name\"|ssh -T host

or

   cat > 1
   ls "folder name"
   ^D

   ssh -T host < 1

argulane · on July 16, 2023

The last example really looks like perfect case for using `here document` feature.

    ssh -T host <<EOF
    ls "folder name"
    EOF

brazzledazzle · on July 14, 2023

Once I run into nested escape wrangling I start to seriously question how I'm trying to accomplish something.

josephg · on July 14, 2023

Once I run into nested escape wrangling I seriously question the sanity of my tools.

lanstin · on July 14, 2023

If I can't fix it in like 3 tries, I switch to my local shell script having the script to run on the remote end be in a HEREDOC and just scp it it over and then ssh exec it. Inside the HEREDOC, it's a sane environment.

theK · on July 14, 2023

Copying over the commands to run also has another added benefit, you have quite good documentation of what was run and when. Also it allows one to easily run a failing thing again (in case output gets mangled in the executing script)

lanstin · on July 15, 2023

And idempotency is essential for shell scripts that alter state. Otherwise you have to keep track of where failures happened.

septune · on July 14, 2023

wrap it in base64 then cat it | ssh « base64 -d | bash »

Arnavion · on July 14, 2023

`printf '%q'` is your friend.

mananaysiempre · on July 14, 2023

> This works:

  ssh host -- ls "folder\ name"

> This also works:

  ssh host -- ls \"folder name\"

Uh, why not

  ssh host -- ls '"folder name"'

? Single quotes are the shell’s ultimate bulk “no touchy” escape, so if you don’t need them in the inner command, it seems easier to use them for everything. (Also when passing programs to sed, awk, jq, xmlstarlet, etc.)

nneonneo · on July 14, 2023

If you do need single quotes in the inner command, you can use the "'" trick:

    ssh host -- ls "'"'folder$name'"'"

me-vs-cat · on July 15, 2023

$name will never be expanded. Is that what you expected?

nneonneo · on July 15, 2023

Yes, that’s the intent: using single quotes at both levels will prevent either your shell or the remote shell from expanding the name, so you can interact with a file name containing a literal $.

theK · on July 14, 2023

On another note, avoiding having paths with troublesome chars in them on servers does generally make the sailing much more pleasurable.

jchw · on July 14, 2023

I believe that also works, but I don't do it often because I often wind up wanting to be able to use variables from the host, especially e.g. in cron jobs and scripts.

mananaysiempre · on July 14, 2023

Yes, that’s a valid thing to want.

At that point, though, I’d try to write a general shell escaping function, because I don’t trust myself to figure out which host things are OK to include unescaped in such a situation. (Here’s when I start to long for Tcl, despite all the times I’ve had to spell out [string index ...].)

The optimal solution would be to have a separate side channel for passing things to the quoted program, like -v in awk or --arg in jq (or whatever it is in your favourite SQL DBMS binding) but I don’t think SSH will let you do that.

formerly_proven · on July 14, 2023

This is more or less baked into the SSH protocol because the exec request only has a single string for the command line.

Also the reason echoing a MOTD from rc files or similar crap breaks tools like rsync or scp which use SSH as a neutral transport. SFTP isn’t affected because, while using the three-pipe as a transport, it’s a separate subsystem with its own SSH request type to initiate the channel (just like X11 or port forwarding).

If you control client and server you can define your own subsystems and invoke them directly, which avoids this whole mess.

raimue · on July 14, 2023

I think the confusion comes from the documentation where ssh(1) says that the command "is executed on the remote host instead of a login shell.". Which is true from the perspective of ssh(1) in the sense of the protocol. The client has no control over what the server does with that string.

However, sshd(8) clearly documents that it will always execute the login shell, even when a command has been passed. ("All commands are run under the user's login shell as specified in the system password database.")

Subsystems open secondary channels to communicate separately from stdin/stdout of the remote command. I used the X11 forwarding before to run a remote command with sudo without getting the password prompt interfering with the protocol: https://raimue.blog/2016/09/25/how-to-run-rsync-on-remote-ho...

formerly_proven · on July 14, 2023

An extra subtlety is that it is the user’s login shell (in the getpwnam sense) but the command is not run in a login shell (in the sh - / sh -l sense). That’s why proper motds or profile files work ok (requesting a shell runs the user’s login shell as a login shell) and don’t break applications. Also the reason why $PATH often differs between the two modes.

zokier · on July 14, 2023

yeah, this exec trace from the article is wrong, it is missing one sh -c from the chain

  $ ssh localhost figlet foobar bar\ baz
  execve("/usr/bin/ssh", ["ssh", "localhost", "figlet", "foobar", "bar baz"], …
  execve("/usr/bin/figlet", ["figlet", "foobar", "bar", "baz"], …

in practice it looks more like this (traced with execsnoop):

    PCOMM            PID     PPID    RET ARGS
    ssh              4255    2058      0 "/usr/bin/ssh" "localhost" "figlet" "foobar" "bar baz"
    sshd             4256    2147      0 "/usr/bin/sshd" "-D" "-R"
    bash             4259    4258      0 "/bin/bash" "-c" "figlet foobar bar baz"
    figlet           4259    4258      0 "/usr/bin/figlet" "foobar" "bar" "baz"

JeremyNT · on July 21, 2023

> This is required to restrict users to certain commands with special shells like scponly or rbash.

I don't think this is some specific design goal of OpenSSH, I think it's just a side effect of how shell escaping works.

> When you keep in mind that the given command string will be parsed twice, first by your local shell and then again by the remote shell, it becomes clear why a running a remote ssh command behaves like this.

I get that this behavior may be surprising to new users, but anybody working with ssh regularly will encounter these kinds of escaping issues. SSH isn't even the only place you'll encounter this. Things like docker etc will have the same "problem".

In the case of ssh you can simply write your commands to a file and send them via stdin, or copy a script to the target.

The tone of this blog post rubs me wrong. Yes this is a footgun (in the same way many POSIX shell-related things are), but it's not like it's some "problem" with the design of SSH.

sureglymop · on July 14, 2023

Only tangentially related.. Is there something like rbash that is actually secure and more restrictive? Like a shell that only "sees" certain files and folders and can only execute certain commands in a non privileged manner.

me-vs-cat · on July 15, 2023

The shell rarely "sees" files and folders, except for expanding a glob like "*".

When the shell executes "cmd folder/file", the "folder/file" is just a string as far as the shell is concerned. It is the command that uses that string with a function like unlink or open.

sureglymop · on July 15, 2023

Okay... you're right. So should I say "the process" and every process forked or exec'd instead of "the shell"? But is it clear what I am looking for?

raimue · on April 30, 2023

This is most probably because they are actually TIFF, but with the .png file extension they are served as image/png.

raimue · on Nov 16, 2022

There is git-fixup, which provides the 'git fixup' command that makes suggestions to which commit the currently staged changes should be added.

https://github.com/keis/git-fixup

git-fixup will add fixup! commits, so it still needs the mentioned 'git rebase -i --autosquash' afterwards. Usually you do not even need to give it a specific commit if your branch is set to track an upstream branch.

jordigh · on Nov 16, 2022

Still not quite the same, because absorb splits up your working directory changes into all relevant commits, as deduced by diff context.

hakre · on Nov 16, 2022

Staged or unstaged? Sounds interesting, will give it a try. Thanks for the details.

hakre · on Nov 17, 2022

Dipped into it.

Staged changes, excluding renames.

So git-fixup and git-absorb will now happily live together for the moment on my box.

Made a good first impression, thanks again for the reference.

raimue · on Oct 28, 2022

It would be easier to intercept the open() for the /proc/cpuinfo file with a LD_PRELOAD library. Of course that would still be detectable by the benchmark. You could also use a modified libc to spoof it. Then their only way out would be static linking.

You could even just modify the kernel to report arbitrary strings in /proc/cpuinfo.

So they use the CPUID instruction instead. The CPUID instruction can be trapped to throw a SIGSEGV instead of returning real values on x86 with arch_prctl(ARCH_SET_CPUID, 0). So an injected SIGSEGV handler could then spoof it.

But even then, you could also trap the CPUID instruction in the kernel, and spoof it from there, which would be even harder to detect from user space.

In the end, the benchmark program always needs to trust the kernel. Is it really worth trying to detect spoofing?

raimue · on Oct 9, 2022

Hamburg, as announced in an earlier blog post.

https://www.ccc.de/en/updates/2022/37-chaos-communication-co...

raimue · on April 9, 2022

Previous discussion: https://news.ycombinator.com/item?id=30953634