I am not in any way against file naming conventions, i am against enforcement of...

nessus42 · on March 12, 2012

If somebody creates files with carriage return in source tree because kernel doesn't stop him, problem is social, not technological.

Your position boggles and dismays me. I have seen so many heinous bugs that appear only intermittently, and are nigh impossible to track down, due to this kind of issue. The problem absolutely positively is not social. It is technical. The only social thing about it is that people persist on taking the wrong side on this issue.

As to having different rules for filenames in different places, that is just nuts. Programs should not be fragile, and shouldn't have subtle edge cases. Having software work that way has all sorts of downsides and hidden costs. E.g., people need to remember a lot more. More documentation is needed. Things go wrong when they didn't have to. All of this costs time and money and helps to sap enthusiasm as people track down chimeras they shouldn't have had to.

Furthermore, one of the prime use cases for scripting is by system administrators, and such scripts need to handle all files. The stories of sysadmin scripts that have run afoul of files with strange filenames is legendary.

Regarding your example with "process $FOO": that's completely a red herring. You might as well assert, "We can't solve everything, so we should solve nothing." In this particular case, we were talking about the problems caused by filenames that are hard to deal with in a scripting environment, not about programs directed by user input. The first problem is easily solvable once and for all, while the second problem is less so and will always require care. Just because some things require great care does not mean that we should make all things require great care.

I just can't fathom that there are still people who actually argue for a world that fosters subtle bugs and lack of robustness. It is downright wrong, and it may someday be our undoing. Quite literally.

jarman · on March 13, 2012

Spaces (parenthesis, semicolons, bangs etc) in file names are not subtle edge cases if you consider system as tool for reaching user's goals. Programs have to process file names with spaces not because kernel aesthetics, but because users want and need files with normal, readable names.

Actually, I would happily agree to ban \n as edge case - it's useless for end user and a readable file names separator is needed for scripts (like \0, which is forbidden because it is useless for users and extremely inconvenient to work with in C).

nessus42 · on March 13, 2012

What if users want rich text in their filenames? Why shouldn't they have the ability to do that? And certainly they want slashes in their filenames! But Unix doesn't give them that either. Horrors!

What people want most of all is reliable, robust software. Features that don't work right are worse than no feature at all. What you fail to consider is that every feature has a cost. In this case, the cost was WAY too high. If this cost is to be paid, then it should have been paid in a lower-cost manner.

Contra to what you say, I'm perfectly sure that users would have dealt file with having more limited filenames. In fact that did quite fine with 8.3 filenames for many years. I must concur, however, that those were more limiting than humans should be forced to adapt to.

This being said, I have nothing against giving people the ability to have all of these things in the display name for a file, if it is deemed that the extra flexibility is worth the trouble. This extra flexibility just shouldn't be in the unique identifier for a file. There are perfectly good ways to provide this capability in a manner that has far fewer costs.

Alternatively, I'm not opposed to adopting the attitude of the kernel hackers and shifting the burden onto the shells to generate such meta-character-free identifies from richer display names, but if that was the way it was to be, it would then have been essential that a standard library for generating such unique identifiers from display names have been created, and that the shells uniformly use this library.

jarman · on March 14, 2012

There are perfectly good ways to provide this capability in a manner that has far fewer costs.

Not really. It's either some specialized tools (throwing away all environment uniformity benefits) or another layer of indirection (display name->real name->inode), with it's share of bugs (and having two close, often equal, but different identifiers won't make programming any less error prone).

shells to generate such meta-character-free identifies from richer display names

You still need to pass rich display names to shell, so old problems are still there, and, on top of that, consistent mapping of display names to real ones is required.

In this case, the cost was WAY too high

-- "$FOO" instead of $FOO, and ls -1 instead of ls? (Not accounting for \n here, because in that case i agree on it's abysmal benefit/cost ratio and banning)

nessus42 · on March 20, 2012

-- "$FOO" instead of $FOO, and ls -1 instead of ls?

There's significantly more to it than that, so either you prove my point, or you are being disingenuous.