Hacker News new | past | comments | ask | show | jobs | submit login

I am not in any way against file naming conventions, i am against enforcement of convention, specific for one use case (programming) in general-purpose system.

When shell script is written for some specific task, you can relay on convention and receive all productivity benefits even without kernel enforcement. If somebody creates files with carriage return in source tree because kernel doesn't stop him, problem is social, not technological.

> I find it amusing when people assert that having limitations in filenames is crippling system functionality, and yet they don't make the same assertion about identifier names in programming languages.

Programming language has narrower usage field than operating system. Naming variable "Мой любимый щеночек (01.02 12:34).jpg" (my favorite puppy) is absurd. Having file with such name is perfectly reasonable.

What is actually my primary point - file names in general are not program internals. They are part of user data and should be treated like that.

> As to your proposed solution of having the shell escape special characters, rather than having the kernel disallow them

Not really what i meant. I was saying, that program (be it shell script, or application calling 'system()'), that intends to work on arbitrary, user provided file names, won't benefit from kernel-enforced limitations. "process $FOO" won't become protected from misuse and exploits if special character will be forbidden, application will still have to check for "bar; rm -rf .", and checking and rejecting that is not harder than replacing it with "./bar\;\ rm\ -rf\ .". It's just calling escape_file_name instead of validate_file_name.

So:

1) any productivity benefits, provided by kernel file name limitations can be acquired by convention. (What UNIX world is doing)

2) such limits won't make anything safer. Building shell command by blind concatenation of user provided data will still be unsafe. If user is trusted - case 1

3) files are used not only by programmers. Imposing such limits will either degrade user experience, or lead to display name !== actual file name, leading to indirections and kludges much worse than touch -- "$FOO"




If somebody creates files with carriage return in source tree because kernel doesn't stop him, problem is social, not technological.

Your position boggles and dismays me. I have seen so many heinous bugs that appear only intermittently, and are nigh impossible to track down, due to this kind of issue. The problem absolutely positively is not social. It is technical. The only social thing about it is that people persist on taking the wrong side on this issue.

As to having different rules for filenames in different places, that is just nuts. Programs should not be fragile, and shouldn't have subtle edge cases. Having software work that way has all sorts of downsides and hidden costs. E.g., people need to remember a lot more. More documentation is needed. Things go wrong when they didn't have to. All of this costs time and money and helps to sap enthusiasm as people track down chimeras they shouldn't have had to.

Furthermore, one of the prime use cases for scripting is by system administrators, and such scripts need to handle all files. The stories of sysadmin scripts that have run afoul of files with strange filenames is legendary.

Regarding your example with "process $FOO": that's completely a red herring. You might as well assert, "We can't solve everything, so we should solve nothing." In this particular case, we were talking about the problems caused by filenames that are hard to deal with in a scripting environment, not about programs directed by user input. The first problem is easily solvable once and for all, while the second problem is less so and will always require care. Just because some things require great care does not mean that we should make all things require great care.

I just can't fathom that there are still people who actually argue for a world that fosters subtle bugs and lack of robustness. It is downright wrong, and it may someday be our undoing. Quite literally.


Spaces (parenthesis, semicolons, bangs etc) in file names are not subtle edge cases if you consider system as tool for reaching user's goals. Programs have to process file names with spaces not because kernel aesthetics, but because users want and need files with normal, readable names.

Actually, I would happily agree to ban \n as edge case - it's useless for end user and a readable file names separator is needed for scripts (like \0, which is forbidden because it is useless for users and extremely inconvenient to work with in C).


What if users want rich text in their filenames? Why shouldn't they have the ability to do that? And certainly they want slashes in their filenames! But Unix doesn't give them that either. Horrors!

What people want most of all is reliable, robust software. Features that don't work right are worse than no feature at all. What you fail to consider is that every feature has a cost. In this case, the cost was WAY too high. If this cost is to be paid, then it should have been paid in a lower-cost manner.

Contra to what you say, I'm perfectly sure that users would have dealt file with having more limited filenames. In fact that did quite fine with 8.3 filenames for many years. I must concur, however, that those were more limiting than humans should be forced to adapt to.

This being said, I have nothing against giving people the ability to have all of these things in the display name for a file, if it is deemed that the extra flexibility is worth the trouble. This extra flexibility just shouldn't be in the unique identifier for a file. There are perfectly good ways to provide this capability in a manner that has far fewer costs.

Alternatively, I'm not opposed to adopting the attitude of the kernel hackers and shifting the burden onto the shells to generate such meta-character-free identifies from richer display names, but if that was the way it was to be, it would then have been essential that a standard library for generating such unique identifiers from display names have been created, and that the shells uniformly use this library.


There are perfectly good ways to provide this capability in a manner that has far fewer costs.

Not really. It's either some specialized tools (throwing away all environment uniformity benefits) or another layer of indirection (display name->real name->inode), with it's share of bugs (and having two close, often equal, but different identifiers won't make programming any less error prone).

shells to generate such meta-character-free identifies from richer display names

You still need to pass rich display names to shell, so old problems are still there, and, on top of that, consistent mapping of display names to real ones is required.

In this case, the cost was WAY too high

-- "$FOO" instead of $FOO, and ls -1 instead of ls? (Not accounting for \n here, because in that case i agree on it's abysmal benefit/cost ratio and banning)


-- "$FOO" instead of $FOO, and ls -1 instead of ls?

There's significantly more to it than that, so either you prove my point, or you are being disingenuous.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: