Hacker News new | past | comments | ask | show | jobs | submit login

Is there something you can do with named pipes that you couldn't also do with a regular text file?



A regular file has two obvious disadvantages: first, you write all the data to disk and read it back again, which costs you time and disk space. In a pipe, the data is streamed, with the OS doing buffering for you. Second, if you have one process writing to a plain file and another process reading from it, you need some way of signaling when there's data ready to be read and when the reader should wait, or when the stream has ended. A pipe provides this for you: in blocking IO mode, the reader just issues a read() and blocks until the writer writes something.


What if the file is in /tmp or shared memory then it wont cost that much time and no disk space/access?

The blocking/signalling is still missing though.


I don't understand why you'd do it, but sure, if you're determined to avoid pipes even in situations where they'd be really well suited to the problem, a scheme using files on a RAM-based filesystem and roll-your-own signaling/blocking/buffering could conceivably be made approximately equivalent, performance-wise, to a pipe.

Out of curiosity, is there some good reason you'd do this instead of just using a mechanism that's specifically made to solve this problem?


Well, for one thing, on some systems (e.g., OS X), /tmp is disk-backed.

But also, by writing out to /tmp on memory-backed systems, the lack of blocking means you grow memory use potentially indefinitely if the reader is slow or delayed for some reason. That will ultimately turn into swapping, which is just disk access again.

There's also no great way to truncate the beginning of a regular file.


The blocking/signalling is missing?

What's wrong with

$ ./producer > my_rambacked_file & $ tail -f my_rambacked_file > ./consumer

Well, okay, I guess this way you don't get a signal that the producer is finished.


It's hard to have multiple processes logging to the same file, but using pipes Posix defines writes shorter than PIPE_BUF to be atomic. On linux this is 1024 Bytes. So if log less than 1024 bytes per message from multiple processes, you can ensure they won't scramble their messages.

http://manpages.courier-mta.org/htmlman7/pipe.7.html

> POSIX.1-2001 says that write(2)s of less than PIPE_BUF bytes must be atomic: the output data is written to the pipe as a contiguous sequence. Writes of more than PIPE_BUF bytes may be nonatomic: the kernel may interleave the data with data written by other processes. POSIX.1-2001 requires PIPE_BUF to be at least 512 bytes. (On Linux, PIPE_BUF is 4096 bytes.) The precise semantics depend on whether the file descriptor is nonblocking (O_NONBLOCK), whether there are multiple writers to the pipe, and on n, the number of bytes to be written.


A pipe is really just a memory buffer. When it gets full, the writing application will block and wait for the reading one to drain it. So in some respects they are primarily IPC and synchronisation mechanisms.


The main benefit is that it avoids needing to write to the disk (which can also be avoided by making the file on a ramdrive). In general, it behaves the same way that a normal file behaves, in the sense that you read what was written, in the order it was written.

The only difference in behavior I can think of is timing. Generally, when you read a text file, you see what is written at that moment in time (barring race conditions). With named pipes, you read what has been written, and then wait until the program writing closes the pipe. If anything gets written in the meantime, you still see it. This lets you use them for message passing in a way that normal files do not.


Besides the speed aspect which most other people have mentioned, you can avoid running out of harddisk space.



Avoid hitting the disk.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: