I don't know if I got this correctly: > When we look at the fsync() and fdatasyn...

jldugger · on Jan 16, 2021

Yes, this was a source of many problems when firefox moved its history and bookmarking engine to sqlite (I think) which used fsync for consistency guarentees. Leading to situations where making a bookmight might cause your entire computer to pause as disk writes spin up.

JNRowe · on Jan 16, 2021

Looks like Firefox 3¹, all the way back in 2008! There is a good description of fixes and workarounds on LWN².

¹ http://shaver.off.net/diary/2008/05/25/fsyncers-and-curvebal...

² https://lwn.net/Articles/328363/

jbritton · on Jan 16, 2021

Last year I had a problem where I had a function that was logging and a sync() call in another process was blocking the logging for greater than 10 seconds and this caused a timeout aborting the operation my code was performing. I have now moved all logging into a queue that logs on its own thread. This was a gotcha I never anticipated.

wahern · on Jan 16, 2021

Unless you have infinite memory, at some point you want a task to slow down, block, or whatever in the face of resource exhaustion.

It's not a bad idea to maintain a local buffer that gives you a certain amount of cushion. I recently helped a team resolve the exact problem you had, with a similar solution. But excessive, unnecessary use of non-pageable memory is one of the things which induce early I/O contention, causing these stalls to begin with. (Consider an overloaded or errant process generating and buffering a lot of logging noise, precisely because the overtaxed system is under heavy I/O contention.)

To reiterate: you want backpressure, which means that you want a process which is exhausting limited resources to slow down or block. And you want that to transitively slow down or block upstream requests. Too many developers don't understand this and insert hacks to solve their immediate problem (e.g. closing a ticket complaining about intermittent SLA latency misses) without appreciating the broader issues, which at the end of the day just contributes to these problems.

One of the alternatives people attempt is to insert a gazillion knobs to permit dedicated resource allocation. But now you just have two problems, the second being figuring out what the magic values should be--a never ending and often intractable problem. This rarely ends well except for highly specialized tasks--e.g. a dedicated DB administrator who spends all day attending to and tuning a database instance.

That said, in the old days you mounted /var (and if you were super fancy, /var/log) on different disks to minimize unrelated I/O contention.

wscott · on Jan 16, 2021

Yes. I consider it one of my worst bugs in the Linux kernel and in big servers you pretty much can't have any programs that call fsync() because it will cripple your performance.

That paragraph caused a whiplash of emotions while reading, "Cool!!!... what???? Ug."

severino · on Jan 16, 2021

Yes, however when you're running a DBMS in your server, I guess it will be doing fsync() all the time. I wonder, if other filesystems like XFS or ZFS have a different behavior here, how much performance gain for a typical database load they can achieve in comparison to ext4.

AdamJacobMuller · on Jan 15, 2021

No, it might sync some metadata for other files, but it won't sync their data.

danudey · on Jan 16, 2021

Are you certain? The quote actually says:

> With ext4, as a side effect of the filesystem structure, *all pending data and metadata for all file descriptors* will be flushed instead.

Is the article mistaken?

jradd · on Jan 16, 2021

manpage suggests the article is not mistaken. `max_batch_time=0` disables batch processing synchronous write operations. the default is 15ms.

It seems like a good idea for client filesystem to mount synchronously for data integrity. I'm not aware of i/o being a factor here.

Of course this doesn't prevent applications mishandling of file objects.

jradd · on Jan 16, 2021

no, the fsync call remains the same as it does in ext3 and most filesystems.

However in ext4 there is no need to call sync unless you disable batch processing of synchronous writes.