FYI, this requires profile data to do the splitting. Yet another reason to do PG...

repsilat · on Sept 11, 2020

For many use-cases FDO ("feedback-driven") can be more convenient and sometimes mkre effective than PGO ("profile-driven").

The difference is sampling prod vs sampling separately in test. The arguments for FDO:

- Prod behaviour/data is always "representative", whereas synthetic or recorded data can go out of date quickly.

-PGO test fixtures can contain sensitive user data. Instrumenting production processes doesn't put data in more places.

The benefits of both are huge though. The rule of thumb I've seen is a 20% improvement for FDO over -O3.

throwaway17_17 · on Sept 11, 2020

I’m interested in reading some more on this concept, The Wikipedia, which usually my first stop for a broad overview and maybe some linked research, seems to suggest that profile directed/guided and feedback guided are the same thing. Is there anywhere I can read about the varying approaches?

jeffbee · on Sept 11, 2020

They are the same thing. The GP makes a distinction which does not exist. Witness the fact that two different compilers refer to the exact same feedback technique as either AutoFDO or SamplePGO.

repsilat · on Sept 11, 2020

Ah, I definitely worked in a place where that distinction in terminology was used, but maybe it isn't widespread. In any case, whatever you call either one of them, sampling from production processes can have benefits over sampling from synthetic workloads.

justinclift · on Sept 11, 2020

Surely that's a given?

It's literally sampling from a representative workload (production) vs a non-representative one (anything synthetic).

repsilat · on Sept 11, 2020

Maybe so. I think it was a novel idea to me because my intuition around profiling was formed with an assumption of delivering binaries to users, and not running server processes. (It's probably also a pre-internet bias of thinking it would be hard to get prod data, whereas profiling data from generated in a horrible enormous compile&run&profile&recompile process at least doesn't need to "go" anywhere.

account42 · on Sept 11, 2020

There is a distinction even, even if you don't like OP's terminology used to differentiate the two things:

- With traditional PGO (ie -fprofile-generate, then -fprofile-use) you first generate a separate instrumented binary that records profile information. You wouldn't generally want to use this binary in production because of the overhead this profile generation incurs.

- With sample driven PGO (ie -fprofile-sample-use) you use external tools to sample profile information from an uninstrumented binary - the same binary you'd use in production.