> C++ has threads which ACTUALLY run in parallel on the CPU. Why bother complica...

PaulDavisThe1st · on Feb 22, 2023

Your example implies some combination of (a) excessive data sharing between threads (b) possible need for core pinning.

If thread A is going to need to read all the data touched by thread B, then it's unclear why you've split the task across threads. If there are still good reasons, then probably pin A and B to core N (never pin anything to core 0, unrelated story), and let them interleave there.

If that doesn't make sense, then yep, you'll have to face the cost of ping-ponging.

samsquire · on Feb 24, 2023

In my benchmark of a simple ledger I generate and do 25,021,365 (25 million) random transactions per second. (Withdraw and deposit over 80,000 accounts)

Paralellise it via sharding and I get 80,958,379 transctions per second.

If I remove the randomness, I get 134,600,233 transactions per second. If I parallelise with 12 threads I get 931,024,042 transactions per second.

My point being, if you paralellise properly and do not use shared data, then you can boost performance by a multiple of the number of threads.

https://github.com/samsquire/multiversion-concurrency-contro...

For the technique I use for this scalability, see my journal entries on sharded structs and sharded integers

https://github.com/samsquire/ideas4#565-sharded-struct-pseud... https://github.com/samsquire/ideas4#608-sharded-integers

jb1991 · on Feb 23, 2023

This is not so likely in most workloads unless multiple threads are hitting very close or adjacent regions in memory. Most multithreading workloads work on memory not so close together. It's good to be aware of it, but to suggest it is the most likely outcome from multithreading is misleading.