Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Best books on modern distributed systems
62 points by eatonphil on Sept 2, 2021 | hide | past | favorite | 21 comments
I've read designing data intensive systems and it covered distributed systems a bit.

I don't find most textbooks to be an actually good intro outside of a course setting. For example, I own Andrew Tannenbaum's Distributed System book and a few others of his. But his writing style is too dense for me to make enough progress without giving up.

What other books (probably not textbooks) do you recommend on distributed systems?




Designing Data Intensive Applications has treated me well, getting me through pretty much every system design interview I have had.

I am going to move onto either

* O'Rielly Designing Distributed Systems or

* O'Rielly Architecture Patterns with Python

Could anyone recommend either one?

I am mainly a Python dev (with no particular love for Python), but that doesn't necessarily make me want to use the Python book more. Having code examples I can grok immediately is nice, but I also enjoy the exercise of translating concepts to code myself.


O’Reilly Building Event-Driven Microservices is a solid book to continue.

I would not recommend Architecture Patterns with Python. Some patterns there are questionable and introduce more complexity. Also, they talk about using Kafka and Service Bus in the book, but never mention an important issue you may face in real implementation: consistency in updating both a message bus and a database that usually is addressed with Outbox pattern https://microservices.io/patterns/data/transactional-outbox....


What do you think of avoiding implementing Outbox by just having things like Kafka Connect subscribe to the table changes in the database?


It can be a problem if you care about schema compatibility for your events - and I would care about it in a multi-team microservices setup. Your database schema essentially becomes a contract, your service is not a "black box" anymore. You have locked yourself from the ability to change the schema freely.

Also, the messages produced reflect database changes, not domain events. I would prefer to have an explicitly defined interface layer for consumption. At least, use database views as contracts with Kafka Connect.


One thing would be that you then are just syncing database as opposed to sending domain events, aren't you?


Good point, but you could have the sender/pusher process read from Kafka Connect and parse into a domain event before pushing.

It sounds roundabout, but you're letting a battle-tested implementation handle the database-table-listening aspects with no special outbox implementation needed on the database write.


In our case it's not the easy to reconstruct domain events from CDC - in many cases, multiple events are affecting the same tables.


How do you handle domain events that are the composition of columns of different tables?


You can use views.


Thank you!


Designing Distributed Systems This is more for devop folks, rather than creating your own distributed system.


@skrtskrt: Sorry for hijacking this post, but is there a way to contact you directly (my email is in my profile)? Your approach to building Django apps closely resonates with me and I would love to discuss these concepts in more detail.


Concurrency and Scalability for Distributed Systems is on my reading list, it's in early release but I think it's what you want https://learning.oreilly.com/library/view/concurrency-and-sc...


This sounds great !


Seek out distributed systems research papers from real-world practitioners. A quick search lead me to this nice collection: https://dancres.github.io/Pages/


If you read https://www.worldcat.org/title/designing-data-intensive-appl... you are probably ready to read research articles of basic concepts that Tanenbaum's book references.

When doing my MSc in CS at Uppsala University we implemented a distributed system. We used Tanenbaum's book and while very good it was a bit terse on details. Looking up the actual research papers referenced was helpful during implementations.


Ken Birman's (https://www.cs.cornell.edu/ken/)' book

"Guide to Reliable Distributed Systems: Building High-Assurance Applications and Cloud-Hosted Services"

He even give a course about topics in cloud computing : https://www.cs.cornell.edu/courses/cs5412/2021sp/

Another interesting course about distributed system is from MIT https://pdos.csail.mit.edu/6.824/schedule.html


This depends a bit on what you aim to get out of reading them? I liked both of the books you mentioned for different reasons personally. Do you have a set of goals you wish for as an outcome? could you say a bit more about your background as well, please?


While they are not solely focused on distributed systems, these five books -- https://rvprasad.medium.com/books-about-designing-systems-of... -- cover different aspects of designing systems of scale, which are most often distributed systems.


While not a book: I took Udi Dahan's Advanced distributed systems design course in 2018 and can't recommend it enough: https://particular.net/adsd

I'm pretty sure there is an online course available which you can watch, so it's a slightly different medium but may be a nice change of pace from reading. The course was really informative for me and has really been paying dividends in my career.


I also enjoyed:

* Release It! by Michael Nygard

* System Design Interview by Alex Xu




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: