This is a great outline. Martin Kleppmann's Designing Data Intensive Applications [0] is very much in the same domain as what's in these notes.
Speaking about scale and distributed systems: AWS specific resources I find interesting from their EdgeEngineering/NetEng teams (I haven't seen many other service teams in AWS openly share as much about design as them):
The notion of what constitutes a ‘distributed system’ has become vague. The original meaning implied a high degree of transparency - to the point, say, that if you open a new tab in Chrome (you know, a web browser), it might run on a different node. The idea was for the cluster to appear to the user as a single computer. I think Plan9/Inferno was the last example of such system.
The Erlang VM presents such a unified view to programs that run on it. The asynchronous messaging built into the language works the same between servers as it does locally.
You can determine whether the process to which you need to communicate is local or remote if desired, but it's unusual to do so.
I've lost track of Kyle's work since my last job crumbled, but for anyone unfamiliar with him, it's a pretty safe bet that his material is among the best available on this topic.
His tool Jepsen is the gold standard for testing the consistency guarantees of distributed databases.
We (backend engineering at Remind) went through this repo over the course of 6 weeks, taking turns leading the discussion. It went really well. We had a variety of participants, from seasoned backend engineers to boot camp grads, and everyone got a lot out of it.
Speaking about scale and distributed systems: AWS specific resources I find interesting from their EdgeEngineering/NetEng teams (I haven't seen many other service teams in AWS openly share as much about design as them):
- https://aws.amazon.com/blogs/architecture/category/networkin... series of articles on Route53's 100% data-plane availability architecture [1].
- https://www.youtube.com/watch?v=O8xLxNje30M colmmacc [2] (seems to be the eng behind AWS HyperPlane [3][4]?) on 10 design patterns for building resilient systems.
- https://www.youtube.com/watch?v=swQbA4zub20 Peter Vosshall (co-creator of Dynamo [5]) presenting "cell-based" design in-use at AWS.
----
[0] https://dataintensive.net/
[1] https://www.slideshare.net/AmazonWebServices/under-the-hood-...
[2] https://news.ycombinator.com/user?id=colmmacc
[3] https://atscaleconference.com/videos/networking-scale-2018-l...
[4] https://www.youtube.com/watch?v=dfEcd3zqPOA&feature=youtu.be...
[5] Decandia, G.; Hastorun, D.; Jampani, M.; Kakulapati, G.; Lakshman, A.; Pilchin, A.; Sivasubramanian, S.; Vosshall, P.; Vogels, W. (2007). "Dynamo: Amazon's Highly Available Key-value Store".
[6] Bonus: The SRE Book https://landing.google.com/sre/sre-book/toc/index.html