Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
System Design 101 (github.com/bytebytegohq)
250 points by saikatsg on Oct 25, 2023 | hide | past | favorite | 43 comments



ByteByteGo seems to almost be a content generation concept for a niche market. And the thing is, aspiring developers do, apparently eat this stuff up. Thing is though, you really don't truly assimilate these concepts from a simple diagram, not just because they are gross simplifications or crude approximations — they are — but because you need to "feel" what these systems are like in a real world scenario or at least to have broken your teeth designing one in order for any of these ideas to truly sink in a meaningful way.

So it's sort of like...if you have the experience of having designed these types of systems or are familiar with some of these ideas from firsthand experience, the cute diagrams hold very little value. If, on the other hand, you're a new engineer, you're gobbling up these diagrams, but they don't really hold any meaning. You can't just binge on some doc that attempts to collate all system design ideas and think you've got it. Knowledge isn't acquired by "carrying books". Now, I'm not saying there might not be value in these diagrams as a refresher or tool for someone already familiar with these concepts, but it's limited to that and I suspect newer devs are being led to believe they can just memorize these diagrams and they grok these ideas. Not so.

I also cannot help but feel the author sees this area and audience as fertile ground for producing a ton of content and gaining readership - it's substack, it's blog posts...and the list goes on. I really try to avoid imputing bad motive or malintent, but I can't help but feel this is cynical in the sense that the author knows this information in this format targeting this audience does very little for them and, worse, leads them astray because of how at odds it is presented vs the way real knowledge is actually acquired.

I guess what I'm mainly saying is: beware, young devs. This ain't it.


Master in CS with 20+ years of professional experience chiming in here

> because you need to "feel" what these systems are like

I've done a lot of job hopping and seen a lot of technologies and companies. There is no way will you ever really experience even 10% of the information found on that page.

In my opinion, it seems like a great resource, and bookmarked it to go through it when I have some time.

Telling young devs to "gain experience" is pretty useless information. Learning theory is a large part of our job.


This seems to be a never-ending story. There are companies out there doing system design interviews to candidates with 3 years old experience and that's mostly the target of ByteByteGo. I may say, desperate people trying to digest the most they can in the less time to increase the chances of getting a job. Same as people grinding LeetCode in record time to nail a job.

BBG content is good if you want to learn about use cases and how things work if you're already in position of taking responsibilities for the implementation of technologies like those, or you have a say in your team/project. If you're never gonna work with them, then it's just reading for nothing, you'll end up forgetting those things exist.


still better than nothing, because in YouTube there are really not a lot of content like this, most people that create content like this are faang employee and you still missing a lot of nuance

sure You cant really 100% translate it to your requirement but it give mind map to a lot of people to start


Disagree. ByteByteGo is great for a 50,000ft view of a topic. It's useful to watch/listen/read ByteByteGo for a first pass at a topic and learn the correct terminology in that domain.

It's also useful for scratching the itch of "huh, how does that work?" without having to allocate 100 hours of "breaking your teeth" learning the concept.

IMO any knowledge is good knowledge, even if it's a simplification of the topic (and the reader is aware it's a simplification).


Maps won't teach you an intuition for tactics. Without maps tho, the best tactician is severely handicapped.


> Stack Overflow serves all the traffic with only 9 on-premise web servers, and it’s on monolith! It has its own servers and does not run on the cloud. This is contrary to all our popular beliefs these days.

Love this part. I wonder how much of the complexity of microservices and various cloud services are actually adding value on a net basis vs. resume-driven development of some bored backend engineer.


I also love this anecdote -- but the SO engineering team that made that possible is _not_ your average LOB engineering team in my experience. Those guys were like C#/SQL technomancers. SO is a great example of just how much is possible with a .NET monolith, but very few teams are going to be capable of driving the machines that hard.


As I understand it, SO sites are mostly read traffic and they keep something like 2/3 of all of their data in cache at any given time. When your entire dataset is sitting in ram, you can handle insane amounts of traffic with few machines using any programming language, because most of the traffic is never even touching any code you wrote. Still impressive though!

Edit: I'd be curious how they hold up to a redis `flush all` or memcached `flush_all`. I wonder if they've ever game day'd such a scenario.


> how they hold up to a `flush all`

Populate the cache as part of the startup sequence.


I was talking more about the app servers would be up and running and their cache layer would just go offline and the app servers would get 100% of the traffic. The `flush all` would be to simulate the cache going away. Populating caches on app boot is a bad idea, it'll slow the boot times, and if the cache is down, now your apps can't boot either to at least run in a degraded state. Cache should never be in the critical path like that, it's a cache, and it going away is not unexpected. Just my 2 cents.

Some cool info about SO

From 2019

> Layers of Cache at Stack Overflow

> We have our own “L1”/”L2” caches here at Stack Overflow, but I’ll refrain from referring to them that way to avoid confusion with the CPU caches mentioned above. What we have is several types of cache. Let’s first quickly cover local and memory caches here for terminology before a deep dive into the common bits used by them:

> “Global Cache”: In-memory cache (global, per web server, and backed by Redis on miss)

> Usually things like a user’s top bar counts, shared across the network

> This hits local memory (shared keyspace), and then Redis (shared keyspace, using Redis database 0)

> “Site Cache”: In-memory cache (per site, per web server, and backed by Redis on miss)

> Usually things like question lists or user lists that are per-site

> This hits local memory (per-site keyspace, using prefixing), and then Redis (per-site keyspace, using Redis databases)

> “Local Cache”: In-memory cache (per site, per web server, backed by nothing)

> Usually things that are cheap to fetch, but huge to stream and the Redis hop isn’t worth it

> This hits local memory only (per-site keyspace, using prefixing)

and

> For the curious, some quick stats from last Tuesday (2019-07-30) This is across all instances on the primary boxes (because we split them up for organization, not performance…one instance could handle everything we do quite easily):

> Our Redis physical servers have 256GB of memory, but less than 96GB used.

- 1,586,553,473 commands processed per day (3,726,580,897 commands and 86,982 per second peak across all instances – due to replicas)

- Average of 2.01% CPU utilization (3.04% peak) for the entire server (< 1% even for the most active instance)

- 124,415,398 active keys (422,818,481 including replicas)

- Those numbers are across 308,065,226 HTTP hits (64,717,337 of which were question pages)

https://nickcraver.com/blog/2019/08/06/stack-overflow-how-we...


You have it work the way you chose, therefore not a bad idea.


Stackoverflow is a much simpler mostly read only, easily cacheable site compared to a workflow heavy SaaS. They're not even remotely comparable.


I think you're underestimating how much write happens in SO, and overestimating how much writes happens on 99% of the SaaS that have very few paying customers..


Currently work for a very large household name SaaS, previously on Salesforce. I'm willing to bet good money that SO does not remotely come close to the amount of writes in either of these systems.


Resume AND sales-driven.

It is hard to make a high-growth startup selling co-located servers whose docs page is a link to Linux documentation and RFCs.


This is bizarre. A tiny bit of information about incredibly complex subjects. And this is "interview prep" ? Like for those "consultants" that try to bluster their way into a 6 figure gig?


To be honest, these "system design" rounds are a hassle for me and even those in-depth YouTube videos don't help.

It's all so theoretical and I feel like everything I do/say is bluffing my way through them. I prefer straight algo rounds even though those are a grind.


See also: "The System Design Primer - Learn how to design large-scale systems"

https://github.com/donnemartin/system-design-primer (see https://hn.algolia.com/?q=system-design-primer for discussions)


Good Stuff,

About GraphQL, I read that GraphQL is versionless, is it really versionless? If so, how? The link below says you will still need to version but we have some Graphql fans at work and they insist that it does not need to be versioned.

[0] https://medium.com/swlh/no-graphql-doesnt-magically-fix-api-...


It's not versionless per se, but given you ask for the specific fields you want (e.g. user { name }), you don't have to be anal about bumping the version every time a new prop is added to a response, such as in REST, e.g. GET /users/1234

However, if you change the data schema around, like `user` is now `person` or something, or a mutation call changes the required params, there's no magic, you either update the clients, or add versioning.


To be fair, you don’t have this problem with REST either (any new properties can be safely ignored). Where it breaks down however (and this applies more to GraphQL because of its insistence of not being versioned) is when you want to deprecate some fields that a lot of clients depend on.


I agree with you, though I have worked with a lot of folks who insist new fields are worthy of a new version. I mostly use Javascript which is really forgiving, but sometimes people have these C++/etc client libraries that blow up with unexpected data apparently.


And if not clear GraphQL is not "SQL on the web", so just because your DB schema changes, your GraphQL API doesn't have to change (basically like REST)


This is a very important point.

I'd argue all these "publish your DB schema as a GraphQL endpoint" frameworks that seem to proliferate have done a lot of damage to GraphQL's reputation. Strongly coupling data to presentation seems like such an obvious anti pattern, yet tools doing just that seem to be very popular for some reason.


Agreed, the reason I really like GraphQL is you can map different parts of your schema "tree" to different backend systems, APIs, etc. This makes your API plane/BFF nicely decoupled from those details like DB tables, etc.


If you want to make a breaking change, yeah - you need to version it somehow. No question. It's fairly flexible in how much you can change without it being breaking, but it's not magic.

Beyond that it's arguably "versionless" because it does nothing to address the problem. Which is essentially fine, and is what most protocols do: you make a completely new API or change the endpoint (e.g. http path) to gradually make breaking changes (expose both, migrate callers whenever), because something like that is basically always an option.


ByteByteGo is one of the best YouTube channels I've come across. Really well done content, all explained very clearly.


There is a dead comment in reply to the above that says that it is very shallow. Maybe it is too crudely worded, but I also have been left feeling disappointed with the ByteByteGo videos. They actually do feel more like bullet point lists. I don't mean to say it was not a lot of work to produce (especially the animations), but they are not deep dives. It could be a good resource for learning where to start, but it's not what I would call comprehensive or authoritative.


Its wrote memorization. Thats the teaching strategy they employ and perhaps even learned from themselves. If you asked a followup youd likely just get a rephrasing of the same thing they already said.

This is completely synonymous with “learning” in some cultures.

To be fair I think this can be a useful angle for learning but its not sufficient. Treat their newsletter like flashcards.


What I find valuable is the distillation of a complex topic I’m unfamiliar with, to the point where I can easily understand what it is at a high level in just a few minutes.

This is easier said than done; there are plenty of technologies that have horrendous intro documentation that dives into all sorts of overly complicated and confusing shit. It’s really hard to stay on topic and present only what’s relevant for someone who’s never heard of the technology before.

Not everything needs to be a comprehensive deep dive.


yeah I would say I'm fairly experienced and I absolute love their gifs and images. But I find the content very shallow overall and very oriented towards passing interviews - which I believe is their goal so :)

I appreciate the gifs !


I find it very shallow. Sometimes I wonder if the person presenting it actually has deep knowledge or is just reading bullet point articles.


It is indeed very shallow, especially the shorter videos. But what I love about them is that they're very valuable for anyone not familiar with the topic at hand.


Keep up the good work! I feel that your content is bringing a lot of educational value to the tech community. I would even say that the visualisations extend the understanding of tech averse people and leads to more people understanding our rather complex ecosystems. Thank you


What a little treasure trove of information.

I can see a recent cs grad who’s never built anything benefit immensely from this.

The bigger thing here is that it teaches you the vocabulary for you to then go on and research and form your own opinions etc.

All you need is a sound starting point and this is definitely it.

Kudos to you man.


This is a great overview and explains the subjects briefly but well. I definitely was surprised by seeing a whole section dedicated to payments.


I was also intrigued by that section. It's surely interesting but looks more like an important special case than a technical subject. Not complaining tho.


bytebytego is insanely high quality compared to the geeksforgeeks like website that came before.


given geeksforgeeks is/was spammy nonsense, I can't tell what you're trying to say about bytebytego


Draw.io is used everywhere. I see it all over in different enterprises...


No wonder:

It is open source and free, insanely powerful, easy to use and has impressive backwards compatibility.


Great sharing




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: