Hacker News new | past | comments | ask | show | jobs | submit | WookieRushing's comments login

Yup! This is the reason why its so cheap for them. Other companies in similar positions have cache nodes in the ISPs and this dramatically lowers the cost


Why do you think its easy?

There's a lot of systems where you can easily take down some hosts, but taking down more than N% at a time causes issues. If your fleet is large enough then you are limited by the largest set of hosts where you can only take N% down at a time. Now you could say keep the sets of hosts small or N% large. But that can cause other issues as you typically lose efficiency or zonal outage protection.

A solution to this could be VM live migration or something similar. This breaks down for storage systems where you can't just migrate those disks virtually since they're physical disks or places that don't use VMs.


My bet is on cache servers having a bad release


It’s millions a year and ms teams is near free. In a cost cutting environment this is near perfect.


How big of a team costs that much? Looks like the highest listed price is $180/person/year before it becomes "call us" pricing. Im sure that enterprise plan can get expensive, but even at $500/person/year that's at least 2000 users.

I'd hope there's lots of things higher on the list of effective cost savings before trying to migrate that many users, and integrations, and history.

Such migrations are happening and will continue to happen, but I expect most will go about as well as the one discussed in a sibling comment.


Why is there input lag?

I’ve used something identical and mosh makes this just work. Most devs at that company swear by remote builds and hate laptop builds


Mosh doesn't work with SSH bastions (kind of obviously, but admittedly a bummer indeed), which Uber's blog post shows they use (as do many other similarly-sized companies).


They can get it from your android phone or from chrome if you’re running it


No, cryptocurrency cannot be used as a normal currency. Nothing as easy as handing cash to someone else exists in cryptocurrency.

There's one giant benefit of using cryptocurrency and a few smaller niches. It is a currency of last resort to guard against hyperinflation or currency controls or for drugs. You should really only be using it if you can't turn your money into dollars but can earn cryptocurrency another way.

Check out an example like https://mission.org/hidden-in-plain-sight/bitcoin-a-lifeline... . In this case its easier to earn crypto as payment for doing work online than it is to be paid in USD.

The smaller niches include things like flash loans

Flash loans are actually new thing in the world but still pretty dangerous. You can get a massive massive loan (Think $1 billion) without putting anything down as long as you pay it back in the same transaction you get it. Ethereum can guarantee the loan and repayment succeeds so the borrower really can loan $1 billion. Of course this only works as long as whatever tool the loan contract is using to check that the borrower can repay $1 billion works. If not then the money is going to be stolen...

As for goal, there isn't one. Its whatever folks want it to be and a large number think ponzis are the goal...

As for normal currency, it does not need to be stable. See Venezuela as recent example. It typically can't follow a real currency like USD exactly. In fact trying to keep an exchange rate between a strong currency and something else the same almost always ends horrifically like https://en.wikipedia.org/wiki/Black_Wednesday


Its like Fish shell vs Bash shell.

Bash has weird defaults so you end up googling for everything. In fish, it just works and you barely need to search for anything.

Sane defaults matter. With hg, I don't need to struggle to get it to do what I want, it just gets out of the way. With git, sure it works but like you said it has a bunch of ducktaped tools together that change the defaults or just generally make things easier.

Now hg is half the pattern here. The other half is stacked commits. Each commit should build and get reviewed separately. There isn't any waiting for reviews on each commit, they all get reviewed over time and you rebase any changes that are requested. With git this is amazingly painful and half my zshrc is about making this simple. With hg, it just works. Take a look at hg absorb or hg split, theyre features built on top that yeah can replicated in zsh scripts but its kind of nice when you can assume they just work. It means junior engineers don't spend hours trying to fight git with stacked diffs.

Sapling is trying to fight the network effect here by doing the classic built a compatible but legitly better front end. Compatible with github but sane defaults is a BIG thing.


> In fish, it just works and you barely need to search for anything.

I keep having to google the location of my configuration file. It's ~/.config/fish/config.fish. I think, if it's not in ~/.local.

The whole function thing is also not the easiest to understand, although I love that it hot reloads and is global across all instances and so on, along with all sorts of other things.

Overall fish is one of my favorite shells but it's not 100% intuitive at first.


When risk to society becomes high enough then it becomes necessary.

See paying taxes, conscription, jury duty

Though judging the level of risk is very hard and there is often not a clear line of when force is needed.


I was also surprised by this. 300K nodes for a distributed DB is kind of crazy. I’ve worked with similar systems but they stored much more than 100 PB with 10x less nodes

Apple is using less than one TB per server…

But when you see the 1000s of clusters it starts to make sense. They probably have a Cassandra cluster as their default storage for any use case and each one probably requires at least 3 nodes. They’re keeping the blast radius small of any issue while being super redundant. It probably grew organically instead of any central capacity management


What you describe is best practice for older versions of Cassandra with older versions of the Oracle JVM on spinning disks. And at this time Apple already had a massive amount of Cassandra nodes. Back when 1TB disks were what we had started buying for our servers. Cassandra was designed to run on large numbers of cheap x86 boxes, unlike most other DBs where people had to spend hundreds of thousands or millions of dollars on mainframes and storage arrays to scale their DBs to the size they needed.

Half a TB per node, which during regular compaction can double. And if you went over, your CPU and disk spent so much time on overhead such as JVM garbage collection that your compaction processes backlog, your node goes slower and slower, your disk eventually fills up, and it falls over. Later things got better and you could use bigger nodes if you knew what you were doing and didn't trip over any of the hidden bottlenecks in your workload. Maybe even fixed in the last few versions of Cassandra 3x and 4.0.


What psaux mentioned makes more sense. A node == one Cassandra agent instead of a server.

Past 100k servers you start needing really intense automation just to keep the fleet up with enough spares.

If you’ve got say 10k servers it’s much more manageable

The fun thing is Cassandra was born at FB but they don’t run any Cassandra clusters there anymore. You can use lots of cheap boxes but at some point the failure rate of using soo many boxes ends up killing the savings and the teams.


Yes, you can run multiple nodes on a single physical server. However, then you have the additional headache of ensuring that only one copy of data gets stored on that physical server, or else you can lose your data if that server dies. Similar to having multiple nodes backed by the same storage system, where you need to ensure losing a disk or volume doesn't lose two or more copies of data. Cassandra lets you organize your replicas into 'data centers', and some control inside a DC by allocating nodes to 'racks' (with some major gotchas when resizing, so not recommended!). Translating that into VMs running on physical servers and shared disk is (was?) not documented.


> The fun thing is Cassandra was born at FB but they don’t run any Cassandra clusters there anymore.

Isn't Intragram mostly Cassandra?

https://instagram-engineering.com/open-sourcing-a-10x-reduct...


It wasn’t when I last saw it. Rocksandra ended up being a stepping stone to fbs most common distributed db, zippydb https://engineering.fb.com/2021/08/06/core-data/zippydb/

Zippydb is honestly one of the best parts of fb infra. It let you select levels of consistency vs latency


> Zippydb is honestly one of the best parts of fb infra. It let you select levels of consistency vs latency

How is that different from Cassandra's Tunable consistency model?

https://cassandra.apache.org/doc/4.1/cassandra/architecture/...


Seagate introduced 2TB drives no later than 2010.


Interestingly, using the highest capacity drives at any point in time would work even worse since they spun slower and slower sequential write speed. If you could get them from your preferred vendor, which seemed to be several years after introduction for us!


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: