I'm the senior dev on my team, and whenever a new dev joined my team they would look at the codebase and go "ew, python2? Just use python3."
That gave me a chance to explain the testing and refactoring cost that would come with changing python versions, and how the benefits to users would be almost zero. And then at some point one of the new juniors said, "hey, there's a lot of filesystem performance improvements and readability improvements (f-strings) in 3. I can get the test/refactor done in a month, and I think it's a net positive." They were right, and now we're on python3.
The reverse of this also happens: new team manager joins a team of 4-5 dev and goes "eww... a monolith, we'll write an MVP in 3 weeks with microservices, CQRS and all".
Long story short, one year and a half passes and the mvp is still not finished, the architect leaves the company and some poor guys (from an outsourcing company) are still going at it with the same architecture.
CQRS and microservices has so much overhead, I'm amazed at how many companies adopt microservices without having anyone know a single thing about distributed systems. I think people underestimate the glue required to make microservices be useful. I had a similar situation at my last company where they spent a year and a half converting their monolith into half ass microservices and it still wasn't working properly. They also stopped releasing new features on their product during this entire conversion which obviously lead to a drop in customers. The mass exodus of employees at the end was something beautiful though.
I would argue that main reason for adaptation of microservices is to actually prolong development time (moar dollars) and make devops as convoluted as possible to safeguard data. Or foolishness.
I don't understand the hate for microservices on hn. I have found microservices are a great way to extend monoliths developed with legacy frameworks. There are so many pros with the cons easily avoidable with the right tooling / processes.
I just started working on a new project about half a year ago, completely greenfield. The backend developer (there is only one...) jumped into microservices directly, deploying on AWS Fargate, trying to split out as many things into containers as possible, because that's the "proper" way of doing it. We still have few users as the project is new, but hey, at least we could scale to 100000s users in a few minutes, instead of having easier logging, debugging and versioning
A lot of stuff in software engineering is just driven by what's popular, not what's needed.
A monolith can handle millions of daily unique users. The database is the hard part.
In the web world, if you have less than 10s of millions of daily users, as long as you design an application that holds no state itself, the architecture is usually more important for scaling your team and the size of your codebase rather than the number of users you can handle.
I've been in discussions about internal applications with few hundreds users and very moderate amounts of data being shuffled and stored and some people, of various backgrounds, are just so convinced that we need to go all in on Kubernetes, Microservices, Istio etc.
And all I can think about is "Hey, you could build this with a small team as a simple monolith and with proper caching you could probably run this on one or two raspberry pi, that is the amount of power you actually need here".
Don't get me wrong I do think they absolutely have their place and in other parts of the company we have much larger software development projects and they are absolutely making great use of Microservice architectures and Kubernetes and is getting a lot out of it. But that is 100+ teams building a product portfolio together.
> And all I can think about is "Hey, you could build this with a small team as a simple monolith and with proper caching you could probably run this on one or two raspberry pi, that is the amount of power you actually need here".
If your product has a global audience who needs to CRUD stuff, caching and a single raspberry pi won't get you very far in the game.
If it's just an intranet stuff with a few hundred users and your front-end isn't very chatty then you're right, you don't need much.
If you don't pile abstractions on top of abstractions on top of kubernetes on top of docker, you'd be surprised how much you can do with a single small-sized instance.
I'm not talking about performance overhead, I'm talking about architecture overhead. Kubernetes doesn't have any advantages over not using Kubernetes for simple applications. Reproducible dev environments and CI you can get easily without Kubernetes, without having to add complex solutions for logging, profiling and other introspection tools.
One could argue that you need the reproducible dev environments and CI to be solved BEFORE even start using Kubernetes.
> Kubernetes doesn't have any advantages over not using Kubernetes for simple applications.
So out-of-the-box support for blue/green deployments and fully versioned deployment history with trivial undo/rollbacks are of no advantage to you?
> Reproducible dev environments and CI you can get easily without Kubernetes, without having to add complex solutions for logging, profiling and other introspection tools.
I'd like to hear what you personally believe is a better alternative to kubernetes.
And by the way, Kubernetes does not support not requires distributed tracing tools not "logging, profiling, and other introspection tools". That's somethings entirely different and separate, and something that you only use if for some reason you really want to and make it your point to go out of your way to adopt and use.
In fact, distributed tracing is only a thing not due to kubernetes but due to you operating a distributed system. If you designed a distributed system and get it up and running somewhere else, you still end up with the same challenges and the same requirements.
Apparently HackerNews used to run from a single application server + Cloudflare caching, but later moved away from Cloudflare. Unsure how many web servers it runs on now. [0]
In 2016, Stack Overflow ran on 11 IIS web servers, but they really only needed 1.
At least that is what I often think, when I hear people describing micro-services. If there is no data sharing between computations, then the problem is embarrassingly parallel [1] and thus easy to scale. The problem is not the monolith, it is the data sharing, which micro-services only solve if each service owns it’s own data. To be fair people advocating micro-services also often argue that they should own their data, but in quite a few of the instances I’ve heard described, this is not the case.
> With microservices, you lose the ability to use database transactions across services/tables. Later on, when people realize this, it's too late.
I don't follow your reasoning. I mean, the ACID vs BASE problem you described is extensively covered in pretty much any microservices 101 course or even MOOC, along other basic microservices tradeoffs such as the distributed tax and ways to mitigate or eliminate these issues like going with bounded contexts. Why do you believe this is a mystery that no one is aware of?
I've seen a couple of projects where the original team didn't think they needed transactions, or underestimated how much harder eventually consistent distributed systems are to reason about than an acid system.
It doesn’t matter if you have transactions or not. Just make sure you execute things in the right order! Same for referential integrity, if you just make sure nothing ever goes wrong, there’s no need for it! /s
Sequences of interactions with the DB which are either atomically committed to on success of the sequence, or rolled back on failure, so that the DB is in the same state it had before the interactions began.
One of the fundamental laws of microservices is that each service is responsible for storing its own state. Nothing else is allowed access to its backing store.
Given that, once you have more than one service in your architecture, you cannot coordinate transactions across the distinct storage mechanisms - they are distinct databases and are therefore subject to the CAP theorem and other complexities of distributed computing.
Of course, nothing's stopping you from putting all your features into one service to avoid this thorny problem. It's a wise approach for most of us.
When you do that you have a monolithic architecture, not a microservice one.
If your application is stateless following your definition then it's trivial to break the monolith down to microservices that focus on a bounded contexts while sharing a common db. This is a very well known microservices pattern.
Yes, and what does that buy you? Instead of a single deployable unit that owns the database, there are many deployable units, none of which can own the database. That doesn't seem like an improvement.
Martin Fowler lists the benefits of microservices as
Strong module boundaries
Independent deployment
Technology diversity
And with a single database you severely limit the benefits from strong module boundaries and independent deployments. Suddenly you have to synchronize deployments, and anyone can pull or change any data in the database which now must be enforced with discipline which gets you into the same boat as a monolith.
You can still wind up with distributed state errors. Ex:. User requisitions a billable resource, request hits node A. User immediately terminates billable resource, hits node B. Requisition request has a long latency (and termination request does not). So in the billing system, there is a negative time associated with the resource.
Our initial prototype was a cluster-based approach, and once we had a better estimate of user growth and resources we moved to a monolith for all the reasons you cite.
It’s not the log writing that’s hard; it’s the log insight extraction that is. When you compare poring over logs from multiple invocations, even with request identifiers, the tooling is worse than tooling to look over stack traces/core dumps.
A calls B, C, and D. B puts an item on a queue that eventually causes a call to C. Both B and C call D. Which D call is slow/erroring? Is it the AD, the ABD, the ABqCD, or the ACD call?
No, contrarian is a good description of the problem. We see a lot of posters complaining loudly about problems they never had or tools they never used. Some appear to simply enjoy arguing for the sake of it.
Having heard mainly "nothing but guff" justifications for microservices in non-enormous orgs, I think (as usual, and some law I forget dictates) the latter is more likely.
Why do people always associate cqrs with microservices?
I have a project where only 3 developers work; we re designed it to be cqrs. I was sceptical at first - I especially don't like some of the boilerplate it creates - but I'm now sold on it by combining it with the mediator pattern, now I can have validation, logging and performance checks on every command and query without repeating code, works like a middleware
And ofc being only 3 devs it would be nuts to have microservices so we are happy with a cqrs monolith
On the other other hand, I’ve worked on system that stuck with “the old way” (like ColdFusion) for so long that it was impossible to even find documentation on the old programming environment if you could find somebody willing to maintain it. The longer you wait to upgrade, the more it’s going to hurt when you finally do.
I'm not against replacing old systems, but this has to happen in incremental steps.
This way, if an implementation path doesn't worth the effort, just drop it and don't spend dozens of months with a full rewrite and realizing that you are spending a week to implement a CRUD for a simple entity. It's even worse when devs are new to the project and have no knowledge of the domain; I've been burned by this, never again.
At the end of the day, we are not in the research field, we are payed to fix business problems, to deliver something that brings a business value. Yeah, it's nice to write some really complex software that would make the system a lot more optimized, but we have to deliver it before our bosses are getting tired of excuses and pull the plug. Learned that the hard way.
Having been in this situation, as well, I find that the aging out of technology MUST be led by a CTO who is competent enough to know that tipping point of cost-benefit. Their job immediately becomes a political choice (pass the pain to the next guy) when the point has passed and it gets worse every new CTO. Some of these systems are HUGE (hudreds of millions of lines of ColdFusion, where I worked).
The reason I have been left without product document is because Oracle bought the software the organisation was using. Cannot stress enough the value of keeping an offline copy of the documentation of a product rather than relying on a companies website or similar.
Edit: Also want to add software installation files are very important to keep offline as well.
On the other hand, don’t hire who wants to understand everything first either. Sometimes a bad choice was just a bad choice, and it’s a waste of time to figure out if it was taken for a reason.
That's a very "junior senior" type of developer. Rewrite from scratch needs external reasons to be a good idea, because you have a huge uncertainty risk of "i don't know enough about this system to rewrite it, and now i'm spending months hacking back in edge cases to the new no-longer-beautiful-design." Uncertainty risk doesn't mean never do it, but it means make sure you understand it and make sure it's gonna be worth it.
I've never worked in outsourcing and almost every job I had in my >15 years career as a developer included at least somewhat-gnarly & often truly-gnarly code.
It is very rare to see a codebase several years old that can be fairly describe as "in good shape".
This sounds like the phenomenon dubbed Chesterton's Fence [0].
A core component of making great decisions is understanding the rationale behind previous decisions. If we don’t understand how we got “here,” we run the risk of making things much worse.
So you helped the new dev understand the current lay of the land. They listened, then suggested an improvement based on their new understanding. You agreed and together improved the code.
This isn't really Chesterton's Fence. It doesn't take any soul searching or insight to answer "why was this not written in Python3 in the first place" - the answer is almost certainly that Python3 or some key libraries didn't exist.
It's a pure cost/benefit analysis question. Switching has some cost, and some benefits, and you have to decide which are likely greater.
Knowing how so many devs are hungry to move to the new shiny thing, the question is often more like "why hasn't this been rewritten in Python3 already?".
Possible answers:
• Legit technical or cost/benefit reasons
• Probably a good idea but no budget/time for the effort
• Somebody already tried and it was a disaster
• No manager has been stupid enough to humor the devs
Most reasonable devs are not jackdaws and do not want a "new and shiny" thing for its shine.
Mostly devs want to go away from the old and increasingly creaky thing. And when the cost / benefit ration is finally right (that is, it's creaking loud enough and slows things down much enough), the move hopefully happens.
Really? The only other answer I can imagine is "all our other projects are in Python 2". And generally that means you will reuse some code from those projects. What other reasons have you seen?
I'm working with some old code and spend a lot of my time thinking "There's got to be a reason this is as wonky as it is."
Sometimes I just can't find out (the guy who wrote it is gone, nobody knows) then I have to just do it the way I think it should be, test ... then I find out why, sometimes spectacularly ;)
I recently started a new job and whenever I can I try to document the code I contribute with the "whys" instead of documenting self-explanatory parts. I've extended this to parts of the code that I now maintain but did not initially write as this helps me mentally map out the rationale.
I think this is the most forgotten part of this type of issue. It's not a static issue. With these types of things and time the problem often gets worse, and the answers get easier. They should be periodically re-evaluated.
Parts of the problem get worse as time goes on. (more code to convert, more complexity, more test cases, EOL for the python2 version of some lib, more user data, blah)
Parts of the solution get easier as time goes on. (easier syntax sugars, better testing frameworks, infrastructure abstracted easier, remove dead features, etc)
Parts of the desired solution become less relevant as time goes on. (Why not use golang or node or elixir, php and python are so dated!)...
Just because last year was a bad time for the upgrade doesn't mean today is. Knowing how to get these things done at the right time by the right people for the business is what separates a great engineering manager from one that is just "shipping features and bug fixes".
I am a senior dev and I always try to push new ideas to my team lead (senior dev and older than me), but all I get is blatant criticism because he says "I've tested it and didn't like it). A clear example: refusing to move to Spring Boot and still staying on the dead horse JavaEE, which got more complicated and fragmented than ever since Java 9
Good Lord. And here I thought I was being a stick-in-the-mud by gently reminding our new upstart that, yes, moving from Spring to Guice would bring us some new functionality, but it would also lose other functionality and have a non-zero migration cost.
Just imagine that allof it is called JakartaEE and yes, you have an artifact for interface and one for implementation and they are not even compatible with jpackage (Java 14)
and... if after 3-4 weeks, it was nowhere near completion, or you'd hit so many snags it was going to take several more months, I'd hope you'd have the good sense to put a cap on it, and revisit later. There's nothing inherently wrong about trying something like that, especially if there's some tests in place, a known goal, a reasonable time boundary relative to the potential benefit, and a willingness to stop if the effort is growing beyond the benefit.
Your experience sounds great, but it also sounds like we don't see that level of middle-ground pragmatism enough - it's either "no, we have to stay using PHP4.3.3 because it's what I know" or "we have to rebuild to cloud micro services to be able to scale infinitely without being restricted by schedules or budgets".
We mostly stick to a single master with our code, but this was one case where we branched so we could watch progress via test passing percentage to make sure we were moving fast enough to finish before we got sick of it.
Definitely have been one or two refactors that have been shutdown. We've been lucky to have clients that give us the freedom to retire some technical debt/risk instead of just churning out features. And we're small enough (200 kLoc approx, 6 devs) that full codebase refactors are still doable.
Here's a tricky one that I encountered a year ago.
A junior tried to advocate for browser test. Senior from the developer productivity team said browser test couldn't possibly be made non-flaky.
I didn't know how to advice the junior because I agreed with them.
Saying "something is infeasible/costly" is like a blanket statement. There was no way to quantify that.
On a flip side, we couldn't justify the impact of browser test either. But I felt, at the time, we should've been biased to implement every type of tests (than not) since there are only 3 types: backend, JS, and browser.
Akin to your example, it was the same argument "X is costly/infeasible".
Your example doesn't have any issue because everyone probably agrees with it. But being infeasible to setup browser test sounds strange.
There's another type of tests: visual diffing. I've used it a lot and I love it.
1) Keep a set of a few thousand URLs to diff, add new ones as needed.
2) Use a tool that can request these URLs from two versions of your app, make a side-by-side visual diff of the resulting pages, and present a report of any differences.
3) Before committing any change to your codebase, run that tool automatically on the whole URL set, comparing the new version of the app with the current version. Then the committer should review the diff report manually and it gets attached to the commit history.
This way, adding coverage for new functionality is easy (you just add some URLs to a text file) and the whole thing runs fast. And it catches all kinds of problems. Not just UI bugs where some bit of CSS messes up something unrelated, but also you can run the frontend diff after making any backend change and it will catch problems as well. It won't solve all your testing needs, but it covers a lot of ground cheaply, so you can concentrate the custom testing code where it's actually needed.
My team went through the browser testing issue as well and agree they're hard tests to write. We ended up writing a few using selenium, but we don't have as much coverage as we'd like. Luckily, unlike an interpreter upgrade, we can add them incrementally.
If the test is slow and hard to deflake, then let's add it slowly. Maybe only test the critical path. There are ways to manage it, instead of discarding it entirely.
I'm a senior dev/tech lead and I empower my junior team mates/reports to do what they think is right. As such, I get shot down just as much as I shoot them down.
I learn a lot from them, because I'm asking them to do a lot, and they also learn from me when I review the code or advise them on an alternative. They push back a lot and that's what I want. I want my reports to prove me wrong and show me better.
This approach means that if I really have to put my foot down or enforce something, then there is enough mutual trust to allow that to happen.
I’m trying to get to that point. It’s interesting to see that any new member that joins the team has a sort of adjustment period where the still come to ask everything, but eventually realize that “do what you think is best” is going to be the answer to any technical question regardless, and they realize it’s ok to have (and fix!) their own problems with the code.
No excuse converting python 2 to python 3 slowly. One commit at a time. We did that in Chromium. Using newer Jon deprecated, unsupported libraries brings a bit better team Morale and enjoyment in what you do. Imagine intern coming in, and using Fortran.
We wrapped up our conversion about 1 week after the sunset date. I think we could have lived with no new features, but the no security updates issue was a big reason why we upgraded.
Wins for the user were performance improvements and security. The dev quality of life features (better type checking, f-strings, nicer pathing syntax), as someone else mentioned, will improve our development velocity in the future, which gets back to the users as new features faster.
Will it save a person-month in future work? If there's 10 devs working on this project and it will be around for at least another year, then that requires less than a 1% increase in average efficiency.
For every one of these stories there are two ways it can go: The first is it turns out the senior dev is jaundiced in their view and the problem was improvable/fixable, or thee senior dev is right and the junior dev is about to go and waste a whole load of time on something they've already been told is a waste of time.
As a manager I always hated these situations, firstly, because the problem the juinor dev is trying to fix is almost never as productive as "Hey let's upgrade to python3", it's more like "Hey let's migrate to this specific version of this specific tool that I happened to use on one of my pet projects" or "I've got this incredibly ambitious plan to change everything about this peice of code you gave me to work on" (You're only looking at that code at all because it's simple, relatively unimportant and we're trying to ease you into the team).
The problem is if it is the junior dev whose wrong you're now going to start seeing all these new issues because you've taken someone whose intuition isn't quite there yet and given them something incredibly complex to do. That's when you get into work 2 weeks later and find their mega-commit changing 2,395 files changing tabs to spaces and auto-modifying everything to camel case. Taking a chance on that intern's pet project means a lot of support from others in the team.
That is, the fact that Python 2 is not officially supported any more, and at maximum you'll see some fixes for most egregious security holes, is not something you considered back then?
I knew about the security risk, but I wasn't aware of how many quality of life and performance improvements we could use in the new version. And I also learned that some people enjoy researching/debugging the interpreter, whereas I originally thought the refactor would be a chore that hurt morale.
This is a nice counter example. Although it's clear the article has good intentions, it's promoting unscientific thinking with an appeal to authority ("yes, they go brrrrrr extremely quickly much of the time, and senior developers know that!"). I think there is a point to be made about how we all could benefit from quelling our outrage at decisions we initially disagree with before we've heard all the evidence, but that has nothing to do with seniority.
It's not clear to me why you think that this anecdote had any particular intention. My intention was not to promote any position at all, but rather to tell an amusing personal story that I was reminiscing about because of an email I got from a young friend.
Anecdotes are by definition anecdotal; I am not promoting an anti-science position by relating a personal anecdote and I resent the statement that I am doing so.
If you'd like to write a blog article that promotes scientific thinking, I strongly encourage you to do so.
A lot of times the creators/maintainers of a project just haven't put any effort into 'upgrading' stuff. Whether it's a newer version of a language, library, operating system...
Some people will use any excuse to not have to change. Once in a while it's because they are genuinely too busy, but that just signals other issues.
And maybe the Junior developer is allowed to do this because they cost less, so if it gets abandoned the company can take it, but a senior out for a month would impact other areas.
Only if you just fix whatever breaks in the upgrade and never use the features in Python 3. The static typing benefits alone should either make your software more reliable (benefit to users) or speed up feature delivery (benefit to users).
That gave me a chance to explain the testing and refactoring cost that would come with changing python versions, and how the benefits to users would be almost zero. And then at some point one of the new juniors said, "hey, there's a lot of filesystem performance improvements and readability improvements (f-strings) in 3. I can get the test/refactor done in a month, and I think it's a net positive." They were right, and now we're on python3.
So, sometimes we all learn something.