If you actually measure real latencies (e.g. collect metrics from your mobile apps), this is not true most of the time, at least in my experience. Most responses are fairly small, making them a bit smaller doesn’t have much impact on mobile app performance in the real world.
I’d guess for most apps, there are maybe a handful of endpoints where the response size can be reduced a tonne for mobile clients, AND that makes a big performance difference in practice. But instead of hopping straight to BFFs, or other approaches to solving this like GraphQL, you can just add a “fields” query param to those few key endpoints. If provided by the caller, you return only the fields they ask for. No massive re-architecture to BFFs or GQL, just a small change to a couple endpoints that a single dev can implement in a day or two.
Now, there are other benefits to BFFs than minimal, client specific payloads. One is that you can make breaking changes to underlying APIs, and not break always-so-slow-to-upgrade mobile clients, by preserving backwards compatibility in your mobile BFF layer. Another is coordinating multiple calls - turning many sequential round trips into a single round trip can be a big win. But if your main concern is payload size, BFFs are probably an over engineered solution.
Essentially, it was a really nice and concise explanation of links to other data, problems with those and adding resource expansion to API endpoints.
That said, it feels like HAL never really took off and the implementation of something like that is very framework specific.
> ...or other approaches to solving this like GraphQL...
GraphQL does seem to solve this, but it's like someone made an equivalent to SQL just for APIs, except that they forgot to include the bit that actually talks to your data store based on the queries and returns data, so you have to write that yourself (e.g. query resolvers).
Unless you're using something like PostGraphile/Prisma/Hasura which exposes your database (provided you use one, which might be the situation in most cases) through GraphQL, but then you run into problems with intermediate data processing or additional access controls, that you might need to be API logic, before your data is returned.
Which kind of feels worse from a complexity standpoint from where we were previously, especially if the queries are dynamic instead of largely static.
Back to the topic, BFFs seem like a good idea, especially if they take over some of the gateway functionality as well. I've heard them be described as facade APIs in some cases.
> Which kind of feels worse from a complexity standpoint from where we were previously, especially if the queries are dynamic instead of largely static.
Agreed on increased complexity, but isn't the point largely the dynamic queries? You setup all this extra architecture, but in response you don't have to write as many "controllers" with custom API response. Just let the frontend request what it needs via GraphQL. So it's more upfront work, less down the line. Or is writing all the API level sanitizers and query resolvers more work?
> Agreed on increased complexity, but isn't the point largely the dynamic queries? You setup all this extra architecture, but in response you don't have to write as many "controllers" with custom API response. Just let the frontend request what it needs via GraphQL.
What about cases like: "The user should only be able to see the data of their own submitted requests, when the parent document is any of the given statuses: FOO and BAR and there is no related BAZ request that's been saved and is approved in the system by the manager. In cases, where the BAR request exists, ..." (a long list of other edge cases that might affect the visibility of any particular document or even individual fields in some form)
My point is that in many systems the cases where you might want to just give someone the ability to retrieve the data they need is rather limited. So in those cases the dynamic nature of everything is more of a risk than anything else.
At large scale, the queries you can handle are limited by the indexes you have in place, and you can’t index everything. So you already have to think about which queries you’ll really need.
There’s also caching, which is another side of the same coin as indexing. You may want to cache certain common/expensive queries. You’ll probably need to think about the granularity of what gets cached, and how to invalidate it. Hard problems for a dynamic, “query anything” API.
Also you probably need to control which data can be retrieved. KronisLV covered that.
Replacing your data fetching logic with GraphQL just means translating your JS/Python to GraphQL - something that isn't as flexible, turing complete, or predictable (in execution).
If anything, I'd prefer to go the other way and get rid of SQL altogether - let me express queries in code:
> except that they forgot to include the bit that actually talks to your data store
Which stands to reason given that GraphQL was designed for talking to services, not data stores. It was built for BFF. The backend is responsible for talking to the data store, as usual.
[client] <-dynamic GraphQL query-> [GraphQL resolver] <-boring old RPC/REST/whatever-> {Set of backend services}
But doesn't that essentially mean that you've simply shifted the problems of fetching data (overfetching, N+1 problems and resource expansion) a little bit to the right, without solving the actual issue?
It would technically be faster because of lower latency between services in the same data center and sending the final results to the user in one go, but at the same time if your DB is hit 50 times to service that request, you still have a problem.
Essentially you just give the client the ability to run dynamic queries (with the complexity of which, the resolvers and their logic, you now need to deal with), without actually improving anything else, at the cost of noticeably increased complexity and yet another abstraction layer.
That tradeoff seems like a tough sell to me, much like an SQL engine that's not tightly coupled with a solution to actually store/retrieve the data would be like. Then again, there are solutions out there that attempt to do something like that already, so who knows.
> Which is a pretty big deal when you have practically the whole planet as users, many of whom are found where low latency is a luxury.
This is an excellent point!
Different scales will need vastly different approaches and the concerns will also differ. Like the anecdotes about a "Google scale problem", where serving a few GB of data won't phase anyone, whereas some companies out there don't even have their prod DBs be that size.
Another thing to consider is that while a query may be dynamic during the discovery/development phase, once it is solidified in the application for deployment it becomes static and is able to be optimized as such.
If you are a small team of developers that is no doubt a lot of unnecessary indirection. When you are small you can talk to each other and figure out how to create a service that returns the optimized results right away. At Facebook scale, if everyone tried to talk to each other there would be no time left in the day for anything else.
From my understanding GraphQLs powerplay is in consolidating data sources rather than endpoints. That it all came through one endpoint was just a bonus.
Overfetching is a massive problem when your entity has multiple N-ary references to other entities, and those entities have multiple N-ary references to other entities, and those entities have multiple N-ary references to other entities.
Suddenly, your frontend or API exhibits high-degree polynomial time complexity.
But this is only a problem if one client _needs_ all that data at once, and another client doesn't. Otherwise you would just never return the referenced data (which isn't the type of overfetching the article is talking about).
Yeah so I'm at this point in our main service which is a rather large REST API. For certain endpoints — naturally lists of things — we run into polynomial time complexity, the N+1 queries problem. We query for a list of entities which, when serialized, queries for some child entity and on and on it goes. We've been able to stop some of the bleeding by refactoring queries to use `select_related` and `prefetch_related` (it's a Django app) so that we limit the number of queries, but this only gets us so far.
I'm now at a point where I'm not sure what the next step should be. I believe caching may be the next logical step, but I've been hesitant due to concerns about the staleness problem, knowing when to bust the cache, etc.
And then even if we do go the caching route, I'm unsure at what level we want to cache. Do we cache individual objects or do we cache the large, fully hydrated models (the ones that require many queries to assemble)? Then there's the problem of having to mix and match fetching objects from the cache in the context of other arbitrary DB queries that can be constructed.
How do people generally go about introducing caching? Do you try to limit the scope of caching as much as possible (like to the lowest/smallest objects) and see whether that's sufficient to get the performance you want and then increase the caching surface area from there?
Also, what approaches do people take from the technical side? We already use Redis for some things so I implemented a quick proof-of-concept that seemed to work in the small scale. I've also seen people use Elasticsearch to keep fully hydrated/rich models in there and then use ES as the read store. Are there merits to this approach? Crazy?
Really keen to hear anyone's thoughts or guidance on this!
Number of queries is not the problem, it's the total query time for the frontend that you're ultimately concerned with.
An API is designed to aggregate and format data from various data-sources. The speed of your REST API will ultimately come down to the SQL queries that need to be executed and how your RDBMS executes them. Using a naive ORM (such as Django or GraphQL) typically leads to poor performance as you lack control over query execution.
Have you written your SQL from scratch and benchmarked slow JOINs? Sometimes doing multiple queries is faster than JOINs (when you only need to fetch related data for a small number of entities for example).
Are you using covering indices to accelerate entity lookups? This is probably the main way of speeding up your DB. De-normalizing your schema in certain cases is also a hack to be aware of.
Have you tuned caching in your RDBMS? You don't really need Redis when RDBMS can do caching for you (hint in-memory tables are awesome, RAM is cheaper than dev time).
Have you tried stored procedures? If you have complicated queries this may be helpful.
There's a whole list of things you can do to get great performance out of your RDBMS. Start there instead of introducing another layer of abstraction/tech that will complicate the architecture. I can guarantee that good schema & query design can yield massive improvements.
Thanks for the thorough reply, very helpful! I have looked into some of these, but not all of them and not for all cases. I really do need to look at constructing some of the queries by hand and comparing with the queries constructed by Django's ORM - I know that in many cases the ORM generates terribly inefficient queries.
Stored procedures is not something I've tried yet, but will look into it. Any more color or examples how this might work in an application?
Are far as RDBMS caching goes, you're referring to the ordinary query caching, related to things like shared buffers in PG, etc?
Just want to make sure I'm clear on all the suggestions you raised. Thank you!
You should be able to achieve a 10X improvement in speed just with good query and schema design, but ultimately everything depends on your data layout.
Avoid ORMs, GraphQL etc - write raw SQL specific to the views that your clients need, if you care about performance.
Low hanging fruit is:
- connection pool: your DB can in all probability execute many simple PK lookups at once, make sure you've got a large enough pool of DB connections to let it do this
- covering indices: if you are computing over columns A,B,C but looking up by PK you can add a covering index over (PK,A,B,C) which means no disk lookups. make sure all queries are using an index!
- joins: ORMs generally don't know how to do joins and use multiple queries instead. You need to get the right balance, for example for HN it could be: get PKs for top 30 articles in one query (one covering index), and lookup their attributes in another query (using a join across N tables where PK IN (1,2,3) which doesn't use a covering index). Reason being: covering indices are expensive, so you minimise the data they hold by splitting up your queries.
- de-normalization: if you have very complex joins, de-normalizing your schema may help
I wouldn't worry about stored procedures to begin with, unless you have a very complex schema/query plan. For caching you just need to make sure you have allocated enough RAM for your DB to keep indices in memory, the more you allocate the more pages it can cache in RAM. Focus on schema + SQL.
Raw SQL is not a cure-all; it’s an exchange of design simplicity and team velocity for application performance.
Prisma could handle the query generation part somewhat intelligently, at least in the future.
Denormalization is not a low hanging fruit. It may seem so because the mass of all the tangled fruits makes the top branch droop down where it can be munched on, but never quite satisfactorily picked.
I don't agree. Extra layers of abstraction like Prisma / ORMs only add complexity.
Sure it means you don't have to learn SQL at all, and that's really the only advantage. But time spent learning an API/ORM is better spent learning SQL.
Word of advice: avoid stored procedures unless you can store them in your codebase and have a CI/CD pipeline integrating their deployment seamlessly with the rest of the DB/API/app.
Storing code in a database is a terrible idea unless the entire workflow can support developing, debugging, deploying and monitoring that code.
My advice would be to avoid attempting to solve the problem by introducing a cache layer or any auxiliary query cache, and instead remove the problem at storage level by leveraging a database solution that 1) stores relationships as links 2) allows queries to resolve relationships to an arbitrary level without specifying join conditions. These two things make sure your entity resolution can happen with direct lookups instead of index lookups (or pog forbid, table scans).
If you absolutely cannot do that, cache at entity resolution level. Although, it really doesn’t solve the actual problem, just makes it more bearable by offloading primary DB workload to another DB, so why not run a read replica instead?
Thanks for the reply. Regarding the type of DB you referenced, what are some examples of DBs that allow relationships as links and querying to an arbitrary level?
With regards to using a read replica, we do have one that we use for some queries, but that doesn't really resolve the problem with having to do many queries for certain endpoints. We can only eliminate so many queries before the backend code becomes unmanageable, because we effectively forgo any of Django's ORM in favor of our own handcrafted queries that are hard to reuse.
Potentially anything which supports Cypher or has Apache TinkerPop integration, Neo4j with GraphQL, PostgreSQL with Apache AGE, EdgeDB, SurrealDB (very fresh) and Dgraph (almost but not quite production grade).
Got it, thank you. We are currently using PostgreSQL so I will look into AGE, Edge and Surreal. Any opinions on one vs the other? I guess some of that depends on which one has the best packages for Python..
No real opinion there, my usual stack is different — those are just some of the players that came to mind. AGE has a Python driver, so maybe that’s a good place to start.
Note that depending on your data model, you may not need ”complex” edges, just ”simple” links directly from vertex to vertex. Not all databases support this.
Then again, you already might have N:M tables which exactly correspond to edges in a graph model. This distinction is something to keep in mind.
Thanks, I will keep that in mind. EdgeDB looks really nice. It would be a rather large project to migrate our PG over to it, but maybe not so bad. I hopped on their Discord to ask if there any tools to aid with migration. They said:
>>> Tools to make migrations from Django (and other ORMs) are possible but we haven't yet started working on them. So you'd have to write the migration script manually.
Your comment on our Discord actually prompted an internal discussion on how we can implement this and we now have a pretty good idea. :) Not ETA or anything at this point, but we'll be looking into this soon.
Haha great. Would be hugely helpful to anyone thinking about migrating, but I'm sure you have your work cut out for you already and there's lots of competing things to prioritize.
That just moves the problem to the frontend. Often it can be worked around with HATEOAS at the architectural level or lazy loading, pagination, etc. in the frontend, but not always.
BFF is perhaps the purest example of Conway's law. When the backend team cannot or will not make the necessary changes for the frontend team, another layer is introduced: the BFF. There is no particular reason why the real backend can't solve this problem; it is purely political.
BFF can live on a different stack from the heavier backend, and bear a different development cycle.
I've worked in a few orgs where backend devs work on both the standard backend and the BFF, often adjusting both when adding new features. It's just convenient to have a separate layer closer to the frontend, dedicated to it.
I wish Next.js talked about this some more and beefed up their API routes.
Writing your BFF in the same codebase, sharing types, and parsing incoming data on the server without filling up the client bundle is a fab experience.
Author here. Yes, it's all about Conway's Law. Unfortunately, it's still a law that governs all companies. Fun fact: I stopped working in projects because I mostly had to find technical solutions to organizational issues.
PS: I wouldn't call it "political" (negatively connotated) but "organizational" (statement of fact)
This is the purest end goal though, if you've got a large monolith or you directly consume a shared backend with a large number of teams, how do you get there? BFF is just one attempt to decouple with another layer of abstraction. Eventually you can strangle the problem on the backend.
CDC on that backend would be one way. You could stream the data out to the various teams who could then materialise a view of it within their application and then craft an API over that view.
It's not political that microservices owning well-defined slices of a pie can't cater to the frontend's every whim without creating a dependency spaghetti everywhere that you have cross-cutting concerns. It's just common sense.
If you're arguing that most microservice vs monolith decisions represent a political and not a technical decision, I'll grant you that.
I don't think it's necessarily bad. In fact it is quite an elegant solution to issues between frontend and backend teams. I just think it's a great example of Conway's law. Conway's law (IMO) was never a negative thing, merely an observation of the realities of software development.
Your post brings up an interesting angle which is that the frontend and backend people can have different expectations of what the backend should actually do (perhaps leading to dependency spaghetti). In this case, introducing a BFF can be a way to isolate that from the rest of the architecture.
100%. A backend API can be used from multiple frontends. Just one from my experience working for an electricity provider, API from Desktop webapp, android mobile app, IOS mobile app all retail customer facing. Then from a Sales pipeline for new customers in a CRM app. From B2B customers (Electricians building new homes requesting connections) in a different web app.
Isn't the article suggesting that due to a different "front ends" (web app, mobile client, etc) there are different sets of requirements, which a backend should not deal with (for sake of simplicity and separation of concerns)?
The BFF layer seems like a natural next step, basically a code shim to abstract away the peculiarities of each FE platform from the (presumably) "pure" business logic / API platform of the BE.
Ideally you wouldn't have (or need) microservices! I'd rather deal with a monolithic rails app any day.
However, assuming you've got them, BFF means only one group has to get the following right:
* User authentication
* Web performance analytics
* A/B testing
* More difficult bits of HTTP (fast TLS, prefetch, streaming responses)
If you have microservices but no BFF, you're pretty much stuck with your page being substantially rendered via js. Maybe that's okay for some things? I sure don't like it.
>> I'd rather deal with a monolithic rails app any day.
if your app gets "bigger" this can work OK, but I've found they typically get "wider" and then any work on the monolith gets really painful and slow. My last two gigs have been at opposite ends of the spectrum and it's kind of a "pick your poison" situation.
Comparing alternatives, BFFs add reliability that an API Gateway pattern doesn't have. One choice is to shard the API Gateway, but then the distinction between BFF and Gateway becomes more philosophical.
You grossly over-simplify the problem though of when you're trying to consume part of a monolith without contributing to it. BFF can be a valid approach to making a break. We also have 4 back-end teams and a poor release process, so for net-new projects this pattern has given teams a lot of deployment flexibility. It's definitely not cost-free, but I'd argue it's far less political than your inferred recommendation of "change the organizational structure to have a 'real' backend team".
There's a massive benefit missed here: backwards compatibility! Or more specifically, the lack of a need for it.
If you force your frontend to only use what makes sense as a "general" backend API, it ends up being inefficient. There are 1:n calls needed to get all data, or a lack of a specific filter requires pulling back everything to do client side filtering, etc.
On the other hand, if you build out your back-end to support every single front-end use case, it gets massive quickly. If other third parties start using it (as part of your published API) you now are forced to support it long term - even if your frontend changes and no longer consumes it - until you finally decide to break backwards compatibility (and the third party app). Break things enough and your users will quickly hate you.
This is especially true early on, when your frontend is going though rapid and radical changes.
I see BFF as the intermediate, giving best of both. You can rapidly adapt to front-end needs but don't commit to long-term support. There's actually no need for BC support at all if you can deploy front-end and BFF atomically.
If you build your code in a structured way, you're sharing the actual business and data logic, and it's trivially easy to "promote" a BFF resource to the official API, with whatever BC commitment you want to provide. The key thing is you are doing this deliberately, and not because the front-end is trying something out that might completely change in the next iteration.
I think there might be a middle ground here that avoids introducing a BFF to be maintained and operated: in the back-end, introduce "faster moving" endpoints that are specifically targeted for front-end clients and that don't commit for long-term support.
What do you see as the difference between a BFF and "fast-moving endpoints"? To me, BFF is an unpublished API for internal use only. I can change BFF whenever I want (as long as I do the corresponding change to frontend) and don't have go through the effort of building+maintaining facades/wrappers/whatever for backwards-compatibility.
If you have a 3rd party using an endpoint, it kind of changes things. Providing short-term instead of long-term BC is maybe slightly less effort, but any BC at all is an order of magnitude more work than none.
With no BC, you would basically say "Hey on Thursday afternoon we're deploying a new version and the old one goes away -- be ready!". That nearly guarantees a 3rd party will have downtime, and is frankly a very hostile way to treat them. If I was the 3rd party I'd either be hounding your business for a long-term BC commitment for anything I use, or complaining about how it's a garbage API that constantly breaks on us and advocating switching away to a competitor ASAP.
I can't say I see the point of this, unless you have a criminally unresponsive backend dev team.
Any half-decent backend API will offer parameters to limit the response of an endpoint, from basic pagination to filtering what info is returned per unit. What's the use of extra complexity of a "BFF" if those calls can be crafted on the fly.
And to be clear, I am not suggesting that custom endpoint be crafted for every call that gets made, that just seem like a strawman the article is positing; but rather that calls specify when info they need.
The article doesn't do a great job at explaining that this isn't always just filtering, sometimes it's aggregation too.
A mobile client may need data points to display a single page that require calling 20 different APIs. Even if every single backend offered options for filtering as efficiently as possible, you may still need an aggregation service to bundle those 20 calls up into a single (or small set) of service calls to save on round-trip time.
You still have to aggregate somewhere. You can do it on the client or the frontend backend, it still has to get done. In the case of the latter we’re adding one extra hop before the client gets their data.
This pattern is advocating for reduced technical performance to accommodate organizational complexity, which I think the parent finds odd. You either have the client call 20 service/data?client_type=ios or you have the frontend backend call 20 different service/data?client_type=ios (after the client called)
> In the case of [backend for frontend] we’re adding one extra hop before the client gets their data.
> You either have the client call 20 service/data?client_type=ios or you have the frontend backend call 20 different service/data?client_type=ios
The article touches on this point, and it mirrors what I've seen as well. The time from client -> backend can be significant. For reasons completely outside of your control.
By using this pattern, you have 1 slow hop that's outside of your control followed by 20 hops that are in your control. You could decide to implement caching a certain way, batch API calls efficiently, etc.
You could do that on the frontend as well, but I've found it more complex in practice.
Also a note: I'm not really a BFF advocate or anything, just pointing out the network hops aren't equal. I did a spike on a BFF server implemented with GraphQL and it looked really promising.
You won't necessarily have to have ?client_type=xyz params on your endpoints if the BFF can do the filtering, so it saves having to build out all sorts of complexity in each backend service to write custom filtering logic. Of course, you'll pay the price in serialization time and data volume to transmit to the BFF, but that's negligible compared to the RTT of a mobile client.
I'd much rather issue 20 requests across a data center with sub-millisecond latency and pooled connections than try to make 20 requests from a spotty mobile network thats prone to all sorts of transmission loss and delays, even with multiplexing.
Tbh, I'm not entirely sold on this - although I see this (server-side aggregation a cross data sources) as the main idea behind graphql. So seems like it belongs in your graphql proxy (which can proxy graphql, rest, soap endpoints - maybe even databases).
But for the "somewhere" part - consider that your servers might be on a 10gbps interconnect (and on 1 to 10gbps interconnect to external servers) - while your client might be on a 10 Mbps link - over bigger distances (higher latency).
Aggregating on client could be much slower because of the round-trip being much slower.
In addition, you might be able to do some server-side caching of queries that are popular across clients.
I agree with your assessment here, but one additional benefit is the capability to iterate faster on the backend. You have control over _where_ the aggregated data is coming from without waiting months for users to update their mobile app so that it sends requests to a new service, for example.
The article implicitly assumes that you have multiple backend teams and you need to combine the results from different services that belong to different teams. The services having interdependencies would lead to a giant ball of mud, so you need a service in front doing that for you. Now if you have also more frontends with different requirements who is going to take responsibility in the central service?
The architecture really only makes sense if you have a lot of people that would step on each other’s toes if you don’t assign them their areas and would come to a halt if you didn’t gave them enough autonomy.
The article puts forward a slightly different proposition, but IMO BFF in smaller organizations are often managed by backend teams (makes sense seeing that "backend" is still the name...). Same goes for teams with full stack devs, they'll be touching whatever layer they want.
Having multiple layers of backend services can have benefits, and one of these layers would just happen to be dedicated to the public facing frontend. To me the one of the main advantage is to make it easier to manage the security settings and applying different assumptions across the whole API. It is single purpose, so it helps a lot for customization and management (or even choosing a different stack altogether, having different scaling strategies etc.)
It becomes something managed really differently when the "real" backend is opaque and off limits (as describe in the article), but then I'm not sure we should call that "BFF", it's just a regular backend for that team as they have no other backend in charge (i.e if I had a single backend API for a mobile app, that I use to interface with Stripe, I wouldn't call it "BFF", that would make no sense)
I've got some vendors. One of them is still in the Stone Age and has difficulty with anything more than http basic auth. The other can handle a more modern OAuth setup.
Each of these vendors needs a list of accounts in an area. Deep in the stack, the list of accounts and managing individuals tied to those accounts is one service. The vendor doesn't need access to the rest of the service, just the list of accounts.
Each vendor also needs to be able to submit orders and issue refunds. That's a different backend service.
One of the vendors has been known to be... shall we say "inconsiderate" in the rate of requests. It is important that the inconsiderate vendor doesn't impact the other one's performance.
We could add in basic auth into those back end services and add in some more roles for each one of the vendors. This would complicate the security libraries on the back end services needing to accept internal OAuth provider, external OAuth provider, and basic auth - and making sure that someone internal isn't using basic auth because they shouldn't be. And trying to handle per account/application rate limiting on those back end services that really don't need to or want to care about the assorted vendors that are out there.
So, we have a pair of BFFs. They're basically the same, though there's a different profile setting in Spring Boot to switch which set of auth is configured - if its the vendor that uses the external OAuth provider then the external OAuth configuration is used. If the profile is for the other vendor, then the basic auth is used.
Each BFF calls the appropriate internal services and aggregates them - so that the internal services don't need to be concerned with the vendors. There's a BFF that has two instances and has access to these endpoints - that's good enough for the internal services. Likewise, each BFF has its own rate limit so that the inconsiderate vendor won't overload the backend or accidentally rate limit the other vendor out of being able to make requests.
The BFF handles the concern of the vendor. Aggregating the internal requests, rate limiting that vendor, providing isolation between the two vendors, and each handling the authentication and authorization for the vendor. By putting those in the BFF, those aspects of making requests to the vendor are kept isolated.
Additionally, the API contract for the vendor can be held constant while internal teams can change the internal API calls without worrying about if that data is leaking out to the vendor accidentally. Services internal can be updated with new endpoints and the BFF can be changed to point to them as long as the external vendor API contract remains the same - without needing to involve the vendor with a migration from a V1 endpoint to a V2 endpoint.
I think it’s a great idea, except that instead of BFF, we should call that layer “a view”. The microservices can be then called “controllers and models”
I feel like another alternative in this space is to use GraphQL: while it does introduce complexity in terms of query handling, it does allow each frontend to more precisely query the data model and avoids having to tightly couple the frontend data model to the backend data model.
GraphQL is practically designed to support this pattern. Facebook was one of the early proponents of this architecture and GraphQL was a product of that. The federation features allow individual teams to maintain their slice of the pie and communicate with the real backends over RPC.
GraphQL was explicitly designed for querying the service model. I’m not sure that makes it an alternative, but rather and early advocate and adopter of BFF, promoting the BFF approach that worked for Facebook.
It has been used to query the data model by some. Data is data at the end of the day. But that is not where it really shines and misses some of its advantages when the service layer is skipped.
Yes, by using GraphQL's Query DSL, the client can control what to fetch :) From my opinion, the mainly difference is API Gateway can filter requests and protect the upstream services (GraphQL can't).
GraphQL can totally do these things, with an appropriate resolver. I mean, in the end, anything you can do using REST can be achieved with GraphQL.
But shouldn’t the individual services be protecting themselves?
That said, I think there’s a case for BFFs - even ones implemented in GQL - when there is a need to support devices or customers which you can change, and which may live longer than your API.
Call me an old gray beard here. But, IMHO the client is x10 more complex in all platforms. Pumping out crud api calls is fast. Dealing with React/swift/state management/product managers is not. So with this project BFF I feel it's going the wrong way. We need ways to help the clients do less work faster. Not the backends. Client team members don't need more work - working on a BFF. They need less work, maybe Backend team members do more client work near the api layer.
If it is true that the client is 10x more complex than the server, then shifting some of the client-side complexity to a 10x simpler BFF server is going to be a net positive.
Logically if you want to rip out work from the client you would put more work into the backend? I don't understand what you're getting at.
CRUD may be fast on the server side, but if the client is performing a complex workflow of CRUD calls, you're incurring the round-trip network cost each time. The more you can coalesce the workflow on the server side, the lower round-trip network costs you'll incur, which is the real bottleneck here (10ms server processing time vs a wild range of 10ms-1000+ms network latency, depending on where you are/what you're on).
Our company has been working on a new large platform built around BFF and it's been really nice to work with.
We have several SPA and mobile apps which interact with our data very differently. It took a bit to break the mindset of using the HTTP layer as the "point of reuse" but now it's far easier to add and maintain functionality.
We feel much more productive compared to past projects where we spent a lot of time trying to make "one API to rule them all" with a single ever-changing model.
Originally this was a Netflix pattern I believe, and the key trade-off there is that there are lots of client implementations/versions and they are hard or impossible to upgrade (eg set top boxes).
And typically you want to call a few different backend services in a given API call, with perhaps different trade-offs; one device might need cover photos up front, another might just do text summaries, to take a made-up example.
If you have one API gateway then your clients tend to be thicker as they need to figure out how to weave the different API calls together. And/or your API gateway gets clogged up with lots of usecase-specific variants.
So if you extract a per-frontend(-family) backend you can keep more of the logic in your backend system where it’s amenable to change as the upstream services evolve, and keep the device logic better modularized.
If you just have iPhone vs Android apps it may well be cheaper to just eat the complexity cost of having different APIs on your single gateway, to avoid having to pay the cost to deploy multiple bff services.
I’ll answer your questions with another question: who is responsible for maintaining the single API gateway in your alternative scenario?
If you have a single module that needs to support customised APIs for each front-end client, and each of those APIs potentially needs to talk to the now-internal APIs for each back-end service, you’re back to the monolith problem where everyone has to co-ordinate changes and deployments with everyone else. Chances are that you adopted microservices to avoid exactly that kind of problem.
If the modules in your new layer instead correspond 1:1 with your front-end clients but each might need to depend on multiple back-end services, you create new problems if the back-end teams have any responsibility for maintaining the new modules. Now each back-end team has to understand all of the different front-end clients using its service and cater to each of them separately, and you also need all of the relevant back-end teams to co-ordinate on any changes and deployments of each front-end client’s back-end API. Again, this is adding new responsibilities and overheads that you didn’t have before.
So that brings us to the idea in the article, where each front-end team is responsible for maintaining its own back-end API module as well. This avoids the original performance problem of having each front-end client making an inefficient set of relatively slow requests to multiple general-purpose APIs for different back-end services, but without needing any extra knowledge or co-ordination between teams that you didn’t already need anyway.
I assume it's so that if your client wants, it can reach out to a different BFF to get some data not implemented in its own BFF, leading to a class of problems where the data might change, necessitating the need to build Backend for Backend for Frontends (BFBFFs) on top of the BFFs. That way you and your competing teams can wage an arms race to see whose jenga tech silo can grow faster.
Because the API is a REST API and you don't want your frontend to have to make 100 calls to render a screen. Each frontend may be a completely different app so they all need the data in their own format. And each team may control their app so you don't want all of the teams roadblocked by the API team.
If the API were a REST API then the client would only need to make 1 call to request the screen (representation of a resource) in a format (media type) the client expects.
This is not the reality when you go into most organizations. It may have started like that but a few cases of "just one more call" result in the client making a load of calls at startup followed by many others when user navigates to a detail screen. It probably caches a few items, but many others are not cached appropriately.
This is what the vast majority of developers are facing. Be thankful, if you don't have to clean these messes up only to move to another team to repeat it. It happens naturally because of the tension between shipping things and finding enough time to properly optimize.
Some mix of politics, management, and prioritization, in practice, when I've seen it. (It's often hard to fully separate those three things.)
Consider an app that has a TV interface, a desktop interface, and a phone interface. The UI might be very different on each. You might want different art at different sizes/aspect ratios for different interfaces (consider different ways of browsing content on a streaming media service, including potentially "live TV guide" views and/or series-vs-show-level navigation). You might iterate on those interfaces at different rates - it's easier to release to the web than to mobile, and it's a LOT easier to release there than to, say, Samsung TVs.
So either distinct sets of APIs for each, or some sort of graphql-style "get just what you ask for" open-ended API contract becomes necessary.
The former is often easier to optimize for performance if you don't truly need unlimited permutations, but just a handful of them.
Nothing here is impossible to do out of a single API gateway, but you're likely run into some problems over time:
* at a certain number of people, a single team working in a single codebase often steps on each other's feet more than is productive
* at a certain feature complexity, the API gateway becomes much more "custom logic for each device" than "shared aggregation of stuff from individual backends"
* at a certain company/product organization/roadmap size, trying to sequence and deliver priorities for three competing client teams at the same time becomes tricky
Splitting up the teams to each focus on different clients lets you accomplish organizational resource splitting at a higher level (how much do we staff these teams) than stakeholders having to jockey for prioritization on every single feature.
Once the teams are split up, splitting up the codebases and deployables makes sense too - if you're responsible for the mobile API, you don't want to be paged because the TV team pushed something that screwed up your crap, etc. You can do finer-grained optimizations at that point, too.
1. You do not control the update cycle of the client layer
2. You want to decouple platform support. You can do this for technical reasons and for org reasons (you can scale up teams independently)
This way, you can have the client be simple with a slow update cycle. You restrict its slowness to it alone and not to the whole product. You can then scale your backend services teams independently. You can throw away entire backend services while your front-end team modifies their API layer to ensure that legacy clients can handle things. The front-end team working on a client with a slow upgrade cycle can mock API responses or something. Meanwhile, the front-end team that's working on a client with a fast upgrade cycle can just move on.
You can thin your platform-specific client and move more logic into the platform-specific backend while keeping your platform-agnostic backend moving fast.
>>> why does IOS, Andriod, ect each have there own API gateway vs sharing one
You need to be aware of some lifecycle aspects when dealing with first versus third-party clients. If you don't have experience with third-party being a key component of your business model then this is hard to grasp.
When you list these mobile clients, I immediately assume that these are your first-party clients. These are clients which you deploy to various stores and manage completely. There is some chance that you are supporting old versions, but you are mostly in control and can drive the support lifecycle by deprecating old clients when analytics suggest it is OK.
When you have clients that are critical to your business (some weird set-top box) that you can't just deprecate, then it starts to become viable to do something like this. Your main system can evolve and grow leaving only a single system to maintain for these special clients. Whether it's worthwhile is going to be completely dependent on the value of the client. It's massive improvement in the quality of life for your backend team and those responsible for supporting the client.
Typically each channel takes its own course and evolves at different speeds with different needs. Trying to engineer a common set of APIs that serve all channels eventually leads to ever increasing FE logic with a backend API with leaky abstraction (by virtue of trying to accommodate different UI concerns).
By decoupling the BFF APIs which are geared towards a UI specifically, in a way you are moving a big part of the logic and orchestration to the backend and keeping the UI layer super lite. From practical experience, this has been a very helpful pattern both in my startups and enterprise career.
> Why isn't there just one API gateway with minimal front-end-specific endpoints?
After Googling, I found that we can use both API Gateway (e.g., Apache APISIX, https://apisix.apache.org/blog/) and GraphQL to achieve this:
1. GraphQL: Let developers choose which resources to receive.
2. API Gateway: Except for implementing BFF, also we could filter requests to make sure upstream service is protected. (It's more flexible IMO)
For our app the back of the frontend provides the working data model.
So for our web app we have a next style server side renderer + serverless functions
For the mobile app we stick we a queue based architecture (the web back of the frontend resolves to the same underlying data structure.)
So imo different front ends have different use cases. But in general the back of the frontend isn’t a backend it’s the frontend server. So the round trip is the same for first paint, api requests etc. For the web app at least
Based on my experience seeing this in action, adopt this at work if you want to have every team create their own shitty ad-hoc service mesh that sucks. Now adding a column to a table takes three PRs and three deploys! And you have to write redundant tests for each stupid layer.
That is still complicated. To maximally reduce complexity you need less. After all the word complexity literally means many.
1. Use a single traffic receiver at both the browser application and the terminal application. This thing works like a load balancer. It receives messages from a network and sends them where they need to go. There is only 1 place to troubleshoot connectivity concerns. You can still choose between monolith and multi-service applications and have as many micro services as you want.
2. Reduce reliance on HTTP. HTTP is a unidirectional communication technology with a forced round trip. That is a tremendous amount of unnecessary overhead/complexity. I prefer a persistent websocket opened from the browser. Websockets are bidirectional and there is no round trip. This means no more polling the server for updates. Instead everything is an independent event. The browser sends messages to the server at any time and the server independently sends messages to the browser as quickly as it can, which is real time communication happening as quickly as your network connection has bandwidth and your application can process it.
Your first sentence was my primary takeaway from this article. Rather than dealing with complexity, I feel like these ideas create more. Issue has already been taken with "overfetching is not an option". It certainly is an option, albeit a non-optimal one. In this business, non-optimal is probably optimal for now, for most of us.
I would like for there to be solutions to the problems identified in this article. I don't have them, and I also would like for the solutions to not be the ones laid out in this article. Better to build something new (?) than to put these band-aids on the existing framework.
At my last company, when we were building a greenfield application, we leveraged this pattern, at a time when GraphQL was not yet available. In my new company, I look at GraphQL and realize that this solves the BFF pattern in a standardized way
There are a number of downsides to graphql that arise from its overly generic (and yet, at the same time, not generic enough) nature. No HTTP caching of graphql requests is one. Very deeply nested payloads is another. Difficulty of expressing complex query logic through parameters. Inability to natively represent structures such as trees of unknown depth. Special efforts spent on preventing arbitrarily deep/complex queries. And so on.
A BFF does not have any of these problems. It can be made as closely tailored to a particular frontend as possible. It does not need to concern itself with a generic use case.
I really like the BFF pattern as it can solve a lot of problems. That said, I've always found it quite complicated to build BFFs because there was no such thing as a "BFF Framework", which was one of the reasons we've built WunderGraph. One of the core use cases is to make building BFFs easy and solve a lot of cross cutting concerns. The idea is that you don't really "write" a BFF but rather generate most of it by treating APIs (origins) as dependencies. I'd love to hear what you think about this approach on implementing the BFF pattern:
> The latter [BFF] is responsible for:
>
> Getting the data from each required microservice
>
> Extracting the relevant parts
>
> Aggregating them
>
> And finally returning them in the format relevant to the specific client
I am pretty certain that the only reason why BFF were invented is for security in mobile and SPA applications, to securely deal with OAuth tokens.
If you don’t need that don’t use a BFF, keep things simple. Start by not building a SPA when it can be a classic server client web app. I feel like most of the web should just be that.
I've seen this pattern frequently lately -- front-end team develops "full stack" MVP using a react/node type stack. Then someone like me comes along when real back-end features are needed, and writes something in python, like an admin page. The point of contact is the db (typically Mongo) - and later on someone will add postgres to properly handle the data...
I really like SvelteKit's way of helping you create a similar BFF with their endpoints system. I still have my "real" backends, but w/ SvelteKit endpoints I can even turn my BFFs into real APIs. And it's trivial to create an interface for getting/posting data from those BFF APIs, which I use for admin.
This looks like overengineering to me. The frontend cannot be that dumb. Every modern frontend technology out there can parse a bunch of JSON efficiently.
If for whatever reason you don't want to call a few REST endpoints to render some stuff, then you should probably replace those endpoints with some other technology.
Maybe this works out for some engineering organizations, but man am I glad not to work with this style of "backend" anymore. The folks at my last gig did this and never have I seen such a mess of services.
One particularly annoying pain point was that one of the engineering directors made it his personal mission to ensure that every service's Protobuf definition conformed to some platonic ideal. We'd have a BFF service Foo that essentially provides a wrapper around another service, Bar (among other things). Naturally, Foo's service definition included some Protobuf messages defined in Bar. Now the simple thing to do here would be to include those Protobuf message definitions in Foo.proto and expose them to consumers of the Foo service, but that apparently couples clients too tightly to Bar's service definition (?). So, it was decided that Foo's Protobuf file would contain actual copies of the messages defined in Bar. Adding a new field to a message in Bar wouldn't expose it to clients of Foo unless you also update the message definition in Foo.proto as well as the code that converts from Bar::SomeMessage to Foo:SomeMessage. Now imagine that the backend service is something commonly used, like UserService, and that there are potentially dozens of BFF services that contain copies of the definition of UserService::User. Welcome to hell.
The aforementioned engineering director would hunt down BFF services that dared to include a message dependency instead of copying the definition. He would then open a ticket and assign it to the service owner, informing them that they were not in compliance with the "Architectural Decision Record": the indisputable, no-exceptions, holy-of-holies log of technical mandates that were decided by a handful of people in a meeting three years ago.
This issue isn't necessarily inherent to BFF architecture, but I think it comes from the same place of overengineering and architectural astronautics.
My company uses this pattern extensively, just as indicated in the post. Frontend teams deliver their own backend-for-frontend and the backend teams just worry about their own microservices. Generally, it works out pretty well most of the time.
The big issue I've been seeing is that occasionally frontend teams will decide to develop "features" by stringing together massive series of calls to the backend to implement logic that a singular backend could do much more efficiently. For example, they commonly will have their backend-for-frontend attempt to query large lists of data, even walking through multiple pages, in order to deliver a summary that a backend service could implement in a SQL query. Unnecessary load on the backend service and on the DB to transmit all that data needlessly to the BFF.
I know the easy answer is to blame the frontend devs, but this pattern seems to almost encourage this sort of thing. Frontends try to skip involving the backend teams due to time constraints or just plain naivety, and suddenly the backend team wakes up one morning to find a massive increase in load on their systems triggering alerts, and the frontend team believes its not their fault. This just feels like an innate risk to promoting a frontend team to owning an individual service living in the data center.
Many of these backend systems will be Off-the-shelf products, e.g., SAP, Salesforce, Marketto, and lots of other systems people use. Many of these systems don't have APIs that work well for interactive frontends. And you have many different frontends, for your sales people, your project engineers, your marketing department, your different suppliers and partners. These different apps have different data & api needs, and will need to aggregate data from different systems. Systems that cannot be modified to execute the most optimal SQL query for your frontend. And it's too much business logic to put in an API gateway. And all these systems have their own lifecycles, so different project timing, different releases, etc. A BFF makes a lot of sense to make all of this manageable.
this is what happens when you give frontend teams tools with to much power in an environment where frontend and backend team don't communicate well with each other.
i have seen this first hand with a customer. to their credit, they didn't even have a backend team, until i was brought on board to help them rewrite some of their slower frontend functionality into some more efficient backend functions
Your frontend team needs an API call that retrieves only friend requests with user names/avatars because the current API call is too heavy.
Normally they'd go to the back-end team and ask them to add an endpoint or modify an existing endpoint to give them the exact data they need. The problem with that is global API surface area becomes more complicated. Complexity is bad because more mistakes are made.
With this model your front-end team can implement the endpoint themselves in the BFF and model the response data exactly as they need it.
They also don't need to deal with that one zealot on the back-end team that gives everyone a hard time about tiny irrelevant issues.
In the nineties, people were talking about n-tier architectures. This was of course before browsers were a common thing. But we did have networks already since the nineteen seventies. And actually, a lot of the architectural solutions people use today aren't that different from what people were doing a few decades ago. Things like resolving names, passing messages, figuring out security, etc. have been things that people have been figuring out for different systems. If you compare Kubernetes to CORBA for example, you'll see some common features between the two. Not surprising because they both provide a solution for allowing networked components to talk to each other.
Basically, the whole frontend/backend distinction is a simplification. There are more than two layers. And you can probably even identify layers in the frontend as well these days given that a typical SPA actually does a lot of stuff before it even talks to a server. Local browser storage for example is a storage layer. And you probably have some layer in between that and e.g. event handlers.
The reason the frontend/backend boundary is important is because they are physically separated by a network but also have relatively independent life cycles. You don't update them at the same time for example. You might have different versions of the frontend talking to the same backend. Or even multiple different frontend applications. And many companies have multiple backends as well. So, it's an ecosystem of frontends and backends talking to each other. Providing some kind of facade layer for that complexity is of course an obvious solution when you have a complicated backend with its complexity spilling over to frontend layers.
Indeed, there was tonnes of n-tier architecture stuff in Java EE land in the 90s. SOAP - Service Orchestration and Automation was/is another concept from the 90s that is conceptually similar to GraphQL.
I don't like the terms front-end and back-end. In reality applications are using a variety of internal & external services these days, and not all of them are behind a backend necessarily. Your own micro-services are meant to be used in much the same way as any third party's endpoint.
If you actually measure real latencies (e.g. collect metrics from your mobile apps), this is not true most of the time, at least in my experience. Most responses are fairly small, making them a bit smaller doesn’t have much impact on mobile app performance in the real world.
I’d guess for most apps, there are maybe a handful of endpoints where the response size can be reduced a tonne for mobile clients, AND that makes a big performance difference in practice. But instead of hopping straight to BFFs, or other approaches to solving this like GraphQL, you can just add a “fields” query param to those few key endpoints. If provided by the caller, you return only the fields they ask for. No massive re-architecture to BFFs or GQL, just a small change to a couple endpoints that a single dev can implement in a day or two.
Now, there are other benefits to BFFs than minimal, client specific payloads. One is that you can make breaking changes to underlying APIs, and not break always-so-slow-to-upgrade mobile clients, by preserving backwards compatibility in your mobile BFF layer. Another is coordinating multiple calls - turning many sequential round trips into a single round trip can be a big win. But if your main concern is payload size, BFFs are probably an over engineered solution.