Making the Python back end for my new webapp

daft_pink · 2024-10-16T16:53:47 1729097627

I’m really interested in how you got the auth to interoperate between nextjs and python. I find auth to be the most difficult part of making blended code projects with JavaScript on the frontend like this.

eigenvalue · 2024-10-16T17:16:05 1729098965

Well you don't really need to get the auth to interoperate per se; the only machine that is allowed to connect to the FastAPI backend is the machine running the NextJS app, and it passes along the email address of the user making the request to the FastAPI backend.

And the user auth stuff in NextJS is incredibly easy using the standard Next-Auth flow: https://next-auth.js.org/

You basically just set up a new application in the Google Cloud console, enable the Google Plus API for the app, and create the OAuth keys, and that's about it. Just add the secret key and identifier to your .env file for the NextJS app and it "just works".

daft_pink · 2024-10-16T20:57:27 1729112247

So essentially, you only let the server side rendering function, access the API and don’t secure it at all beyond that?

You could essentially use a static JWT token that only that nexts cloud function knows?

eigenvalue · 2024-10-16T21:27:22 1729114042

Yeah, I restrict it by the IP address, so the FastAPI backend can only receive connections from localhost or the one machine running the NextJS app. There are certainly lots of ways you could restrict it, such as using a password or key.

benn0 · 2024-10-16T21:53:26 1729115606

I've been doing this with NextAuth and fastapi-nextauth-js [https://github.com/TCatshoek/fastapi-nextauth-jwt]. With JWTs it's pretty straightforward to do something similar for any other auth provider. This is also using NextJS rewrites (see [https://vercel.com/templates/python/nextjs-fastapi-starter] for an example)

twolf910616 · 2024-10-16T17:09:29 1729098569

authn or authz? Authentication is not terrible to build but authorization yeah, I'm totally with you there. It's hard.

dennisy · 2024-10-16T21:44:45 1729115085

Great article!

Bit of an aside, whilst SQLModel and merging the API validation and ORM models is fast, is it a good idea? There could be reasons these two things should grow individually and is it good to put so much logic into external frameworks?

eigenvalue · 2024-10-16T21:51:10 1729115470

Thanks. I don't really see how the validation schema and ORM model would ever really diverge... it's basically just specifying the various fields that are required or expected and the types of those fields. Before I found SQLModel and I was separately using SQLAlchemy ORM and Pydantic, I would always end up with models that more or less looked the same but with just different syntax. And actually keeping them in sync was very annoying, since you would always have to remember to change things in both places. It's very natural to combine them.

The beauty of how tiangolo (creator of FastAPI/SQLModel) implemented things is that an SQLModel class isn't just LIKE a pydantic schema and LIKE an SQLAlchemy ORM data model-- they literally ARE those things under the hood, so you can still use those libraries separately for more advanced tinkering and stuff "just works".

Attummm · 2024-10-17T10:12:06 1729159926

> I don't really see how the validation schema and ORM model would ever really diverge...

If that were the case, then using a PostgreSQL API[0] that maps tables to APIs would be all that's required.

However, the real world is messy. Requirements change, which could lead the project becoming a reimplementation of full framework such as Django.

Django also comes with generic REST endpoints based on models thus giving you the magic, but still allows for all the different use cases and customizations that might present themselves during the full lifecycle of a project.

[0]https://github.com/PostgREST/postgrest

dennisy · 2024-10-17T14:13:24 1729174404

That is a great point, if CRUD is all we need PostgREST would be all we need!

Scarblac · 2024-10-17T16:26:50 1729182410

I feel it would be good to start with PostgREST and only start adding custom endpoints once what you need diverges from tgat.

Although those could also be Postgres views and stored procedures, of course.

dennisy · 2024-10-16T22:15:10 1729116910

Yeah I had a good read of the concept. I do agree for most CRUD apps there will be no divergence.

My concern is that this breaks layering suggested by most architecture frameworks.

globular-toast · 2024-10-17T08:20:01 1729153201

Well, for a start there is a strong tendency for devs these days to ignore the old wisdom of abstraction and layered architectures etc. If they keep doing it long enough they usually learn it eventually, though!

But this kind of thing can work in a well-architected system too. You're basically just saying this API is pure CRUD. I simply record what the client tells me. Then you have to apply your business rules somewhere else. This is usually the event-driven approach. I just record commands/events, then some other code actions those, based on business rules.

Where I think this goes wrong is not realising the business rules have to go somewhere else. I see this with Django a lot. You get started with just pure CRUD then someone says "oh, we shouldn't allow that record because xyz", and the whole thing starts to become a mess.

kristianp · 2024-10-17T06:55:10 1729148110

Just looking at SQLModel example at (1)

   class Hero(SQLModel, table=True):
       id: Optional[int] = Field(default=None, primary_key=True)
       name: str
       secret_name: str
       age: Optional[int] = None

Is the id supposed to be an Optional like that?

(1) https://sqlmodel.tiangolo.com/

instig007 · 2024-10-17T17:16:47 1729185407

No, it's a bad design that doesn't use type composition to restrict invalid representations. People are given type hints, but they still practice the same dynamic typing mistakes as before, including the authors of the lib.

Instead, it should distinguish between persisted entities and value objects:

    class Entity(Generic[A]):
        pk: int
        val: A

    class Hero:
        name: str
        secret_name: str
        age: None | int = None


    class Service(Generic[A]):
        def persist(self, x: A) -> Entity[A]:
            pk = self.execute(insert(x))
            return Entity(pk, x)
    
        def select(self, *cond) -> Iterable[Entity[A]]:
            ...

        def copy(self, x: Entity[A]) -> Entity[A]:
            return self.persist(x.val)

        def delete(self, pk_ent: int | Entity[A]) -> A:
            ...

It should be obvious how the same model extends to different PK types too: serial ints, UUIDs, etc.

globular-toast · 2024-10-17T07:07:54 1729148874

I think because you can create an object before you persist it in the db. The db will generate the id for you. So id=None is used to mean "not in persistent storage".

valbaca · 2024-10-17T16:52:47 1729183967

usually the ORM generates the id, so it's None until saved

(at least I'm assuming based on how other ORMs often (not always!) work)

lukax · 2024-10-17T06:01:07 1729144867

The author uses Whisper and GPT-4o to get transcriptions into a nicely formatted Markdown file.

We just released Omnio, a new AI model, that can do all this in a single step, as it works with audio directly. It does not generate a transcript and then modify it but can generate structured output such as Markdown directly from audio.

Maybe you can check it out.

https://soniox.com/blog/omnio/

asplake · 2024-10-16T16:43:29 1729097009

Oh, SQLModel [1] looks interesting – hadn't seen that before. For apps with html front ends, are people replacing WTForms?

[1] https://github.com/fastapi/sqlmodel

eigenvalue · 2024-10-16T16:52:13 1729097533

It's such a game changer... I always had the thought before in the back of my mind, "Why do I need to create two versions of every data model that basically specify the same information (field name and type)?" And it turns out, you don't have to do that, you can use the same model for both the ORM and the response validation schema.

binary_slinger · 2024-10-17T02:58:27 1729133907

Separation of concerns. Input data from web is not always going to match schema of DB.

sassbadger · 2024-10-17T05:20:43 1729142443

I argued exactly this fact with a talented junior dev in our team. He still implemented his solution with SQLModel but started seeing the difficulties half a year into the project once the business requirements became more apparent and the API schema and data model started diverging.

asplake · 2024-10-16T17:01:05 1729098065

Right! And the forms too? Got to say though, even if it doesn't do the html generation that WTForms does, it seems the better split.

eigenvalue · 2024-10-16T17:18:13 1729099093

No it doesn't do any forms stuff, but you can easily get the forms using your preferred UI framework from the data model definitions themselves using an LLM.

Spivak · 2024-10-16T22:50:55 1729119055

Make sure you know the limitations of SQLModel before committing to it. It doesn't support polymorphism which was an immediate dq since that ends up being needed in every project.

physicsguy · 2024-10-17T05:42:55 1729143775

I don’t really understand the point of having two backends for such a simple application.

varun_chopra · 2024-10-16T16:41:00 1729096860

This post comes at a great time. I've been looking into what the "perfect" stack would be for me (I'm OK with Python but haven't done any frontend work).

Is anyone actually using FastAPI in a commercial, large scale app? Would you prefer using...say Django or Flask + Gevent (since they're more mature) over FastAPI?

I recently found this thread[1] about FastAPI. It's somewhat old now but reviews are mixed. I'm wondering if the landscape has improved now. Additionally, OP is using NextJS for the frontend and even that isn't without complaints[2]. What's odd for me is that the React website also asks you to pick between either Next.js or Remix[3].

[1] https://www.reddit.com/r/Python/comments/y4xuxb/fastapi_stab...

[2] https://www.reddit.com/r/nextjs/comments/1g18xgu/nextjs_is_h...

[3] https://react.dev/learn/start-a-new-react-project#production...

ruicaridade · 2024-10-16T16:59:48 1729097988

Indeed, we use FastAPI in quite a large scale! I would not trade FastAPI for anything else at this point. Before the typing module became ubiquitous there used to be a lot of "magic" frameworks that made heavy use of Python's dynamic nature; both Django and Flask fall within this category: I am not a fan of untyped Python, at all.

There's some gotchas in the way FastAPI works, mostly due to its usage of thread pools and the GIL. I would recommend anyone starting a project to exclusively use asynchronous I/O (SQLAlchemy supports async with asyncpg), as FastAPI is very clearly meant as a pure ASGI framework, despite what they claim.

nickpadge · 2024-10-16T22:19:12 1729117152

How is your experience with async SQLAlchemy with FastAPI?Ive experienced a handful of nasty bugs, specifically around how connection pooling is handled.

flakes · 2024-10-16T21:35:48 1729114548

Yup, I use FastAPI for some large services, handling a few million requests per day. Great experience overall, and speed of development is significantly improved. You can feel confident something is going to work after writing code. With flask and django I was always second guessing if things are hooked together properly. One pitfall I would avoid though is mixing both sync and asyncio together. Ive found there are a lot of footguns in that type of setup. Use one or the other, but not both

E_Bfx · 2024-10-16T22:04:11 1729116251

Hello, could you elaborate on the footgun with mixing sync and asyncio ? I am currently devellopin one app with both kind of endpoint + websocket and I prefer not discover too late this problems .

flakes · 2024-10-16T22:54:37 1729119277

I've found that when mixing the two types of routes, it can be quite easy to accidentally introduce blocking paths which freeze up the main event loop, especially if mixing async dependencies with sync ones.

When a route is async, it gets scheduled on the main event loop, meaning that any blocking calls can block all requests in flight, and block unexpected things like start up and teardown of api handlers. `asyncio.to_thread()` can help here, but it's easy to forget (and there's no warnings in case you forget).

If you do mix the two, I would be very careful to monitor the timings of your requests to detect early if blocking is occurring. For adding these metrics, I suggest using something like statsd or open-telemetry libraries for adding reporting to the endpoints, which you can feed into telemetry dashboards.

eigenvalue · 2024-10-16T22:18:13 1729117093

Basically, you want anything that has any kind of IO (network, disk, etc.) or which is very slow to compute to be fully async. If you have a function that returns almost instantly, like computing the hash of something, you can keep that sync.

If you're forced to use a slower sync function (say, something from a library that you can't easily change), then you can use asyncio.to_thread() so that it can run in a separate thread without blocking your main event loop. If you don't do that, then you sort of defeat the whole purpose of the architecture because everything would stop while you're waiting for that slow sync function to return.

bb88 · 2024-10-16T19:22:31 1729106551

Django/DRF are fine for API's particularly with DRF Spectacular to generate the Open API specs. DRF couples tightly to the ORM and plays nicely with the Django DB models.

FastAPI will have more boilerplate, but I'm not sure that's an issue anymore in the age of AI coding assistants.

HTMX is also wonderful for both. It's a nice tech for lightweight SPAs. If you're going to go deeper into the JS side you can look at some of the more mature frameworks. I've been kinda partial to Vue.js.

And if you really want to go crazy there's pyscript from anaconda.

seabrookmx · 2024-10-16T23:53:14 1729122794

DRF's serializers make it really easy to generate N+1 queries and it's a very opinionated framework that doesn't share many opinions with other frameworks. There is also no plan to bring async support to DRF, even though it's already present in Django itself.

If you want to make a Django API these days I'd lean towards django-ninja which essentially bolts Pydantic onto Django, in a similar fashion to how FastAPI leverages Pydantic ontop of an async flask-like framework.

I personally would just use FastAPI but I understand lots of folks have invested heavily in (or prefer) Django's ORM to raw SQL or Alchemy.

bb88 · 2024-10-18T18:42:17 1729276937

There's a lot to like about the Django ORM. And Django still has it's place.

What I'd like is a SQL that creates and consumes a JSON structure as output/input, and then there wouldn't really be a need to serialize/deserialize anything. Python would be a lightweight wrapper for authentication, error handling, logging, and data processing that couldn't be easily done with SQL.

I mean people are trying this, but SQL feels so arcane. If you have tabular data and working with spreadsheets, SQL is great. And the JSON support in postgres is a badly bolted on design choice to force tree style data into tabular data.

Instead they should modify SQL to process tree style data as tree style data.

eigenvalue · 2024-10-18T20:22:02 1729282922

There is a JSON column type in SQLite now, and it's supported by SQLModel/SQLalchemy:

https://www.sqlite.org/json1.html

bb88 · 2024-10-19T03:49:15 1729309755

There's a JSONB column type in postgres for a while now too with the arrow syntax.

I think my point is that the structure of the data in a relational database is tree-like. Yet we can only get tabular data out of the queries.

eigenvalue · 2024-10-16T22:10:53 1729116653

FastAPI automatically generates the openapi.json file for you, which is how it's able to also give you the Swagger page "for free" once you've defined the route structure. It's very convenient.

sethammons · 2024-10-16T23:24:49 1729121089

I have grown to believe that python is not appropriate for any organization where multiple teams will work in the same codebase / repo. The system is prone to disorder and should be in the same category as perl for similar reasons.

Programming is a team sport and static types is just too useful and the bolted on typing is insufficient. We are burning, literally, millions of dollars in salaries to make python work at our org. It has been the same now at the four shops I've been a staff/principal level. Dynamic languages lend towards less maintainable code because the compiler offloads work to your squishy human working memory.

r_hanz · 2024-10-17T05:14:22 1729142062

I tend to agree, the flexibility that non-statically typed languages (i.e. Python) offer on smaller-scale projects (very) quickly diverges to chaos on larger-scale. With scale, rules and rigidity provide structure, without they provide verbosity and bureaucratic obstacles. Unfortunately “scale” is a gradient, not discrete, so there’s no “right answer” - hence the waste you experience. Ultimately, waste is in the eye of the beholder… “One person’s waste is another’s GDP.”

seabrookmx · 2024-10-16T23:59:17 1729123157

mypy helps, but I generally agree. Similar thing on the node side.. Typescript is great but it's still a lot harder to scale an application. It can be done in dynamic languages obviously (PHP: Slack, Typescript: VSCode) but IMO it's harder.

Despite having a big Django/DRF and FastAPI footprint, our backends in-development are using C#/aspnet.

treadmill · 2024-10-17T05:54:24 1729144464

> where multiple teams will work in the same codebase / repo

This is a bad idea in general and leads to all sorts of problems.

krashidov · 2024-10-17T03:44:33 1729136673

Statements like these are usually too general to apply to every such case. But I think you are 100% right

TZubiri · 2024-10-17T00:44:12 1729125852

Yeah, but python services can work if they communicate by api with other services anyway.

In essence

solo dev: python is ok Tight teams: python not ok big corps where you literally can't even share code: python ok again.

ggregoire · 2024-10-17T03:48:32 1729136912

> Is anyone actually using FastAPI in a commercial, large scale app? Would you prefer using...say Django or Flask + Gevent (since they're more mature) over FastAPI?

Django and FastAPI are two different things. The former is a huge and opinionated MVC-like web framework and the latter is a simple library to make HTTP endpoints. I prefer FastAPI but it really depends of what you are building and how many people are going to work on that API.

> What's odd for me is that the React website also asks you to pick between either Next.js or Remix[3].

It's indeed very odd, they made this change with the new doc. I would not recommend to pick any of those and just start your project with vite.js, so you can focus on React and keep it simple. You can use react-router for the routes and react-query to call your backend. You don't really need anything else.

kdazzle · 2024-10-17T02:56:44 1729133804

Django with DRF will get you a very maintainable API, and you can have as little or as much boilerplate as you want. For example, you can inherit all of the model’s fields, or you can choose to specify them all, or somewhere in between, add additional fields, etc. You can have it generate pagination or certain filters. It’s got plenty of hooks for overriding functionality.

If you see performance issues or situations where youre hitting the n+1 query issue, you can optimize by using the orm’s prefetch_related, select_related, or just drop into raw sql.

Im obviously a fanperson, but have yet to find a framework combo that I like more than those two. It’s not very fashionable, but you’ll end up with an app thats quick to develop and thats reasonably secure by default.

turtlebits · 2024-10-16T22:10:04 1729116604

I'd start with a pure server side stack like Flask or Django and forego the complexity of front-end JS.

bilekas · 2024-10-16T16:34:31 1729096471

Is your link working or maybe am I blocked on my network ?

If it's not working, then maybe maybe python the backend wasn't the best idea in the end?

eigenvalue · 2024-10-16T16:35:32 1729096532

It's working for me, and I see lots of traffic coming to it, so it's probably blocked for you. Most likely because it has "youtube" in the url!