Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could you simplify your point? I was an ardent marshmallow user and when I finally switched to pydantic, it felt like I finally sat down in my life after standing forever. The documentation sounds good enough to me, but importantly the interface pydantic provides to define your json schema is the most elegant interface I’ve seen in any language and miles better than the mess marshmallow provided.

For many of us especially in the SaaS side, speed of these operations is a distant third priority compared to ease of writing and understanding the code, and ensuring reliable less buggy code. The actual compute happens on a cluster with spark or snowflake anyway.



There is no reference doc. The docs cover a lot of material in a small amount of space, buying important pieces of information and mixing up a large number of topics under unintuitive headlines. Reading the source code is occasionally necessary just to figure out how it all works.

The API is a little weird, particularly around defining validators. The parameter name-matching is an "interesting" design choice. Accessing "values" as a dict[str,Any] is messy if you care about static typing, although I can understand why they did it.

Furthermore, the behavior of validators and the exact sequence in which they run is not defined by the docs. It's not that hard to figure out, but it also might change at any time because there's no user contract. Attrs is significantly nicer in just about all respects here, especially their attention to detail in their extensive user guide and reference docs.

Speaking of user contract, there's no clear separation between private and public. Without a reference doc it all looks like fair game, but without a reference doc it also might all change at any moment. Either you stick to the examples, or you're off doing a guess-and-check dance and hoping something doesn't break.

Even with the Mypy plugin, I often have to write `if TYPE_CHECKING` all over any nontrivial Pydantic class consuming data from external sources. Variable annotations in Pydantic are fundamentally not PEP 484 type hints. That's fine, but it's confusing that they're almost the same, and, as above, it's almost entirely up to you to figure out how it all works, either by trial and error or by digging around in the issue tracker and StackOverflow.

Ease of writing and reliability is precisely my big area of annoyance and concern. Speed of (de)serialization is comparatively unimportant (although I don't like the huge amount of overhead involved and I avoid using it in hot code paths).

I also don't like using Pydantic-defined classes very much, because the actual init method signature is just *args, **kwargs, which doesn't work well with any tooling. It feels like being back in the Tornado & PyMongo dark ages where everything is dynamic or dynamically-generated and classes are just glorified hash tables.

I agree that the JSONSchema integration is outstanding. BaseSettings is also a tremendous productivity improvement, I love that I can define a class and immediately get a proper app-wide config reading from both env vars and a dotenv file. I also like the default error messages that tell you exactly which field failed validation. I also like the validator system (once I figured out how it worked), respecting the order in which I define the validators as well as supporting validators that run before or after the default set of validators (pre=True and pre=False respectively). I was probably being a little too negative before, but my annoyance level with the developer-facing API and documentation remains high, and I will gladly jump to an Attrs-based alternative as soon as one exists.*


Please please take a look at V2, both the code and the documentation (although I admit, the documentation for V2 isn't finished).

I (the developer of Pydantic) had many of the same frustrations with Pydantic V2 which is why I've spent so long rewriting it to try and fix these concerns.

In particular:

* we now have API documentation [1] * we have first class support for validating `TypedDict` which gives you a typing-valid dict representation of your data straight out of validation * we now have strict mode * we're working hard to define an exact spec for what validates to what [2] * we have a strict separation between public/private - everything private is in a `pydantic._internal` module, and we have unit tests that everything which can be publicly imported is explicitly public * we now use `Annotated[]` for defining custom validations/constraints, together with annotated-types [3] * the protocol for customising validation and serialization has been significantly improved [4]

I'd really love to hear your feedback on V2 and what more we can do to improve it - your feedback seem unusual reasonable for HN ;-) - please email samuel@pydantic.dev or create an issue/discussion if you have any thoughts.

1: https://docs.pydantic.dev/latest/api/main/ 2: https://docs.pydantic.dev/latest/usage/conversion_table/ 3: https://github.com/annotated-types/annotated-types 4: https://docs.pydantic.dev/latest/usage/types/custom/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: