> there is probably a way to make typing work with pandas
So we did use a lot of pandas as well. The way to make it work is to create custom types and of course each dataframe will have it's own type which is going to be a total mess.
typing pandas seems like an r&d-level problem as it seems quite close to tricky areas like row types / dependent types that mostly only work in theory
attempting manual workarounds like pure nominal typing for it as you describe is against the grain of a type system and pushes a lot of work to the user. We just... don't. We stick with class ('pd DataFrame'), and the one extension we are thinking is Index, while getting into actual columns gets into a mess quite quickly.
I tried to engage some PL researchers on this awhile back but no go. IMO would be a great project for most type system research students.
Not exactly this, but I've had some success using logic programming (minikanren) to statically assert certain facts about dataframes in spark. E.g. "if Row_a is filtered to Row_b, and Row_a is not null, Row_b is also not null".
So we did use a lot of pandas as well. The way to make it work is to create custom types and of course each dataframe will have it's own type which is going to be a total mess.