Cross-compiling is finicky in any ecosystem, things like numpy especially so. When working on armv7 projects, I got a few TinkerBoards and ran the builds native. The first one took a while, but i just kept my binary cache up to date and normal CI/CD builds were fast. I'd also push core builds to https://arm.cachix.org. I haven't updated it recently, but I am considering doing that again.