It isn't even that much of a speed hit using the classical sorting CART implementation. However xgboost and ligthgbm use histogram based approximate sorting which might be harder to adapt in a performant way. And certainly the code will be a lot messier.
Came here to cite your work, I even mention "CloudForest" in my slides still as "an interesting implementation that is also capable of handling NANs in DTs in a slightly different way." Crazy this has already been 10 years.
It isn't even that much of a speed hit using the classical sorting CART implementation. However xgboost and ligthgbm use histogram based approximate sorting which might be harder to adapt in a performant way. And certainly the code will be a lot messier.