It doesn’t look Apple Vision optimized, but this barebones app is free while also running local LLMs on VisionOS, iOS, and iPadOS: https://apps.apple.com/us/app/mlc-chat/id6448482937
I’ve messed with it mostly out of curiosity to see how fast Apple silicon can run inference on mobile.