Even running a quantized and optimized LLM on a smartphone would kill battery life at minimum.
In the future(?), they will probably use the AI blocks instead of the GPU, which are very low power.