Sorry for being vague. I meant LoRA, but used Apple as an example because their demo showed the potential. At a conceptual level, you can finetune a base model to be good at a specific task - eg: summarization, proofreading, generation etc. These finetuned weights are at the top layer and can be replaced by other weights for a different task as needed. Apple demoed different tasks by showcasing how their model identifies the task and then chooses the right set of finetuned weights. Apple called it Adapters as it comes via LoRA (Low Rank Adapters). It's around for some time, but only shot into prominence after people got some idea on how to use it.
> Our foundation models are fine-tuned for users’ everyday activities, and can dynamically specialize themselves on-the-fly for the task at hand. We utilize adapters, small neural network modules that can be plugged into various layers of the pre-trained model, to fine-tune our models for specific tasks. For our models we adapt the attention matrices, the attention projection matrix, and the fully connected layers in the point-wise feedforward networks for a suitable set of the decoding layers of the transformer architecture.
Not Apple adapters per-se, but LoRA adapters. It’s a way of fine tuning a model such that you keep the base weights unchanged but then keep a smaller set of tuned weights to help you on specific tasks.
(Edit) Apple is using them in their Apple Intelligence, hence the association. But the technique was around before.