That's not quite right. The models are pretty bad at generating a proper diff, so there are two common formats used. The main one is a search and replace, and the search is then done in quite a fuzzy manner.
To be clear the diff they generate is something you or I could apply manually and wouldn't notice an issue. It's things like very minor whitespace issues, or more commonly the count saying how large the sections are - nothing that affects the meat of the diff, they're fine with the hard part but then there's small counting errors.
thanks, i didn't know how to respond to this as i never diff or use patch, but i know what they look like (@22,8 -/+ sort or whatever), and aider was outputting the green and red lines inverse video the same way github looks. It's a reasonable facsimile of "diff output", but i shouldn't have asserted it was diff output.
That's how aider commands the models to reply, for example.