There’s no need, though: the entry point and return address are unique; it’s literally code that is sliced out, jumped to, and it jumps back to the function it was cut out of. The only thing you’d need to save is a register or two if you can’t make the jump without doing some math.
X86 should always be able to jump directly, and ARM sets aside x16 and x17 just for this kind of math. But all jumps should be PC relative, so you shouldn't have to clobber anything anyway