Monad and Functor were first, if I remember correctly Applicative made its debut when Parsec became popular. For a long time, the hierarchy was »Functor ===> Applicative, Monad is independent«.
We made Applicative a superclass of Monad around 2015, and since then return has been a historical artifact (with a dangerously misleading name).
ty for this history! I write haskell at work but with an honestly middling understanding of the category theory that goes into what typeclasses are supersets of other typeclasses. Mostly because we use an overly verbose version of Prelude. So it's wild to me that these mathy bits were not always set in stone!
Part of the fun of being a research language is that you don't know everything up-front!