In cryptography, the output of an algorithm is far less important than how that output was generated (with very precise specificity). Crypto code requires far more heavy lifting in the "how" of the output than other software does. This is not intuitive, because ordinarily there are several ways to do things and you can compare on output, and it's probably fine to develop something in a functional but unorthodox way (minus performance consequences, maybe).
As the parent comment here mentions, there are extremely bad consequences to subtle mistakes in how the algorithm and its implementation arrives at the output, even if the output is correct.
If Alice sends Bob an encrypted message and the custom crypto provides an incorrect output, Bob can't decrypt the message. But if Alice sends Bob an encrypted message with the correct output generated in an unsafe way, there are many ways by which the secret key could theoretically be recovered, which is far worse than Bob not being able to read the message in the first place.
> "Just because an AES implementation matches the test vectors does not make it correct or safe."
Actually, this does make it correct. Whether or not it is safe depends on the application. For example, I sent my friend Bob some cyphertext that I calculated by hand with a pencil and paper using the AES algorithm. I sent it via my trusted courier Eve. It took four hours for me to do the calculation. At the last minute I started to second guess my math so I double checked it against a respected crypto library. It was fine, so I handed it off to Eve. I am pretty sure that in this application my choice to calculate the answer by hand was exactly as safe as if I had just used the library. In fact, I am so sure (given that the answers were identical), that just as Eve was walking out the door, enroute to Bob, I pulled her close and whispered in her ear: "Eve, be very careful with this cyphertext, it took me four hours to create it."
...what? Yes, if you compute by hand and compare with a known implementation, than it's likely you computation is correct. But this has nothing to do with test vectors. You could match all test vectors while still giving incorrect results for values not in the test vectors.
Here is an excellent visual depiction of several different AES implementations, many of which do not run in constant time:
https://cr.yp.to/mac/variability1.html
Implementing AES with variable time secret-dependent operations can leak the key.
There are much bigger and more terrifying subtleties to implementing crypto correctly than simply matching the test vectors.