Someone help me with this, but why do people need the full history of the Ethereum blockchain? Shouldn't having a few of the last valid blocks be enough?
Each block does not replay the entire state of the system. There could be an unspent output in a block from a year ago, that is then spent in a transaction tomorrow. You need to track all of the valid inputs that might go into a transaction. You don't need the full history (i.e. you don't need spent outputs), but you certainly need a lot more than just the most recent N blocks.
In Ethereum, you can track the account balances; in Bitcoin, you can choose to track just the unspent transactions. The decision is pretty arbitrary from the pruning perspective.
Is this for Bitcoin or for Ethereum? I thought the former has UTXOs and the latter has accounts.
So the issue is, if you want to validate a transaction, I would have to find the block in which the input was made, and confirm all the blocks from that one to the current block?
You're correct, Ethereum uses account balances. You don't have to keep track of inputs and outputs, just check that the balance is sufficient.
Since each block header has a merkle root of the block's state, you can just get all the block headers to check PoW, get the current state, and validate transactions and states as many blocks back as you feel you need to make sure a miner hasn't faked the current state.
Yeah, looks like I got the terminology wrong, but the concept is the same for these purposes. Whether you track the outputs or the account balances, there's a lot more information that you need to keep an accounting of than what you can glean from just the last N blocks. Which makes sense; it is a distributed ledger after all.
2 years ago when I was first experimenting with Ethereum, it was necessary for trying to extract out smart contract addresses among other things. Now, I'd likely use etherscan or such, unless I was building something proprietary.