I don't know why, but this bad practice seems to be in place in so many places. Live tracking seems to be almost uniformly code for Timetable tracking in public transit. In the UK I can stand and watch the live information for the train departing from Waterloo - it'll say the train is on-time for leaving at 5:30. I can go into their own app, identify the physical train that is going to terminate at waterloo and become my train and find out that it left the last station 20 minutes late. Clearly the system is driven by timetables which are then modified by manual entries rather than some actual underlying representation of the trains (which clearly must exist for them to be able to manage their fleet of trains).
? UK Network Rail APIs are the probably the best I've seen. There is ~10,000 real time monitoring sensors on the network. Check out realtimetrains.co.uk for example, which is driven by the STOMP feed from NR (which sends thousands of requests per second on real time location and status of each and every train in the UK).
Regarding your point on outbound trains, often if a train is running late inbound they will have another one take over the duties, so trying to identify it isn't a perfect system for predicting.
The background data may be great, but I regularly experience standing at a station telling me the train is on time when it is already late (after the point the train should have left).
As a counterexample the company running public transport where I’m from actually released this just a few days ago. So it doesn’t seem to be an impossible task.