One way is to learn clusters in the data over the whole time period and then cal...

One way is to learn clusters in the data over the whole time period and then calculate the cluster distribution of the same clusters for each time interval separately. You can then track the proportion of each cluster over time as a time series.

So for example if you have a word cluster that describes computers, you could see it start growing in the seventies, while having near-zero proportion in 19th century etc.