https://cloud.google.com/blog/topics/public-datasets/github-...
It explains the full pipeline - how to download, collect, and analyze this sort of data.