Hacker News new | past | comments | ask | show | jobs | submit login
Hundreds of millions of stars turned into a map of GitHub projects (anvaka.github.io)
471 points by anvaka on May 13, 2023 | hide | past | favorite | 85 comments



To me this is proof that doing a good visualization is really an art. I saw this on hn for a couple hours, but really didn't think much of it. As soon as I clicked through the funny names, and "continents" brought a smile to my face.

I guess my little project is too small to make it, but now I too aspire to join the great nation of Golandia.


Thank you!


"Homelabia" made me smile like a schoolboy.


I love these things - to see everything sort of classified, and all at once, helps me find things I would never think to look for. Forgive me if this has been posted somewhere else, but if you want to see a reddit map, here it is (maybe you found it already in their github, but I just have it as a bookmark): https://anvaka.github.io/map-of-reddit/?x=18083.096950551575...


Yeah I just wanna paste this and found your answer :P, now we have another similar tool about Github, much more useful for many people, kudos to the author!


thank you :)


This gives a really cool top level view into the whole Chinese opensource community - which has always been very mysterious to me. Bc it's absolutely massive and almost entirely on its own doing its own thing due to the language barrier


Indeed! A lot of amazing things happens there. A few are getting world wide adoption (echarts are awesome if you are into frontend viz, so is element-plus library for vue components). There are giant communities of ML too


Check out other anvaka's projects, there are gems: https://news.ycombinator.com/from?site=anvaka.github.io



This is amazing, the GitHub ham radio and hardware scenes are way bigger than I previously thought. Shout-out to OpenRTX and M17 along with all the awesome SDR applications!


Is Vue over represented in this? Seems like there would be a Reactistan island somewhere before Vue would show up.


Vue is a pretty big deal in the PHP/Laravel community. Evan the creator even gave a talk at Laracon a few months in, showing the hockey stick growth after it started to get traction in the community.


True. Though I also expected to "Reactistan" somewhere more explicitly, I wonder if it is scattered in Fronterra continent?

I am also puzzled myself why the algorithm has separated Fronterra from another large island to the west, seemingly related to node and some other JS libraries. Decided to keep it all as is in case I'm missing some reason


My feeling is that the React community is at least double to triple the size of the Vue one, so yes, I feel React should definitely be it's own island.


I'm embarrassed to admit that I read "stars" in the title and spent far too long trying to figure out how these were arranged into constellations.


This is absolutely awesome. Such a good job well done! The search is fast! The sidebar is a great feature. I also love how you highlight the repositories you search for and draw red lines between things.

I found my journal repositories in the "Land of Node" in "Frontartia". I am surprised by that because I didn't realise I was associated with the node community!

I am so impressed with your visualization, it is intuitive and interesting the different communities of GitHub.


Thank you so much!

I'm puzzled by some countries in Frontartia's island too. I'm not sure why it was even pulled away, as if there is something I'm missing.


This is amazing! I love this effect of showing me that my little project that I was working on fairly solitarily is actually part of a community of other people doing similar work that I can reach out to/collaborate with etc!

What does it mean if, I click on a repo, and it shows 5-6 links to specific projects? Does that just mean the jaccard similarity index was below a threshold?


Yes! I picked only the highest scores to form an edge in relationships graph. Typically a sigma (std deviation) or two away from the mean. So if there is a direct link between your project and others - the similarities are abnormally high.

Note that I'm not rendering direct links outside of the country yet, there might be more there. Will probably add a "focused" view to see those better


This is great, not just nice to look at but useful too. Top work!

How it was constructed (also really interesting) is described at the repo:

https://github.com/anvaka/map-of-github#readme


thank you!


I'm quite unhappy with SBCL located in Schemaria:

https://anvaka.github.io/map-of-github/#12/13.469/-8.175

It should be Lispaña.


ML-family languages are placed in Schemaria as well; apparently it is a catch-all for functional languages.


Should I rename Schemaria to Lispania?


No, as there are all/most/some (Haskell, OCaml, Purescript, Idris, Coq, ...) functinal languages. Lambdania would be a possibility. Although Prolog and Forth are located there too.


The Scheme people would be unhappy then.


True... Scheming Lispania :)?


Parentensia


I like this one.


AILandia was big before but now has been growing at an astounding rate almost surpassing Frontera.

Some say it's a cancer others say it's inevitable and here to stay. Probably both are right.

Also, why is Swiftoria so big?


I've been watching changelog nightly emails for a while - they summarize most starred daily repositories, and growth of AI there (subjectively) seems to be even higher than frontend tech


This is a beautiful project. It's useful, aesthetically pleasing, and based on a clever methodology. Absolutely love it. Great work.


Thank you so much for your compliment!


1. this is so cool. i'd love to see more name for the regions! 2. nit: one repo dear to my heart dbt-labs/dbt-core was previously dbt-labs/dbt as well as fishtownanalytics/dbt. they show up as unique nodes on your map. what most interesting to me is that each name of this repo links to a different set of related repos?


Thank you!

The connections are inferred from stargazers. If they lead to different set of related projects it might be a sign that different group of people gave stars different things at that time.

Github of course has much more dimensions than a flat 2d surface can show


Is there a page that explains what the island names mean? I understand Minecraftria, but Fluttopia (and most others)? No clue.


it usually becomes clear when zooming in and looking at the examples. E.g. "Fluttopia" is Flutter-related repos.


Yes! Thank you

Note that some country are missing their names because I didn't have enough knowledge to assign a name to them.

Here is my naming process if anyone wants to help with the rest: https://github.com/anvaka/map-of-github#country-names


very cool! how did you decide on the boundaries and placement of the continents?


From the project readme on GitHub:

> A lot of country labels were generated with help of ChatGPT. If you find something wrong, you can right click it, edit, and send a pull request - I'd be grateful.


I found many ClickHouse-related projects in a dedicated cluster - "House of Clicks" :)

But ClickHouse itself somehow appears in Kubernation, and ClickBench - in Datapolis. Nice names btw.


Thank you!

It could be a curse or popularity? Some projects are so popular that you can place them nearly anywhere, and they would still find a lot of densely connected group of neighbors


This is really neat! Are there any insights you gleaned from doing this that were unexpected?


I knew GitHub is not a tiny website, but I didn't imagine how big it actually is. Each of those dots are giant parts of someone's life.

There are a lot of interests that I didn't know exist. For example https://github.com/cat-milk/Anime-Girls-Holding-Programming-... - someone collects anime girls holding programming books.

https://github.com/tylertreat/Comcast - and here is someone who is amazing at coming up with funny project names =)


What else did you use to calculate similarities? When I tried something similar with HN co-commenters, I just got one big blob with dang and other prolific commenters in the middle. That was not very interesting or insightful.

How long did it take to calculate the Leiden clustering and the force-layout? Do you think it would be somehow possible to compute a force-layout of the whole graph?


Somebody has been doing a lot of Covid-19 plots in Gnu-R


This is really neat. Clustering of data like this makes for so interesting graphics.


Very cool!

If you work on a dynamic version that allows users to understand changes in the open-source topography over time and detect/predict new clusters, this could be a very powerful tool for investment intelligence.


Easier than maintaining a wiki!

https://wiki.thingsandstuff.org/Audio etc.


This is so fun, thanks for making it.

I bet that several of these regions have a common image in their readme (the python logo, the nix logo, etc). Imagine little flags popping out of each region...


haha, I love the flags idea :). I wish I had the prowess to implement it in a way that is visually appealing and not obscuring the map.

I would also love to have a giant octocat hugging the archipelago, with some radial gradient emitting inside of it. Alas my design-gl foo is not there yet


If you type a capital letter into the search box, it gets stuck on "downloading index" indefinitely.

I was waiting for about 10 minutes wondering how big these indexes were...


I also experienced this looking up DontShaveTheYak


Oof, sorry! This should be fixed soon thanks to https://github.com/anvaka/map-of-github/pull/1


IDK how I feel about Kubernation: Firecracker, Grafana, Localstack, Podman, Terraform, Vault?

Seems like a really overreaching power. Wait, never mind, that makes sense.


:D btw, if you find any country label to be missing or could be improved - right click it, change it, and then send a PR (it's two clicks away)


how exactly did you name and cluster these things? i love how you presented it and dont know of any existing quantitative methodology


Check this out https://github.com/anvaka/map-of-github#how-was-it-made - I gave a short overview there - does this help?


Very cool visualisation!

Though I noticed a couple of odd groupings - like that MicroPython was clustered in Arduinoria rather than the adjacent MicroPythonia... :P


This is frickin awesome. Made me really see how the programming community can come together and contribute to something powerful.


This reveals an interesting dichotomy in the project I work on, Hail. I think of hail as a serverless workflow, relational, and linear algebra system most similar to Dask, BigQuery, Snowflake, Spark, etc. but this map is constructed from the perspective of the user so Hail lands squarely in the world of bioinformatics.

Also neat to see how bioinformatics is such a splitbrained community. They land next to R but are filled with Python projects.


The title is clickbaity in a very clever way, and I mean this as a compliment! I thought about what it says for almost a minute before clicking. Stars being made into Github projects? Would I find Orion and Alpha Centauri? I thought hard on about how to pull this off.

Then I clicked and I realized that it is the opposite way, in reverse. And then this realization brought a smile on my face.


You are very kind and your compliment put smile on my face too :)


Haha, very nice. I was able to locate multiple projects even ones with less than 1k stars easily.


The Ruby world is part of an island labeled Art of node, west of Fronterra. Strange choice.


I agree. I'm not sure why they are together. Maybe all that tooling stuff that was happening in early days?


Amazing way to see related projects in an environment you aren't familiar with.


This is very cool and was pleasant albeit surprising to see colour in Gamedonia!


Group names are 10/10


Damn that's good


Did it break under the load? I can't search


The data seems a bit dated, but this is really cool.

I know this because I moved visionmedia/debug to debug-js/debug in November of 2021.


I've got data from 2020 to end of March 2023


Ah okay, that makes more sense. Did you scrape it yourself? Incredible work by the way :)


Thank you! Google BigQuery has it readily available, so probably took 10 minutes to fetch it =)

https://cloud.google.com/blog/topics/public-datasets/github-...


I recommend checking https://ghe.clickhouse.tech/

It explains the full pipeline - how to download, collect, and analyze this sort of data.


Reminds me of dotlan Eve maps


just came here to give it my upvote. That's so insane cool!


Nice!!

But how do I zoom in/out?


Scroll wheel works for me.


Wow.. that’s cool


This is amazing


incredible and so inspiring. thank you!!


W'




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: