Hacker News new | past | comments | ask | show | jobs | submit login
Simulation of the world's universities in a single unified graph (thinkaurelius.com)
28 points by okram on May 14, 2013 | hide | past | favorite | 7 comments



I will probably get down voted for this, but when did a monopoly that makes you pay ~80 £ (if i remember correctly) for an obligatory book is a good guy now?


Ha. The world works in mysterious ways -- and organizations are large with many heads. The ideas I see emanating from the Pearson team that Aurelius works with are along the lines of -- Why are there brick-and-mortar schools? Why is education just a short period of your life? Why does a teacher only teach < 100 students in one session? Why do tests still exist? Why can't software be the "office hours" that helps struggling students get back on track with the concepts at hand?

Many of the algorithms were are working with them go into the arena of computationally supporting education.

"If you want to understand X, given your personal knowledge graph, you will first need to understand Y."

"The graph is detecting that student A is struggling with X concept."

"Teacher, given the knowledge of your incoming students, you should focus more on Y concept."

"Teacher, no tests needed, here is a ranking of all your students based on their comprehension of the material."

... hopefully the 80£ (and 80 lbs) textbook will be a thing of the past.


A 121 billion edge graph is too large to fit within the confines of a single machine. Fortunately, Titan/Cassandra is a distributed graph database able to represent a graph across a multi-machine cluster. The Amazon EC2 cluster utilized for the simulation was composed of 16 hi1.4xlarge machines...The 10 terabyte, 121 billion edge graph was loaded into the cluster in 1.48 days at a rate of approximately 1.2 million edges a second with 0 failed transactions.

How many machines can you add so that Titan continues to scale linearly? And have you run the benchmarks on Google Compute Engine to compare?


Note that the codebase used in this benchmark was just released -- Titan 0.3.1.

https://github.com/thinkaurelius/titan/wiki/Downloads


Amazing. What is the cost of renting such cluster?


I forget the exact cost, but it was, along with various dry runs at a smaller scale, around $30k.


Approximately $63 per hour on Amazon EC2.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: