MSR Asia is based in Beijing and is one of the fastest growing research outfits in the world. They're slowly showing up all over the place in the CS world, and this is just one of the examples.
Consider Microsoft Academic Search. It BLOWS AWAY Google Scholar, Citeseer etc. in terms of features. Once they attain coverage parity, there is reason to use anything else.
Academic Search, Trinity, and similar projects out of MSR Asia seem to be directly having impact in Bing. In a matter of a short few years, you're looking at a barebones search engine (Live.com) building an R&D infrastructure, with prototypes, and production modules feeding into what is now a pretty rocking search engine. [Example of Academic search integration with Bing: http://www.bing.com/search?q=donald+knuth , scroll to bottom -- page also shows Freebase integration in middle]
Ironically, Microsoft is playing David in the David vs Goliath story here, and the passion is showing in terms of the ecosystem of computing projects and products that are making it into Bing.
If you want to look outside of Bing, consider the Kinect effort. Did you know that the _hardware_ ships with mathemetical models built at MSR, trained using Dryad? (Dryad is an MapReduce competitor out of MSR) The training is the "secret sauce" and why you don't have to spend days calibrating the Kinect.
None of this is privileged information -- you just have to follow the hyperlinks :) The open source world is built on collaboration and sharing, and hence the "story" is the backbone of most work. But that doesn't mean other people don't have stories and passion!
Side note: If you go back to the Bill and Steve interview from D5, listen to Mr. Gates mention the tech of kinect years before the product. It is interesting how much MSR jives with that talk.
Searching http://www.bing.com/search?q=donald+knuth, I see the freebase integration, but I don't see anything other than search results and ads at the bottom of the page. What am I looking for?
I would say that's Powerset's technology merged into Bing, not direct Freebase.
If you want a more practical application for Trinity look at http://research.microsoft.com/probase/ (linked at the bottom of the page). Also to quote the page:
Microsoft Bing’s AEther project now uses Trinity for managing AEther’s experimental data,
which consists of large number of workflows, and the evolutions among the workflows.
Trinity is the backend graph storage engine of AEther's workflow management system.
We are adding more functionalities, in particular,
subgraph matching and frequent subgraph mining, to support the project.
Microsoft has become a company that produces technological advances as a side-effect of their primary goal, which is making money. They are not really interested in sharing, or promoting innovation - they are interested in generating lock-in, collecting licensing fees, and stamping out the competition.
What I find truly disappointing is that there are so many brilliant, creative people letting their work be owned and controlled by companies like Microsoft.
Open source projects get admiration and passion because they are OPEN. The top priority is to share the discovery or creation and promote productivity and innovation among others.
Actually, my understanding of Microsoft Research is that their main focus is on promoting and sharing innovative ideas. A MSR researcher's performance is reviewed based on academic publications and the amount of impact that their work has on the communities in which their work lies. If their work happens to help one of MS's products, then that is a bonus.
That's what I've been told at least by someone who works for MSR in Seattle. In fact, some of the most passionate people in my field of study (I'm a graduate student in electrical/computer engineering) work for MSR.
^^ this. I didn't known about MSR until I became a grad student in comp sci and started seeing the name pop up time and again in the literature whether it was graphics, computer vision, AI, etc etc.
If you like computer science, you like MSR. (Plus, they gave the world F# and C# which are an oasis in the world of enterprise development)
That is so true. I also did not understand what the real world application around this technology would be. Any ideas there? Something that can be understood in not so layman terms :)
1) A distributed wikipedia/dbpedia. Instead of fighting with deletionists, one would just run its own node of the graph database, and keep/merge/sort changes that they see fit.
2) Recommendation engines that are seeded with the user data, at the edge. For instance, instead of me having to upload my song library to Apple or having a last.fm scrobbler running, I would able to have a "Genius" feature without ever having data leaving my computer. If I want to get my friends' recommendations, I could just add them as peers.
3) A medical expert-system that can analyze my medical records that I hold, instead of one central place. Instead of trusting something like Google Health or Microsoft HealthVault to keep the data safe, I can be the only one with access to it, and only talk to my doctor if this system triggers some sort of alarm.
Yeah they should have put a video of the researcher against a milk-white background with inspirational music and while they visualize a subgraph we get to hear the climax of a coldplay song.
I don't get why the passion and a story matters really. What I feel bad about is that they make these cool projects that rarely ship as part of a real product that peopole buy.
What use is this database if its not going to make my search results better or make my computer be a little smarter...
MSR might just be the bell labs of this century! They already employ Tony Hoare (Quicksort), Niraj Kayal (the K in AKS), Simon Jones (GHC). What a heady list!
This looks very interesting, although it seems like they are relying on a very high speed network to get around the latency issues inherent in sharding a graph database. (They mention partitioning, but not whether it occurs online).
Related, my Third Year project is a graph database that loads its data lazily from configurable back ends (databases, caches, APIs, written in Clojure). Now I have more evidence for my dissertation that this kind of tech is useful.
I wonder what their language for queries is going to look like, the only main graph language I've found is Gremlin, which used to be XQuery derived, but recently switched to chained objects.
It's a company talking about an internal tech (in comparison to another company's internal tech, no less.) They didn't release it. It is completely useless from a hacker's POV.