I think there is a fair chance that much of the semantic web dream (i.e. "the internet organizing itself") will arrive pretty suddenly, with exponential-like growth.
There has been a very long period of ramping up, experimentation and hype, but it may only take a few good tools, some well written blog posts and some serious incentives (e.g. Yahoo's embrace) to tip it from dream to reality.
This announcement may not actually reflect the tipping point, but its a certainly a good sign, and a major step in the right direction.
'I think there is a fair chance that much of the semantic web dream (i.e. "the internet organizing itself") will arrive pretty suddenly, with exponential-like growth.'
I'm skeptical. It seems like you'd have to either solve some pretty tricky AI problems, or get people writing for the internet to do all this extra work. It might still to feature in some larger, professionally developed sites, but I'd be surprised if it took off.
Well, there is a pretty big incentive. Who wouldn't want their SERP real estate to have more attractive, more valuable information? Seems like a new kind of SEO opportunity for companies and freelance developers.
Actually the embrace of big search engines might be the death-kiss for the semantic web, as blackhat SEO will game the data and the little valuable content that exists will be drowned by junk & spam.
Structured data exists in data silos more commonly known as relational databases. This ramp-up will really only occur once SQL servers start speaking SPARQL.
Yahoo has really been impressing me in the last couple weeks. First the Open Search Platform and now this. I guess all it takes is the threat of a hostile takeover to get your productivity going.
I really hope they can stave off Microsoft, because I don't see the big M$ attempting to innovate like this if they took control.
Standardization and openness do not benefit the dominant player (or players, in close competition) in a market.
Standardization and openness tend to benefit all players that are not dominant in a market, by comoditizing the market and forcing value based competition.
If the majority of a market is locked up by the dominant player(s), standardization tends to not happen.
If the majority of the market is not locked up by the dominant player(s), things tends towards standardization.
As it pertains to Yahoo they're still an internet powerhouse, but their markets are eroding at a pace that they are probably falling out of the dominant category into the leader-of-the-non-dominant-pack category. Yahoo does some stuff very well, and I suspect commoditization of parts of web-space would be in their favor. The question is if they'll be able to push hard enough fast enough to make some of this stuff catch on while people still care.
Now, if the dominant player in a market was "foolish" enough to joyfully embrace standards even if their competitors weren't adopting standards very quickly, would they necessarily lose their "edge" over competitors, considerations of benefitting the market as a whole aside?
Well, one classic example: IBM and the PC market (where standardization helped) vs. Apple and the Mac clones (where standardization almost wiped Apple out).
Here's some more theorizing: I think it would depend on how saturated the market was. If standardization would cause the market to grow such that a smaller percentage of a bigger market was larger than a large percentage of a small market, the company could still benefit from standardization, but they'd also have to diversify in that time so that when saturation approached that they wouldn't get stuck in a commodity market while their competitors could catch up.
Nope, just being typically over-analytical. But I suspected that there are people that had written more well reasonsed versions of such. Noted for my list of people to look up. :-) I started reading my first business books since freshman accounting about a month ago.
I don't see the big M$ attempting to innovate like this
Yahoo's recent openness binge is so strategically opposite to Microsoft that I wonder if they aren't doing this in part to impede the takeover, or as a sort of post-takeover insurance.
You're anthropomorphizing. Giant corporations don't react at the human time scale. Anything you see coming out of the corp today has been under development for months, if not years.
A soon-to-be-ex-colleague wrote a good piece on this recently.
Exactly.
"In fact, the gain from the Semantic Web comes much before [AI]. Maybe [in 2001] we should have written about enterprise and intra-enterprise data integration and scientific data integration. So I think data integration is the name of the game...What we should realize is that the return on investment will come much earlier when we just have got this interoperable data that we can query over." - Tim Berners-Lee, Feb 7 2008, interview (http://talis-podcasts.s3.amazonaws.com/twt20080207_TimBL.htm...)
"Trying to use the Semantic Web without SPARQL is like trying to use a relational database without SQL." - Tim Berners-Lee, W3C Director, 15 Jan 2008 Press Release (http://www.w3.org/2007/12/sparql-pressrelease)
"SPARQL's focus on querying the data models saves time for developers; there's no need for a host of little Web services to retrieve different aspects of the state of a system. This allows the user of the SPARQL endpoint to ask any question -- it is as though they could design their own interface instead of having to work with a limited set of fixed services." - Lee Feigenbaum, Chair of the RDF Data Access Working Group (i.b.i.d.)
I've looked carefully at their logic and epistemology, and what I found had many problems. Their data is full of contradiction, so they had to come up with a way to focus on subsets of the data to get a meaningful answer from a query. The reason for the contradiction, as far as I can tell, is their lack of a consistent model for what a concept is vs. what a word is, etc. Also missing seems to be the formal definition functionality necessary to make the model work, such as an unambiguous genus for a concept, and an unambiguous differentia of that genus reducable to the form of a logical formula. In other words, you should be able to take a genus, run the differentia forumla on it, and get only the instances of the concept that your looking at out. The contradictions are so deep, that adding information was slowed significantly (I think I read this in some of their articles). So I think the CYC approach is simply the wrong one, not that some interesting information can't be harvested from their data and deployed in the right way.
I often wonder whether the semantic web is really Web 3.0. Shouldn't it be more like Web 2.5?
(Or maybe we should drop the revision numbers altogether?)
Seriously, I wonder if we shouldn't be thinking bigger. Adding "meaning" to web pages is important, but it seems like a smaller goal on the way to, I dunno, maybe the realization of The Metaverse (ala Stephenson). Or something big like that.