Hacker News new | past | comments | ask | show | jobs | submit login

They’re in wikitext, which looks to be considerably less semantic than the crawled data. I’m not sure that’s the reason, but it could be a reason.



I'd say not the reason, since the wiki text is pretty semantic. the wiki source of https://en.wiktionary.org/wiki/subbureau#English is:

  ==English==

  ===Etymology===
  {{prefix|en|sub|bureau}}

  ===Noun===
  {{en-noun|s|subbureaux}}

  # A [[district]]-level public security bureau in [[China]].
so as long as one can parse wikitext, it's split pretty well up!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: