The Whois protocol, RFC 812, does not specify the result format returned. The pr... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		lambda on Dec 27, 2011 \| parent \| context \| favorite \| on: GoDaddy Responds To Namecheap Accusations, Removes... The Whois protocol, RFC 812, does not specify the result format returned. The protocol is actually incredibly simple; you connect to port 43 via TCP, send a query delimited by a CRLF, the server sends back a plain text, human readable response. Now, many registrars use a format that has lines that look like "Name: Value". But not all do. And those that do will frequently include other text, including terms of service for using their WHOIS service or providing instructions. And some don't use a "Name: Value" format, or they allow for multi-line indented values, or whatnot. Some seem to use "%" to delimit informational lines, as if it's a comment characters, while others don't bother to use anything. So, parsing whois is a bit more like scraping the web than like parsing a real file format. It can break sometimes.

rhizome on Dec 27, 2011 [–]

I understand, and while I do have experience in text processing I would not want to take on a project like this. The number of people responding about this raises the question, though: why is this being reinvented so much? My impression is that everybody writes their own parser, which smells like NIH syndrome.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact