Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Spell check your entire website in one go (spellr.us)
58 points by zemaj on March 23, 2009 | hide | past | favorite | 27 comments


Been a while since I've been to news.yc, but I've been reading the RSS feeds more recently, so thought I should pimp my new job at some stage :)

Feedback welcome!


Crawling should respect the excluded pages in robots.txt and rel="nofollow" directives.


That's a tough one. This has been much debated, but at the end of the day we decided to go with our own filtering system since robots.txt and rel="nofollow" are created with search engines in mind, not spell checking. In general you want search engines to cover a subset of what you want spell checked.

You can filter on any CSS3 selector & a[rel="nofollow"] would block no-follow links if you don't want them included. By default we filter out any block with a class or an id that contains "comment" which does a pretty good job of filtering a lot of user comments.


You could always just have an option for it.

I feel people would trust your tool more if they are able to tell it what to ignore.


That was the intention of choosing CSS selectors over robots.txt or nofollow, although a checkbox to specify following robots.txt & nofollow would be useful as you say.


Which lib are you using for selector filtering?


Good one and complements http://www.sitereportcard.com/


It's a little bit silly that it says "The first 100 pages have been reviewed …" when the site I submitted only has a handful of pages.


This is wonderful. I scanned my website and found two spelling errors. I did not know "seeked" was not a word. It is supposed to be "sought".


The five things it found on my website (only five because, as zemaj explains, that's all they list if you use the quick scanner on their front page instead of creating an account) were all false positives. (Two non-English words, both in the middle of lengthy tracts of non-English; it seems to me that there might be useful heuristics for spotting this situation. One phonetic explanation of how to pronounce something; not much they could do about that. One colloquial neologism; not much they could do about that. One perfectly correct but uncommon word.)

I bet there are genuine typos, at least, on some of my pages. I find one every now and then. So whatever heuristics "spellFOCUS" is using to distinguish errors from non-errors seem like maybe they could be improved.

Nice interface.

The pay-for-service prices seem awfully high to me, but I'm not the target market for several different reasons.


Should "Your site has queued." be "Your site has been queued."?


Check http://www.grammarr.us ... just kidding-coming soon...


I think "queue" could arguably be used as a verb.


Both of the OP's examples are verb uses of "queue". I believe the question here is whether to use active or passive voice.

IMO the passive construction ("has been queued") sounds more natural, because you wouldn't normally think of an inanimate object like a website queuing itself in a line. But it could go either way, as "queue" technically can act as either a transitive or an intransitive verb.


Immensely useful. Well done.


I use it every week and find it extremely powerful. The custom dictionary and the ability to highlight exactly what is wrong are two key differentiating features. A grammar checker add on would also be very useful.


Unleashing it on YouTube now, I have no high hopes.


Ok well after letting it go on several websites known for butchering the English language, it seems that the results have been capped at 5 for the free version.

This is either a bug, or an undocumented feature.


You're right, that totally wasn't clear from the text we had on there. Thanks for pointing that out. I've updated the copy so it makes more sense now. The quick scan on the homepage only lists the first 5 errors. You need to create an account to see the full list of errors.


Anytime.

I've got a couple suggestions for further things you could do with the tool. Should I just send that via the email on the contact page?


That be great, thanks!


It would be nice to have the ranking of error by likelihood on my desktop, but only being able to spell check my site (where everything has already been through a desktop spell checker).

It might be useful for user generated content (I assume that is what they mean by "multiple content contributors cannot be easily identified by the website content manager"), but then, should you be editing everything your users get wrong?

On my site the free scan found 0 likely errors, I possible error that is an accepted shortened form in the context, four unlikely errors. These were my surname, the name of the site, a correct (if not often used) plural form and a common abbreviation.


Shameless plug but I'm working on the user generated content side at http://www.afterthedeadline.com


awesome, found a couple on my site

i have always looked for a service like this


Great implementation, I especially like how it visually shows me the error on my page. It would be nice if I could add the words it thinks are errors to my custom dictionary, so I don't have to type each one in.


Very nicely done. I would add 404 and other error checking as a premium feature if you haven't already done so.


I hate the name, but it found a spelling error on my site. Cool! "Langrangian" != "Lagrangian"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: