Hacker News new | past | comments | ask | show | jobs | submit login
Filter Hacker News by tags (70K past URLs and hourly sync) (archfinch.com)
70 points by drx on May 23, 2011 | hide | past | favorite | 18 comments



Filter old and new Hacker News link by tags.

I've gone through old Hacker News links and tagged them. Also, every hour, new HN posts are synced to Archfinch.

Only posts with at least 10 points are shown, and automatic tagging uses the Delicious API (no autotagging involved). Users can also tag things manually (and they do).

You can follow and block specific tags, e.g. you can follow linux, python and math and block techcrunch, facebook and iphone.

Some highlights:

http://archfinch.com/tags/ycstartup - all YC company news

http://archfinch.com/tags/hn/videos - videos that appear on HN (automatically embedded)

http://archfinch.com/tags/hn/pics - same as above, except pics

http://archfinch.com/tags/programming/history/lisp - an example of how deep the tags go

Bonus: some stories have wiki-style summaries


Do you use anything else than Delicious tagging e.g. some other content extraction & keyword tagging mechanisms?


No. From brief testing I found such automatic systems to be unreliable.

However, perhaps I could add tagging suggestions based on content extraction / keywords.


Seems like just looking at tf-idf would be useful. Could result in some false positive tags, but this shouldn't actually break the system, just perhaps add a little noise to the tag list.


For reference, here's the initial submission to HN with the discussion:

http://news.ycombinator.com/item?id=1601930

Much good feedback there, it would be interesting to see a report of how much of that was taken up and integrated, how much was tested, and how much was ignored.

Instead, this submission is a repeat of this one from 4 days ago:

http://news.ycombinator.com/item?id=2563739


Regarding the original submission, yes, I got a ton of very useful feedback. The site differs greatly from what it was then (I wish I had comparison screenshots handy right now), I implemented some of the feedback.

I did, however, ignore some feedback. Well not really ignore, I listen to every piece of advice I get, but I can't implement every feature request. Plus often advice is contradictory with other advice :)

For example, people tell me to get rid of 1-5 ratings and just do 'like/dislike'. I won't do that because there is a chasm between 'like' and 'love'.

I could write a detailed post on what exactly I took to heart and what I didn't do.

Regarding the repeat. The submission from 4 days ago didn't end up on the front page for some reason, even though from my observation it had the necessary upvotes (it had more upvotes than an older submission that was on the front page). I figured it won't hurt to resubmit, and it worked this time.


Neat! This just helped me find out the name of the weather site I was trying to remember on Saturday (http://weatherspark.com). I knew it'd been on HN but searchyc.com returned too many results in this case:

http://archfinch.com/tag/hn/weather (result #3, "Incredible weather information UI")

http://searchyc.com/weather

Thanks!


I'm glad to see it's useful for someone.

It's interesting how accurate the tags are and how much more information they give. I've been playing with them for a while and I can sit and browse for hours.



Thanks, fixed. Turns out memcached keys can't have spaces in them. At least using python-memcached:

            for char in key:
                if ord(char) < 33 or ord(char) == 127:
                    raise Client.MemcachedKeyCharacterError(
                            "Control characters not allowed") ...


hey any way to get an RSS feed from this?


I'll try to implement this as soon as I can, so check back later. Seems like a no-brainer but I forgot to do this :)


Ok, I think I've done the feeds. I need to test them for a while to see if they work as they are supposed to before adding them to the site, but you can start using them if you'd like:

http://archfinch.com/rss/tags/programming/history/lisp - example tag rss

http://archfinch.com/rss/followed/drx - things followed by me


Does it just pull from HN or are there other sources - if the latter, what are they?


For now just HN. Plus users sometimes submit links themselves.

I plan to expand on that. Add other sources, perhaps add a way for people to add their own sources. RSS feeds filtered by topics would be cool.


Love the site, I think it has a lot of potential. I'll be in touch through email.


Where do I click to go to the HN thread?


Right now you have to click 'discuss this' and in the sidebar, a link named 'via HN' pops up. This needs some improvement, but I can't put too much information in the link boxes, so it's a tough one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: