Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A new way to view IAmA's on Reddit (topiama.com)
263 points by xpose2000 on June 22, 2012 | hide | past | favorite | 65 comments



Very cool. It picks all the important stuff and presents it sans the fluff. Where do I subscribe? Really, where's the rss feed for new AMAs?

Also, links to questions in their original context would be nice.


Some questions are in response to other questions - some indentation to show this might be handy.


I have emailed Reddit about the legality of this project. (Have not received a response yet) So far it seems likely that I will have to take down the site.

So consider this a small preview I suppose. I plan to port this to a Chrome Extension.

Do you guys find this format much more readable than the current completed IAmA?


I've wanted to build something like this for a long time, the problem I encountered (which I couldn't solve) is how to effectively and automatically find not just the questions and answers, but the discussion that stem from them.

A good AMA is one that creates discussion between the subject and commenter's so the majority of good content is buried underneath multiple comments. For example I did an AMA and it had a few hundred comments, the "top" question was pretty mediocre but it had a reply that made it a valuable question and that reply was the most "popular" (most replies, most votes) thing in the AMA. Your site would not have picked this up.

If you can solve the problem of dealing with all the nested comments and trying to work out the value of each reply (taking into account votes, length, number of replies and possibly the "relevant" workds?) then your site will absolutely be worth using. Instead of just being a different way to display the data, create extra value. Good AMAs are not just Q&As, they're discussions.

> So far it seems likely that I will have to take down the site

Traditionally reddit is very good towards community projects, they pride themselves on their "different" approach to community, unless you're causing harm to reddit I don't see why they would require you to take this down. There are multiple tools that already exist that extend reddit and provide extra value, none of which have been taken down -- from what I've seen anyway.

Specific to your current implementation of the idea, your font-size is way too low, the focus of the site is the words so make them bigger! The size you have it now isn't very readable. Check our some blogs for examples of what sizes to use for text.


The problem of dealing with nested comments is very tricky as you mentioned. I did my best to grab the "best" comments, but I know that it does miss some in a deep thread. I will try my best to further enhance its accuracy and ability to continue a discussion.

I will also work on the font sizes and perhaps even modify the font choices to further enhance readability.

Thanks for your suggestions and feedback! :)


It looks like you grabbed every comment by the OP along with the comment it was a response too.

In at least one instance, I see a situation where someone asked a question, I (not the OP) provided an answer, and the OP then commented on my answer to confirm that my answer was correct. The original question doesn't appear on your website.


I ran into the same problem. I started a pet project that involved scraping reddit (though for a different purpose than AMAs). Their robots.txt and an admin writeup from somewhere on their site made me realize that I'd probably just have to take it down and/or my scraper would just get blacklisted. It's a bummer because there's 1001 great ideas out there for filtering, categorizing, and viewing reddit's data in different ways. And it seems like they encourage 3rd party interaction to some extent with their API and all, yet scraping is kind of needed in most cases.

I do like the format for sure. The only thing I would consider is maybe nesting the Q/A divs (.qitem) for threads because a lot of times the Q/A content is contextual to past Q/As. You already order them that way and that helps a lot but on one of the ones I was reading it got confusing on whether they were speaking in the context of a thread or if it was a fresh Q/A. Maybe set it as a view option to toggle or something (maybe have it be a carousel where each frame contains all the Q/A divs in a thread starting with the root level, and keep it displayed flat like they are now).


I made sure to be nice to reddit. The scrapper is set to crawl reddit once every 12 hours for new "top monthly iamas".

Very good suggestion for nested threads. A good example of a reply to a question is on the westboro-baptist-church thread. I think it's possible to implement this suggestion. Will fool around with it on localhost and see what I come up with.


Yeah, I had a limiter put in mine as well so that it only made a request every 6 or 8 seconds.

No worries on the suggestion. Those threaded comments can be tricky sometimes.

Hey, if you do hear back from them about their stance on this sort of thing, I'd really appreciate if you could let me know what they say. I sort of halted my project after a certain point because I had the fear I'd just have to take it down as soon as I completed it.


You should be able to up that to 2 seconds with no problem, provided you're following the other usage rules of course.

https://github.com/reddit/reddit/wiki/API


Threaded Q&A's are now supported and will show up for any new IAmA's collected.

Reddit has still not responded. However, I know a moderator of Reddit went to the site based on the link source. It seems like they don't care.


Why were you scraping? They offer a pretty good API.


User comments, but going by users rather than threads. That way you could get a profile where someone posts, or turn it around and see what prolific posters existed in a given subreddit.

The thing is it wouldn't sweep everything. Instead a user would only get scraped if a request was made to my app, and I had a tool that would go through a request queue (storing to my own DB) in a metered way so that reddit only experienced a handful of requests from me per minute.

Nonetheless it still breaks robots.txt and if I could dig it up admins have said in the past that don't want automated/batched requests hitting their site.


Were you using the API, or just scraping HTML?


Very very awesome! I love IAmA's, but the signal to noise ratio is relatively low, so that I can only bear it for about 1 page. This does a fantastic job in separating out the great questions and answers!


I'm pretty sure you'll be fine. Erik's a Pretty Cool Guy(TM); I can't see him trying to take down a site whose express purpose is to make the content on reddit more accessible.


Absolutely, If you add link references, like related news, events or uncommon knowledge this will be perfect.

I always imagine someone making a book about the IAMA, there is just so much value on the posts, there are an amazing human view on people. Sometimes amazing, sometimes heartbreaking mostly just fascinating.

Great work!


This is MUCH better than the reddit version. If they're okay with you doing this, I whole-heartedly support your efforts here and wish you the best of luck with this endeavor


Why not just use their API instead of scraping?

https://github.com/reddit/reddit/wiki/API


Yes I much prefer this format, great job. The only thing thing that I don't like is that the questions are hard to read for me. Maybe because they're large blocks of text in bold.


Why not build this as a browser extension or Greasemonkey script?


Why would scraping their site be illegal?


Scraping itself isn't always illegal. But scraping a site and then republishing the content elsewhere often is, unless you have permission.


I figured displaying their content in full is illegal.


Ah, I hadn't clicked any of the links. I thought you had just linked back to their site.

By the way, I like your site. It makes it easy to cut through the cruft that always appears in reddit threads. Do you just grab all responses by the poster and the parent comment?


If they have a full content RSS feed, then this use should be okay. But I hope it survives their scrutiny since this is vastly more readable then the native view.


It's an interesting debate, surely the content is owned by the writers?


The concept is good and its easier to scan than IAMA but the design/typography just hurts to read. I would probably fix that asap. Make it bigger, easier to read and change the font.

In regards to your concerns of legality. Reddit seems to be ok with scrapers, bots and the like as long as you dont make more than 30 requests per minute.


Do you have any suggestions on fonts to use for readability? I was looking through Google Web Fonts and came up short.


I'd recommend changing some CSS on the H2 elements.

You have line-height and font-size both set to 15px, which gives the multi-line H2s no room to breath.

I'd also ditch Open-Sans for Helvetica-Neue, which is more readable, remove font-weight:bold (it's too much bold on the page), and tweak the margins.

Before: http://i.imgur.com/vaf7U.png

After: http://i.imgur.com/a8bHS.png


Thanks a lot for those suggestions!

That looks fantastic. I actually tried to get a designer on board to help with the front-end, but unfortunately he wasn't able to help out. So I did the best I could. :)

I will definitely implement these font changes. :)


radsauce :)


That looks really slick. Nice job.


A couple months ago, reddit started refusing requests from my web scraper. Figured out they started checking the user agent and refusing connections that didn't look like they came from a user's browser. Unless I missed an announcement somewhere, it doesn't seem like they're overly friendly about allowing web scrapers.


They just don't want to be abused.

"We're happy to have API clients, crawlers, scrapers, and Greasemonkey scripts, but they have to obey some rules:"

https://github.com/reddit/reddit/wiki/API


They are fine with it as long as you abide to their terms, they have a subreddit dedicated to reddit development and the reddit api which has discussion of scraping: http://www.reddit.com/r/redditdev


Also, the hover effect is very distracting.


Thank you!! Serious time saver when you cut the ding-dong-ping-pong Reddit karma whoring replies. Just enjoyed this AMA with a former Rosetta Stone employee:

http://www.topiama.com/r/21/iama-former-rosetta-stone-employ...


Surprisingly compelling. The IAmA's are some of Reddit's best content, and this is so much easier to scan than a real IAmA thread. Thanks!


An alternative that we know is legal, you can use http://reddit.com/r/tabled.

To expand on that, you could create a stylesheet or such that modified the look of it. Something similar to the mobile HN sites where banner ads are preserved could be considered as well.


Minor problem: Zero based indexing, combined with not showing current page in navigation, is confusing. At main page I saw links to page 1, 2. Thought that 1 was current page, 2 was next page.


I like this, although I do think it would be better if each comment had a permalink back to the original so that we could see all the responses, not just the one by the person who created the thread.


Very cool. Reading the Nate Phelps AMA:

http://www.reddit.com/r/IAmA/comments/v99eg/iaman_exmember_o...

Right now the top-rated comment by Nate Phelps is not in direct response to a particular question.

This top-rated comment is not on topiama.com:

http://www.topiama.com/r/4/iaman-exmember-of-the-westboro-ba...

I don't know if this is a feature or a bug or if hasn't been crawled recently.

It would be great to see these top-level comments!

------------

Linking to an answer's context would be super-useful as well. A great deal of the fun and value reddit supplies is the community commentary and responses to an AMA's answers. This can range from the funny to the insightful to the scary - something topiama.com doesn't capture. Which is great - sometimes you don't want the peanut gallery.


There used to be a comment bot/person who put all the reddit AMA questions into a tabular format - I recently discovered it has a pretty useful subreddit: http://www.reddit.com/r/tabled

I actually prefer that format over this one.


Super, thanks! Saw a lot of stuff that I missed, e.g. the physicist who can calculate anything:

* If I gain a kilo of weight here on earth, how much more will my wife be attracted to me? Ans: 4x10^-9N

* How large would the wings of a pegasus have to be to allow a horse to actually fly? Ans: 1000m2


Substantial improvement. Now can you make one that shows r/funny minus all of the cat posts?


Neat. The presentation would be vastly improved without the text-align: justify; though.


This is awesome, so much cleaner and easier to read. I hope it stays online.

I'd make questions link to the original comment threads on Reddit, though. Sometimes comments by other users are useful too.


I'm doing something similar with all the funny pictures on reddit. not really successful though, and basically leeching.


Really enjoyed this... so much easier than navigating through hundreds of comments on Reddit's main interface


Does this also support http://www.reddit.com/r/ama?


It is not supported at the moment. However it could be added down the line.


Beat me to it!! I am currently working on the exact same problem. Lets see how my version turns out!


This is much more readable than IAmA's interface though I suggest some work on the font. Great job.


This is great! I second the request for an RSS feed for the new IAMAs. Thanks for the site!


I came here just to say say the same thing. RSS please and thank you!


There definitely need to be context links to the original comments.


Once it denotes nested questions it will be perfected. Nice work.


This is great, would love to see it as a chrome extension!


Paging is buggy.


This is great! Thanks for making it :)


Fantastic job.


RSS Please!


How about one for HN?


I guess we should be glad Craig Newmark does not own Reddit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: