Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Please review our startup: historious
42 points by stavros on Aug 15, 2010 | hide | past | favorite | 46 comments
historious is basically a search engine for sites you have bookmarked. Think of it as a cross between delicious and google, you bookmark something and the entire content of the page becomes searchable.

It's at http://historio.us/

I have posted it here once before, and you guys were very helpful, giving a deluge of very good suggestions. Right now I'd like to ask for your opinion on the pricing model.

Our model right now is a free option, which has a limit of 1000 bookmarks for a month, and then it becomes unlimited. This is to avoid people coming in, importing their 20,000 bookmarks and leaving, never to return.

There are also some extra features here: http://historio.us/pricing/

The main problem, and what we'd like your feedback on, is that every bookmark a user adds is expensive, as we have to store the entire page. Right now the service is profitable, but we'd like to improve it so we can scale it better.

What features would make you pay for it? What sort of things would turn you away from it as a free user with a mind to convert later on?

A good point we heard for not having a bookmark limit is that, even with 1,000 bookmarks, users would think twice before bookmarking something, and we don't want that.

Any feedback you could give is greatly appreciated!




I use Google Bookmarks: http://www.google.com/bookmarks

It already merges my personal "Web History" (things I search for on Google) with my actual bookmarks (things I click 'bookmark' on a bookmarklet - http://www.google.com/support/chrome/bin/answer.py?hl=en&... ).

The key feature of Google bookmarks is that searching the bookmarks always searches the content of all bookmarked pages and the web history pages. It returns the results with bookmarked pages at the top, and web history below.

It also has tags (they call them labels) and collections (they call them lists), and collections are shareable, and tags are searchable (find X only in Erlang tagged bookmarks).

You can import bookmarks from other services, and export all bookmarks to XML (one of the reasons I stay with Google is that they allow me to move).

My question to you is: What do you offer that trumps that functionality that I already have? Why should I move? Snowman?


Well, the snowman is pretty awesome... There are other features, such as the ability to share bookmarks on your personal site (http://stavros.historio.us/) for an example, automatically add things to "read later", one-click bookmarking, automatic sharing...

A few features that are coming are the ability to import RSS feeds from other places, as well as more social elements (and, of course, support).

For you, though, David, I'll throw in a free subscription!


Actually... I've just tried your site end-to-end and you do have a killer feature and it's not the thing you are making most prominent.

The killer feature is the cache.

To know that what I bookmark will have a snapshot at that moment in time, citable and impervious to changes that happen to the source.

That is a fantastic feature.

My advice to you would be to highlight that. Bookmarks against things that can and will change, disappear or move are moving targets... what you're offering here is a permanent bookmark as it stood at that moment in time.

That's a big deal, definitely enough to make me consider trying it out. But you need to communicate that this is possible... it would really help researchers and those who use bookmarks as a searchable source over a long period of time (which is when the effects of things changing becomes most obvious).

I think you should possibly look at features like verifying a source (it looks like you cache from the browser, which the user could've modified) and allowing tags to be shared with other users or publicly. Basically... allow academics to use this and to include web citations in their papers and such with histori.us providing the verified cache.

Perhaps even extend the bookmarking functionality to include the ability to highlight and add notes to part of the bookmarked page.

Permanent cache = big deal.


Ah, thanks for that, you're right in that we don't publicise it almost at all (and we should). As for the sharing, you have your personal historious at:

http://buro9.historio.us/search/

You can publish sites there (including tags) and refer people to it by giving them a link of the form http://buro9.historio.us/?q=some+query. Personally, I think that's pretty handy for answering the question "do you have any sites about X?"

No items are made public by default, but do note that, when you do make an item public, you are also sharing the cached version.

Thanks again for the feedback, and I'm good for that free subscription if you need it!


I've thought about it some more.

Beware and embrace the slashdotting (insert digg, reddit, HN here).

The cache is both your killer feature and your risk. I hope it's statically stored or can be put in memory quickly... are you using Varnish?

Encourage it's use... if people are bookmarking using histori.us and a site goes down, you've just gained a very large audience to your service. But only if your site survives the onslaught itself.


We are using Varnish, but we're storing the sources in the DB (to take advantage of automatic compression and all-around ease). It's trivial to get Varnish to cache the entire source and invalidate the cache when the document changes (the public/private setting, basically, so it doesn't accidentally share a page that's been changed to private), so that's all good.

The biggest problem (by far) is disk usage. The rest of the service is very easily sharded, really, as every user is isolated. Solr is also fantastic (much better than Sphinx, in hindsight we should have gone with that for TP), so that can also scale very well (there's even an implementation of it on hadoop).

We'll add the caching feature to the front page as soon as we finish the current round of A/B testing, thanks again (that feature was basically an afterthought, so it was great that you noted its importance)!


Until the TP ref I wondered whether it was you. Hi again :)


It is, hi :)


He's right. You (historious) could market that to the academic and legal world, where citation is a big deal but URLs can be ephemeral; the other day I was reading a brief less than a week old, filed in a current (and very prominent) case, and two of the hyperlinks in it 404'ed. If historio.us can get a reputation for having 'certified copies' then everyone will stampede towards the service.

For bonus points, a button to search for the current version of the same page, which may be at a different address from the original one.


We definitely should emphasise that. Also, the way the service works now, each user's cached page might be different (e.g. if you historified it a month ago and I yesterday, the caches will be different if the page changed). Example:

http://stavros.historio.us/cached/354988/

Unfortunately, we just lost a day of A/B testing due to an erroneous setting, but we'll start testing this addition to the homepage first thing tomorrow. Thanks again!


Following on again... allowing a legal firm to have a company account whereby the company paid for the storage of their employees would be a good thing.

This is a just collection of users, perhaps identified strongly by their company email address... and as each user works on a different case they should see their own thing or things of shared tags (multiple people working on the same account).

The company shouldn't lose access to a cached URL just because an employee leaves and decides not to renew their sub with you. Hence the need for a company account.

The company should also be able to control what is made public and to review all public items. This takes care of ensuring that research into defence isn't leaked... perhaps a simple approval process on what is made public (whereby a named person for a tag approves items within that tag being made public).

Then make it so that you bill the legal firm on a monthly basis and show the proportion of storage per tag (they will re-bill their client accounts) and you have a winner.

After that the only thing you have to do is sell it to a few firms.


Mrs Browl works for a discovery firm; she sugggests that the rolling billing might not work, but getting the foot in the door might allow sales of a long-term cache at a premium price instead eg $10 keeps a page up for 10 years.

Tricky pitch for a new company, of course - you haven't been around so long. Any sort of partnership with an existing one would help. Be aware that law firms in general are conservative about technology so it might take a year or two for the idea to catch on. A way around this might be to target law students first. Contact the editors at prominent 10-20 prominent law journals and give them free accounts for the journal and/or themselves as individuals; let them recommend your freemium service to their fellow students. Law librarians are another likely target, they spend quite a bit of time helping people with research tasks.

I think it would help if these 'permacache links' had some distinctive appearance that was easy to type from a printed page, eg 'http://historio.us/citations/username/98734545.htm <- numeric is easier than mixed alphanumeric if you have to copy it by hand.

In fact you could do 10 digits easily using a telephone format, and then ditch the user id string. 'historio.us/citations/####' is easy enough to become a standard link for public citation, and 111-222-3333 gives you room for 10 billion citations before you need to change the naming schema.


Thank you for your points, they are very good. I agree that the lawyer market is hard to penetrate, and that students would be a much better target, so this is what we'll try to do.

As for your second point, due to a security vulnerability with having cached pages on the main domain, we moved them all to another one:

http://cache.historious.net/cached/354992/

I think that is much more readable, and the number doesn't really have to have a fixed number of digits; it can increment infinitely. If we ever need shorter URLs we can base36-encode it and reduce the number of digits right away.

Thanks again for the insight!


That's a fantastic idea and something we've been looking for for a while. We couldn't find a feature a company would pay for, but the caching is it. Thank you again for pointing it out, we can approach many lawyers/journalists/universities this way.


This is absolutely an accurate comment. The fact that content is cached actually makes me see value in this to the point that I might pay for it. I haven't looked at your pricing, but I'd probably pay $10-20 a year. Or pehaps I'd just roll my own.


You (we, really) are in luck, the price is exactly in that range!


I really enjoy using Historio.us. Here are some features I might consider paying for:

Update notifications. Store the page content, url, and date I historified a document. Alert me to any changes in the document; sort of like an RSS feed. Use the wiki format to display changes in the document.

I also like being able to see all my links at once. I am playing around with an idea here[1]. I know there is an "iframe = bad" mentality, but most places use them- more so with the ajax web. If all my links were to the right and there was a blank iframe that I could load them in on the histrio.us website you keep users on your page longer and increase your traffic. Combine this with the updated views and you have a nice little tool.

I would also like some metrics tools- why store it and not doing any thing cool. Maybe the ability to put pages in sets and query those sets for keywords, images, ect. Do something different than google bookmarks/delicious/et al.

Better facebook/twitter integration. Until you get on the social bookmark list and have your icon plastered on pages you should offer this directly. A icon beside an already historified link to facebook or twitter would be enough for a user to know what it is there for. "Jane just historified URL".

Other than that I am content with it just the way it is now. Keep up the good work.

[1] http://christopherwoodall.com/trendy/


Thank you very much, both for your feedback and praise!

Your idea is very very good, and there are many things we can do, especially with document clustering (grouping similar documents), automatically generated keyword clouds, etc etc. We are trying to get the service to a good, stable level for now, and then we'll devote more time to playing with new ideas.

One such idea we're considering is have the search interface be a filter, where the results are continuously decreasing, in real time, as you type more and more keywords.

We already have Twitter integration (as in, posting things to Twitter when you historify them). We also want to add Facebook integration, but its priority is a bit lower right now.

Thanks again, we add all ideas to our wishlist and are very happy to implement most of them (it just might take some time)!


I absolutely love historious. I've using it for about a month now and I can't imagine life without it.

I like the fact that you don't need to sign in with the bookmarklet (contrasted with instapaper).

The incredibly clean interface is another draw for me. A lot of these other sites are all "web 2.0"ish and to me it's bothersome.

I don't really like the chromium extension. I dunno, I think it's kinda useless seeing as though it's off to the side, whereas the bookmarklet is right on top of what I'm viewing. Also I like the notification window of the bookmarklet better than the extension. The window when a "page has already been indexed" on the extension is a bit annoying. I'd like to see maybe a keyboard shortcut for the extension, I think that would make it killer.

A couple things that I would like would be to search through a time range, also better browser integration would be nice. i.e. having pages show up under the autocomplete. The ability to select multiple items would be nice. Having a text only view (a la instapaper, readability). Also I wish there was a shorter url, I always spell historious wrong.

thanks!


Thank you very much for your feedback, it really is the best part of the day when a user tells you they like the service!

It's perfectly fine if you prefer the bookmarklet over the extension, as we spent more time working on it (it affects more users)! We absolutely want to add a shortcut, but don't know how to do it in Chrome yet (some research is in order).

You can already search for time ranges (the syntax is a bit weird, though), it's "added:[2010-07-01T00:00:00Z TO 2010-08-01T00:00:00Z]", for example.

We are also planning to support better autocompletion, as well as bulk edits/deletions. I have never seen the text-only view, but it sounds like a good idea, I'll take a look now! As for the URLs, we also have historius.com and historious.net. Unfortunately we don't have historious.com for symmetry :/

Thanks again!


I just did the sign-up. I hate having to wait for the activation email, then click activate then having to "relog".

Sometime when this happen, I don't even go further and never even try the service. Do you you really need to wait for me to click an "activate" link before letting me try your service ?


Hmm, thank you for that. We don't really need emails to be verified, it's just for forgotten passwords. I'll see if we can get it removed right away, thank you.

EDIT: We've changed the signup process to log you in immediately after activating, and we'll remove the activation requirement soon, thank you!


I'd be interested in your approach to competing with services like Delizzy (http://www.delizzy.com/). Easy full text search of my Delicious bookmarks is a huge draw-card for me.

The main advantages I see you having versus Delizzy are:

- Privacy (they require your Delicious username and password)

- Being able to snapshot the page as you read it

The main disadvantages I see you having versus Delizzy are:

- They integrate with Delicious



Thank you, is there any way to edit the post to make them clickable?


I made a suggestion about this (clickable links in submissiosn), FWIW, http://news.ycombinator.com/item?id=1569226.


I think that's an interesting suggestion, and would be a useful feature. It does suffer from the problem that links would only become clickable after a time (having to wait for votes) whereas it's the first few views that really most benefit.

It's a great piece of thinking though - kudos. I've upvoted you both here and there.


Thanks ROG - I figure that at least the first few people view the link anyway before someone posts a "clickable" comment.


URLs in submissions don't get linked, URLs in comments do. If you want your URLs to be clickable, add a comment, just as I did.

This isn't a shot at you specifically because it happens again and again and again and again, but I'm constantly surprised, constantly surprised that people like yourself who have been here for months don't know the rules. It's something you simply need to know, something that you learn from spending time here, and wondering "I wonder why that happens."


This is off-topic for this thread I think, but the thing that constantly surprises me is that there is no mention of or direct link to the submission guidelines from the submission page.


And now I know!

I still can't figure out when the downvote arrow appears, though.


From the FAQ at the bottom of every page:

http://ycombinator.com/newsfaq.html

    Why don't I see down arrows?

    There are no down arrows on submissions. They only
    appear on comments after users reach a certain karma
    threshold.


That's the odd thing, I saw them one day and then they went away again. It's no big deal, I was just wondering, thank you.


No.


Niggling:

>The best part: It's completely free!

It's not completely. I think you've over marketed that bit, would "It's free" do, or even "You can use it free forever" (I know, but it's clear IMO that forever is curtailed by heat-death of the universe, business ending and the like).


Hmm, thank you for that. It's a remnant from the days before subscriptions, we'll see about changing it (even though it depends on what one means by "completely", i.e. "not forcing you to sign up" vs "all features").


On the 'net with companies one hasn't previously used I think the barrier for "this company might be scamming me" is so low that anything that looks like a half-truth should be avoided. But it's probably just me!


As someone who uses Opera with its own native bookmark manager and synchronizer, why is would I choose your service?

I haven't wrapped my mind around your concept completely yet, and I didn't get any hits by searching on your demo, so ... :)


As someone who has used Opera since version 4, historious is much superior. I have never been much into bookmarking, I have about 50 bookmarks now and the list is already unwieldy, even though it's categorised and synched. I just never look at them.

In Opera, you can see the titles, and you're lucky if you can make out the title of the item you're looking for. On historious, you just search for "nginx ubuntu" and it finds the document, even though the title may be "So and so's blog | Webservers".

Another thing is that, with historious, I don't have to think about where to put the bookmark. I just click the button and it's done, I can look at it later or not, it's always going to be there, even if I forget about it. I find myself searching historious first and Google later, as I usually find things that relate to what I want to do now which I had found interesting before but then forgot about.

Do give it a try, it's really changed the way I bookmark, and I love Opera to bits.

EDIT: Not to mention that I can create a custom search where typing "hist <keywords>" will immediately take me to the first result in my historified sites!


About the demo, we should make it more obvious, but there's a hint that says "try searching for historious". We can't include too many things in there, as everyone searches for different things and we'd never get it all :/


I just tried some things like Obama and thought that it would have something on that. Oh well. :)


I use http://wheatt.com

It has tags, but nothing social about it, a more nerdy command line type interface, and because its mine :)


You really should have a look at Wajam which is another startup that seems to have the same goal as you. They do it differently thought.


Thank you, I'd never heard of them! I'll have a look right away.


Hooeey seems to be their most similar competitor.


neat functionality idea!




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: