OP here. Definitely, great idea :) Briefly: Sites are archived using a system wr...

19eightyfour · on June 27, 2017

The issue is cost. Your costs are disk space for people's archives, instances for people's use, and bandwidth for the fetches and crawls and access.

How can you pay for this if it's free? It's unreliable unless its financially viable.

agamble · on June 27, 2017

Totally right, great observation :)

For now it's a free service with a single rate-limited form. Now it's time to work on adding specialty features that are worth paying for.

Faaak · on June 27, 2017

How about avoiding redundancies ? Are same CSS files cached twice or referenced by their hash ?

The page URI is a bit obscure though. I think a tresoro.io/example.tld/page/foobar/timestamp would look good.

What about big media content and/or small differences between them ?

agamble · on June 27, 2017

Great question.

Currently there is no global redundancy checking, only locally within the same page. So two CSS files which are the same from multiple archives are both kept. While this might not be ideal in terms of scaling to infinity, each archive + its dependencies are right now limited in size to 25MB, which should help keep costs under control until this is monetised. :)

WhiteOwlLion · on June 27, 2017

What file system are you using? Couldn't a deduplicating file system handle redundancy for you?