Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Lots of tools available, best index I've found of the ecosystem is this: https://wiki.archiveteam.org/index.php/The_WARC_Ecosystem

Ultimately, this is the best viewer I've found so far: https://replayweb.page/



> Lots of tools available, best index I've found of the ecosystem is this: https://wiki.archiveteam.org/index.php/The_WARC_Ecosystem

It looks like there's a lot of tools for creating them, but not a a lot for viewing.

What they really need is browser support, or at least an extension so a browser can open the files directly.


> What they really need is browser support, or at least an extension so a browser can open the files directly

That's probably the wrong thing. What browsers really need is a thin but standardized API that lets any third-party app that the user has installed on their machine to supply the content for various fetch/reads.

You'd open the WARC in Firefox or Safari or whatever, but Safari et al wouldn't have any special understanding of the format. It would know that your app does WARCs, though, and then knock on the door and say, "Please tell me the content I should be showing here; I'll defer to you for any further "requests" associated with the file/page loaded in this tab—just tell me the content I should use for those, too."


That's too complicated, though.

One of the main use cases for an archived web page would be to share archives, and in that case I think you'd want them to be double-clickable with little fuss.


So ship an implementation with the browser.


There is a browser extension. It can record WARC files, but also has a viewing interface identical to ReplayWeb.page. https://archiveweb.page/guide


Sadly it's Chrome only it seems.


I've never easily been to read warc files.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: