> Other free ebooks don’t put much effort into professional-quality typography: they use "straight" quotes instead of “curly” quotes, they ignore details like em- and en-dashes, and they look more like early-90’s web pages instead of actual books.
True. I hope you guys get proper funding and keep this project on.
Mailing lists are superior for async communications IMHO for endeavors such as this. Nothing needs to be addressed immediately (and as everyone is a volunteer, realistic expectations should be set for response latency; email helps that, Slack/Discord does not), and the mailing list archive is a natural log of conversations and decisions that are open and accessible (Free Slack only keeps 10k lines of conversation history if I recall). A mailing list is also free (can be, not always, but can be), and does not require a chat client installed.
Mailing lists was superior in 90's-00's, now when discourse/slack/discord/etc exists there's no reason to use ML except nostalgia. Parsing tons of new emails isn't easy.
Also I'm prefer to avoid Google services 'cause privacy issues.
I soured on discord/slack/etc when their absurdly bad performance caused my laptop to get so hot it probably neutered me.
Seriously though, those services are fine on a powerful tower PC plugged into the wall, but if you're on the move on battery power, they are unbearable.
I like the combo of regular forum (Xenforo, Discourse, etc) and chat (Discord, Slack, etc). Unlike mailing lists, modern forums are actually usable, fun to use, and appealing. And chat provides a place for more conversational community-building.
For example, the Elm community has both. The Discourse forum is technical and business-only yet there's a clean record of these discussions. The Slack chat is where I hang out, get to know people, and participate in more relaxed chit chat about Elm, webdev, and building applications.
Elm used to just have a mailing list but it was obsoleted and shut down with the creation of the Slack group and Discourse forum which were far more popular.
They have all sorts of modern features more conducive to discussion and community-building like notifications that someone @mention/replied to you and even editing your post -- features that people generally like. If you don't think that's "fun", fair enough, but I also enumerated other benefits like their broader appeal.
Any community that only has a mailing list could benefit from experimenting with a proper forum. I've seen this experiment broaden a community time and time again as you move away from only selecting for the type of person who likes mailing lists. And notice that HN isn't a mailing list either.
For example, I would imagine that the sort of people interested in high-quality ebooks extend beyond mailing list loving super-techies. Even a subreddit would be a nice option.
Counterpoint: Old mailing list conversations are difficult to parse and encourage a "ignore it until the issue goes away" mentality if no one is enforcing a reply rate.
Mailing lists only really work for corporations imo
We're talking open source/free/non-profits here. No reply rate should be enforced unless by project owners (their time, their project, their rules). Some issues should be ignored until they go away. I myself ignore issues from some folks who engage me in my role as an open source tooling maintainer, after I have exhausted my patience working with them and they are not receptive to polite discussion.
> Mailing lists only really work for corporations imo
Fascinating subject. It seems like the difference between slack/discord and email, is the difference between a water cooler conversation and an actual sit down meeting.
That's a good point. I find Zulip better than Slack/Discord for discourse (even better than mailing lists, with some caveats, and it's Apache-licensed:
I knew about Zulip but wasn’t aware of the free hosted plan. I can truly see it now as an alternative for orgs who can’t afford running these things themselves.
It doesn't directly, only as a meme, or archetype. The same way that beer is not directly related to the concept of gratis, but you still use it in the illustrative phrase "free as in beer".
I was hoping I could contribute financially, which could help fund any software or hosting costs they have, but it doesn't look like they're accepting donations (which is also fine, and completely their prerogative).
But! If someone from SE is reading this and it turns out that you just don't have a way to donate because it doesn't seem like people will donate, definitely put a paypal button or something out there. :)
We have minimal hosting costs (ebooks are small) and no software costs, so the rest is just down to time. Luckily, the majority of the process is proof reading, which it turns out people quite enjoy doing regardless, and is easily parallelisable our across multiple contributors.
So so far not need for contributions, and it makes things simpler to not need them.
If your hosting costs ever mount, I can recommend Hetzner for hosting, they'll give you a whole bunch of bandwidth for free on their smallest plan ($3/mo) and you can even buy a 3 Tb pipe (IIRC) you can saturate for $20ish a month.
Otherwise, I'm sure some organization will be happy to provide some bandwidth in exchange for a shoutout.
So, not about curly vs straight quotes, but about whether to use en-dash, em-dash, minus, or hyphens on Wikipedia. For example, what mark do you put in "Mexican-American War"?
It's great that these are in a consistent formatting style. When trying to extract some contents programmatically from some Gutenberg texts, I kept running into different formatting styles. That combined with being able to check out the entire repository makes it much simpler to do data processing on the works.
And, of course, fixing more errors is of course a noble goal. Are these corrections going to make it upstream to Gutenberg?
> Are these corrections going to make it upstream to Gutenberg?
It's an issue worth raising to the team. In the spirit of GPL, I think reporting any instances of clear typos in the source text upstream would be a good idea.
The problem is that so much work is done to the text as part of StandardEbook production that we can't exactly just submit a single patch or diff. It would be difficult to identify the textual corrections from stylistic changes in an automatic way, unless we were to enforce typo corrections to occur in a single commit. We're currently encouraged to use an [Editorial] tag when modernizing spelling such as "any one" -> "anyone", so perhaps we should see about a [Transcription Error] tag for obvious typos.
The upshot is that all of the books' sources are hosted on GH. So an interested party could, in theory, review the commit history and pull out what look to be typo corrections. See, for example:
"Modernizing" is a very questionable thing to do IMO.
Fixing typos is fine I guess, but books are the result of an era and grammar or writing style is an inherent part of a book that should not be altered.
On the other side, modernising the texts might not be what gutenberg wants, if they want to keep original errors from the books? Haven't looked it up but I have been in other book project where it was more important to keep everything as it was printed.
> Standard Ebooks puts significant work into designing, formatting, marking up, and hosting our ebooks. While some think we could, or even should, release our work with some kind of copyright notice, instead Standard Ebooks dedicates the entirety of each of our ebook files, including markup, cover art, and everything in between, to the public domain.
Public Domain projects that subtly argue for an extensive view of copyright in ambiguous (in the best case) situations make me somewhat suspicious.
Editing is a lot work, and their efforts are appreciated. But most of that involves the application of existing rules (curly quotes etc.) and therefore doesn't meet the creativity standard of copyright.
I guess the texts are out there anyway, and it doesn't make much of a difference. But I'm reminded of the art world, where many a painting is long out of copyright, yet only the Museums have access and the insist setting up a lightbox and taking a photo is a creative endeavour worthy of protection from the prying eyes of the non-paying public.
> [T]hose who are hesitant to use photos of works that are in the public domain […] should know that under the law if the image is “slavish,” a mere reproduction, a plain unadorned exact image, they can use it and do not have to pay anyone a licensing fee.
Not all the sources come from Gutenberg for a start. But for those that do I usually keep a running list of proofing corrections as I go and submit them back. Gutenberg are pretty responsive to any changed submitted if you supply a link to source scans as well.
It covers a plethora of subjects, but devotes a few chapters to the Difference Engine and the political difficulties in getting it funded. Bonus MathML (rendered to PNGs in most readers but real for the Kobo), and all diagrams support both normal and white-on-black dark mode if you’re got your reader set up like that.
Is there any way to download them all? Ebooks are very small so it shouldn't be a problem to make an archive, or at least have a way to use curl/wget to download them all from a directory.
I ended up using this to download all the azw3 files for my kindle, it's probably not the best you could do, so feel free to use it as reference for yourself if you might do something similar.
For iOS there’s something called KyBook which reads that opds link and shows a list of books with cover art, book name, author and tags pertaining to the books subject matter. It does not allow one to search the opds library in anyway. I was able to download a book using the app, and from within the app move the book to iCloud. From within the Files app I found the KyBook folder, selected the book and using share opened it with iBooks and it is now in my iBooks collection. I found the app here https://www.maketecheasier.com/best-ebook-reader-ios/
It's always great to see public domain books being made available, and standardebooks is certainly worth a visit. However, while I read quite a lot and in particular the sort of books that are available there, I mostly give the site a miss purely because of its design. I don't want to seem too sarcastic, but having huge images as a listing for books is odd when most of the users can read quite well. I'm probably a bit sensitive about this, since our local library does the same thing - it almost looks like there is something about uncompromisingly textual information that provokes a reaction from web designers.
Not only is this an excellent project, I think this is also an incredible collection of books you've chosen to feature. I also really appreciate the art choices that have been made for the covers.
I have many of these in epub from Gutenberg, but plan to replace them with your versions when I have some time.
I know others have already asked about bulk download--have you considered offering a torrent of the full library or possibly one for each file format?
This is fantastic, and immediately I want to try and contribute engineering time to it. I've tried reading Gutenberg ebooks before and gave up because of how inconsistent and unreadable they could be.
Is there a wishlist of tools/software out there that someone could contribute to?
Alex explicitly does not want the "Standard Ebooks" name or mailing list used to coordinate similar projects elsewhere (including other primarily English-language nations), due to copyright issues: https://groups.google.com/d/msg/standardebooks/qRDTb-hHMxk/z...
It looks like they’ve decided not to publish any non-English books [1]. It’s a pity – I much prefer reading books in their original language if I’m able to understand it, and I was even considering contributing some German books to their collection. Maybe it would complicate the publishing process a bit though since different languages have different practices for things like punctuation.
> I much prefer reading books in their original language if I’m able to understand it
Absolutely. To the extent that I have trouble focusing on texts I know to be translations, unless there are inescapably good reasons for them to be, i.e. they come from a language I have no chance of understanding.
Currently battling to resurrect my highschool German. Getting there, but cursing myself for not starting out with something a wee bit more accessible than Thomas Mann...
The only one I know of is http://projectoadamastor.org in Portuguese. I can't tell whether it uses the same tooling, but the project generally predates Standard Ebooks.
"This ebook is only thought to be free of copyright restrictions in the United States. It may still be under copyright in other countries. If you’re not located in the United States, you must check your local laws to verify that the contents of this ebook are free of copyright restrictions in the country you’re located in before downloading or using this ebook."
Most of the work is life of the author + 70 years, but the US is anything published in 1924 or later (with a few exceptions, for example if copyright wasn’t renewed in the 60s).
I don’t have a Kindle so haven’t tested, but I believe we build the AZW3 files from the epub2 ones, which have hyphenation baked in using the Python hyphenation library.
For the Amazon-compatible "azw3" files that I'm seeing, I'm curious why the book cover thumbnail images are a separate download from the ebook file itself?
Unless I'm missing a trick, it seems like you have to use Calibre (or some other application) to re-build the "azw3" file with the cover thumbnail properly embedded. Why not just ship the ebook files like that to begin with?
I applaud this idea & hope it goes well. I'm always a little disappointed when I download a book from Gutenberg and the formatting makes it virtually unreadable.
Notice the gray margin outside the sheet, and the padding inside the sheet. The first one gives you a general frame of view, and also you don't want the document to use the whole screen width when using a 16:9 monitor or similar screen.
The padding is necessary because you don't want characters too close to a margin, else they look like they're escaping the sheet.
I haven't been able to replicate this setup with Calibre, FBReader or any other epub reader.
Fonts are another issue. Default fonts always suck. FBReader uses Dejavu Serif, that, in my opinion looks just bad. I changed it to Bitstream Charter, which looks decent, but then the line justification looked wrong, I changed that, and then paragraph margin looked wrong. There's a million little things that look horrible by default and you have to spend an hour per book setting up your reader so that it looks right.
I've tried generating PDFs with Calibre, the result: giant ugly fonts, zero sheet padding, nonsensical spacing, etc.
At some point you just give up and avoid epubs like the plague.
What about any of the ways to convert epub to pdf yourself? Are they no good?
If anything, I think an HTML version that you can view right on the website would be the best format addition. It's always interesting to me when a website doesn't offer a browser-native document format as an option to view text.
I've tried generating PDFs with Calibre, the result: giant ugly fonts, zero sheet padding, nonsensical spacing, etc.
I could spend the time trying to make them look right, but I don't want to spend my time that way. I prefer learning things that will give me more satisfaction per second spent.
Another ebook non-profit I'd like to see is one that shepherds books through the copyright maze. No doubt there are scads of books in the public domain that no one has proven are actually there. Perhaps this exists already but it strikes me as a good separate, and highly targeted, kind of effort.
BTW does Kindle let you load your own DRM-free ebook files instead of buying books on Amazon? I use a PocketBook (pocketbook-int.com) which emulates a mass storage device and lets me read everything. I once considered buying a Kindle but heard it won't let me load bare files this way. Is this true?
Calibre is a great way to manage syncronisation and reformatting to Kindle-supported file types, it even allows you to select your Kindle model to make sure it looks well:
I've forgotten to mention my PocketBook also has a microSD card slot. It came with just 1 GB of internal memory, ⅓ of which is occupied with the OS (Linux) so I've just bought an additional 16 GB microSD card to extend it recently. Now I just plug the card into my laptop SD slot (using a microSD-SD converter that came with the card) and don't even need to use a cable (which never worked well, it often happened that the device would charge but won't establish a connection).
Absolutely. I've been using Kindles for a decade and have never bought any ebooks from Amazon.
The only functionality issue I run in to is that as far as I can tell, Amazon has a feature where if you buy the book from them then they keep your progress synced between your kindle and their phone apps. Can't use that with books from other sources.
I generally upload them to a file-hosting website--0x0.st works well--and download them to the kindle from there. Use calibre to convert them to mobi format if they're pdfs.
You can search by genre, but the only way I noticed to do it is to click on a book with the genre you want, then there'll be a link to browse other books with the same genre tags.
How are you sending the file to your Kindle? AZW3 should work if you're connected to your computer via USB and dragging over manually or sending over via Calibre, but I've seen it not work if you're trying to send it to the email addy associated with the kindle. For that, you are correct, MOBI is usually the preferred option.
It should still work if you transfer it over USB, but if you're trying to do it all wirelessly, simply download the EPUB and convert it to MOBI, and you should be good to go.
Without commenting on the contents, it wasn’t fully published in English until 1939, which means it doesn’t arrive into the US public domain for another 15 years. It’s also down to personal preference what people work on.
Unfortunately none of the formats given work with Amazon's "email to kindle" system, which is the most convenient way to load books-- it allows you to download a PDF on your phone and send it to a special email address associated with your Kindle device. Considering all the work this site has already done preparing the book files, it seems like they might as well ought to generate PDF files using a page size roughly equal to that of the most common Kindle readers.
Amazon's "email to kindle" system also accepts books in the mobi format which would be a preferable to mapping pdf page sizes to Kindle reader screen sizes.
True. I hope you guys get proper funding and keep this project on.
Contribute: https://standardebooks.org/contribute/
(I was thinking a Slack or Discord would be better than Google groups mailing list for this?)