If they're not using the book text to train models (keeping the focus on this particular new Kindle feature), where's the room for objection? My device, my content, it's none of the author's business how I read it, in my view.
Edit: Given I've been a reader of HN for some time, I am perfectly aware that on Kindle you don't own the content, just a license to the content. Don't need any more people pointing this out! Lol. In my house we still call owning a license to something that is not likely to be revoked "owning it".
Audiobook publishers require/request this when you sell subsidiary rights. We’ve been able to push back citing accessibility concerns. I find it really annoying when not available for my own reading.
Couldn’t agree more. This is actually a super useful feature. I can’t think of how many times I’ve been reading a book and some minor character resurfaces and I’m like, who the hell is that guy? Now I can know. I can also get information on historical context. Who knows, maybe I can finally read Ulysses without having to have 5 other books.
I wish it was "my device, my content" but it absolutely isn't. If you want that you have to buy from a DRM-free source, and Kindle is the absolute opposite of that.
IMHO you should be able to enjoy your books however you want. If you want to run a local AI against it, more power to you.
But my opinion doesn't matter. Only Amazon's does. That's the point I was making. The premise of "my device, my content" is flawed (because of the DRM Amazon uses) and undermines the argument.
Right, under that argument it's their content, their rules then - making this situation even more of a non issue because they're adding this feature themselves.
I suspect most of the people arguing this way would be in favor of more end user rights if we were talking about anything except the right to use AI.
“Rights good, AI bad” somehow leads to the insane argument that it’s a good thing you don’t have rights over the book you bought.
“You don’t really own the book” is a crazy argument unless the person saying this wants the locked-down DRM world where you can’t own a piece of media.
Amazon is selling digital copies (or licenses, if you like) of the books, which means they need permission from the copyright holders. This permission is likely backed by a contractual agreement that covers some details about how Amazon presents the digital copies to the end users.
(This of course wouldn't be the case if they were reselling physical books.)
Sure. But you knew what this comment was trying to say. It is obviously saying that what happens on the Kindle is between the customer and possibly Amazon, specifically that authors should not be involved. They got their money. That part of the transaction is complete. I know you realize this, it's annoying to read the constant "not your keys not your coins" reframe.
No. The author incorrectly thinks they "own" the "content" like with a physical book, which is the prerequisite for all the discussions following it. I pointed out, factually and correctly, that they don't own anything (other than the license) or have any control over anything.
"Amazon DID NOT answer PubLunch’s questions about “what rights the company was relying upon to execute the new feature was not answered, nor did they elaborate on the technical details of the service and any protections involved (whether to prevent against hallucinations, or to protect the text from AI training).”
Yeah, the "but what about a human" argument doesn't really work here. Scale of data matters as always. And an Ai for kindle has the scale of 20 years of literature (and more if they just scrape the internet).
> Yeah, the "but what about a human" argument doesn't really work here. Scale of data matters as always. And an Ai for kindle has the scale of 20 years of literature
I haven't seen a convincing argument why not. There's millions of librarians with the knowledge of more than 20 years of literature under their belt. Why can they answer your questions about a book but the robot can't?
Is the "clerk" scanning the books an digitizing them to generate other products using an LLM under the guise of "Answering Questions?" I believe this is the question being asked.
Companies like Amazon and Google have some really sticky fingers when it comes to intellectual property and personal data. I think it's worth asking these questions and holding them accountable for exploiting data that doesn't rightly belong to them.
> Is the "clerk" scanning the books an digitizing them to generate other products using an LLM under the guise of "Answering Questions?" I believe this is the question being asked.
That's what I mean by presumptuous. If that's really what they want the answer to, and what they object to, they should ask it plainly instead of alluding to it by asserting that there's some requisite but missing entitlement for the feature to exist on its face.
Either the Clerk would have read it, because they bought it, or borrowed it from the library.
I mean they could have read it on company time as well.
However, let us not use a straw man here. Its not some company clerk, its one of the largest company on earth using other people's copy right to make more money for them selves.
The author also gets a cut of this, no? It is the author's prerogative to sell their books to be read on a Kindle and they get compensated, maybe perhaps unfairly, when I choose to buy the book. Whatever happens after that, other then copying it and sticking it on Anna's archive is basically free game as long as I'm making derivative works and making money off them. Anything short of that, I'm good.
You don’t need any rights to execute the feature. The user owns the book. The app lets the user feed the book into an LLM, as is absolutely their right, and asks questions.
1. The user doesn't own the book, the user has a revocable license to the book. Amazon has no qualms about taking away books that people have bought
2. I doubt the Kindle version of the LLM will run locally. Is Amazon repurposing the author-provided files, or will the users' device upload the text of the book?
You agree that we should own our digital content but it sounds like you don’t want this particular capability because… fuck Amazon.
I can totally understand that sentiment but I don’t think giving up end user capabilities to spite Amazon is logically aligned with wanting ownership of digital media.
> All these weird mental gymnastics to argue that users should have less rights
We probably agree more than not. But users getting more rights isn’t universally good. To finish an argument, one must consider the externalities involved.
Hasn't training been already ruled to be fair use in the recent lawsuits against Meta, Antrhopic? Ruled that works must be legally acquired, yes, but training was fair use.
> Edit: Given I've been a reader of HN for some time, I am perfectly aware that on Kindle you don't own the content, just a license to the content. Don't need any more people pointing this out! Lol. In my house we still call owning a license to something that is not likely to be revoked "owning it".
The amount of people completely - and likely intentionally - missing your point is both frustrating and completely unsurprising.
A quick reminder that this is part of HN's guidelines
> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.
Once you've bought that food and it's on your plate, how would you feel about the farmer who grew it coming up and forcing you to eat it with a specific fork or only using approved utensils?
There are LLM's that can process 1 million token context window. Amazon Nova 2 for one, even though it's definitely not the highest quality model. You just put whole book in context and make LLM answer questions about it. And given the fact that domain is pretty limited, you can just store KV cache for most popular books on SSD, eliminating quite a bit of cost.
If you want proper answers, yes. If you want to rely on whatever reddit or tiktok says about the book, then I guess at that point you're fine with hallucinations and others doing the thinking for you anyway. Hence the issues brought up in the article.
I wouldn't trust an LLM for anything more than the most basic questions of it didn't actually have text to cite.
> It's not training on books, but it will answer questions about the book you're reading. Doesn't pass the sniff test.
What do you mean? Presumably the implication is that it will essentially read the book (or search through it) in order to answer questions about it. An LLM can of course summarize text that's not in its training set.
"Reads the book" is the issue, yes. It's possible they aren't training. Vit to be frank, we're long past the BOTD where tech companies aren't going to attempt to traon on every little thing fed into their servers.
In the case of a novel, or even certain text books, the author relies on the reader not jumping ahead. Especially murder mysteries and those kind of genres. There are artistic reasons for that, and it can wreck the work.
In my experience, AI summaries often miss points or misrepresent work. There is a human element to reading a well written novel. An AI will miss some of the subtleties and references.
I agree but for some reason, there are people who enjoy doing that. I think they should be allowed to do as they like.
In any case, Amazon claims this feature is spoiler-free and that would be easy to implement. It likely works by feeding the book into an LLM context, and they could simply feed in the portion you've already read.
You don't mind having an llm owned by a megacorp lecturing you about the meaning of a book ?
"Yes this is a good question about 1984 by George Orwell, you could indeed be tempted to compare the events of this book with current authoritarianism and surveillance but I can assure you this book is a pure work of fiction and at best can only be compared to evil states such as China and Russia, rest assured that as a US citizen you are Free"
This sounds useful for when you forget something that happened chapters earlier or when you space out and need to figure out what's happening. This feature should work for the user, author's shouldn't be able to deprive me of this tool.
Fantasy series seem like they've gotten longer and longer and it's often years between volumes. Many authors have started doing recaps of their previous books at the start of later volumes, but not all.
I use LLMs for that all the time. Most frontier models have books trained in, so I just ask for a spoiler free recap or ask about certain characters. Works well in my experience, and made jumping back into Wheel of Time a lot easier
Or were away from the book for a while and are coming back to it. I've read 1000 page books that I just got tired of reading, so put 'em down for a bit to read something else. Anathem by Neal Stephenson comes to mind.
> when you space out and need to figure out what's happening
Ok it's not just me that gets to the end of a page and it's like the page didn't exist.
On the other hand the times I use the search function on the ereader most are when I stumble across a continuity error. It would be interesting if a story-reading AI can be used to detect those. Not that I want there to be less human editing in books, if anything we seem to need more.
Good lord, at this point just drone off in front of a Netflix show. How bad has it gotten that you even suggest that one can "forget what happened chapters earlier" ? This is not normal.
This is hardly that strange, life gets in the way for many of us. I too have many times wished for an easy way to recap a book I've had to put down for a week or two - this is by no means an endorsement of how Amazon have done it here, but you are making incredibly arrogant assumptions about how others enjoy books.
If you have ample free time and few commitments and/or you read very short pop fiction, I could see how you might believe this. But there's a vast world of very long and dense literature, and also...people have kids and a life that gets in the way. Combine the two and...well, I can see why this feature would be useful. :)
I'm an avid reader. I'm reading The Silmarillion right now. There have been countless times where a short summary of an area/character/etc has been helpful. Luckily, in this particular case, there are very good Tolkien fan dictionaries that serve well.
As another example, I read the Aubrey-Maturin series earlier this year. Many times I would have liked a quick summary of a previous voyage or of a political plotline or something.
Even Pynchon's lightest work yet, the newly released Shadow Ticket, has me immediately reading it a 2nd time for many details I missed the first time through. Imagine the arrogance of shaming a Pynchon reader assuming they should never need to check notes or refer back to earlier pages.
It's more telling about the current state of affairs than the person who commented. Forgetting things is part of life, move on, we don't need daddy bezos sucking 1.21 gigawatts per request to tell you that some side character drunk a beer 12 chapters ago so you can enjoy the joke you just missed.
Sure the _result_ is transformative, but it had to consume the content in the first place to make a transformative. (grey area).
Yes, you _could_ argue that its a plain review, but you need to prove that its actually reviewing it rather than just quoting. But as its the machine doing it, that further muddies the water. Is it the end user whos generating the review? does the kindle license actually allow them to do that?
However, the other thing to note is that there is a contract between publishers and amazon that go over and above copyright. It will say how and where works can be distributed and how they can be processed. For example you're not able to distribute the book and then create your own audiobook version of it.
I was imagining a feature that allowed me to search across all my books, which is something that O’Reilly Learning does (actually it gives you answers from their entire range since their model is a license to access all content).
Come to think of it, given how early O’Reilly had this it’s shocking to me that Amazon hasn’t done this sooner.
The O’Reilly Learning search was simultaneously the best and worst of all the early LLM applications. They have tons of high quality content that underpins very useful answers. I’ve also found a bunch of worthwhile books by looking through the sources.
It’s the worst because their template response is extremely unimaginative. I can be asking process questions about managing tech debt and it still gives me a code sample with every response as though I was asking “how do I add this button to my app”.
Who asked for this? I thought Amazon was all about customer obsession, and I'm having a hard time imagining readers saying "You know? This book would SEND ME if it had a chat assistant."
My kids have book reports and stuff. Lately I can use AI to generate non-trivial questions about the books and use it to quiz them without me knowing anything about the books. Been super useful.
> I'm on page 750 of Anathem. Please give me a recap.
> You are currently reading the section of the book where the main characters have been launched into orbit aboard a repurposed military rocket and are preparing to board the alien starship, the Daban Urnud.
Given different printings and formats for books, I’d be very surprised if asking about a specific page number works reliably at all across books. I don’t even know if epub has page numbers embedded to keep track because the number of words on a page of an ereader is entirely arbitrary. My wife has her kindle in grandma mode or something. Only about 50 words fit on a page.
I would expect much more reliable results from chapter numbers though.
I’m looking forward to this. Especially reading old classics, or catching up on an old series and trying to figure out “is this character the sister or niece of the main protagonist? Outline their character development”
I used to have to read fan wikis to figure this out.
But it will especially be useful for all the textbooks I’ve bought years ago. Being able to ask it questions (to the content itself) is better than asking ChatGPT or Gemini because they don’t have the content (they’re summarizing summaries found on the web)
Seems like a great feature. What I’d really like is a “recap for me till here” for books I started reading then stopped for whatever reason. I was reading Unsong for a bit (great book, very enjoyable) and then lately the baby has wanted a lot more attention so I didn’t get much reading done. I just want to catch up quick so I can continue.
LLMs are great for this, for the plot and character questions, etc.
Authors have nothing to do with it. It’s my device, my book that I bought. It would be like if YouTube banned a screen reader. These are at two different levels of the stack.
It's not that direct a counterexample. We have no idea what underlying data from the Fallout show they gave to the model to summarize. Surely it wasn't the scripts of the episodes. The nature of the error makes me think it might have been given stills of the show to analyze visually. In this case we know it is the text of the book.
Amazon made a video with AI summarizing their own show, and got it broadly wrong. Why would we expect their book analysis to be dramatically better - especially as far fewer human eyes are presumably on the summaries of some random book that sold 500 copies than official marketing pushes for the Fallout show.
For the reason I gave in my answer: it would be answering based on the text of the book. I don't expect it to be particularly great regardless because these features always use cheap models.
I’ve found the complete opposite as recently as last week. When I ask a deep question about a book it will hallucinate whole paragraphs of bogus justification and even invent characters
The LLM just working on its own is just generative intelligence. You have to ground it if you want the real stuff. The Kindle app has the text of the book and I'd want it to put that in the LLM context.
Edit: Given I've been a reader of HN for some time, I am perfectly aware that on Kindle you don't own the content, just a license to the content. Don't need any more people pointing this out! Lol. In my house we still call owning a license to something that is not likely to be revoked "owning it".
reply