It’s some combination of fair use and raw data not being copyrightable. My understanding is that only the creative expression that’s copyrighted, and not the actual words. So, if you distill out all of the creativity into something that’s purely information about the work, you’re probably fine copyright-wise.
There’s a long tradition of compiling and publishing concordances, which are just indices of every place each word appears in the original text. They’re generally not useful without access to the original, so noboy seems to mind them very much. Google’s index is just a modern form of the same thing.
There’s a long tradition of compiling and publishing concordances, which are just indices of every place each word appears in the original text. They’re generally not useful without access to the original, so noboy seems to mind them very much. Google’s index is just a modern form of the same thing.