| Yeah, I've noticed this too.
As far as I can guess, the problem is greater with books published in the 'pre-digital' era. These have to be scanned & then run through a text-recognition program. The next step ought to be to pass a spell-checker across it, but I don't reckon that always happens.
If you've ever messed around with any OCR package, you've probably seen what a mess they can make of scanned text; not just outputting garbage, screwing up paragraph breaks & ignoring hyphenation, but also generating perfectly valid words that don't make sense in context... 'arid' is a common result instead of 'and', so is 'nun' instead of 'him'. Spellcheckers will miss those.
To get decent output from OCR means proofreading the document - which I guess many providers of free versions will be loath to do as it means paying someone.
Theoretically publications typeset in programs like Quark & InDesign should be fine, but I found that generating an EPUB file from InDesign CS4 messed with the formatting & I still had to edit the output to get it into a usable state for my e-reader. Hopefully CS5 is better.
I guess it's the price we have to pay for being in near the start.
A quick Google for the Harpo book suggests it originally came out in 1961, so has probably fallen foul of crappy OCR.
Pete.
__________________
Psalm 37:8 ...do not fret, it leads only to evil. Blues Bass Players Club # I-IV-II.
Aria Pro II SB-1000 FrankenFretless, SB-900, TSB-400, ZZB Custom.
|