Monday, June 20, 2011

Lessons Learned Part 7 - Double-check Your Shit

Currently (Re)Reading - Gotcha! by Christie Craig (E-book)

A lot of published folk I know are putting out their backlists.  The problem for a lot of them is that they don't have a digital copy.  So they're having to scan their old typewritten manuscripts or paperback copies, and in a lot of cases, they need new artwork because of licensing issues concerning the original covers.  What I'm about to say applies to both new authors and old:

(1)  There's no such thing as a perfect OCR program (at least not yet).

(a) For those who aren't computer geeks, OCR stands for Optical Character Recognition.  Most OCR scanners 'read' each line of text and tries to translate the shapes it 'sees' into a corresponding alphanumeric character in a word processing program.  If the OCR is missing some 'training', it may translate 'K'ehleyr the Klingon' into 'Cpl. Klinger'.

ALWAYS proof-read your text after a scan!  In fact, have a couple of people put eyeballs to the novel.  Then you won't be mocked on various blogs where the bloggers get a thrill out of trashing anyone self-publishing.

(b) Depending on the OCR program, the file may go through a couple of conversions before it gets to your preferred word processing format.  This means you could have invisible formatting characters totally fucking up your file.  I know this sounds like extra work, but I highly recommend using Smashwords founder Mark Coker's nuclear option to make sure you have a clean file.  Yes, it's a pain in the ass to reset indents and re-italicize words, but it beats smacking your head against the wall trying to fix formatting errors on your e-book.

(2) Don't assume public domain art is actually in the public domain.

Sorry to bust your fuzzy bubbles, but people lie on the internet.  ALWAYS double-check rights to a particular piece of art.  An acquaintance got a cease and desist letter from the owner of a particular painting after she used a JPG from a public domain site.  Even better, pay the couple of dollars to license photos or art from a reputable company.  It'll save your ass.

Anybody else have a double-check suggestion?


  1. Smart OCR is one of the latest optical character recognition software programs. It's significantly better than most other programs. Try it here:

  2. Thanks for the heads-up, Nina. I'll pass the info along!