Sunday, March 25, 2007

Scanning, Scanning, and More Scanning

I put in 4.5 hours scanning today (including 3 hours at Scanfest). I am all scanned out. But I'm very pleased with what I've accomplished. I've scanned in all of my friend Bob Postula's articles that I intend to cyber-publish. :-) I've got a total of 11 articles. Some are only a page or two. A couple are 10+ pages.

I've been thinking about what format I want to use to publish the articles. PDF is the easiest for me to do but not all search engines will index PDF documents. Text (html) documents are much better for indexing on the search engines but they require much more time to prepare on my part. Even with OCR (optical character recognition) scanning it will take me many hours to clean up/correct the text. Most of Bob's articles include Polish surnames and locations in Poland which use a lot of diacritical marks. OCR software doesn't like that (at least mine doesn't).

My main objective is to get Bob's name and research out on the 'net and readily available so I think I'm better off going with the text format. It'll just be easier for people to find that way. It'll probably take me as long to cyber-publish his articles as it took him to write them in the first place ;-) But that's OK. He's worth it.

2 comments:

  1. It was fun scanning and chatting with you, Jasia. It's amazing how fast three hours can fly by!

    ReplyDelete
  2. I'm glad you got yours done! Can you publish them in pdf first while you work on the text format? I admit the technolgy you're talking about is beyond me right now so I don't know if that doubles the work that you'd need to do.

    ReplyDelete