scanned documents

M

Mark Pavlick

List members:

I'd appreciate any suggestions regarding formatting scanned
documents? I need to incorporate scanned documents with various formats
into a book with a unified format. How to impose an overall order on
these documents? Thanks in advance for any help. - Mark Pavlick
 
D

Daiya Mitchell

Are you scanning the documents as images or as text that you can edit? Can
you say a little bit more about what you mean by "a unified format"?

What version of Word and OS?
 
M

Mark Pavlick

Ms. Mitchell,

Sorry for leaving out these details. I'm scanning the older material
as text. The formatting that emerges from this process needs a lot of
work.
I want all of the book chapters to have the same general formatting.
I wonder if this is best done chapter-by-chapter, or ? somehow all at
once at the end.
Thanks for your help.

Mark Pavlick
 
M

Mark Pavlick

Hello,

I'm working with Word 2004 for the Mac, with the 11.3 update, on a
Mac running OS 10.4.8. Thanks again. - Mark
 
L

little_creature

If I understood well, I would suggest you to prepare a template with various
style such as - let say:
Body: Book antiqua 12pt, alignment justified...

1_title: will be based on body and you can add: font: bold, size: 14pt,
bullets-level:1, page, break before

2_title: 1_title: will be based on body and you can add: font: bold, size:
12pt, bullets-level:2, page, space before and space after
....
Then you will apply these style to your scanned document. Whenever you then
just change your mind and want to have Helvetica instead of Book antiqua you
will just change the body, all other styles based on body will change as
well and so all the documunt will change automatically.
Is that what you wanted?
 
E

Elliott Roper

Mark Pavlick said:
List members:

I'd appreciate any suggestions regarding formatting scanned
documents? I need to incorporate scanned documents with various formats
into a book with a unified format. How to impose an overall order on
these documents? Thanks in advance for any help. - Mark Pavlick

A few additional suggestions:

Examine your workflow. Choose your OCR tools carefully. Paste
unformatted.

In more detail:
Since scanning old documents is very time consuming, make sure you do
it once and properly. I'd advise retaining the images and OCR-ing those
rather than scanning direct to text. That way you can avoid re-handling
the originals as you repair the OCR mistakes.

Test your OCR techniques and scanning settings together. e.g. Here I
use ReadIris OCR and a Canon Lide30 scanner. 300 dpi greyscale with
auto levels saving to jpg works well. I train ReadIris on each
document's fonts and layout when there are lots of similar documents to
be OCR'd, and store the settings for later use. I set ReadIris to
ignore the layout and font so it does nothing more than read the text
and detect really truly paragraph endings.

My next step would work well in your workflow. I tell ReadIris to OCR
to the clipboard, then I cmd-tab to Word and paste unformatted. That
means I get the text in the uniform style using similar techniques to
those little_creature suggested. Then, with the jpg on one screen and
the Word doc on the other, I fix the OCR.

While doing that, If I'm being efficient, I'm scanning more input with
GraphicConverter to add to my pile of JPGs.

If I'm not. I take lots of coffee breaks and chat to people on
newsgroups. It is such a soul-destroying task.
 
D

Daiya Mitchell

Yeah, Elliot's right. The most important thing is sorting out the OCR
properly. Like he says, I'd save various steps along the way, then paste
unformatted into a custom template like the one Little_Creature described.

For managing lots and lots of text, the general best way is to create your
own template and combine all chapters into one file. It is best done by
accepting that manual reformatting will be necessary. If you set up a custom
template the way you like it, and use keyboard shortcuts to assign various
styles, this is not such a hassle as it seems. Me, I would probably create
the template, format each chapter as a separate doc based on the same
template, then combine.

Most useful articles:
http://word.mvps.org/FAQs/Customization/CreateATemplatePart1.htm
http://word.mvps.org/FAQs/Formatting/NumberingFrontMatter.htm

More info (probably more than you need) on doing a book in Word:
http://daiya.mvps.org/bookword.htm
 
H

Hugh Watkins

Mark said:
List members:

I'd appreciate any suggestions regarding formatting scanned
documents? I need to incorporate scanned documents with various formats
into a book with a unified format. How to impose an overall order on
these documents? Thanks in advance for any help. - Mark Pavlick
sounds like you are ending up with transcriptions

if you used a contrasting type face all will be clear

Hugh W
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top