PDF to Word

K

Kdesigns

I am really new to Word and I need to convert Mac PDFs into Word.

Acrobat Pro 7
Word 12 (or 2004/X)

Is there a 3rd party piece of software that may do this well?

Thanks,
Kathy
 
E

Elliott Roper

Kdesigns said:
I am really new to Word and I need to convert Mac PDFs into Word.

Acrobat Pro 7
Word 12 (or 2004/X)

Is there a 3rd party piece of software that may do this well?

Nothing much works better than OCR-ing the document.

Sad but true. Plenty of OCR applications will eat the PDF directly. I
use Readiris.
The trouble is there is no consistency in the order that PDF stuff is
slapped on the page. For instance, multi-column stuff might get merged,
or the running headings might all appear first. It depends on what
wrote the PDF. There is even no guarantee that what looks like text to
you and me isn't. e.g. if the author did a convert to paths in Illy or
Freehand while building the pdf pages. If you don't believe this, open
a few documents in preview.app, select the text tool then try dragging
through lots of text and pasting the result into Word or any other
program that pastes text. But then you already knew this or you
wouldn't be asking here.
 
P

Phillip Jones

Acrobat7.0.8 open pdf. go down to save as and file type choose Word.doc.

IN In acrobat 8 choose export file as choose Word.doc next window give
you a chance to name the file and choose where you want it. but there is
also, a button customize click and choose what you want to save and how
to save it.
I am really new to Word and I need to convert Mac PDFs into Word.

Acrobat Pro 7
Word 12 (or 2004/X)

Is there a 3rd party piece of software that may do this well?

Thanks,
Kathy

--
------------------------------------------------------------------------
Phillip M. Jones, CET |LIFE MEMBER: VPEA ETA-I, NESDA, ISCET, Sterling
616 Liberty Street |Who's Who. PHONE:276-632-5045, FAX:276-632-0868
Martinsville Va 24112 |[email protected], ICQ11269732, AIM pjonescet
------------------------------------------------------------------------

If it's "fixed", don't "break it"!

mailto:p[email protected]

<http://www.kimbanet.com/~pjones/default.htm>
<http://www.kimbanet.com/~pjones/90th_Birthday/index.htm>
<http://www.kimbanet.com/~pjones/Fulcher/default.html>
<http://www.kimbanet.com/~pjones/Harris/default.htm>
<http://www.kimbanet.com/~pjones/Jones/default.htm>

<http://www.vpea.org>
 
K

Kurt

Elliott Roper said:
Yeah? Does it actually work? That is the full Acrobat, not the reader?
Adobe used to have an OCR thing built into the PC version, but since I
hate Acrobat with a passion I never let it too close to any machine I
want to keep using, so I have not tried it lately.

Reader is only for reading. That's why it's free.
Acrobat Pro does convert all your text and images into Word, but not
formatted well.
 
K

Kdesigns

I have Pro 7 and it only gave me 1 line of text on 2 pages. The original
file has a photo background with text from Photoshop. Does Pro 8 do a better
job?
 
J

John McGhie [MVP - Word and Word Macintosh]

Hi Kathy:

No.

The issue is that the content of a PDF is not necessarily "text". The
content of a PDF is not expressed in a form that can be recognised as
"text".

Even if a PDF contains "characters", as Elliott notes, the characters are
not necessarily in any specific sequence. PDF is a "page description"
language. Content often appears in PDF in the order that the printer will
make up the image for the page.

Because PDF is a page description language, each element (i.e. Each piece of
ink on the page) contains a "position" (so many pixels in and so many pixels
down the page...)

Which means that content can appear in the file in ANY sequence that suits
the generating application.

Microsoft Word does a bit of that, itself. All of the headers, footers,
styles and graphics are all lumped into containers at the extreme bottom of
the file.

And the text normally contains no positioning at all: Word "pours" text onto
the screen, character by character. There is no such thing as a "page" in a
Word file, it generates page images on the way out to the screen or the
printer, they don't exist within the file.

So you have two completely different paradigms for representing information,
in Word and PDF. If you get the text out at all, consider yourself lucky.
An experienced Word user knows that having done so, it is far, far quicker
to discard any formatting that came with it and start again than to try to
fix the formatting you got from the PDF.

Word naturally formats text using STYLES. Styles are simply named
collections of formatting properties. PDF does not describe styles, only
their individual properties. Far better to strip the formatting from ex-PDF
text and re-apply the correct styles. Not only quicker, but completely
consistent.

As Phillip notes, the "Save as a Word document" facility in Acrobat 7 and
later can be useful for retrieving the text from a PDF. But discard the
formatting that comes with it, and do not depend on the text being in the
correct sequence.

Sorry: What you see is about the best you're going to get. There are no
"good" answers.

Which is why I tend to get straight back to people who send me a PDF and ask
them for a usable copy of whatever it is. For my purposes, PDF is "pretty,
but useless" :)

Cheers

I have Pro 7 and it only gave me 1 line of text on 2 pages. The original file
has a photo background with text from Photoshop. Does Pro 8 do a better job?


--

Please reply to the newsgroup to maintain the thread. Please do not email
me unless I ask you to.

John McGhie <[email protected]>
Microsoft MVP, Word and Word for Macintosh. Business Analyst, Consultant
Technical Writer.
Sydney, Australia +61 (0) 4 1209 1410
 
K

Kdesigns

Okey Dokey! I¹m going to Plan B.



Hi Kathy:

No.

The issue is that the content of a PDF is not necessarily "text". The
content of a PDF is not expressed in a form that can be recognised as
"text".

Even if a PDF contains "characters", as Elliott notes, the characters are
not necessarily in any specific sequence. PDF is a "page description"
language. Content often appears in PDF in the order that the printer will
make up the image for the page.

Because PDF is a page description language, each element (i.e. Each piece of
ink on the page) contains a "position" (so many pixels in and so many pixels
down the page...)

Which means that content can appear in the file in ANY sequence that suits
the generating application.

Microsoft Word does a bit of that, itself. All of the headers, footers,
styles and graphics are all lumped into containers at the extreme bottom of
the file.

And the text normally contains no positioning at all: Word "pours" text onto
the screen, character by character. There is no such thing as a "page" in a
Word file, it generates page images on the way out to the screen or the
printer, they don't exist within the file.

So you have two completely different paradigms for representing information,
in Word and PDF. If you get the text out at all, consider yourself lucky.
An experienced Word user knows that having done so, it is far, far quicker
to discard any formatting that came with it and start again than to try to
fix the formatting you got from the PDF.

Word naturally formats text using STYLES. Styles are simply named
collections of formatting properties. PDF does not describe styles, only
their individual properties. Far better to strip the formatting from ex-PDF
text and re-apply the correct styles. Not only quicker, but completely
consistent.

As Phillip notes, the "Save as a Word document" facility in Acrobat 7 and
later can be useful for retrieving the text from a PDF. But discard the
formatting that comes with it, and do not depend on the text being in the
correct sequence.

Sorry: What you see is about the best you're going to get. There are no
"good" answers.

Which is why I tend to get straight back to people who send me a PDF and ask
them for a usable copy of whatever it is. For my purposes, PDF is "pretty,
but useless" :)

Cheers
 
K

Kurt

Kurt said:
Reader is only for reading. That's why it's free.
Acrobat Pro does convert all your text and images into Word, but not
formatted well.

BTW I'm using Pro 8. Tried this with a few documents, BTW, but never
have any need to convert to Word. PDF is so much better to keep fonts
and formatting between platforms.
 
M

Michel Bintener

Reader is only for reading. That's why it's free.

Adobe Reader does have an option to save a PDF file as a plain text
document, though.

--
Michel Bintener
Microsoft MVP
Office:Mac (Entourage & Word)

***Always reply to the newsgroup.***
 
P

Phillip Jones

Yes its the full Acrobat Pro 7.0.8.

Yes it does a respectable job on Simple word documents

Elliott said:
Yeah? Does it actually work? That is the full Acrobat, not the reader?
Adobe used to have an OCR thing built into the PC version, but since I
hate Acrobat with a passion I never let it too close to any machine I
want to keep using, so I have not tried it lately.

--
------------------------------------------------------------------------
Phillip M. Jones, CET |LIFE MEMBER: VPEA ETA-I, NESDA, ISCET, Sterling
616 Liberty Street |Who's Who. PHONE:276-632-5045, FAX:276-632-0868
Martinsville Va 24112 |[email protected], ICQ11269732, AIM pjonescet
------------------------------------------------------------------------

If it's "fixed", don't "break it"!

mailto:p[email protected]

<http://www.kimbanet.com/~pjones/default.htm>
<http://www.kimbanet.com/~pjones/90th_Birthday/index.htm>
<http://www.kimbanet.com/~pjones/Fulcher/default.html>
<http://www.kimbanet.com/~pjones/Harris/default.htm>
<http://www.kimbanet.com/~pjones/Jones/default.htm>

<http://www.vpea.org>
 
K

Kdesigns

Mine are simple files but most of the text and none of the photo backgrounds
show up in Word. I went with a program called PDF2Office and it works
surprisingly well, much better than Pro 7.0.8
 
E

Elliott Roper

Kdesigns said:
Mine are simple files but most of the text and none of the photo backgrounds
show up in Word. I went with a program called PDF2Office and it works
surprisingly well, much better than Pro 7.0.8

Thanks. That's good to know. While I OCR protected PDF stuff, it is
tiresome.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top