Deleting paragraph markers at the end of each line

S

Stephen Fox

I'm new to this group, so forgive me if this has been asked and answered
before. Using Office X.

I've converted a long (5 MB) PDF file to a Word document. But each line
is now a separate paragraph. Is there a way to delete these line
paragraphs so as to consolidate lines into a single paragraph?

Hope that's clear.

Thanks!

Steve
 
S

Stephen Fox

Clive said:
Well thanks for nothing, Steve -- I've just seen the other thread, created
long before your new one, and realize I've been completely wasting my time.

Clive Huggan
============

Welcome, Steve!

Click on the pilcrow button (backwards "P" -- paragraph mark) on the toolbar
to make non-printing marks visible if they aren't already. Check whether the
lines end just in a paragraph mark or a space followed by a paragraph mark.

Bring up the "Find & Replace" pane with Command-Shift-h. <==[I think it's
Command-Shift-h in Word X -- I skipped from Word 2001 to Word 2004 so can't
be sure; if not it's Command-h].

Click in the "Find what" box, then hold down the Shift key and type "6" to
give you a carat "^", then follow that with a "p". Together, "^p" stands for
"paragraph mark" in this pane.

In the "Replace with" box, either type a space (if you will need one to stop
the adjacent words from being joined) or if not just click in the box, which
will result in the paragraph mark being replaced with nothing.

Click "Find next" then "Replace" if it does what you want.

Once you're happy, key Command-r to trigger the Replace action. If you
didn't have to watch for the genuine paragraph marks you could.

I usually display the Word document next to the PDF and replace the
"genuine" paragraph marks first with a temporary character -- say, the
Option-z character. Then I key Command-a to "Replace All" (i.e., to replace
all the remaining "non-genuine" paragraph marks). Then I replace the
temporary characters with paragraph marks, again with Command-a.

After the first time, the whole procedure takes less time than reading this
post! ;-)

You may well know that you can paste text from PDFs into Word via Edit menu
-> Paste Special -> Unformatted Text and the pasted-in text will have the
characteristics of the paragraph in which your insertion point is located --
saves a lot of re-formatting. You can also export PDF text from Acrobat if
that's what you're using.

I do a huge amount of this on occasions (have just checked this newsgroup as
light relief from an hour of it). If you need to do it often, we have a
macro that automates the paste-in; post back if you want details.

Cheers,

Clive Huggan
Canberra, Australia
(My time zone is 5-11 hours different from North America and Europe, so my
follow-on responses to those regions can be delayed)
============================================================
Avoid long delays before your post appears -- use Entourage or newsreader
software -- see http://word.mvps.org/Mac/AccessNewsgroups.html
============================================================



I'm new to this group, so forgive me if this has been asked and answered
before. Using Office X.

I've converted a long (5 MB) PDF file to a Word document. But each line
is now a separate paragraph. Is there a way to delete these line
paragraphs so as to consolidate lines into a single paragraph?

Hope that's clear.

Thanks!

Steve
Clive,

I hadn't seen the other thread either, before starting my own. Sorry.
I will give your technique a try. Got nowhere with the suggestions from
the other thread.

BTW, is this a top or bottom posting list?

Steve
 
D

Daiya Mitchell

BTW, is this a top or bottom posting list?
General policy--maintain the existing pattern to avoid the worst-case
scenario of back and forth responses. Inline posting with snipping
irrelevant material is always accepted and welcomed. Top posting
completely accepted. Bottom posting--just realize that some people will
not bother to scroll all the way to the bottom to read your post. Some
will.

I personally get kinda furious to be asked to scroll past four screens
of irrelevant material to read a "thanks, all fine now", because if the
conversation is over, it really doesn't matter about the logical flow of
discussion. I generally choose top or bottom posting as seems
appropriate to the context.

See also:
http://word.mvps.org/Mac/AccessNewsgroups.html

Daiya
 
S

Stephen Fox

Clive,

I got your method to work, somewhat.

I'm going through the nearly 600 pages paragraph by paragraph, using the
Command/Shift/H technique. But that's better than one line at a time.

Steve

Most of the people who respond to a lot of posts in the Microsoft newsgroups
prefer top posting, Steve, because the time taken to scroll down becomes
significant. However, if a bottom-posting trend has already started, we tend
to follow. Otherwise it can become confusing. And of course there is inline
posting. No great dramas either way.

Clive Huggan

Clive said:
Clive said:
Well thanks for nothing, Steve -- I've just seen the other thread, created
long before your new one, and realize I've been completely wasting my time.

Clive Huggan
============

On 26/2/07 2:24 PM, in article
C2089E2A.26758%[email protected], "Clive Huggan"

Welcome, Steve!

Click on the pilcrow button (backwards "P" -- paragraph mark) on the toolbar
to make non-printing marks visible if they aren't already. Check whether the
lines end just in a paragraph mark or a space followed by a paragraph mark.

Bring up the "Find & Replace" pane with Command-Shift-h. <==[I think it's
Command-Shift-h in Word X -- I skipped from Word 2001 to Word 2004 so can't
be sure; if not it's Command-h].

Click in the "Find what" box, then hold down the Shift key and type "6" to
give you a carat "^", then follow that with a "p". Together, "^p" stands for
"paragraph mark" in this pane.

In the "Replace with" box, either type a space (if you will need one to stop
the adjacent words from being joined) or if not just click in the box, which
will result in the paragraph mark being replaced with nothing.

Click "Find next" then "Replace" if it does what you want.

Once you're happy, key Command-r to trigger the Replace action. If you
didn't have to watch for the genuine paragraph marks you could.

I usually display the Word document next to the PDF and replace the
"genuine" paragraph marks first with a temporary character -- say, the
Option-z character. Then I key Command-a to "Replace All" (i.e., to replace
all the remaining "non-genuine" paragraph marks). Then I replace the
temporary characters with paragraph marks, again with Command-a.

After the first time, the whole procedure takes less time than reading this
post! ;-)

You may well know that you can paste text from PDFs into Word via Edit menu
-> Paste Special -> Unformatted Text and the pasted-in text will have the
characteristics of the paragraph in which your insertion point is located --
saves a lot of re-formatting. You can also export PDF text from Acrobat if
that's what you're using.

I do a huge amount of this on occasions (have just checked this newsgroup as
light relief from an hour of it). If you need to do it often, we have a
macro that automates the paste-in; post back if you want details.

Cheers,

Clive Huggan
Canberra, Australia
(My time zone is 5-11 hours different from North America and Europe, so my
follow-on responses to those regions can be delayed)
============================================================
Avoid long delays before your post appears -- use Entourage or newsreader
software -- see http://word.mvps.org/Mac/AccessNewsgroups.html
============================================================



On 26/2/07 11:04 AM, in article #MCT#[email protected],

I'm new to this group, so forgive me if this has been asked and answered
before. Using Office X.

I've converted a long (5 MB) PDF file to a Word document. But each line
is now a separate paragraph. Is there a way to delete these line
paragraphs so as to consolidate lines into a single paragraph?

Hope that's clear.

Thanks!

Steve
Clive,

I hadn't seen the other thread either, before starting my own. Sorry.
I will give your technique a try. Got nowhere with the suggestions from
the other thread.

BTW, is this a top or bottom posting list?

Steve

Most of the people who respond to a lot of posts in the Microsoft newsgroups
prefer top posting, Steve, because the time taken to scroll down becomes
significant. However, if a bottom-posting trend has already started, we tend
to follow. Otherwise it can become confusing. And of course there is inline
posting. No great dramas either way.

Clive Huggan
============
 
S

Stephen Fox

Clive,

I must have missed something, because the problem I ran into is a bit
more complicated than the ones you describe.

The first time I tried the ^p replace thing, it did get rid of them all,
including the "real" paragraph markers. I would up with a document that
looked like a solid block of text. Bad news.

That left me with the same trick, but omitting the last line of each
paragraph. But going through a 600 page book in that manner is also
formidable.

Like I said, I must have missed something because you make it sound like
Cheez Whiz.

Steve

Clive said:
Thanks for the feedback, Steve.

Yes, those "real" end-of-para paragraph marks will be a pain -- but oh, the
sheer enjoyment when you zap all the remaining, unwanted paragraph marks in
one hit!

It's just occurred to me (sorry, I became very busy when I originally
replied): I should have mentioned that if there are two paragraph marks
between paragraphs, as there often is, the correcting is very quick because
your first step is to replace all instances of ^p^p with say followed by
replacing all the instances of ^p with nothing or a space -- all done in a
minute, even with 600 pages (but on a Saved As copy, just in case ;-)

Though on second thoughts Elliott probably mentioned that.

Cheers,
Clive H
=======

Clive,

I got your method to work, somewhat.

I'm going through the nearly 600 pages paragraph by paragraph, using the
Command/Shift/H technique. But that's better than one line at a time.

Steve

Most of the people who respond to a lot of posts in the Microsoft newsgroups
prefer top posting, Steve, because the time taken to scroll down becomes
significant. However, if a bottom-posting trend has already started, we tend
to follow. Otherwise it can become confusing. And of course there is inline
posting. No great dramas either way.

Clive Huggan

Clive said:
On 27/2/07 4:55 AM, in article (e-mail address removed),

Clive Huggan wrote:
Well thanks for nothing, Steve -- I've just seen the other thread, created
long before your new one, and realize I've been completely wasting my time.

Clive Huggan
============

On 26/2/07 2:24 PM, in article
C2089E2A.26758%[email protected], "Clive Huggan"

Welcome, Steve!

Click on the pilcrow button (backwards "P" -- paragraph mark) on the
toolbar
to make non-printing marks visible if they aren't already. Check whether
the
lines end just in a paragraph mark or a space followed by a paragraph
mark.

Bring up the "Find & Replace" pane with Command-Shift-h. <==[I think it's
Command-Shift-h in Word X -- I skipped from Word 2001 to Word 2004 so
can't
be sure; if not it's Command-h].

Click in the "Find what" box, then hold down the Shift key and type "6" to
give you a carat "^", then follow that with a "p". Together, "^p" stands
for
"paragraph mark" in this pane.

In the "Replace with" box, either type a space (if you will need one to
stop
the adjacent words from being joined) or if not just click in the box,
which
will result in the paragraph mark being replaced with nothing.

Click "Find next" then "Replace" if it does what you want.

Once you're happy, key Command-r to trigger the Replace action. If you
didn't have to watch for the genuine paragraph marks you could.

I usually display the Word document next to the PDF and replace the
"genuine" paragraph marks first with a temporary character -- say, the
Option-z character. Then I key Command-a to "Replace All" (i.e., to
replace
all the remaining "non-genuine" paragraph marks). Then I replace the
temporary characters with paragraph marks, again with Command-a.

After the first time, the whole procedure takes less time than reading
this
post! ;-)

You may well know that you can paste text from PDFs into Word via Edit
menu
-> Paste Special -> Unformatted Text and the pasted-in text will have the
characteristics of the paragraph in which your insertion point is located
--
saves a lot of re-formatting. You can also export PDF text from Acrobat
if
that's what you're using.

I do a huge amount of this on occasions (have just checked this newsgroup
as
light relief from an hour of it). If you need to do it often, we have a
macro that automates the paste-in; post back if you want details.

Cheers,

Clive Huggan
Canberra, Australia
(My time zone is 5-11 hours different from North America and Europe, so my
follow-on responses to those regions can be delayed)
============================================================
Avoid long delays before your post appears -- use Entourage or newsreader
software -- see http://word.mvps.org/Mac/AccessNewsgroups.html
============================================================



On 26/2/07 11:04 AM, in article #MCT#[email protected],

I'm new to this group, so forgive me if this has been asked and answered
before. Using Office X.

I've converted a long (5 MB) PDF file to a Word document. But each line
is now a separate paragraph. Is there a way to delete these line
paragraphs so as to consolidate lines into a single paragraph?

Hope that's clear.

Thanks!

Steve
Clive,

I hadn't seen the other thread either, before starting my own. Sorry.
I will give your technique a try. Got nowhere with the suggestions from
the other thread.

BTW, is this a top or bottom posting list?

Steve
Most of the people who respond to a lot of posts in the Microsoft newsgroups
prefer top posting, Steve, because the time taken to scroll down becomes
significant. However, if a bottom-posting trend has already started, we tend
to follow. Otherwise it can become confusing. And of course there is inline
posting. No great dramas either way.

Clive Huggan
============
 
E

Elliott Roper

Stephen Fox said:
Clive,

I must have missed something, because the problem I ran into is a bit
more complicated than the ones you describe.

The first time I tried the ^p replace thing, it did get rid of them all,
including the "real" paragraph markers. I would up with a document that
looked like a solid block of text. Bad news.

That left me with the same trick, but omitting the last line of each
paragraph. But going through a 600 page book in that manner is also
formidable.

Like I said, I must have missed something because you make it sound like
Cheez Whiz.

You did, and it is, except not as cheesy.

The bit you missed is replacing ^p^p (i.e) any sequence of two
paragraph marks with something not likely to be in your document. It
becomes a temporary placeholder for your real paragraph ends while you
nuke the unwanted ones into a smouldering pile of green glowing space
characters.
Yes, at that stage, your document is one enormous paragraph.

The next step replaces your placeholder with brand new genuine
paragraph marks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top