replace all ^p 's unless followed by a date

B

Brent Whaling

I am using word to prepare a delimited file to be imported into Excel and am
having trouble getting rid of extra ^p characters.

Basically, starting with the second line of the file I need to remove all ^p
unless they are immediately followed by the date (ex. 04-NOV-2008). I've
figured out the pattern mask (^p^#^#-^$^$^$-20^#^#).

I was hoping to use "Find and Replace" code but I'm wondering how to create
the above exception. Maybe a properly place "if" statement or perhaps there
is a property of "Find" that I'm not aware of.

Any help would be much appreciated.

Until now I've been doing it this way but I keep finding more and more ^p
characters and I would like to implement a more elegant solution.


Dim wdApp As Word.Application, wdDoc As Word.Document

On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then 'Word isn't already running
Set wdApp = CreateObject("Word.Application")
End If
On Error GoTo 0

Set wdDoc = wdApp.Documents.Open("C:\temp\esi_monitor.dat")

wdApp.Visible = True

With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = ".^p.^p"
.Replacement.Text = ".."
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^p)"
.Replacement.Text = ")"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "D^p@"
.Replacement.Text = "D@"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "\@|"
.Replacement.Text = "\&|"
.Execute Replace:=wdReplaceAll
End With

wdDoc.SaveAs "C:\temp\esi_monitor.dat"
wdDoc.Close
wdApp.Quit
 
J

Jay Freedman

Word's Find/Replace isn't really set up to do exceptions. You could program
a macro that does a Find for each ^p and then tries to figure out whether
the text that follows it matches the rest of your pattern, but that would be
horribly inefficient and slow. A better method is to do three "Replace All"
steps:

1. Use a wildcard search (that is, include .MatchWildcards = True in
the settings of the Find object; see
http://www.gmayor.com/replace_using_wildcards.htm) for
^13([0-9][0-9]-[A-Z]{3}-20[0-9][0-9])
and replace with
~~\1
so each date that was preceded by a paragraph mark is now
preceded by two tildes (the assumption being that two tildes
won't appear anywhere else in the document).

2. Search (without wildcards) for
^p
and replace with nothing (that is, .Replacement.Text = "").

3. Search for
~~
and replace with
^p

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.
 
B

Brent Whaling

This worked as long as I took out the ^13. With the ^13 it never found an
instance. I was using the the find/replace without the ^13 but now (as would
be expected) I have examples where it is now inserting ^P's where there
shouldn't be.

Any ideas?

I really appreciate your post, it got me on the right track.

Jay Freedman said:
Word's Find/Replace isn't really set up to do exceptions. You could program
a macro that does a Find for each ^p and then tries to figure out whether
the text that follows it matches the rest of your pattern, but that would be
horribly inefficient and slow. A better method is to do three "Replace All"
steps:

1. Use a wildcard search (that is, include .MatchWildcards = True in
the settings of the Find object; see
http://www.gmayor.com/replace_using_wildcards.htm) for
^13([0-9][0-9]-[A-Z]{3}-20[0-9][0-9])
and replace with
~~\1
so each date that was preceded by a paragraph mark is now
preceded by two tildes (the assumption being that two tildes
won't appear anywhere else in the document).

2. Search (without wildcards) for
^p
and replace with nothing (that is, .Replacement.Text = "").

3. Search for
~~
and replace with
^p

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.

Brent said:
I am using word to prepare a delimited file to be imported into Excel
and am having trouble getting rid of extra ^p characters.

Basically, starting with the second line of the file I need to remove
all ^p unless they are immediately followed by the date (ex.
04-NOV-2008). I've figured out the pattern mask
(^p^#^#-^$^$^$-20^#^#).

I was hoping to use "Find and Replace" code but I'm wondering how to
create the above exception. Maybe a properly place "if" statement or
perhaps there is a property of "Find" that I'm not aware of.

Any help would be much appreciated.

Until now I've been doing it this way but I keep finding more and
more ^p characters and I would like to implement a more elegant
solution.


Dim wdApp As Word.Application, wdDoc As Word.Document

On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then 'Word isn't already running
Set wdApp = CreateObject("Word.Application")
End If
On Error GoTo 0

Set wdDoc = wdApp.Documents.Open("C:\temp\esi_monitor.dat")

wdApp.Visible = True

With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = ".^p.^p"
.Replacement.Text = ".."
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^p)"
.Replacement.Text = ")"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "D^p@"
.Replacement.Text = "D@"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "\@|"
.Replacement.Text = "\&|"
.Execute Replace:=wdReplaceAll
End With

wdDoc.SaveAs "C:\temp\esi_monitor.dat"
wdDoc.Close
wdApp.Quit
 
B

Brent Whaling

I figured it out. I created a macro with the code "msgbox asc(selection)" to
find the actual ascii character being used. I found that the ^p's were
actually ascii char(10)'s. So I did a find/replace ^10 with ^13. Then ran the
find/replace that Jay supplied. It worked great.

Thanks again!

Brent Whaling said:
This worked as long as I took out the ^13. With the ^13 it never found an
instance. I was using the the find/replace without the ^13 but now (as would
be expected) I have examples where it is now inserting ^P's where there
shouldn't be.

Any ideas?

I really appreciate your post, it got me on the right track.

Jay Freedman said:
Word's Find/Replace isn't really set up to do exceptions. You could program
a macro that does a Find for each ^p and then tries to figure out whether
the text that follows it matches the rest of your pattern, but that would be
horribly inefficient and slow. A better method is to do three "Replace All"
steps:

1. Use a wildcard search (that is, include .MatchWildcards = True in
the settings of the Find object; see
http://www.gmayor.com/replace_using_wildcards.htm) for
^13([0-9][0-9]-[A-Z]{3}-20[0-9][0-9])
and replace with
~~\1
so each date that was preceded by a paragraph mark is now
preceded by two tildes (the assumption being that two tildes
won't appear anywhere else in the document).

2. Search (without wildcards) for
^p
and replace with nothing (that is, .Replacement.Text = "").

3. Search for
~~
and replace with
^p

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.

Brent said:
I am using word to prepare a delimited file to be imported into Excel
and am having trouble getting rid of extra ^p characters.

Basically, starting with the second line of the file I need to remove
all ^p unless they are immediately followed by the date (ex.
04-NOV-2008). I've figured out the pattern mask
(^p^#^#-^$^$^$-20^#^#).

I was hoping to use "Find and Replace" code but I'm wondering how to
create the above exception. Maybe a properly place "if" statement or
perhaps there is a property of "Find" that I'm not aware of.

Any help would be much appreciated.

Until now I've been doing it this way but I keep finding more and
more ^p characters and I would like to implement a more elegant
solution.


Dim wdApp As Word.Application, wdDoc As Word.Document

On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then 'Word isn't already running
Set wdApp = CreateObject("Word.Application")
End If
On Error GoTo 0

Set wdDoc = wdApp.Documents.Open("C:\temp\esi_monitor.dat")

wdApp.Visible = True

With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = ".^p.^p"
.Replacement.Text = ".."
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^p)"
.Replacement.Text = ")"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "D^p@"
.Replacement.Text = "D@"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "\@|"
.Replacement.Text = "\&|"
.Execute Replace:=wdReplaceAll
End With

wdDoc.SaveAs "C:\temp\esi_monitor.dat"
wdDoc.Close
wdApp.Quit
 
J

Jay Freedman

The occurrence of ASCII 10 instead of 13 sometimes happens when you import a
text file from other applications, especially ones that are generated by old
mainframe-based programs. It should never happen in a file created in Word
or other Office programs.

I'm glad you figured it out, and thanks for posting the solution.

Brent said:
I figured it out. I created a macro with the code "msgbox
asc(selection)" to find the actual ascii character being used. I
found that the ^p's were actually ascii char(10)'s. So I did a
find/replace ^10 with ^13. Then ran the find/replace that Jay
supplied. It worked great.

Thanks again!

Brent Whaling said:
This worked as long as I took out the ^13. With the ^13 it never
found an instance. I was using the the find/replace without the ^13
but now (as would be expected) I have examples where it is now
inserting ^P's where there shouldn't be.

Any ideas?

I really appreciate your post, it got me on the right track.

Jay Freedman said:
Word's Find/Replace isn't really set up to do exceptions. You could
program a macro that does a Find for each ^p and then tries to
figure out whether the text that follows it matches the rest of
your pattern, but that would be horribly inefficient and slow. A
better method is to do three "Replace All" steps:

1. Use a wildcard search (that is, include .MatchWildcards = True in
the settings of the Find object; see
http://www.gmayor.com/replace_using_wildcards.htm) for
^13([0-9][0-9]-[A-Z]{3}-20[0-9][0-9])
and replace with
~~\1
so each date that was preceded by a paragraph mark is now
preceded by two tildes (the assumption being that two tildes
won't appear anywhere else in the document).

2. Search (without wildcards) for
^p
and replace with nothing (that is, .Replacement.Text = "").

3. Search for
~~
and replace with
^p

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the
newsgroup so all may benefit.

Brent Whaling wrote:
I am using word to prepare a delimited file to be imported into
Excel and am having trouble getting rid of extra ^p characters.

Basically, starting with the second line of the file I need to
remove all ^p unless they are immediately followed by the date (ex.
04-NOV-2008). I've figured out the pattern mask
(^p^#^#-^$^$^$-20^#^#).

I was hoping to use "Find and Replace" code but I'm wondering how
to create the above exception. Maybe a properly place "if"
statement or perhaps there is a property of "Find" that I'm not
aware of.

Any help would be much appreciated.

Until now I've been doing it this way but I keep finding more and
more ^p characters and I would like to implement a more elegant
solution.


Dim wdApp As Word.Application, wdDoc As Word.Document

On Error Resume Next
Set wdApp = GetObject(, "Word.Application")
If Err.Number <> 0 Then 'Word isn't already running
Set wdApp = CreateObject("Word.Application")
End If
On Error GoTo 0

Set wdDoc = wdApp.Documents.Open("C:\temp\esi_monitor.dat")

wdApp.Visible = True

With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = ".^p.^p"
.Replacement.Text = ".."
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "^p)"
.Replacement.Text = ")"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "D^p@"
.Replacement.Text = "D@"
.Execute Replace:=wdReplaceAll
End With
With wdDoc.Range.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = "\@|"
.Replacement.Text = "\&|"
.Execute Replace:=wdReplaceAll
End With

wdDoc.SaveAs "C:\temp\esi_monitor.dat"
wdDoc.Close
wdApp.Quit
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top