NEWBIE:splitting multi-page word doc into single word doc - thank

P

patti

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it from
another source. Unfortunately, the files retrieved are not named properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original file
may not contain the page break, but I'd still like to be able to capture this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
D

Doug Robbins - Word MVP

Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
P

patti

Hi Doug,

Many thanks for your generous offer of code. This does indeed parse the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once these
pieces are located, I can copy the contents of the client transaction (the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example: joe_smith_1_24_07.doc [this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two pieces of
information, located between each page break, I'd really appreciate it. This
way, the files will be named properly and in keeping with his requirements.

I so appreciate you sharing this code segment. If you have any additional
suggestions as to how to extract these two important pieces of info, while in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and appreciation,
Patti

Doug Robbins - Word MVP said:
Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
R

Russ

Patti,
What we need to search for are consistent patterns that you say are on each
page. Can you figure out what the patterns are? If not, then show us a few
pages of data so that we can see how it is laid out. You can, of course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent text,
font, color, heading styles, etc.
Hi Doug,

Many thanks for your generous offer of code. This does indeed parse the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once these
pieces are located, I can copy the contents of the client transaction (the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example: joe_smith_1_24_07.doc [this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two pieces of
information, located between each page break, I'd really appreciate it. This
way, the files will be named properly and in keeping with his requirements.

I so appreciate you sharing this code segment. If you have any additional
suggestions as to how to extract these two important pieces of info, while in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and appreciation,
Patti

Doug Robbins - Word MVP said:
Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
P

patti

Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the file]

Client information/requirement captured here and may extend into multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client information
header, but the person who input the data was not consistent in their entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


Russ said:
Patti,
What we need to search for are consistent patterns that you say are on each
page. Can you figure out what the patterns are? If not, then show us a few
pages of data so that we can see how it is laid out. You can, of course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent text,
font, color, heading styles, etc.
Hi Doug,

Many thanks for your generous offer of code. This does indeed parse the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once these
pieces are located, I can copy the contents of the client transaction (the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example: joe_smith_1_24_07.doc [this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two pieces of
information, located between each page break, I'd really appreciate it. This
way, the files will be named properly and in keeping with his requirements.

I so appreciate you sharing this code segment. If you have any additional
suggestions as to how to extract these two important pieces of info, while in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and appreciation,
Patti

Doug Robbins - Word MVP said:
Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
R

Russ

Hi Patti,
More info please.
So a clients name appears *by itself* (no label) in the first paragraph of
each page (and may repeat on *consecutive pages* if more information is
available for that particular client)? And is consistent in that respect
from page 1 to end of document?
The date you want is always the first date found on the first page of each
client and always formatted month/day/year(two digit year)?
The name and date are always in the main text area and not in header or
footer of page?
The name and date are not formatted differently than the rest of the text?
The date is always preceded by the label Date:?
Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the file]

Client information/requirement captured here and may extend into multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client information
header, but the person who input the data was not consistent in their entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


Russ said:
Patti,
What we need to search for are consistent patterns that you say are on each
page. Can you figure out what the patterns are? If not, then show us a few
pages of data so that we can see how it is laid out. You can, of course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent text,
font, color, heading styles, etc.
Hi Doug,

Many thanks for your generous offer of code. This does indeed parse the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once these
pieces are located, I can copy the contents of the client transaction (the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example: joe_smith_1_24_07.doc
[this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two pieces of
information, located between each page break, I'd really appreciate it.
This
way, the files will be named properly and in keeping with his requirements.

I so appreciate you sharing this code segment. If you have any additional
suggestions as to how to extract these two important pieces of info, while
in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and appreciation,
Patti

:

Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
P

patti

Hi Russ,

Thanks so much for your response and inquiries. I'll do my best to
address them.

First page :
-Client Name is in the very first paragraph, followed by paragraph symbol
For example: Patty Foobar1, paragraph symbol
- date string is contained somewhere prior to the page break
- date string format:
Date: -> mm/dd/yy (where -> is some Microsoft inserted symbol), followed by
paragraph symbol. For example: Date: -> 6/26/07

Client data may, or may not span multiple pages.
If there are multiple pages:
Client Name is in the first paragraph, but may be underlined, contain
extraneous information (eg. Patty Foobar1-6/26/07, followed by paragraph
symbol

New Filename (Patty_Foobar1_6_26_07.doc) [named based upon first page]
- should contain everything in page 1 and subsequent pages, where applicable.

Header/Footer questions:
The pieces of information that will end up in the new client_name_date.doc
are
not located in the header or footer sections.

Many thanks once again for your help and interest.

Gratefully,
Patti
=====================================================

Russ said:
Hi Patti,
More info please.
So a clients name appears *by itself* (no label) in the first paragraph of
each page (and may repeat on *consecutive pages* if more information is
available for that particular client)? And is consistent in that respect
from page 1 to end of document?
The date you want is always the first date found on the first page of each
client and always formatted month/day/year(two digit year)?
The name and date are always in the main text area and not in header or
footer of page?
The name and date are not formatted differently than the rest of the text?
The date is always preceded by the label Date:?
Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the file]

Client information/requirement captured here and may extend into multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client information
header, but the person who input the data was not consistent in their entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


Russ said:
Patti,
What we need to search for are consistent patterns that you say are on each
page. Can you figure out what the patterns are? If not, then show us a few
pages of data so that we can see how it is laid out. You can, of course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent text,
font, color, heading styles, etc.

Hi Doug,

Many thanks for your generous offer of code. This does indeed parse the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once these
pieces are located, I can copy the contents of the client transaction (the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example: joe_smith_1_24_07.doc
[this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two pieces of
information, located between each page break, I'd really appreciate it.
This
way, the files will be named properly and in keeping with his requirements.

I so appreciate you sharing this code segment. If you have any additional
suggestions as to how to extract these two important pieces of info, while
in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and appreciation,
Patti

:

Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
D

Doug Robbins - Word MVP

This is untested, but I think it will do what you want:

Dim Counter As Long
Dim Source As Document, Target As Document
Dim ClientName As Range
Dim FileDate As Range
Dim DocName As String
Set Source = ActiveDocument
Selection.HomeKey Unit:=wdStory
Pages = Source.BuiltInDocumentProperties(wdPropertyPages)
Counter = 0
While Counter < Pages
Counter = Counter + 1
Source.Bookmarks("\Page").Range.Cut
Set Target = Documents.Add
Target.Range.Paste
Set ClientName = Target.Paragraphs(1).Range
ClientName.End = ClientName.End - 1
DocName = Left(ClientName.Text, InStr(ClientName.Text, " ") - 1) & "_" &
Mid(ClientName, InStr(ClientName.Text, " ") + 1)
Target.Activate
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13"
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
End With
Selection.Find.Execute
Set FileDate = Selection.Range
FileDate.End = FileDate.End - 1
DocName = DocName & Format(FileDate.Text, "dd_MM_yy")
Target.SaveAs FileName:=DocName
Target.Close
Wend


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi Russ,

Thanks so much for your response and inquiries. I'll do my best to
address them.

First page :
-Client Name is in the very first paragraph, followed by paragraph symbol
For example: Patty Foobar1, paragraph symbol
- date string is contained somewhere prior to the page break
- date string format:
Date: -> mm/dd/yy (where -> is some Microsoft inserted symbol), followed
by
paragraph symbol. For example: Date: -> 6/26/07

Client data may, or may not span multiple pages.
If there are multiple pages:
Client Name is in the first paragraph, but may be underlined, contain
extraneous information (eg. Patty Foobar1-6/26/07, followed by paragraph
symbol

New Filename (Patty_Foobar1_6_26_07.doc) [named based upon first page]
- should contain everything in page 1 and subsequent pages, where
applicable.

Header/Footer questions:
The pieces of information that will end up in the new client_name_date.doc
are
not located in the header or footer sections.

Many thanks once again for your help and interest.

Gratefully,
Patti
=====================================================

Russ said:
Hi Patti,
More info please.
So a clients name appears *by itself* (no label) in the first paragraph
of
each page (and may repeat on *consecutive pages* if more information is
available for that particular client)? And is consistent in that respect
from page 1 to end of document?
The date you want is always the first date found on the first page of
each
client and always formatted month/day/year(two digit year)?
The name and date are always in the main text area and not in header or
footer of page?
The name and date are not formatted differently than the rest of the
text?
The date is always preceded by the label Date:?
Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs
which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the
file]

Client information/requirement captured here and may extend into
multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client
information
header, but the person who input the data was not consistent in their
entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into
multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


:

Patti,
What we need to search for are consistent patterns that you say are on
each
page. Can you figure out what the patterns are? If not, then show us a
few
pages of data so that we can see how it is laid out. You can, of
course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent
text,
font, color, heading styles, etc.

Hi Doug,

Many thanks for your generous offer of code. This does indeed parse
the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection
of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a
paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once
these
pieces are located, I can copy the contents of the client transaction
(the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example:
joe_smith_1_24_07.doc
[this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client
located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two
pieces of
information, located between each page break, I'd really appreciate
it.
This
way, the files will be named properly and in keeping with his
requirements.

I so appreciate you sharing this code segment. If you have any
additional
suggestions as to how to extract these two important pieces of info,
while
in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and
appreciation,
Patti

:

Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a
document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of
my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything
for
anyone. His hard drive of clients was hosed and is able to retrive
it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info,
delimited
with over 65 page breaks. Within each page break contains the
client
info,
which needs to be extracted into their own word document, and take
on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is
contained
within each page break in the larger document. Last record in
original
file
may not contain the page break, but I'd still like to be able to
capture
this
one as well.

Filename containing relevant client information should be of the
form
client_name_date_of_service.doc (this information is contained
within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a
given
folder
-for each large document file (over 60 files) -- start the
splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the
client
name]
- for each paragraph [search for the string "date"]
- generate the client name
(eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and
copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in
this
file
-- note: the final client info. may not contain the
trailing
page break,
but I'd still like to be able to capture it
and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within
the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to
all who
take the time to share their time and talents by responding witih
the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
P

patti

Hi Doug,

This looks like it fits the bill exactly. I particularly like the approach
you took to locate the date (With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13")

Now that's neat! Many thanks once again to all for all your help.

Gratefully,
Patti




Doug Robbins - Word MVP said:
This is untested, but I think it will do what you want:

Dim Counter As Long
Dim Source As Document, Target As Document
Dim ClientName As Range
Dim FileDate As Range
Dim DocName As String
Set Source = ActiveDocument
Selection.HomeKey Unit:=wdStory
Pages = Source.BuiltInDocumentProperties(wdPropertyPages)
Counter = 0
While Counter < Pages
Counter = Counter + 1
Source.Bookmarks("\Page").Range.Cut
Set Target = Documents.Add
Target.Range.Paste
Set ClientName = Target.Paragraphs(1).Range
ClientName.End = ClientName.End - 1
DocName = Left(ClientName.Text, InStr(ClientName.Text, " ") - 1) & "_" &
Mid(ClientName, InStr(ClientName.Text, " ") + 1)
Target.Activate
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13"
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
End With
Selection.Find.Execute
Set FileDate = Selection.Range
FileDate.End = FileDate.End - 1
DocName = DocName & Format(FileDate.Text, "dd_MM_yy")
Target.SaveAs FileName:=DocName
Target.Close
Wend


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi Russ,

Thanks so much for your response and inquiries. I'll do my best to
address them.

First page :
-Client Name is in the very first paragraph, followed by paragraph symbol
For example: Patty Foobar1, paragraph symbol
- date string is contained somewhere prior to the page break
- date string format:
Date: -> mm/dd/yy (where -> is some Microsoft inserted symbol), followed
by
paragraph symbol. For example: Date: -> 6/26/07

Client data may, or may not span multiple pages.
If there are multiple pages:
Client Name is in the first paragraph, but may be underlined, contain
extraneous information (eg. Patty Foobar1-6/26/07, followed by paragraph
symbol

New Filename (Patty_Foobar1_6_26_07.doc) [named based upon first page]
- should contain everything in page 1 and subsequent pages, where
applicable.

Header/Footer questions:
The pieces of information that will end up in the new client_name_date.doc
are
not located in the header or footer sections.

Many thanks once again for your help and interest.

Gratefully,
Patti
=====================================================

Russ said:
Hi Patti,
More info please.
So a clients name appears *by itself* (no label) in the first paragraph
of
each page (and may repeat on *consecutive pages* if more information is
available for that particular client)? And is consistent in that respect
from page 1 to end of document?
The date you want is always the first date found on the first page of
each
client and always formatted month/day/year(two digit year)?
The name and date are always in the main text area and not in header or
footer of page?
The name and date are not formatted differently than the rest of the
text?
The date is always preceded by the label Date:?

Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs
which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the
file]

Client information/requirement captured here and may extend into
multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client
information
header, but the person who input the data was not consistent in their
entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into
multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


:

Patti,
What we need to search for are consistent patterns that you say are on
each
page. Can you figure out what the patterns are? If not, then show us a
few
pages of data so that we can see how it is laid out. You can, of
course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent
text,
font, color, heading styles, etc.

Hi Doug,

Many thanks for your generous offer of code. This does indeed parse
the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection
of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a
paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once
these
pieces are located, I can copy the contents of the client transaction
(the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example:
joe_smith_1_24_07.doc
[this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client
located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two
pieces of
information, located between each page break, I'd really appreciate
it.
This
way, the files will be named properly and in keeping with his
requirements.

I so appreciate you sharing this code segment. If you have any
additional
suggestions as to how to extract these two important pieces of info,
while
in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and
appreciation,
Patti

:

Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a
document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of
my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything
for
 
R

Russ

Patti,
Sorry, I didn't have time to work more on this until this weekend.
However, starting with Doug's basic premise, I tried to add the check for
the same client on multiple pages.

Sub Splitter()

Dim counter As Long, Source As Document, Target As Document
Dim blnFirstPage As Boolean, strDate As String, strClient As String
Dim lngPages As Long, strDocName As String, strClient2 As String
Dim myRange As Word.Range

Set Source = ActiveDocument
blnFirstPage = True
Selection.HomeKey Unit:=wdStory
lngPages = Source.BuiltInDocumentProperties(wdPropertyPages)
counter = 0
Application.ScreenUpdating = False
While counter < lngPages
counter = counter + 1
Source.Activate
Source.Bookmarks("\Page").Range.Cut
If blnFirstPage Then
blnFirstPage = False
Set Target = Documents.Add
Target.Range.PasteAndFormat (wdFormatOriginalFormatting)
strClient = Target.Paragraphs(1).Range.Text
strClient = Left(strClient, Len(strClient) - 1)
strClient2 = Replace(strClient, " ", "_")
Set myRange = Target.Range.Duplicate
With myRange.Find
.Text = "[0-9]{1,2}/[0-9]{1,2}/[0-9]{2}"
.MatchWildcards = True
.Execute
If .Found Then
strDocName = strClient2 & "_" & _
Replace(myRange.Text, "/", "_")
Else
MsgBox "Date not found for client: " & strClient
Source.Undo
Target.Undo
Target.Close
Exit Sub
End If
End With
Else
myRange.Start = Target.Range.End
myRange.PasteAndFormat (wdFormatOriginalFormatting)
End If
If counter = lngPages Or InStr(Source.Paragraphs(1).Range.Text, _
strClient) = 0 Then
Do While Target.Paragraphs.Last.Range.Characters.Count = 1
Target.Paragraphs.Last.Range.Delete
Loop
Target.SaveAs FileName:=strDocName
blnFirstPage = True
Target.Close
End If
Wend
Application.ScreenUpdating = True
End Sub
Hi Doug,

This looks like it fits the bill exactly. I particularly like the approach
you took to locate the date (With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13")

Now that's neat! Many thanks once again to all for all your help.

Gratefully,
Patti




Doug Robbins - Word MVP said:
This is untested, but I think it will do what you want:

Dim Counter As Long
Dim Source As Document, Target As Document
Dim ClientName As Range
Dim FileDate As Range
Dim DocName As String
Set Source = ActiveDocument
Selection.HomeKey Unit:=wdStory
Pages = Source.BuiltInDocumentProperties(wdPropertyPages)
Counter = 0
While Counter < Pages
Counter = Counter + 1
Source.Bookmarks("\Page").Range.Cut
Set Target = Documents.Add
Target.Range.Paste
Set ClientName = Target.Paragraphs(1).Range
ClientName.End = ClientName.End - 1
DocName = Left(ClientName.Text, InStr(ClientName.Text, " ") - 1) & "_" &
Mid(ClientName, InStr(ClientName.Text, " ") + 1)
Target.Activate
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13"
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
End With
Selection.Find.Execute
Set FileDate = Selection.Range
FileDate.End = FileDate.End - 1
DocName = DocName & Format(FileDate.Text, "dd_MM_yy")
Target.SaveAs FileName:=DocName
Target.Close
Wend


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi Russ,

Thanks so much for your response and inquiries. I'll do my best to
address them.

First page :
-Client Name is in the very first paragraph, followed by paragraph symbol
For example: Patty Foobar1, paragraph symbol
- date string is contained somewhere prior to the page break
- date string format:
Date: -> mm/dd/yy (where -> is some Microsoft inserted symbol), followed
by
paragraph symbol. For example: Date: -> 6/26/07

Client data may, or may not span multiple pages.
If there are multiple pages:
Client Name is in the first paragraph, but may be underlined, contain
extraneous information (eg. Patty Foobar1-6/26/07, followed by paragraph
symbol

New Filename (Patty_Foobar1_6_26_07.doc) [named based upon first page]
- should contain everything in page 1 and subsequent pages, where
applicable.

Header/Footer questions:
The pieces of information that will end up in the new client_name_date.doc
are
not located in the header or footer sections.

Many thanks once again for your help and interest.

Gratefully,
Patti
=====================================================

:

Hi Patti,
More info please.
So a clients name appears *by itself* (no label) in the first paragraph
of
each page (and may repeat on *consecutive pages* if more information is
available for that particular client)? And is consistent in that respect
from page 1 to end of document?
The date you want is always the first date found on the first page of
each
client and always formatted month/day/year(two digit year)?
The name and date are always in the main text area and not in header or
footer of page?
The name and date are not formatted differently than the rest of the
text?
The date is always preceded by the label Date:?

Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs
which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the
file]

Client information/requirement captured here and may extend into
multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client
information
header, but the person who input the data was not consistent in their
entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into
multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


:

Patti,
What we need to search for are consistent patterns that you say are on
each
page. Can you figure out what the patterns are? If not, then show us a
few
pages of data so that we can see how it is laid out. You can, of
course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent
text,
font, color, heading styles, etc.

Hi Doug,

Many thanks for your generous offer of code. This does indeed parse
the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection
of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a
paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once
these
pieces are located, I can copy the contents of the client transaction
(the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example:
joe_smith_1_24_07.doc
[this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client
located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two
pieces of
information, located between each page break, I'd really appreciate
it.
This
way, the files will be named properly and in keeping with his
requirements.

I so appreciate you sharing this code segment. If you have any
additional
suggestions as to how to extract these two important pieces of info,
while
in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and
appreciation,
Patti

:

Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a
document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of
my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything
for
 
R

Russ

Patti,
Also, if you want, you could add the three/four lines in message below to
unwind (undo) the original source document back to the beginning after it is
successfully split.
Patti,
Sorry, I didn't have time to work more on this until this weekend.
However, starting with Doug's basic premise, I tried to add the check for
the same client on multiple pages.

Sub Splitter()

Dim counter As Long, Source As Document, Target As Document
Dim blnFirstPage As Boolean, strDate As String, strClient As String
Dim lngPages As Long, strDocName As String, strClient2 As String
Dim myRange As Word.Range

Set Source = ActiveDocument
blnFirstPage = True
Selection.HomeKey Unit:=wdStory
lngPages = Source.BuiltInDocumentProperties(wdPropertyPages)
counter = 0
Application.ScreenUpdating = False
While counter < lngPages
counter = counter + 1
Source.Activate
Source.Bookmarks("\Page").Range.Cut
If blnFirstPage Then
blnFirstPage = False
Set Target = Documents.Add
Target.Range.PasteAndFormat (wdFormatOriginalFormatting)
strClient = Target.Paragraphs(1).Range.Text
strClient = Left(strClient, Len(strClient) - 1)
strClient2 = Replace(strClient, " ", "_")
Set myRange = Target.Range.Duplicate
With myRange.Find
.Text = "[0-9]{1,2}/[0-9]{1,2}/[0-9]{2}"
.MatchWildcards = True
.Execute
If .Found Then
strDocName = strClient2 & "_" & _
Replace(myRange.Text, "/", "_")
Else
MsgBox "Date not found for client: " & strClient
Source.Undo
Target.Undo
Target.Close
Exit Sub
End If
End With
Else
myRange.Start = Target.Range.End
myRange.PasteAndFormat (wdFormatOriginalFormatting)
End If
If counter = lngPages Or InStr(Source.Paragraphs(1).Range.Text, _
strClient) = 0 Then
Do While Target.Paragraphs.Last.Range.Characters.Count = 1
Target.Paragraphs.Last.Range.Delete
Loop
Target.SaveAs FileName:=strDocName
blnFirstPage = True
Target.Close
End If
Wend
''''''''''''''''''''''''
Do While Source.Undo
Loop
Source.UndoClear
'Source.Saved = True 'uncomment to allow file to close without save prompt
''''''''''''''''''''''''
Application.ScreenUpdating = True
End Sub
Hi Doug,

This looks like it fits the bill exactly. I particularly like the approach
you took to locate the date (With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13")

Now that's neat! Many thanks once again to all for all your help.

Gratefully,
Patti




Doug Robbins - Word MVP said:
This is untested, but I think it will do what you want:

Dim Counter As Long
Dim Source As Document, Target As Document
Dim ClientName As Range
Dim FileDate As Range
Dim DocName As String
Set Source = ActiveDocument
Selection.HomeKey Unit:=wdStory
Pages = Source.BuiltInDocumentProperties(wdPropertyPages)
Counter = 0
While Counter < Pages
Counter = Counter + 1
Source.Bookmarks("\Page").Range.Cut
Set Target = Documents.Add
Target.Range.Paste
Set ClientName = Target.Paragraphs(1).Range
ClientName.End = ClientName.End - 1
DocName = Left(ClientName.Text, InStr(ClientName.Text, " ") - 1) & "_" &
Mid(ClientName, InStr(ClientName.Text, " ") + 1)
Target.Activate
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13"
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
End With
Selection.Find.Execute
Set FileDate = Selection.Range
FileDate.End = FileDate.End - 1
DocName = DocName & Format(FileDate.Text, "dd_MM_yy")
Target.SaveAs FileName:=DocName
Target.Close
Wend


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi Russ,

Thanks so much for your response and inquiries. I'll do my best to
address them.

First page :
-Client Name is in the very first paragraph, followed by paragraph symbol
For example: Patty Foobar1, paragraph symbol
- date string is contained somewhere prior to the page break
- date string format:
Date: -> mm/dd/yy (where -> is some Microsoft inserted symbol), followed
by
paragraph symbol. For example: Date: -> 6/26/07

Client data may, or may not span multiple pages.
If there are multiple pages:
Client Name is in the first paragraph, but may be underlined, contain
extraneous information (eg. Patty Foobar1-6/26/07, followed by paragraph
symbol

New Filename (Patty_Foobar1_6_26_07.doc) [named based upon first page]
- should contain everything in page 1 and subsequent pages, where
applicable.

Header/Footer questions:
The pieces of information that will end up in the new client_name_date.doc
are
not located in the header or footer sections.

Many thanks once again for your help and interest.

Gratefully,
Patti
=====================================================

:

Hi Patti,
More info please.
So a clients name appears *by itself* (no label) in the first paragraph
of
each page (and may repeat on *consecutive pages* if more information is
available for that particular client)? And is consistent in that respect
from page 1 to end of document?
The date you want is always the first date found on the first page of
each
client and always formatted month/day/year(two digit year)?
The name and date are always in the main text area and not in header or
footer of page?
The name and date are not formatted differently than the rest of the
text?
The date is always preceded by the label Date:?

Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs
which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the
file]

Client information/requirement captured here and may extend into
multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client
information
header, but the person who input the data was not consistent in their
entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into
multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


:

Patti,
What we need to search for are consistent patterns that you say are on
each
page. Can you figure out what the patterns are? If not, then show us a
few
pages of data so that we can see how it is laid out. You can, of
course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent
text,
font, color, heading styles, etc.

Hi Doug,

Many thanks for your generous offer of code. This does indeed parse
the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection
of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a
paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once
these
pieces are located, I can copy the contents of the client transaction
(the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example:
joe_smith_1_24_07.doc
[this
is the first client located in the larger file]
frank_hood_1_31_06.doc [this is the second client
located in
the larger file], etc.

If you, or anyone else, has any ideas on how to gather these two
pieces of
information, located between each page break, I'd really appreciate
it.
This
way, the files will be named properly and in keeping with his
requirements.

I so appreciate you sharing this code segment. If you have any
additional
suggestions as to how to extract these two important pieces of info,
while
in
the parsing of each page break, I'd really appreciate it.

Thanks ever so much for your help. With much gratitude and
appreciation,
Patti

:

Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a
document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of
my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything
for
 
P

patti

Hi,

You and Doug have been invaluable resources, I am so thrilled to have had
the opportunity to learn something new. I very much appreciate all the
assistance.

Thanks once again for participating in this conference and for sharing your
time and talents with me.

With much appreciation and gratitude for the help,
Patti

Russ said:
Patti,
Also, if you want, you could add the three/four lines in message below to
unwind (undo) the original source document back to the beginning after it is
successfully split.
Patti,
Sorry, I didn't have time to work more on this until this weekend.
However, starting with Doug's basic premise, I tried to add the check for
the same client on multiple pages.

Sub Splitter()

Dim counter As Long, Source As Document, Target As Document
Dim blnFirstPage As Boolean, strDate As String, strClient As String
Dim lngPages As Long, strDocName As String, strClient2 As String
Dim myRange As Word.Range

Set Source = ActiveDocument
blnFirstPage = True
Selection.HomeKey Unit:=wdStory
lngPages = Source.BuiltInDocumentProperties(wdPropertyPages)
counter = 0
Application.ScreenUpdating = False
While counter < lngPages
counter = counter + 1
Source.Activate
Source.Bookmarks("\Page").Range.Cut
If blnFirstPage Then
blnFirstPage = False
Set Target = Documents.Add
Target.Range.PasteAndFormat (wdFormatOriginalFormatting)
strClient = Target.Paragraphs(1).Range.Text
strClient = Left(strClient, Len(strClient) - 1)
strClient2 = Replace(strClient, " ", "_")
Set myRange = Target.Range.Duplicate
With myRange.Find
.Text = "[0-9]{1,2}/[0-9]{1,2}/[0-9]{2}"
.MatchWildcards = True
.Execute
If .Found Then
strDocName = strClient2 & "_" & _
Replace(myRange.Text, "/", "_")
Else
MsgBox "Date not found for client: " & strClient
Source.Undo
Target.Undo
Target.Close
Exit Sub
End If
End With
Else
myRange.Start = Target.Range.End
myRange.PasteAndFormat (wdFormatOriginalFormatting)
End If
If counter = lngPages Or InStr(Source.Paragraphs(1).Range.Text, _
strClient) = 0 Then
Do While Target.Paragraphs.Last.Range.Characters.Count = 1
Target.Paragraphs.Last.Range.Delete
Loop
Target.SaveAs FileName:=strDocName
blnFirstPage = True
Target.Close
End If
Wend
''''''''''''''''''''''''
Do While Source.Undo
Loop
Source.UndoClear
'Source.Saved = True 'uncomment to allow file to close without save prompt
''''''''''''''''''''''''
Application.ScreenUpdating = True
End Sub
Hi Doug,

This looks like it fits the bill exactly. I particularly like the approach
you took to locate the date (With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13")

Now that's neat! Many thanks once again to all for all your help.

Gratefully,
Patti




:

This is untested, but I think it will do what you want:

Dim Counter As Long
Dim Source As Document, Target As Document
Dim ClientName As Range
Dim FileDate As Range
Dim DocName As String
Set Source = ActiveDocument
Selection.HomeKey Unit:=wdStory
Pages = Source.BuiltInDocumentProperties(wdPropertyPages)
Counter = 0
While Counter < Pages
Counter = Counter + 1
Source.Bookmarks("\Page").Range.Cut
Set Target = Documents.Add
Target.Range.Paste
Set ClientName = Target.Paragraphs(1).Range
ClientName.End = ClientName.End - 1
DocName = Left(ClientName.Text, InStr(ClientName.Text, " ") - 1) & "_" &
Mid(ClientName, InStr(ClientName.Text, " ") + 1)
Target.Activate
Selection.HomeKey wdStory
Selection.Find.ClearFormatting
With Selection.Find
.Text = "[0-9]{1,2}\/[0-9]{1,2}\/[0-9]{2}^13"
.Forward = True
.Wrap = wdFindStop
.MatchWildcards = True
End With
Selection.Find.Execute
Set FileDate = Selection.Range
FileDate.End = FileDate.End - 1
DocName = DocName & Format(FileDate.Text, "dd_MM_yy")
Target.SaveAs FileName:=DocName
Target.Close
Wend


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi Russ,

Thanks so much for your response and inquiries. I'll do my best to
address them.

First page :
-Client Name is in the very first paragraph, followed by paragraph symbol
For example: Patty Foobar1, paragraph symbol
- date string is contained somewhere prior to the page break
- date string format:
Date: -> mm/dd/yy (where -> is some Microsoft inserted symbol), followed
by
paragraph symbol. For example: Date: -> 6/26/07

Client data may, or may not span multiple pages.
If there are multiple pages:
Client Name is in the first paragraph, but may be underlined, contain
extraneous information (eg. Patty Foobar1-6/26/07, followed by paragraph
symbol

New Filename (Patty_Foobar1_6_26_07.doc) [named based upon first page]
- should contain everything in page 1 and subsequent pages, where
applicable.

Header/Footer questions:
The pieces of information that will end up in the new client_name_date.doc
are
not located in the header or footer sections.

Many thanks once again for your help and interest.

Gratefully,
Patti
=====================================================

:

Hi Patti,
More info please.
So a clients name appears *by itself* (no label) in the first paragraph
of
each page (and may repeat on *consecutive pages* if more information is
available for that particular client)? And is consistent in that respect
from page 1 to end of document?
The date you want is always the first date found on the first page of
each
client and always formatted month/day/year(two digit year)?
The name and date are always in the main text area and not in header or
footer of page?
The name and date are not formatted differently than the rest of the
text?
The date is always preceded by the label Date:?

Hi Russ,

Thanks so much in your interest. I really appreciate it.

Description: Split large files into separate files
- open each large word.doc file containing client info.
capture two fields:
Client[n]_name (eg. Patty Smith)
Date: (eg. Date: 9/4/07)
Filename generated: Patty_Smith_9_4_07
Client information/requirements are captured in various paragraphs
which may
extend beyond one page.
Cut speicific client information (pages [1-n]) and save as
client_name_date_as_recorded_in_page1
(eg. Patty_Foober1_7_18_07)

Sample included below for reference:

Sample:
Patty Foober1
Address
Telephone Number
Date: 7/18/07 [ this may or may not be located in this area of the
file]

Client information/requirement captured here and may extend into
multiple
pages.

PAGE 2
Patty Foobar1
Additional requirements may be captured here

---------------------------------page break
-----------------------------------

Rob Foobar
Date: 9/4/07 [ the date field appears somewhere in the client
information
header, but the person who input the data was not consistent in their
entry
methods, which means it needs to be searched and retrieved]

Client information/requirement captured here and may extend into
multiple
pages.
------------------------------- page break
-----------------------------------
Kanga Roo
Date: 9/1/07


Client information/requirement captured here
---------------------------------page break
-----------------------------------

--- Thanks again for any recommendations, and for sharing your time and
talents with me.

With much gratitude,
Patti

====================================================


:

Patti,
What we need to search for are consistent patterns that you say are on
each
page. Can you figure out what the patterns are? If not, then show us a
few
pages of data so that we can see how it is laid out. You can, of
course,
disguise the names, etc., but we need to know where the names and date
formats are in relationship to paragraph marks or other consistent
text,
font, color, heading styles, etc.

Hi Doug,

Many thanks for your generous offer of code. This does indeed parse
the
larger file, breaking it down and writing it to a file formatted as
page[n].doc.

The only downside, is that for every iteration through the collection
of
larger files, it overwrites the contents of the previous page[n].doc.

I've still quite a bit of work to do with this one though.

The bigger piece for me would be to locate two important pieces of
information, namely:
- the first paragraph or sentence as this contains the client name.
- Then establish a search throughout the page for a
paragraph/sentence
starting with the string "date" mentioned in my original post.

These two critical pieces of information form the
client_name_date_of_service.doc
which is the business model with which this person employs. Once
these
pieces are located, I can copy the contents of the client transaction
(the
code you posted), and then perform a 'file save as:
client_name_date_of_service.doc'. As an example:
joe_smith_1_24_07.doc
[this
 
A

aq4word

Hi Doug
Have just used your macro for splitting large files. Excellent. Thank you
for that. I have split a 240k file (a Scrabble dictionary that has about 50%
erroneous spellings per MS Spellcheck) into 43 smaller files (page 1, page 2,
etc.). I have then applied Greg Maxey's macro for deleting wrongly spelled
words (Thank you Greg) on just one file, (i.e. Page 1). Works fine, took
about 3 hours.
Question - Is it possible to batch process the other 42 files with Greg's
macro instead of doing them one by one?

Doug Robbins - Word MVP said:
Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info, delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc) and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all who
take the time to share their time and talents by responding witih the code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
D

Doug Robbins - Word MVP

You should be able to modify the code in the following article so that it
incorporates Greg's routine

See the article "Find & ReplaceAll on a batch of documents in the same
folder" at:

http://www.word.mvps.org/FAQs/MacrosVBA/BatchFR.htm


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

aq4word said:
Hi Doug
Have just used your macro for splitting large files. Excellent. Thank you
for that. I have split a 240k file (a Scrabble dictionary that has about
50%
erroneous spellings per MS Spellcheck) into 43 smaller files (page 1, page
2,
etc.). I have then applied Greg Maxey's macro for deleting wrongly spelled
words (Thank you Greg) on just one file, (i.e. Page 1). Works fine, took
about 3 hours.
Question - Is it possible to batch process the other 42 files with Greg's
macro instead of doing them one by one?

Doug Robbins - Word MVP said:
Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

patti said:
Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info,
delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on
the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to
capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within
each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the
splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the
client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc)
and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy
the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the
trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the
page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all
who
take the time to share their time and talents by responding witih the
code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 
A

aq4word

Thanks for your timely response Doug. Much appreciated. I'm working on it.
(I'm a Newbie!)

Regards Brian

Doug Robbins - Word MVP said:
You should be able to modify the code in the following article so that it
incorporates Greg's routine

See the article "Find & ReplaceAll on a batch of documents in the same
folder" at:

http://www.word.mvps.org/FAQs/MacrosVBA/BatchFR.htm


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

aq4word said:
Hi Doug
Have just used your macro for splitting large files. Excellent. Thank you
for that. I have split a 240k file (a Scrabble dictionary that has about
50%
erroneous spellings per MS Spellcheck) into 43 smaller files (page 1, page
2,
etc.). I have then applied Greg Maxey's macro for deleting wrongly spelled
words (Thank you Greg) on just one file, (i.e. Page 1). Works fine, took
about 3 hours.
Question - Is it possible to batch process the other 42 files with Greg's
macro instead of doing them one by one?

Doug Robbins - Word MVP said:
Sub splitter()

'

' splitter Macro

' Macro created 16-08-98 by Doug Robbins to save each page of a document

' as a separate file with the name Page#.DOC

'

Dim Counter As Long, Source As Document, Target As Document

Set Source = ActiveDocument

Selection.HomeKey Unit:=wdStory

Pages = Source.BuiltInDocumentProperties(wdPropertyPages)

Counter = 0

While Counter < Pages

Counter = Counter + 1

DocName = "Page" & Format(Counter)

Source.Bookmarks("\Page").Range.Cut

Set Target = Documents.Add

Target.Range.Paste

Target.SaveAs FileName:=DocName

Target.Close

Wend

End Sub


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP

Hi,

Environment: Windows XP/ home edition; sp2
Office 2000 [no help installed and no cd to install help :-( ]
experience: newbie to vba, c coding experience

This request is free of charge for a friend who would do anything for
anyone. His hard drive of clients was hosed and is able to retrive it
from
another source. Unfortunately, the files retrieved are not named
properly
and incompatible with his business operations.


Problem description/vba coding segment requested:

Over 60 documents named xxxxxnnnn.doc that contain client info,
delimited
with over 65 page breaks. Within each page break contains the client
info,
which needs to be extracted into their own word document, and take on
the
name: client_name_date_of_service.doc to easily distinguish it.

Relevant client information (including ultimate file name) is contained
within each page break in the larger document. Last record in original
file
may not contain the page break, but I'd still like to be able to
capture
this
one as well.

Filename containing relevant client information should be of the form
client_name_date_of_service.doc (this information is contained within
each
page break)

Objective:
- Cycle through all the word documents (approx 60 files) in a given
folder
-for each large document file (over 60 files) -- start the
splitting
process
- open each file
- for each page break found (between 65 - 80 page breaks)
- for each paragraph in each page break
- capture the first line of each page [this is the
client
name]
- for each paragraph [search for the string "date"]
- generate the client name (eg.joe_smith_1_24_07.doc)
and
save in a string variable
- capture the entire page
---> including final page break for each page
---- select the contents of this page break and copy
the
entire page,
including trailing page break [if absolutely
necessary,]
into client_name_date_of_service.doc.
(eg. joe_smith_1_24_07.doc)
- next [for each page break until no more page breaks in this
file
-- note: the final client info. may not contain the
trailing
page break,
but I'd still like to be able to capture it and
store it in its proper
clientname_date_of_appt.doc
- next [for each file containing all the client data within the
page
breaks]
- close/properly dispose of any allocated resources
- error handler to close/dispose to determine the cause of the
failure
and properly shutdown the application.

?? any additional steps that I've neglected to mention.

I enjoy helping people and learning new things. Many thanks to all
who
take the time to share their time and talents by responding witih the
code
capable of accomplishing this task.

With much gratitude and appreciation,
Patti
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top