Split Word doc into several files (continued)

P

Pesach Shelnitz

Hi,

A week ago, I responded to a thread of a similar name that was started by
(e-mail address removed), I ended up posting several versions of a macro that
accomplished the task almost perfectly, but even the last revision created an
extra blank page at the end of the files created or didn't correctly format
the last paragraph. Splitting a doc at a page break with preservation of all
formatting in the new files created is complicated by the fact that the page
break can be at a point exactly between two paragraphs (the simplest case),
at a point in the middle of a paragraph, at a point within a table, or at
some other tricky point. I worked on the macro a little more, and this
revision can now handle the first two types of page breaks much better, but
it should not be used in cases in which a table spans a page break where the
original file should be split.

This macro may be of interest to anyone who needs to copy formatted portions
of a doc into another doc.

Sub SplitDocByPagesAndSaveParts()
Dim myRange As Range
Dim doc As Document
Dim name, partName As String
Dim i As Integer
Dim j As Integer
Dim k As Integer
Dim pSize As WdPaperSize
Dim pWidth As Integer
Dim pHeight As Integer
Dim hdDist As Integer
Dim ftDist As Integer
Dim lMargin As Integer
Dim rMargin As Integer
Dim tMargin As Integer
Dim bMargin As Integer
Dim pos1 As Long
Dim endFound As Boolean

endFound = False
name = ActiveDocument.FullName
i = 1
k = InStr(1, name, ".docx")
If k = 0 Then
MsgBox "The name of the source file must have the .docx extension."
Exit Sub
End If
name = Left(name, k - 1)
pSize = ActiveDocument.PageSetup.PaperSize
pHeight = ActiveDocument.PageSetup.PageHeight
pWidth = ActiveDocument.PageSetup.PageWidth
hdDist = ActiveDocument.PageSetup.HeaderDistance
ftDist = ActiveDocument.PageSetup.FooterDistance
lMargin = ActiveDocument.PageSetup.LeftMargin
rMargin = ActiveDocument.PageSetup.RightMargin
tMargin = ActiveDocument.PageSetup.TopMargin
bMargin = ActiveDocument.PageSetup.BottomMargin
Selection.HomeKey wdStory
Do While endFound = False
pos1 = Selection.Start
For j = 1 To 10
ActiveDocument.Bookmarks("\Page").Select
If ActiveDocument.Bookmarks("\Page").Range.End _
<> ActiveDocument.Bookmarks("\EndOfDoc").Range.Start Then
Selection.MoveRight Unit:=wdCharacter, Count:=1
Else
endFound = True
Selection.Collapse Direction:=wdCollapseEnd
End If
Next
If endFound = False Then
Selection.MoveLeft Unit:=wdCharacter, Count:=1
Selection.TypeParagraph
Set myRange = ActiveDocument.Range(Start:=pos1,
End:=Selection.Start)
myRange.Copy
Selection.Delete Unit:=wdCharacter, Count:=-1
Selection.MoveRight Unit:=wdCharacter, Count:=1
Else
Set myRange = ActiveDocument.Range(Start:=pos1,
End:=Selection.Start)
myRange.Copy
End If
Set doc = Documents.Add(ActiveDocument.AttachedTemplate.FullName)
Selection.Paste
If endFound = False Then
Selection.Delete Unit:=wdCharacter, Count:=-1
End If
doc.PageSetup.PaperSize = pSize
doc.PageSetup.PageHeight = pHeight
doc.PageSetup.PageWidth = pWidth
doc.PageSetup.HeaderDistance = hdDist
doc.PageSetup.FooterDistance = hdDist
doc.PageSetup.LeftMargin = lMargin
doc.PageSetup.RightMargin = rMargin
doc.PageSetup.TopMargin = tMargin
doc.PageSetup.BottomMargin = bMargin
doc.SaveAs fileName:=name & "_Part" & CStr(i), _
FileFormat:=wdFormatDocumentDefault
doc.Close
i = i + 1
Loop
Set doc = Nothing
Set myRange = Nothing
End Sub

Comments are welcome.

Thanks,
Pesach Shelnitz
 
J

Jean-Guy Marcil

Pesach Shelnitz was telling us:
Pesach Shelnitz nous racontait que :
Hi,

A week ago, I responded to a thread of a similar name that was
started by (e-mail address removed), I ended up posting several versions
of a macro that accomplished the task almost perfectly, but even the
last revision created an extra blank page at the end of the files
created or didn't correctly format the last paragraph. Splitting a
doc at a page break with preservation of all formatting in the new
files created is complicated by the fact that the page break can be
at a point exactly between two paragraphs (the simplest case), at a
point in the middle of a paragraph, at a point within a table, or at
some other tricky point. I worked on the macro a little more, and
this revision can now handle the first two types of page breaks much
better, but it should not be used in cases in which a table spans a
page break where the original file should be split.

This macro may be of interest to anyone who needs to copy formatted
portions of a doc into another doc.

Sub SplitDocByPagesAndSaveParts()
Dim myRange As Range
Dim doc As Document
Dim name, partName As String
Dim i As Integer
Dim j As Integer
Dim k As Integer
Dim pSize As WdPaperSize
Dim pWidth As Integer
Dim pHeight As Integer
Dim hdDist As Integer
Dim ftDist As Integer
Dim lMargin As Integer
Dim rMargin As Integer
Dim tMargin As Integer
Dim bMargin As Integer
Dim pos1 As Long
Dim endFound As Boolean

endFound = False
name = ActiveDocument.FullName
i = 1
k = InStr(1, name, ".docx")
If k = 0 Then
MsgBox "The name of the source file must have the .docx
extension." Exit Sub
End If
name = Left(name, k - 1)
pSize = ActiveDocument.PageSetup.PaperSize
pHeight = ActiveDocument.PageSetup.PageHeight
pWidth = ActiveDocument.PageSetup.PageWidth
hdDist = ActiveDocument.PageSetup.HeaderDistance
ftDist = ActiveDocument.PageSetup.FooterDistance
lMargin = ActiveDocument.PageSetup.LeftMargin
rMargin = ActiveDocument.PageSetup.RightMargin
tMargin = ActiveDocument.PageSetup.TopMargin
bMargin = ActiveDocument.PageSetup.BottomMargin
Selection.HomeKey wdStory
Do While endFound = False
pos1 = Selection.Start
For j = 1 To 10
ActiveDocument.Bookmarks("\Page").Select
If ActiveDocument.Bookmarks("\Page").Range.End _
<> ActiveDocument.Bookmarks("\EndOfDoc").Range.Start
Then Selection.MoveRight Unit:=wdCharacter, Count:=1
Else
endFound = True
Selection.Collapse Direction:=wdCollapseEnd
End If
Next
If endFound = False Then
Selection.MoveLeft Unit:=wdCharacter, Count:=1
Selection.TypeParagraph
Set myRange = ActiveDocument.Range(Start:=pos1,
End:=Selection.Start)
myRange.Copy
Selection.Delete Unit:=wdCharacter, Count:=-1
Selection.MoveRight Unit:=wdCharacter, Count:=1
Else
Set myRange = ActiveDocument.Range(Start:=pos1,
End:=Selection.Start)
myRange.Copy
End If
Set doc =
Documents.Add(ActiveDocument.AttachedTemplate.FullName)
Selection.Paste If endFound = False Then
Selection.Delete Unit:=wdCharacter, Count:=-1
End If
doc.PageSetup.PaperSize = pSize
doc.PageSetup.PageHeight = pHeight
doc.PageSetup.PageWidth = pWidth
doc.PageSetup.HeaderDistance = hdDist
doc.PageSetup.FooterDistance = hdDist
doc.PageSetup.LeftMargin = lMargin
doc.PageSetup.RightMargin = rMargin
doc.PageSetup.TopMargin = tMargin
doc.PageSetup.BottomMargin = bMargin
doc.SaveAs fileName:=name & "_Part" & CStr(i), _
FileFormat:=wdFormatDocumentDefault
doc.Close
i = i + 1
Loop
Set doc = Nothing
Set myRange = Nothing
End Sub

Comments are welcome.

Thanks,
Pesach Shelnitz

Here is a little something to help you with tables...

Dim docNew As Document
Dim rgeSource As Range

Set rgeSource = Selection.Range.Bookmarks("\Page").Range

Set docNew = Documents.Add

docNew.Range.FormattedText = rgeSource.FormattedText

Try to use range assignments like above instead of using .Copy, which may
destroy something the user was saving in the clipboard.

Try to use the Type Long instead of the Type Integer,which is deprecated.

Do use With blocks to make you code easier to read and marginally faster to
run...:
For instance:

With ActiveDocument.PageSetup
pSize = .PaperSize
pHeight = .PageHeight
pWidth = .PageWidth
hdDist = .HeaderDistance
ftDist = .FooterDistance
lMargin = .LeftMargin
rMargin = .RightMargin
tMargin = .TopMargin
bMargin = .BottomMargin
End With


And:

With doc.PageSetup
.PaperSize = pSize
.PageHeight = pHeight
.PageWidth = pWidth
.HeaderDistance = hdDist
.FooterDistance = hdDist
.LeftMargin = lMargin
.RightMargin = rMargin
.TopMargin = tMargin
.BottomMargin = bMargin
End with
 
P

Pesach Shelnitz

Hi Jean-Guy,

Thank you for taking the time to look over my code and offer your suggestions.

From the start, I tried simply copying the contents of the FormattedText
property instead of using the Copy and Paste methods, but I was losing the
formatting of the last paragraph. Now that my macro is correctly splitting
paragraphs at the page breaks where the document should be split, I could use
the preferable technique that you suggested for files that don't need to be
split at a page break within a table. In such cases, even if I split the
table at the page break, the macro crashes when it tries to save the new file
after such a break. With Copy and Paste I can presently generate the split
files even when splitting is done at a page break within a table. I intend to
continue working on this.

I don't understand why you say that the Integer type should be replaced by
Long in my macro. According to the MSDN documentation for InStr
(http://msdn.microsoft.com/en-us/library/8460tsh1.aspx ), the return value of
this function is an Integer, and Integer is still listed as a data type for
Visual Basic (http://msdn.microsoft.com/en-us/library/47zceaw7(VS.100).aspx).
I thus don't see any reason not to use the Integer type for the return value
of InStr (k), integers from 1 to 10 (j), and the number of files generated by
my macro (i).

On the other hand, the MSDN documentation for the PageSetup.FooterDistance
property (http://msdn.microsoft.com/en-us/library/bb210941.aspx) and other
properties of this object, states that each of these properties holds a
Single. Both a Single and an Integer are 32-bit numbers, but since a Single
supports a floating decimal point, I should use the Single type for these
properties. The Long data type offers no advantage over the Integer type in
this case because it uses more memory (8 bytes instead of 4) and does not
support a floating decimal point.

Thank you also for reminding me to use the With statement more frequently. I
agree 100%, and I'll pay more attention to this in the future.

Pesach
 
J

Jean-Guy Marcil

Pesach Shelnitz was telling us:
Pesach Shelnitz nous racontait que :
Hi Jean-Guy,

Thank you for taking the time to look over my code and offer your
suggestions.

From the start, I tried simply copying the contents of the
FormattedText property instead of using the Copy and Paste methods,
but I was losing the formatting of the last paragraph.

This probably means that there was a problem with the manner in which you
were manipulating your ranges.
Now that my
macro is correctly splitting paragraphs at the page breaks where the
document should be split, I could use the preferable technique that
you suggested for files that don't need to be split at a page break
within a table. In such cases, even if I split the table at the page
break, the macro crashes when it tries to save the new file after
such a break.

Then you were doing something wrong?
:p
With Copy and Paste I can presently generate the split
files even when splitting is done at a page break within a table. I
intend to continue working on this.

I don't understand why you say that the Integer type should be
replaced by Long in my macro. According to the MSDN documentation for
InStr (http://msdn.microsoft.com/en-us/library/8460tsh1.aspx ), the
return value of this function is an Integer, and Integer is still
listed as a data type for Visual Basic

Probably for backward compatibility.
(http://msdn.microsoft.com/en-us/library/47zceaw7(VS.100).aspx). I
thus don't see any reason not to use the Integer type for the return
value of InStr (k), integers from 1 to 10 (j), and the number of
files generated by my macro (i).

On the other hand, the MSDN documentation for the
PageSetup.FooterDistance property
(http://msdn.microsoft.com/en-us/library/bb210941.aspx) and other
properties of this object, states that each of these properties holds
a Single. Both a Single and an Integer are 32-bit numbers, but since
a Single supports a floating decimal point, I should use the Single
type for these properties. The Long data type offers no advantage
over the Integer type in this case because it uses more memory (8
bytes instead of 4) and does not support a floating decimal point.

Some time ago I read that the Integer, being so small, cannot correspond to
an actual memory address, so the compiler has to convert it so that it can
actually fit in the smallest physical unit that is available, which, I was
told, was a long.

So, if you have a lot of integer manipulation in a macro, it might slow
down a bit because the compiler has to convert the integers into Longs to
actually place them in the smallest physical memory unit available. However,
this was a while ago, and I may remember incorrectly.

So, since I read about this detail, I always use Long.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top