G
Greg Maxey
Fellow MVP Helmut Weber and I where comparing efficiencies of our macros for
deleting duplicate paragraphs from a document.
In this discussion (and tests) I have learned that: 1) my macro is
considerably faster that his using Word2003. 2) Helmut's is considerably
faster than mine using Word2007. 3) Both are considerably faster using
Word2003.
Assuming Word2007 is a better product than its predecessor, its seems that
at the very least both would be faster in Word2007! I am interested if
anyone knows of the changes to the object model that would account for
degrade in performance using both methods and the flip-flop in the result
with my method in Word2003 compared to Word2007.
Jezebel or Steve Hudson, I don't remember which, once explained to me how
using Do ... Loop Until x Is Nothing was faster than a For Each ... Next x.
The results in our tests bare that out in Word2003, but it seems the Do ...
Until x Is Nothing method has taken the slow road in Word2007.
Here is a sample of results observed using 201 and 801 paragraphs. The
first 200/800 paragraphs are unique. The last paragraph (201/801) is a
duplicate of the 200/800th paragraph.
Word 2003 200 800
My Method 4.375 sec 68.68 sec
Helmut's Method 7.48 sec 120.81 sec
Word 2007 200 800
My Method 14.42 sec 9.98 sec
Helmut's Method 229.56 sec 158.44 sec
If you are interested running your own comparisons, here is code for
generating the test paragraphs, my code, and Helmut's code.
Sub BuildTestParagraphs()
Dim oRng As Word.Range
Dim i As Long
Set oRng = ActiveDocument.Range
oRng.Delete
For i = 1 To 200
If i = 1 Then
oRng.InsertAfter "The quick brown fox jumped over the lazy dog." & vbCr
Else
oRng.InsertAfter "The quick and extremely agile brown fox" _
& " jumped over " & i & " the lazy dogs." & vbCr
End If
Next i
oRng.InsertAfter "The quick and extremely agile brown fox" _
& " jumped over " & i - 1 & " the lazy dogs."
End Sub
Sub GregsKillDuplicates()
Dim eTime As Single
Dim oParRef As Paragraph
Dim oParChk As Paragraph
eTime = Timer
Set oParRef = ActiveDocument.Range.Paragraphs(1)
Set oParChk = ActiveDocument.Range.Paragraphs(2)
Do
Do
'An empty last paragraph may throw an error on the last loop.
On Error GoTo Err_Exit
If oParRef.Range = oParChk.Range Then
oParChk.Range.Delete
Else
Set oParChk = oParChk.Next
End If
Loop Until oParChk Is Nothing
Set oParRef = oParRef.Next
On Error Resume Next
Set oParChk = oParRef.Next
On Error GoTo 0
Loop Until oParRef Is Nothing
Err_Exit:
MsgBox Timer - eTime
End Sub
Sub HelmutsKillParagraphs()
'AKA Makro6x
Dim t As Single
t = Timer
Dim prg1 As Paragraph
Dim prg2 As Paragraph
For Each prg1 In ActiveDocument.Range.Paragraphs
For Each prg2 In ActiveDocument.Range.Paragraphs
If prg1.Range.Text = prg2.Range.Text Then
If prg1.Range.Start <> prg2.Range.Start Then
prg2.Range.Delete
End If
End If
Next
Next
MsgBox Timer - t
End Sub
deleting duplicate paragraphs from a document.
In this discussion (and tests) I have learned that: 1) my macro is
considerably faster that his using Word2003. 2) Helmut's is considerably
faster than mine using Word2007. 3) Both are considerably faster using
Word2003.
Assuming Word2007 is a better product than its predecessor, its seems that
at the very least both would be faster in Word2007! I am interested if
anyone knows of the changes to the object model that would account for
degrade in performance using both methods and the flip-flop in the result
with my method in Word2003 compared to Word2007.
Jezebel or Steve Hudson, I don't remember which, once explained to me how
using Do ... Loop Until x Is Nothing was faster than a For Each ... Next x.
The results in our tests bare that out in Word2003, but it seems the Do ...
Until x Is Nothing method has taken the slow road in Word2007.
Here is a sample of results observed using 201 and 801 paragraphs. The
first 200/800 paragraphs are unique. The last paragraph (201/801) is a
duplicate of the 200/800th paragraph.
Word 2003 200 800
My Method 4.375 sec 68.68 sec
Helmut's Method 7.48 sec 120.81 sec
Word 2007 200 800
My Method 14.42 sec 9.98 sec
Helmut's Method 229.56 sec 158.44 sec
If you are interested running your own comparisons, here is code for
generating the test paragraphs, my code, and Helmut's code.
Sub BuildTestParagraphs()
Dim oRng As Word.Range
Dim i As Long
Set oRng = ActiveDocument.Range
oRng.Delete
For i = 1 To 200
If i = 1 Then
oRng.InsertAfter "The quick brown fox jumped over the lazy dog." & vbCr
Else
oRng.InsertAfter "The quick and extremely agile brown fox" _
& " jumped over " & i & " the lazy dogs." & vbCr
End If
Next i
oRng.InsertAfter "The quick and extremely agile brown fox" _
& " jumped over " & i - 1 & " the lazy dogs."
End Sub
Sub GregsKillDuplicates()
Dim eTime As Single
Dim oParRef As Paragraph
Dim oParChk As Paragraph
eTime = Timer
Set oParRef = ActiveDocument.Range.Paragraphs(1)
Set oParChk = ActiveDocument.Range.Paragraphs(2)
Do
Do
'An empty last paragraph may throw an error on the last loop.
On Error GoTo Err_Exit
If oParRef.Range = oParChk.Range Then
oParChk.Range.Delete
Else
Set oParChk = oParChk.Next
End If
Loop Until oParChk Is Nothing
Set oParRef = oParRef.Next
On Error Resume Next
Set oParChk = oParRef.Next
On Error GoTo 0
Loop Until oParRef Is Nothing
Err_Exit:
MsgBox Timer - eTime
End Sub
Sub HelmutsKillParagraphs()
'AKA Makro6x
Dim t As Single
t = Timer
Dim prg1 As Paragraph
Dim prg2 As Paragraph
For Each prg1 In ActiveDocument.Range.Paragraphs
For Each prg2 In ActiveDocument.Range.Paragraphs
If prg1.Range.Text = prg2.Range.Text Then
If prg1.Range.Start <> prg2.Range.Start Then
prg2.Range.Delete
End If
End If
Next
Next
MsgBox Timer - t
End Sub