Looping through every word in doc

H

hafabee

Hi.

I have a routine that loops through every word in the active document,
sentence by sentence. Here is the code outline:

For Each rngRange In ActiveDocument.StoryRanges
For Each rngSentence In rngRange.Sentences
For Each rngWord In rngSentence.Words
MyWordProcess(rngWord)
Next rngWord
Next rngSentence
Next rngRange

For Each shpShape In ActiveDocument.Shapes
If shpShape.TextFrame.HasText Then
rngRange = shpShape.TextFrame.TextRange
For Each rngSentence In rngRange.Sentences
For Each rngWord In rngSentence.Words
MyWordProcess(rngWord)
Next rngWord
Next rngSentence
End If
Next shpShape
End If

This code always goes through the doc from the beginning. I would like
to modify it to instated start where the cursor currently is in the
doc, but I cannot figure out how to do this given the two separate
loops. Is there a cleaner way to loop through all the words in a
document? I do not need to include the page headers/footers, but I do
need to include the text boxes. In processing each word I need to be
able to show the sentence it is in for context.

Thanks.
 
G

Graham Mayor

You can set the start of story range containing the cursor to the cursor
position

For Each rngRange In ActiveDocument.StoryRanges
If Selection.InRange(rngRange) Then
rngRange.Start = Selection.Range.Start
End If
'etc

and this will then ignore the words before the cursor in that particular
story range, but Word documents are not laid out logically as you see them
on the page and cannot be processed as if they were. Instead they comprise
of various layers - imagine a pile of transparencies each with a bit of
information that when stacked provide the whole view.

--
<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
Graham Mayor - Word MVP


<>>< ><<> ><<> <>>< ><<> <>>< <>><<>
 
H

hafabee

Thanks Graham. You confirm what I had deduced about the lack of a
single structure with all of the words. I liked your transparency
analogy.

On a related note, I am seeing an anomaly in how Word treats
textboxes. I am using Word 2002 and am wondering if this is true of
other versions as well. My code to loop through all words in a doc now
looks at all StoryRanges, all msoTextBox Shapes, and all msoTextBox
items on each Shapes' Canvas. The weird thing is, the first (first
being the first one created in the doc regardless of its layout
location) textbox (either as a Shape or as a Canvas item), shows up in
the StoryRanges as well as being in the shapes/canvas collection. I
need to look at each word only once but cannot figure out a simple way
to determine if a given range in both within a StoryRange and within a
shape.

Here is some code I ginned up to mark each sentence in a doc so I can
see what was going on. When I run it in Word 2002 on a doc that has at
least one text box, the sentences in the first textbox get marked both
in the story loop and the shape loop. Any insight would certainly be
appreciated.

Sub MarkAllSentences()

Dim rngRange As Range
Dim rngSentence As Range
Dim rngWord As Range
Dim strText As String
Dim i As Integer, j As Integer
Dim shpShape As Shape
Dim shpCanvasShape As Shape

i = 0
For Each rngRange In ActiveDocument.StoryRanges
i = i + 1
j = 0
For Each rngSentence In rngRange.Sentences
j = j + 1
strText = "[T" & i & "S" & j & ":" & rngSentence.Start & "-" &
rngSentence.End & "]"
rngSentence.InsertBefore strText
For Each rngWord In rngSentence.Words
Next rngWord
Next rngSentence
Next rngRange

For Each shpShape In ActiveDocument.Shapes
If shpShape.Type = msoTextBox Then
i = i + 1
j = 0
For Each rngSentence In shpShape.TextFrame.TextRange.Sentences
j = j + 1
strText = "[H" & i & "S" & j & ":" & rngSentence.Start &
"-" & rngSentence.End & "]"
rngSentence.InsertBefore strText
For Each rngWord In rngSentence.Words
Next rngWord
Next rngSentence
Else
If shpShape.CanvasItems.Count > 0 Then
For Each shpCanvasShape In shpShape.CanvasItems
If shpCanvasShape.Type = msoTextBox Then
i = i + 1
j = 0
For Each rngSentence In
shpCanvasShape.TextFrame.TextRange.Sentences
j = j + 1
strText = "[C" & i & "S" & j & ":" &
rngSentence.Start & "-" & rngSentence.End & "]"
rngSentence.InsertBefore strText
For Each rngWord In rngSentence.Words
Next rngWord
Next rngSentence
End If
Next shpCanvasShape
End If
End If
Next shpShape

End Sub
 
G

Greg Maxey

I am not sure what you are trying to do exactly. See if this helps any:

Public Sub LoopThroughRanges()
Dim rngStory As Word.Range
Dim rngSentence As Word.Range
Dim rngWord As Word.Range
For Each rngStory In ActiveDocument.StoryRanges
Select Case rngStory.StoryType
Case wdMainTextStory, wdTextFrameStory
If Selection.Range.InRange(rngStory) Then
rngStory.Start = Selection.Range.Start
End If
Do
For Each rngSentence In rngStory.Sentences
For Each rngWord In rngSentence.Words
MyWordProcess rngWord
Next rngWord
Next rngSentence
Set rngStory = rngStory.NextStoryRange
Loop Until rngStory Is Nothing
Case Else
'Do Nothing (skip all but main text and textframe storyranges)
End Select
Next rngStory
End Sub
Sub MyWordProcess(ByRef oWord As Range)
MsgBox oWord
End Sub
 
H

hafabee

Thanks Greg.

The StoryType property is the hook I needed.

What i am trying to do is loop through *every* sentence in the
document. Your code, like my initial code when I started, loops
through the StoryRanges object.

However, I since discovered, to my surprise and dismay, that there can
be text in a document that is *not* in any of the StoryRanges. Namely,
in TextBoxes.

From my testing, using Word XP/2002, only the first TextBox is
included in StoryRanges. The second and subsequent TextBoxes are not
there. To further complicate things, a TextBox can be either a Shape
of Type msoTextBox, or a CanvasItemShape of Type msoTextBox within one
of the Shapes in the document.

If anybody wants to play around with this, I created a sample doc:
http://www.mediafire.com/file/y2nzznxkjm4/Sample_Text.doc

If you run the below sub HighlightAllStoryWords (a simple loop through
StoryRanges) on this doc, you see that it misses Box B and Box C.

If you run the below sub HighlightAllDocWords (my current, annoyingly
complex code) on this doc, you see that it gets all boxes. I would
like to have this simpler in order to more easily implement the
original problem in this thread: being able to loop through all
sentences/words in a doc starting at the current cursor location with
wrap-around, rather than always at the beginning of the doc.

Any way, thanks again Greg and Graham for your insight.

' ------------------------------------------------
Public Sub HighlightAllStoryWords()

Dim rngStory As Range
Dim rngSentence As Range
Dim rngWord As Range

For Each rngStory In ActiveDocument.StoryRanges
For Each rngSentence In rngStory.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex = wdYellow
Next rngWord
Next rngSentence
Next rngStory

End Sub
' ------------------------------------------------
Public Sub HighlightAllDocWords()

Dim rngStory As Range
Dim rngSentence As Range
Dim rngWord As Range
Dim shpShape As Shape
Dim shpCanvasShape As Shape

For Each rngStory In ActiveDocument.StoryRanges
If rngStory.StoryType <> wdTextFrameStory Then
For Each rngSentence In rngStory.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex = wdYellow
Next rngWord
Next rngSentence
End If
Next rngStory
For Each shpShape In ActiveDocument.Shapes
If shpShape.Type = msoTextBox Then
For Each rngSentence In shpShape.TextFrame.TextRange.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex = wdPink
Next rngWord
Next rngSentence
Else
If shpShape.CanvasItems.Count > 0 Then
For Each shpCanvasShape In shpShape.CanvasItems
If shpCanvasShape.Type = msoTextBox Then
For Each rngSentence In
shpCanvasShape.TextFrame.TextRange.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex =
wdBrightGreen
Next rngWord
Next rngSentence
End If
Next shpCanvasShape
End If
End If
Next shpShape

End Sub
 
H

hafabee

Thanks Greg., the StoryType property is the hook I needed.

What i am trying to do is loop through *every* sentence/word in the
document. Your code, like my initial code when I started, loops
through the StoryRanges object.

However, I since discovered, to my surprise and dismay, that there can
be text in a document that is *not* in any of the StoryRanges, in
TextBoxes.

From my testing, using Word XP/2002, only the first TextBox is
included in StoryRanges. The second and subsequent TextBoxes are not
there. To further complicate things, a TextBox can be either a Shape
of Type msoTextBox, or a CanvasItemShape of Type msoTextBox within one
of the Shapes in the document.

If anybody wants to play around with this, I created a sample doc:
http://www.mediafire.com/file/y2nzznxkjm4/Sample_Text.doc

If you run the below sub HighlightAllStoryWords (a simple loop through
StoryRanges) on this doc, you see that it misses Box B and Box C.

If you run the below sub HighlightAllDocWords (my current, annoyingly
complex code) on this doc, you see that it gets all boxes. I would
like to have this simpler in order to more easily implement the
original problem in this thread: being able to loop through all
sentences/words in a doc starting at the current cursor location with
wrap-around, rather than always at the beginning of the doc.

Any way, thanks again Greg and Graham for your insight.

' ------------------------------------------------
Public Sub HighlightAllStoryWords()

Dim rngStory As Range
Dim rngSentence As Range
Dim rngWord As Range

For Each rngStory In ActiveDocument.StoryRanges
For Each rngSentence In rngStory.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex = wdYellow
Next rngWord
Next rngSentence
Next rngStory

End Sub
' ------------------------------------------------
Public Sub HighlightAllDocWords()

Dim rngStory As Range
Dim rngSentence As Range
Dim rngWord As Range
Dim shpShape As Shape
Dim shpCanvasShape As Shape

For Each rngStory In ActiveDocument.StoryRanges
If rngStory.StoryType <> wdTextFrameStory Then
For Each rngSentence In rngStory.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex = wdYellow
Next rngWord
Next rngSentence
End If
Next rngStory
For Each shpShape In ActiveDocument.Shapes
If shpShape.Type = msoTextBox Then
For Each rngSentence In shpShape.TextFrame.TextRange.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex = wdPink
Next rngWord
Next rngSentence
Else
If shpShape.CanvasItems.Count > 0 Then
For Each shpCanvasShape In shpShape.CanvasItems
If shpCanvasShape.Type = msoTextBox Then
For Each rngSentence In
shpCanvasShape.TextFrame.TextRange.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex =
wdBrightGreen
Next rngWord
Next rngSentence
End If
Next shpCanvasShape
End If
End If
Next shpShape

End Sub
 
G

Greg Maxey

hafabee,

I tried unsuccessfully to download your document.

I think the "hook" you need and are still missing is:

".NextStoryRange"

and the associated Do ... Loop shown below.

Public Sub HighlightAllStoryWords()
Dim rngStory As Range
Dim rngSentence As Range
Dim rngWord As Range
For Each rngStory In ActiveDocument.StoryRanges
Do
For Each rngSentence In rngStory.Sentences
For Each rngWord In rngSentence.Words
rngWord.HighlightColorIndex = wdYellow
Next rngWord
Next rngSentence
Set rngStory = rngStory.NextStoryRange
Loop Until rngStory Is Nothing
Next rngStory
End Sub

This code will certainly pick up Textboxes A, B, and C. As far as I can
tell it picks up textboxes and the textrange of any shape in a canvass or
not.
 
G

Greg Maxey

I just did manage to download your sample doc. The code I posted earlier
does highlight every word.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top