How to loop through all the different "things" in a Word doc?

  • Thread starter Margaret Bartley
  • Start date
M

Margaret Bartley

If I want to do a For Each....Next loop to catch all the objects in a Word
document, how do I handle it?

The things I would expect to find would be
First- a Table of Contents
Second - possibly a paragraph or more or not
Third - a table

Then, possibly another paragraph, followed by another table.

So, I want to just start at the beginning, skip all the TOC stuff, and then
find out if the next thing is a paragraph, that I have to save to a
variable, or a table, that I have to loop through the rows and cells of.

Maybe all I need to know is the generic word that describes all the
different objects in the body of a document (no header or footer or
footnotes, etc)

Thanks
 
D

Doug Robbins - Word MVP

The usual way is to declare a variable such as i as Long and then use

With ActiveDocument
For i = 1 to .[objects].count
.[objects](i) 'do something
Next i
End With

However, if in the process you are destroying or deleting the object, you
should use

With ActiveDocument
For i = .[objects].count to 1 Step - 1
.[objects](i) 'do something
Next i
End With


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP, originally posted via msnews.microsoft.com
 
J

Jay Freedman

Margaret,

What Doug has shown you is true, but it's true for only one kind of object
at a time -- for example, .[objects] might be .Tables or .Paragraphs or
..Shapes. But there is _no_ generic [objects] collection for all the "things"
in a Word document, and no way to step through the "things" in the order
they appear on the page.

That's because the "things" you see on the page are often put there out of
order: The paragraphs of text, non-floating tables, and non-floating
graphics (Shapes) are positioned; then floating tables and shapes are
inserted relative to their anchors in the text, which may push following
text out of the way; then headers, footers, and footnotes come in; etc. It's
not at all a linear stream of objects.

If you tell us the end result you're trying to accomplish, rather than how
you want to do it, maybe we can suggest a way that has at least a chance of
working. The current plan is a non-starter.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.
The usual way is to declare a variable such as i as Long and then use

With ActiveDocument
For i = 1 to .[objects].count
.[objects](i) 'do something
Next i
End With

However, if in the process you are destroying or deleting the object,
you should use

With ActiveDocument
For i = .[objects].count to 1 Step - 1
.[objects](i) 'do something
Next i
End With



Doug Robbins - Word MVP, originally posted via msnews.microsoft.com
Margaret Bartley said:
If I want to do a For Each....Next loop to catch all the objects in
a Word document, how do I handle it?

The things I would expect to find would be
First- a Table of Contents
Second - possibly a paragraph or more or not
Third - a table

Then, possibly another paragraph, followed by another table.

So, I want to just start at the beginning, skip all the TOC stuff,
and then find out if the next thing is a paragraph, that I have to
save to a variable, or a table, that I have to loop through the rows
and cells of. Maybe all I need to know is the generic word that describes
all the
different objects in the body of a document (no header or footer or
footnotes, etc)

Thanks
 
M

Margaret Bartley

OK. I've been trying to figure this out for a long time, and keep coming to
a dead end, probably because, as you said, the plan is a "non-starter"

What I have are dozens of Word documents that I've used to save information
from my internet researches.

the documents are named "Health.doc", or "Travel.doc" or "Internet.doc"
Each document has several tables, each preceeded by a title, which may be a
Header1 or Header2

At the beginning of each document, I've put in a TOC listing all the tables.

I now want to export these all out to Excel or Acess, (there are thousands
of links).
Each row or record will have the Heading level(s), the title, as well as the
URL and the commments.

So, I need to go through the document. First, skip the TOC (it may or may
not be accurate).
If I see a paragraph, I need to identify it's Style, then save the text, and
put those two in variables to save when I write the row in Excel or record
in Access.

So a document called "Health.doc" has the following contents:

Facilities..............1 [this is the TOC]
Gyms................1
Suppliers 1


Facilities [Heading1 style]
www.grouphealth.com login: MargaretBartley pw: password

Gyms [Heading 2 style]
www.localgym.com Haven't been there yet. Jim recommends 6/6/08

Buy [Heading 1 style]
www.buyonline.com login: MargaretBartley pw password, buy taurine

and will create the following dataset:

Doc Heading Text URL
Comment
======================================================
Health H1 Facilties www.grouphealth.com
login: MyName pw: Password
Health H1Faciliites H2 Gyms www.localgym.com haven't
been there yet. Jim recommends 6/6/08
Health H1 Buy www.buyonline.com
login: MyName pw PassWord, buy taurine


Any help you can give would be great. This is totally getting out of hand!




Jay Freedman said:
Margaret,

What Doug has shown you is true, but it's true for only one kind of object
at a time -- for example, .[objects] might be .Tables or .Paragraphs or
.Shapes. But there is _no_ generic [objects] collection for all the
"things" in a Word document, and no way to step through the "things" in
the order they appear on the page.

That's because the "things" you see on the page are often put there out of
order: The paragraphs of text, non-floating tables, and non-floating
graphics (Shapes) are positioned; then floating tables and shapes are
inserted relative to their anchors in the text, which may push following
text out of the way; then headers, footers, and footnotes come in; etc.
It's not at all a linear stream of objects.

If you tell us the end result you're trying to accomplish, rather than how
you want to do it, maybe we can suggest a way that has at least a chance
of working. The current plan is a non-starter.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup
so all may benefit.
The usual way is to declare a variable such as i as Long and then use

With ActiveDocument
For i = 1 to .[objects].count
.[objects](i) 'do something
Next i
End With

However, if in the process you are destroying or deleting the object,
you should use

With ActiveDocument
For i = .[objects].count to 1 Step - 1
.[objects](i) 'do something
Next i
End With



Doug Robbins - Word MVP, originally posted via msnews.microsoft.com
Margaret Bartley said:
If I want to do a For Each....Next loop to catch all the objects in
a Word document, how do I handle it?

The things I would expect to find would be
First- a Table of Contents
Second - possibly a paragraph or more or not
Third - a table

Then, possibly another paragraph, followed by another table.

So, I want to just start at the beginning, skip all the TOC stuff,
and then find out if the next thing is a paragraph, that I have to
save to a variable, or a table, that I have to loop through the rows
and cells of. Maybe all I need to know is the generic word that
describes all the
different objects in the body of a document (no header or footer or
footnotes, etc)

Thanks
 
J

Jay Freedman

This looks to be much simpler than the "general case" that I described when I
didn't know what was in your document.

Two questions, though: When you say the document has "tables", are they really
just paragraphs of text the way you showed them here, or are they real Word
tables (as in Table > Insert Table)? And is the TOC just plain text, or is it a
Word table of contents (as in Insert > Reference > Index and Tables)?

If my assumption is correct that these are just paragraphs, then you can use a
loop like

For Each oPara In ActiveDocument.Paragraphs
' process the paragraph, its style, and the following data
Next


OK. I've been trying to figure this out for a long time, and keep coming to
a dead end, probably because, as you said, the plan is a "non-starter"

What I have are dozens of Word documents that I've used to save information
from my internet researches.

the documents are named "Health.doc", or "Travel.doc" or "Internet.doc"
Each document has several tables, each preceeded by a title, which may be a
Header1 or Header2

At the beginning of each document, I've put in a TOC listing all the tables.

I now want to export these all out to Excel or Acess, (there are thousands
of links).
Each row or record will have the Heading level(s), the title, as well as the
URL and the commments.

So, I need to go through the document. First, skip the TOC (it may or may
not be accurate).
If I see a paragraph, I need to identify it's Style, then save the text, and
put those two in variables to save when I write the row in Excel or record
in Access.

So a document called "Health.doc" has the following contents:

Facilities..............1 [this is the TOC]
Gyms................1
Suppliers 1


Facilities [Heading1 style]
www.grouphealth.com login: MargaretBartley pw: password

Gyms [Heading 2 style]
www.localgym.com Haven't been there yet. Jim recommends 6/6/08

Buy [Heading 1 style]
www.buyonline.com login: MargaretBartley pw password, buy taurine

and will create the following dataset:

Doc Heading Text URL
Comment
======================================================
Health H1 Facilties www.grouphealth.com
login: MyName pw: Password
Health H1Faciliites H2 Gyms www.localgym.com haven't
been there yet. Jim recommends 6/6/08
Health H1 Buy www.buyonline.com
login: MyName pw PassWord, buy taurine


Any help you can give would be great. This is totally getting out of hand!




Jay Freedman said:
Margaret,

What Doug has shown you is true, but it's true for only one kind of object
at a time -- for example, .[objects] might be .Tables or .Paragraphs or
.Shapes. But there is _no_ generic [objects] collection for all the
"things" in a Word document, and no way to step through the "things" in
the order they appear on the page.

That's because the "things" you see on the page are often put there out of
order: The paragraphs of text, non-floating tables, and non-floating
graphics (Shapes) are positioned; then floating tables and shapes are
inserted relative to their anchors in the text, which may push following
text out of the way; then headers, footers, and footnotes come in; etc.
It's not at all a linear stream of objects.

If you tell us the end result you're trying to accomplish, rather than how
you want to do it, maybe we can suggest a way that has at least a chance
of working. The current plan is a non-starter.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup
so all may benefit.
The usual way is to declare a variable such as i as Long and then use

With ActiveDocument
For i = 1 to .[objects].count
.[objects](i) 'do something
Next i
End With

However, if in the process you are destroying or deleting the object,
you should use

With ActiveDocument
For i = .[objects].count to 1 Step - 1
.[objects](i) 'do something
Next i
End With



Doug Robbins - Word MVP, originally posted via msnews.microsoft.com
If I want to do a For Each....Next loop to catch all the objects in
a Word document, how do I handle it?

The things I would expect to find would be
First- a Table of Contents
Second - possibly a paragraph or more or not
Third - a table

Then, possibly another paragraph, followed by another table.

So, I want to just start at the beginning, skip all the TOC stuff,
and then find out if the next thing is a paragraph, that I have to
save to a variable, or a table, that I have to loop through the rows
and cells of. Maybe all I need to know is the generic word that
describes all the
different objects in the body of a document (no header or footer or
footnotes, etc)

Thanks
 
M

Margaret Bartley

These are actually tables, and the TOC was inserted using the
Insert>Reference>Index and Tables.

I guess I could do a global Convert Table To Text for each table in the
document at the beginning?
Is there a vba command for that?



Jay Freedman said:
This looks to be much simpler than the "general case" that I described
when I
didn't know what was in your document.

Two questions, though: When you say the document has "tables", are they
really
just paragraphs of text the way you showed them here, or are they real
Word
tables (as in Table > Insert Table)? And is the TOC just plain text, or is
it a
Word table of contents (as in Insert > Reference > Index and Tables)?

If my assumption is correct that these are just paragraphs, then you can
use a
loop like

For Each oPara In ActiveDocument.Paragraphs
' process the paragraph, its style, and the following data
Next


OK. I've been trying to figure this out for a long time, and keep coming
to
a dead end, probably because, as you said, the plan is a "non-starter"

What I have are dozens of Word documents that I've used to save
information
from my internet researches.

the documents are named "Health.doc", or "Travel.doc" or "Internet.doc"
Each document has several tables, each preceeded by a title, which may be
a
Header1 or Header2

At the beginning of each document, I've put in a TOC listing all the
tables.

I now want to export these all out to Excel or Acess, (there are thousands
of links).
Each row or record will have the Heading level(s), the title, as well as
the
URL and the commments.

So, I need to go through the document. First, skip the TOC (it may or may
not be accurate).
If I see a paragraph, I need to identify it's Style, then save the text,
and
put those two in variables to save when I write the row in Excel or record
in Access.

So a document called "Health.doc" has the following contents:

Facilities..............1 [this is the TOC]
Gyms................1
Suppliers 1


Facilities [Heading1 style]
www.grouphealth.com login: MargaretBartley pw: password

Gyms [Heading 2 style]
www.localgym.com Haven't been there yet. Jim recommends 6/6/08

Buy [Heading 1 style]
www.buyonline.com login: MargaretBartley pw password, buy taurine

and will create the following dataset:

Doc Heading Text URL
Comment
======================================================
Health H1 Facilties www.grouphealth.com
login: MyName pw: Password
Health H1Faciliites H2 Gyms www.localgym.com haven't
been there yet. Jim recommends 6/6/08
Health H1 Buy www.buyonline.com
login: MyName pw PassWord, buy taurine


Any help you can give would be great. This is totally getting out of hand!




Jay Freedman said:
Margaret,

What Doug has shown you is true, but it's true for only one kind of
object
at a time -- for example, .[objects] might be .Tables or .Paragraphs or
.Shapes. But there is _no_ generic [objects] collection for all the
"things" in a Word document, and no way to step through the "things" in
the order they appear on the page.

That's because the "things" you see on the page are often put there out
of
order: The paragraphs of text, non-floating tables, and non-floating
graphics (Shapes) are positioned; then floating tables and shapes are
inserted relative to their anchors in the text, which may push following
text out of the way; then headers, footers, and footnotes come in; etc.
It's not at all a linear stream of objects.

If you tell us the end result you're trying to accomplish, rather than
how
you want to do it, maybe we can suggest a way that has at least a chance
of working. The current plan is a non-starter.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the
newsgroup
so all may benefit.

Doug Robbins - Word MVP wrote:
The usual way is to declare a variable such as i as Long and then use

With ActiveDocument
For i = 1 to .[objects].count
.[objects](i) 'do something
Next i
End With

However, if in the process you are destroying or deleting the object,
you should use

With ActiveDocument
For i = .[objects].count to 1 Step - 1
.[objects](i) 'do something
Next i
End With



Doug Robbins - Word MVP, originally posted via msnews.microsoft.com
If I want to do a For Each....Next loop to catch all the objects in
a Word document, how do I handle it?

The things I would expect to find would be
First- a Table of Contents
Second - possibly a paragraph or more or not
Third - a table

Then, possibly another paragraph, followed by another table.

So, I want to just start at the beginning, skip all the TOC stuff,
and then find out if the next thing is a paragraph, that I have to
save to a variable, or a table, that I have to loop through the rows
and cells of. Maybe all I need to know is the generic word that
describes all the
different objects in the body of a document (no header or footer or
footnotes, etc)

Thanks
 
D

Doug Robbins - Word MVP

The following will convert all of the tables in a document to text

Dim i As Long
With ActiveDocument
For i = .Tables.Count To 1 Step -1
.Tables(i).ConvertToText
Next i
End With


--
Hope this helps.

Please reply to the newsgroup unless you wish to avail yourself of my
services on a paid consulting basis.

Doug Robbins - Word MVP, originally posted via msnews.microsoft.com
Margaret Bartley said:
These are actually tables, and the TOC was inserted using the
Insert>Reference>Index and Tables.

I guess I could do a global Convert Table To Text for each table in the
document at the beginning?
Is there a vba command for that?



Jay Freedman said:
This looks to be much simpler than the "general case" that I described
when I
didn't know what was in your document.

Two questions, though: When you say the document has "tables", are they
really
just paragraphs of text the way you showed them here, or are they real
Word
tables (as in Table > Insert Table)? And is the TOC just plain text, or
is it a
Word table of contents (as in Insert > Reference > Index and Tables)?

If my assumption is correct that these are just paragraphs, then you can
use a
loop like

For Each oPara In ActiveDocument.Paragraphs
' process the paragraph, its style, and the following data
Next


OK. I've been trying to figure this out for a long time, and keep coming
to
a dead end, probably because, as you said, the plan is a "non-starter"

What I have are dozens of Word documents that I've used to save
information
from my internet researches.

the documents are named "Health.doc", or "Travel.doc" or "Internet.doc"
Each document has several tables, each preceeded by a title, which may be
a
Header1 or Header2

At the beginning of each document, I've put in a TOC listing all the
tables.

I now want to export these all out to Excel or Acess, (there are
thousands
of links).
Each row or record will have the Heading level(s), the title, as well as
the
URL and the commments.

So, I need to go through the document. First, skip the TOC (it may or may
not be accurate).
If I see a paragraph, I need to identify it's Style, then save the text,
and
put those two in variables to save when I write the row in Excel or
record
in Access.

So a document called "Health.doc" has the following contents:

Facilities..............1 [this is the TOC]
Gyms................1
Suppliers 1


Facilities [Heading1 style]
www.grouphealth.com login: MargaretBartley pw: password

Gyms [Heading 2 style]
www.localgym.com Haven't been there yet. Jim recommends 6/6/08

Buy [Heading 1 style]
www.buyonline.com login: MargaretBartley pw password, buy taurine

and will create the following dataset:

Doc Heading Text URL
Comment
======================================================
Health H1 Facilties www.grouphealth.com
login: MyName pw: Password
Health H1Faciliites H2 Gyms www.localgym.com haven't
been there yet. Jim recommends 6/6/08
Health H1 Buy www.buyonline.com
login: MyName pw PassWord, buy taurine


Any help you can give would be great. This is totally getting out of
hand!




Margaret,

What Doug has shown you is true, but it's true for only one kind of
object
at a time -- for example, .[objects] might be .Tables or .Paragraphs or
.Shapes. But there is _no_ generic [objects] collection for all the
"things" in a Word document, and no way to step through the "things" in
the order they appear on the page.

That's because the "things" you see on the page are often put there out
of
order: The paragraphs of text, non-floating tables, and non-floating
graphics (Shapes) are positioned; then floating tables and shapes are
inserted relative to their anchors in the text, which may push
following
text out of the way; then headers, footers, and footnotes come in; etc.
It's not at all a linear stream of objects.

If you tell us the end result you're trying to accomplish, rather than
how
you want to do it, maybe we can suggest a way that has at least a
chance
of working. The current plan is a non-starter.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the
newsgroup
so all may benefit.

Doug Robbins - Word MVP wrote:
The usual way is to declare a variable such as i as Long and then use

With ActiveDocument
For i = 1 to .[objects].count
.[objects](i) 'do something
Next i
End With

However, if in the process you are destroying or deleting the object,
you should use

With ActiveDocument
For i = .[objects].count to 1 Step - 1
.[objects](i) 'do something
Next i
End With



Doug Robbins - Word MVP, originally posted via msnews.microsoft.com
If I want to do a For Each....Next loop to catch all the objects in
a Word document, how do I handle it?

The things I would expect to find would be
First- a Table of Contents
Second - possibly a paragraph or more or not
Third - a table

Then, possibly another paragraph, followed by another table.

So, I want to just start at the beginning, skip all the TOC stuff,
and then find out if the next thing is a paragraph, that I have to
save to a variable, or a table, that I have to loop through the rows
and cells of. Maybe all I need to know is the generic word that
describes all the
different objects in the body of a document (no header or footer or
footnotes, etc)

Thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top