frames to plain text

Y

Yvonne_G

Hi all,

I scanned an article containing more or less complicated tables to Word to
put the numbers in Excel later. So now I have a Word-document containing (a
lot of) frames (I think they are no textboxes nor tables because I can't
convert them to text). I need to convert the frames (preferably all frames
together of course) to the normal Word layer. I think I need VBA for that.
Anybody who can give me the code? I use XP SP2 and W2k.

TIA

Jack Sons
The Netherlands
 
J

Jezebel

Frames are already in the normal Word layer. There is no conversion
involved. A frame is simply an instruction to position the paragraph
somewhere unusual.

If they are frames, you can set the text back to normal very simply: ctrl-A
ctrl-Q. This reverts everything to its original style.

However, I think they are more likely to be textboxes. Right-click on the
edge of one: does the popup menu offer 'format frame' or 'format textbox' ?
 
J

Jay Freedman

Unfortunately, VBA will make a worse mess than you already have.

The problem is that each frame is "anchored" to a text paragraph in the body
of the document; most OCR programs put the anchors for all frames in the
first paragraph on the page. When you right-click a frame and choose Format
Frame and then the "Remove frame" button, the frame's text is dropped at the
anchor position, not at the point on the page where it appears to be. With
two or more frames, the texts can even appear out of order -- text from a
frame near the bottom of the page can be dropped before text from a frame at
the top.

When you use VBA to do the equivalent, you have no control over where the
text is dropped, so you can't prevent the scrambling that results. Then
you'll spend hours trying to figure out which piece belongs where.

To get the text where it belongs, it will actually be quicker to select the
contents of each frame manually, cut and paste it into the regular text at
the proper position, and then delete the frame. Take the word of someone
who's tried it. :-(

Before you do this, you might try scanning the tables again, this time
turning off any feature in the OCR program that attempts to preserve
formatting. If you're lucky, you can get a plain-text file that you can
import directly into Excel.

--
Regards,
Jay Freedman
Microsoft Word MVP
Email cannot be acknowledged; please post all follow-ups to the newsgroup so
all may benefit.
 
Y

Yvonne_G

Jay,

Maybe I can do what I want if I have code that can transform frames into
textboxes (I'll put them in a separate document) and also code that
transforms textboxes into a plain word document. A number of years ago I had
also this kind of problem and then some good soul gave me the code that I
needed (it worked nice) but I lost it.
Please help.

Jack.
 
J

Jezebel

You can iterate the textboxes and retrieve all the text; but you'll still
have the positioning problem that Jay mentions.

Dim pRange as Word.Range

'Get the first textbox contents
Set pRange = ActiveDocument.StoryRanges(wdTextFrameStory)
Do
'Put the textbox text in the body of the document, eg
ActiveDocument.Content.InsertAfter pRange

'Get the next textbox
set pRange = pRange.NextStoryRange

Loop until pRange is nothing
 
J

Jack Sons

Jezebel,

What about this?

Jack.
-------------------------------------------------------------------------------------------------------------
Sub TextboxesToText()
'Convert textbox text to plain text
Dim oShp As Shape
Dim i As Integer
For Each oShp In ActiveDocument.Shapes
If oShp.Type = msoTextBox Then oShp.ConvertToFrame
Next oShp
For i = ActiveDocument.Frames.Count To 1 Step -1
With ActiveDocument.Frames(i)
.Borders.Enable = False
With .Shading
.Texture = wdTextureNone
.ForegroundPatternColor = wdColorAutomatic
.BackgroundPatternColor = wdColorAutomatic
End With
.Delete
End With
Next
End Sub
----------------------------------------------------------------------------------------------------------------
 
Y

Yvonne_G

Jezebel,

Your code halts at the first line
Set pRange = ActiveDocument.StoryRanges(wdTextFrameStory)

and I get a message saying (translated from Dutch) "the member of the
collection that you asked for doesn't excist".

In my VBA editor I have three parts of my screen: left is project, middle is
classes and right is members of StoryRanges. If I select in the middle list
StoryRanges I get in the right list only 5 members: application, count,
creator, item and parent. _NewEnum is greyed out.
In the middle list existof all kinds of things (called classes), also Word
constants. But there is no constant wdTextFrameStory. The list jumps from
WdTextFormFieldType to WdTextOrientation, so no WdTextFrame constants at
all.

Can you explain what goes wrong?

Jack.
 
J

Jezebel

The message you're getting means that your document has NO textboxes. (Test
it by creating a new document, adding a textbox, and running the code.)

The five 'members' you refer to are actually the properties and methods of
the StoryRanges collection (same as for any collection). The members of the
collection itself are accessed via the Item() method. (Since Item is the
default method, you don't need to use the word 'item'). Try this: in the VBA
immediate window, type

? activedocument.StoryRanges(

.... you'll see a list of all the possible storyranges. Every document has
wdMainTextStory -- this is the body of the document. The other members
(headers, footers, textboxes, footnotes, etc) are present only if you add
them to the document.


In your case, if there are no textboxes, the things you're seeing are indeed
frames, as you originally suggested. Frames are part of the main story. The
framing is handled as part of the paragraph format.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top