How to extract strings from the code inside a field?

E

Edward Mendelson

I'm writing a macro that needs to test certain strings inside a field code
and make decisions on that basis, and I hope someone can suggest how to test
for the strings inside the code.

Here is the problem:

In a document that contains Symbol fields, I toggle field codes visible,
then I search for the string

^19symbol

This takes me to the first symbol field. The content of the field looks
something like this:

{symbol 65 \f "WP TypographicSymbols" \s 12}

(the curly braces are of course really the opening and closing field codes)

65 is the symbol number, "WP TypographicSymbols" is the font name.

What I want to do is put the symbol number and font name into variables, so
that I can run a Select Case test, first on the font name, then on the
symbol number, to determine what to do next.

Very conveniently, Word will replace these fields even if you do not specify
the full text of the field, so the macro can find

^19symbol 65 \f "WP Typ

and replace it with a quotation mark, and the find-replace will succeed.

Many thanks for any help in extracting those two variables from the field
codes.

Edward Mendelson
 
H

Helmut Weber

Hi Edward,
here comes another one:
Sub Resetsearch()
With Selection.Find
.ClearFormatting
.Replacement.ClearFormatting
.Text = ""
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
.Execute
End With
End Sub
---
Sub test444()
Dim rDcm As Range
Dim rFnd As Range
Set rDcm = ActiveDocument.Range
Resetsearch
With rDcm.Find
.Text = "^19 symbol"
While .Execute
Set rFnd = rDcm.Duplicate
With rFnd.Find
'.Text = " [0-9]{1;}" ' German
.Text = " [0-9]{1,}" ' English
.MatchWildcards = True
If .Execute Then
MsgBox Trim(rFnd) ' var 1
End If
End With
Set rFnd = rDcm.Duplicate
With rFnd.Find
.Text = "^34*^34"
.MatchWildcards = True
If .Execute Then
MsgBox Trim(rFnd) 'var 2
End If
End With
Wend
End With
Resetsearch
End Sub
Note that there are again differences according to language.
This time it is the listseperator in the wildcard search:
German ";", English ",".
I wonder, whether "resetsearch" is required at that position,
or required at all. At least, it doesn't do any harm. Even
"if .execute" may not be required, if you absolutely sure about
the contents of the fields.
 
H

Helmut Weber

Hi Edward,
maybe instead of passing a range to the messagebox
it would be better, to pass the range.text.
Like this:
1. MsgBox Trim(rFnd.Text)
2. MsgBox rFnd.Text
I removed "trim" in 2., as it was a result of copying and pasting.
 
J

Jay Freedman

Hi Edward,

Helmut gave you a good search algorithm. Just to prove there's usually
another way, this one loops through all the fields (in the document
body only; you need the usual StoryRange loops if the symbols may be
in headers/footers, textboxes, etc.) and extracts stuff from each
Symbol field.

Sub GetSymbols()
Dim oFld As Field
Dim strCode As String
Dim nAsc As Long
Dim strFont As String
Dim strTemp As String
Dim nPos As Integer

For Each oFld In ActiveDocument.Fields
With oFld
' work only on Symbol fields
If .Type = wdFieldSymbol Then
' extract the code
' (you don't have to display field codes)
strCode = .Code

' remove the symbol keyword
strCode = Mid$(strCode, Len(" symbol ") + 1)

' extract the character number
Do While IsNumeric(Left$(strCode, 1))
strTemp = strTemp & Left$(strCode, 1)
strCode = Mid$(strCode, 2)
Loop
nAsc = Val(strTemp)

' extract the font name
' starting after the \f switch and
' ending before the next switch if any
nPos = InStr(strCode, "\f ")
strCode = Mid$(strCode, nPos + 3)
nPos = InStr(strCode, " \")
If nPos Then
strFont = Left$(strCode, nPos + 1)
Else
strFont = strCode
End If
' remove quotes if any
strFont = Replace(strFont, """", "")

MsgBox "Character " & nAsc & vbCr & strFont
End If
End With
Next oFld
End Sub
 
E

Edward Mendelson

Hi Jay,

I think this is getting very close to what I need, because it should let me
branch off and make a global search replace after each match without
disturbing the sequence of jumping from one character to the next.

I'm going to beg your help in fixing the original code (included below)
slightly. Right now if the macro encounters these field strings:

symbol 65 \ f "WP TypographicSymbols" \s 12
symbol 64 \f "WP Typographic Symbols" \s 12
symbol 67 \f "WP Typographic Symbols \s 12

The message boxes read:

Character 5
WP TypographicSymbols \
(only the second digit in 65 gets returned, and there's an
extraneous \)

Character 54
WP TypographicSymbols \
(only the second digit gets returned, and is appended to the 5
returned earlier)

Character 547
WP TypographicSymbols \
(again, only the second digit gets returned)

The appending is easy to fix with a strTemp = "" before you extract the
character number, but I can't figure out how to get the full numbers (you
return the second and third digits in three-digit numbers, so the first
digit is always the missing one, I think).

Also, what needs to be fixed to remove the " \" from the end of the font
string?

Many thanks for this, and my apologies for coming back for more!

Edward Mendelson
 
G

Greg Maxey

Edward,

With the code you posted, I was getting

Character 65 ...
Character 6564 ...
Character 656467 ...


I fixed that with:
strTemp = "" (above to start of the Do ... Loop)

This is working here:
Sub Test()

Dim oFld As Field, strCode As String, strTemp As String, nAsc As Long, nPos
As String, strFont As String
For Each oFld In ActiveDocument.Fields
With oFld
' work only on Symbol fields
If .Type = wdFieldSymbol Then
' extract the code
' (you don't have to display field codes)
strCode = .Code

' remove the symbol keyword
strCode = Mid$(strCode, Len(" symbol ") + 1)

' extract the character number
strTemp = ""
Do While IsNumeric(Left$(strCode, 1))
strTemp = strTemp & Left$(strCode, 1)
strCode = Mid$(strCode, 2)
Loop
nAsc = Val(strTemp)

' extract the font name
' starting after the \f switch and
' ending before the next switch if any
nPos = InStr(strCode, "\f ")
strCode = Mid$(strCode, nPos + 3)
nPos = InStr(strCode, " \")
If nPos Then
strFont = Left$(strCode, nPos + 1)
Else
strFont = strCode
End If
' remove quotes if any
strFont = Replace(strFont, """", "")
MsgBox "Character " & nAsc & vbCr & strFont
End If
End With
Next oFld
End Sub
 
E

Edward Mendelson

Greg and Jay,

Many thanks to both: I see what was going wrong earlier when the code left
the first digit trimmed.

As Greg says, the original code (with the addition of the line strTemp="")
works perfectly with Symbol fields that you create by your yourself with the
Insert / Field dialog.

However, when the Word WP import filter creates symbol fields automatically,
they are formatted different from the way the Insert Field formats them.

Insert Field produces this string. For clarity, I've added hyphens (-) where
Word inserts spaces:

{--SYMBOL--65-\f-"WP TypographicSymbols" ....

But the WP import filter creates fields that look like this:

{symbol-65-\f- "WP TypograhicSymbols" ....

The extra space after "symbol" in the Insert/Field version was the problem
here.

Is there a way to write the string handling code so that it works correctly
no matter how many spaces there are before or after the "symbol" string??

Thanks again for all this very generous help.

Edward Mendelson

==============================================
' working code for fields created by the WP import filter
' and exposed by saving a file in WinWord 2.0 format
' and then opening it again

Sub GetDataFromFieldsCreatedByWord()

Dim oFld As Field
Dim strCode As String
Dim strTemp As String
Dim nAsc As Long
Dim nPos As String
Dim strFont As String

For Each oFld In ActiveDocument.Fields
With oFld
' work only on Symbol fields
If .Type = wdFieldSymbol Then
' extract the code
' (you don't have to display field codes)
strCode = .Code

' remove the symbol keyword
'''strCode = Mid$(strCode, Len(" symbol ") + 1)
'''old version removes first digit
strCode = Mid$(strCode, Len(" symbol "))
' extract the character number
strTemp = ""
Do While IsNumeric(Left$(strCode, 1))
strTemp = strTemp & Left$(strCode, 1)
strCode = Mid$(strCode, 2)
Loop
nAsc = Val(strTemp)

' extract the font name
' starting after the \f switch and
' ending before the next switch if any
nPos = InStr(strCode, "\f ")
strCode = Mid$(strCode, nPos + 3)
nPos = InStr(strCode, " \")
If nPos Then
'''strFont = Left$(strCode, nPos + 1)
'''old version includes /f after name of font
strFont = Left$(strCode, nPos - 1)
Else
strFont = strCode
End If
' remove quotes if any
strFont = Replace(strFont, """", "")
msgBox "Character " & nAsc & vbCr & strFont
End If
End With
Next oFld
End Sub
 
G

Greg Maxey

Edward,

Try adding

Do
strCode = Mid$(strCode, 2)
Loop Until IsNumeric(Left$(strCode, 1))

immediately after the strTemp="" that you added earlier.

This will keep moving the start of the strCode one character to the right
until the first number is reached following Symbol.
 
J

Jay Freedman

Hi Edward,

Sorry about the problems, though you and Greg seem to have worked
through most of them.

To make the macro handle various numbers of spaces around the keyword,
replace the lines
' remove the symbol keyword
'''strCode = Mid$(strCode, Len(" symbol ") + 1)
'''old version removes first digit
strCode = Mid$(strCode, Len(" symbol "))

with these:

' remove the symbol keyword and any surrounding spaces
Do While Left$(strCode, 1) = " "
strCode = Mid$(strCode, 2)
Loop
strCode = Mid$(strCode, Len("symbol") + 1)
Do While Left$(strCode, 1) = " "
strCode = Mid$(strCode, 2)
Loop
 
E

Edward Mendelson

Greg,

That's perfect - many thanks. I'll post another question about string
handling in a different thread.

Edward Mendelson
 
G

Greg Maxey

Edward,

Glad that I could help. I hope you are back to this thread to see Jay's
reply. It is a better solution. You see, while I may occassionally see a
tree, Jay sees the forrest :)
 
H

Helmut Weber

Hi Edward,
I played for ours with your and my code
and could not get to work it properly.
Ranges and duplicates of duplicates of ranges
left me totally confused. So I thought I'd start
from scratch and arrived at Jay's solution. Just
one odd thing, and that's why I post this here:
If I open a WP file, that was converted to Word 2.0,
that comes from an englisch system, the fields
look like:
{symbol 65 \f "WP TypographicSymbols" \s 12}
However, after saving and reloading, the fields
have changed to, because of my german version:
{sonderzeichen 65 \f "WP TypographicSymbols" \s 12}
 
E

Edward Mendelson

Helmut Weber said:
Hi Edward,
I played for ours with your and my code
and could not get to work it properly.
Ranges and duplicates of duplicates of ranges
left me totally confused. So I thought I'd start
from scratch and arrived at Jay's solution. Just
one odd thing, and that's why I post this here:
If I open a WP file, that was converted to Word 2.0,
that comes from an englisch system, the fields
look like:
{symbol 65 \f "WP TypographicSymbols" \s 12}
However, after saving and reloading, the fields
have changed to, because of my german version:
{sonderzeichen 65 \f "WP TypographicSymbols" \s 12}

Hi Helmut,

Many thanks for working more on this. Thanks to the help of everyone here,
who provided the intelligent parts of the code, I now have what I think is a
working macro for solving the problem of WP-created files that open in Word
with A and @ instead of quotation marks. I had to spend somet time debugging
the look-in-all-ranges code, but I think it works and I think it's
efficient.

It's cluttered with comments, and I haven't combined the
save-as-Word20-and-reopen macro with the replace-symbol-fields macro, but
it's good enough for testing.

Now the question is how to handle non-English versions. It's easy enough to
call a procedure like this at the beginning, and pop up an error message for
any language not listed -

Private Sub SymbolString (sStr As String)
Dim iLng As Integer ' application language
iLng = Application.Language
Select Case iLng
Case 1031: sStr = "sonderzeichen" ' German
Case 1033: sStr = "symbol" ' English
Case Else: sStr = "NoMatch" ' string to compare in calling sub
End Select
'msgBox (sStr)
End Sub

But what to do in cases like yours where the document may have been created
in another language system? I don't see anything in the help that lets me
identify the language in which a document was saved, but I assume it is
there somewhere. Perhaps it would be possible to test for the language in
which the document was saved, and set the "symbol" or "sonderzeichen" string
accordingly?

Best wishes from New York,

Edward Mendelson
 
H

Helmut Weber

Hi Edward,
I wouldn't overcomplicate this. Maybe just saving the doc, opening it
again, display the fields and ask the user for the string in the
fields with the WP-Fonts.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top