Parsing Text into fields (John Vinson?)

S

Sharon

Because I am a visual type person and it helps to be able to "spell" out the
code in words I understand, I have attempted to follow up on my 10/09/06 post
about parsing code. (BTW John:) I am attempting to parse these fields
because these documents are already entered into the Word program as shown
below and, of course, they would be better suited to having separate fields.
Through trial and error (and there was a lot), this is what I have come up
with (although I do have a wonderful color coded document for myself (which I
can't seem to send to ya'll) (yes I am from Texas)). Boy, you sure have to
watch those parentheses.

Anyway, I have a question at the very end that has puzzled me to no end and
if anyone has a chance to answer it, I would be grateful. Thanks.

This is the [Cite] field:

Abaza and Atassi, "Effects of Amino Acid Substitutions Outside an Antigenic
Site on Protein Binding to Monoclonal Antibodies of Predetermined Specificity
Obtained by Peptide Immunization: Demonstration with Region 94-100
(Antigenic Site 3) of Myoglobin", J. Protein Chemistry, 11(5):433-444, 1992.

This the code:

Title: Mid([Cite],InStr(1,[Cite],", """)+3,InStr(InStr(1,[Cite],""",
")-1,[Cite],",")-InStr(1,[Cite],",")-4)

This is my interpretation of the code in English:

1. In the middle of the string, starting at character 1 of the field
[Cite], and searching until it finds the characters: , "

2. Then deletes from the first of said characters + three spaces to the
right.

3. In the middle of the string, starting at character 1 of the field
[Cite], and searching until it finds the characters: ",

Then deletes (starting at the comma) to the end of the field [Cite] minus
four spaces (4)

*Not sure why -4, but it appears as if it starts at the text and then
subtracts four spaces to the left?
 
B

BruceM

I'll make a few observations while waiting to see if John jumps in. I find
it helpful when parsing strings to break it into component parts, so a good
place to start is the first InStr.

First of all, I will simplify the example so that the numbers are easier to
count for the purposes of this example:
Abaza and Atassi, "Effects of Amino Acid", J. Protein Chemistry,
11(5):433-444, 1992

InStr starts at the first position by default (you can eliminate the 1, from
the expression), and looks for the first occurence of a character, so you
could just look for the quote. The expression:

TestOne: InStr([Cite],"""")

will return 19, if I have counted correctly. The double quote is in the 19
position in the string.

TestOne: InStr([Cite],"""") + 1

will return 20. You could use:

TestOne: InStr([Cite],", """) + 3

but it makes it more complex than it needs to be.

TestTwo: Mid([Cite],20)

will return everything from the twentieth character, so it should leave you
with the string starting from the first letter of "Effects":
Effects of Amino Acid", J. Protein Chemistry, 11(5):433-444, 1992

If you insert the expression that returned 20 in place of the number 20:

TestThree: Mid([Cite],InStr([Cite],"""") + 1)

you will get the same result as in TestTwo. Note that you need to add or
subtract from InStr, which returns a number, rather than from Mid, which
returns a text string.

TestFour: InStr([TestThree],"""")-1

will return the position of the character preceding the quote mark in the
reduced-length string produced by the TestThree expression, or 21.

TestFive: Left([TestThree],21)

will return:

Effects of Amino Acid

Again, substitute the expression for the number:

TestSix: Left([TestThree],InStr([TestThree],"""")-1)

Finally, substitute the TestThree expression for the field name:

Left(Mid([Cite],InStr([Cite],"""") + 1),InStr(Mid([Cite],InStr([Cite],"""")
+ 1),"""")-1)


Sharon said:
Because I am a visual type person and it helps to be able to "spell" out
the
code in words I understand, I have attempted to follow up on my 10/09/06
post
about parsing code. (BTW John:) I am attempting to parse these fields
because these documents are already entered into the Word program as shown
below and, of course, they would be better suited to having separate
fields.
Through trial and error (and there was a lot), this is what I have come up
with (although I do have a wonderful color coded document for myself
(which I
can't seem to send to ya'll) (yes I am from Texas)). Boy, you sure have
to
watch those parentheses.

Anyway, I have a question at the very end that has puzzled me to no end
and
if anyone has a chance to answer it, I would be grateful. Thanks.

This is the [Cite] field:

Abaza and Atassi, "Effects of Amino Acid Substitutions Outside an
Antigenic
Site on Protein Binding to Monoclonal Antibodies of Predetermined
Specificity
Obtained by Peptide Immunization: Demonstration with Region 94-100
(Antigenic Site 3) of Myoglobin", J. Protein Chemistry, 11(5):433-444,
1992.

This the code:

Title: Mid([Cite],InStr(1,[Cite],", """)+3,InStr(InStr(1,[Cite],""",
")-1,[Cite],",")-InStr(1,[Cite],",")-4)

This is my interpretation of the code in English:

1. In the middle of the string, starting at character 1 of the field
[Cite], and searching until it finds the characters: , "

2. Then deletes from the first of said characters + three spaces to the
right.

3. In the middle of the string, starting at character 1 of the field
[Cite], and searching until it finds the characters: ",

Then deletes (starting at the comma) to the end of the field [Cite] minus
four spaces (4)

*Not sure why -4, but it appears as if it starts at the text and then
subtracts four spaces to the left?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top