N
nitm
hi everyone,
i need to parse a word document with special tags.
the documents always have the same structure:
each document is devided into sections, each section starts with a <section>
tag, after that tag there can be any number of optional tags (i.e: <title>i
am a title<title>, <author>paul auster<author>, <date>06/06/2007<date>, and
so on...) and after these tags comes the section text.
i need to go over the entire document and break these sections apart (i have
a special object for a section).
i use c# and i can't find anything on the web that can help me with this...
here's what i have so far:
Microsoft.Office.Interop.Word.Range currentRange, baseRange;
currentRange = wordDoc.Content;
baseRange = currentRange.Duplicate;
bool ans = currentRange.Find.Execute(ref searchFor, ref falseObj, ref
falseObj, ref trueObj, ref falseObj,
ref falseObj, ref trueObj, ref wrap, ref falseObj, ref missing, ref
falseObj, ref falseObj,
ref falseObj, ref falseObj, ref falseObj);
if (!ans) {
addUserMessage("Error: document does not seem to be in a valid format");
}
else while (ans) {
baseRange.Start = currentRange.End + 1;
ans = currentRange.Find.Execute(ref searchFor, ref falseObj, ref falseObj,
ref trueObj, ref falseObj,
ref falseObj, ref trueObj, ref wrap, ref falseObj, ref missing, ref
falseObj, ref falseObj,
ref falseObj, ref falseObj, ref falseObj);
baseRange.End = currentRange.Start - 1;
addUserMessage(baseRange.Text);
}
this works great except that it always goes to the start of the document
instead of returning false to ans... thus the loop never stops.
what's wrong with my code, and if anyone thinks that there's a better way to
do what i'm trying to do i'll be happy to know about it...
thanks a lot, nitzan
i need to parse a word document with special tags.
the documents always have the same structure:
each document is devided into sections, each section starts with a <section>
tag, after that tag there can be any number of optional tags (i.e: <title>i
am a title<title>, <author>paul auster<author>, <date>06/06/2007<date>, and
so on...) and after these tags comes the section text.
i need to go over the entire document and break these sections apart (i have
a special object for a section).
i use c# and i can't find anything on the web that can help me with this...
here's what i have so far:
Microsoft.Office.Interop.Word.Range currentRange, baseRange;
currentRange = wordDoc.Content;
baseRange = currentRange.Duplicate;
bool ans = currentRange.Find.Execute(ref searchFor, ref falseObj, ref
falseObj, ref trueObj, ref falseObj,
ref falseObj, ref trueObj, ref wrap, ref falseObj, ref missing, ref
falseObj, ref falseObj,
ref falseObj, ref falseObj, ref falseObj);
if (!ans) {
addUserMessage("Error: document does not seem to be in a valid format");
}
else while (ans) {
baseRange.Start = currentRange.End + 1;
ans = currentRange.Find.Execute(ref searchFor, ref falseObj, ref falseObj,
ref trueObj, ref falseObj,
ref falseObj, ref trueObj, ref wrap, ref falseObj, ref missing, ref
falseObj, ref falseObj,
ref falseObj, ref falseObj, ref falseObj);
baseRange.End = currentRange.Start - 1;
addUserMessage(baseRange.Text);
}
this works great except that it always goes to the start of the document
instead of returning false to ans... thus the loop never stops.
what's wrong with my code, and if anyone thinks that there's a better way to
do what i'm trying to do i'll be happy to know about it...
thanks a lot, nitzan