V
vasdeep
I have the following requirement.
1. User chooses to download the contents captured in our system as a word
document. I have written an XSLT that converts the system
data [in XML format] into WordML. To identify the contents of this
document, I
also create hidden tags as part of this Transformation.
Using Word property, "w:visible", I add these tags . For example it
could be something like <vasuclause> </vasuclause>
These tags will not be visible to the end-user when viewed from MS
Word 2003. The contents between these tags will be the text that the user
will see and
it represents an entity in our system. So, there will be many occurrences of
this pattern corresponding to the entities in the system.
But when Word 2003 tries to open the WordML created this way, it will
try to process the tag "<vasuclause>" and error out. To avoid this, as part
of my
transformation I escape the "<" and ">" tags, resulting in "%lt;vasuclause>"
and
"%lt;/vasuclause>"
Question 1: Is this approach right?
The user can now open the document in Word 2003.
2. The user can make changes to the document and upload it back to the
system.
I have written another XSLT that reads the contents of the document
and extracts the text including the formatting.
i.e. if the user has used bold, italics, underline, lists, paragraphs
etc, all that information is captured as part of this transformation. The
XSLT has relevant templates to convert the run properties for "bold",
"italics", "lists", etc to
their HTML equivalents. The XSLT outputs an XML document containing the text
with relevant formatting instructions.
As part of this, I need to read the tags that I had inserted
"vasuclause" and using that identify each entity.
Question 2: The XML output will be something
like
<vasuclause>This is the first paragraph with <b> bold </b>
</vasuclause>
The second stage of this process is to use a SAX parser to read
the XML and insert the data into the database. But the XML output is not
correct. It has
the tags escaped. Is there a way to resolve this.
The examples available in mdsn talk about using XSD for defining the Data
Definition for the word document and defining blocks where the user can
provide input. Then the user can save the file by choosing "Save Data Only"
option. But this does not work for my requirement. Doing so saves only the
"data(text)" and
loses the formatting. I need to capture both the data and the formatting.
Also these examples talk about user providing data into specific input
blocks. In my case, the user can add new paragraph texts. Hence, I have not
been able
to use the suggested solution. Is there something that I'm missing?
Is there a way to achieve what I wish to accomplish? Will the procedure I
have explained above work? Is there a better way.
1. User chooses to download the contents captured in our system as a word
document. I have written an XSLT that converts the system
data [in XML format] into WordML. To identify the contents of this
document, I
also create hidden tags as part of this Transformation.
Using Word property, "w:visible", I add these tags . For example it
could be something like <vasuclause> </vasuclause>
These tags will not be visible to the end-user when viewed from MS
Word 2003. The contents between these tags will be the text that the user
will see and
it represents an entity in our system. So, there will be many occurrences of
this pattern corresponding to the entities in the system.
But when Word 2003 tries to open the WordML created this way, it will
try to process the tag "<vasuclause>" and error out. To avoid this, as part
of my
transformation I escape the "<" and ">" tags, resulting in "%lt;vasuclause>"
and
"%lt;/vasuclause>"
Question 1: Is this approach right?
The user can now open the document in Word 2003.
2. The user can make changes to the document and upload it back to the
system.
I have written another XSLT that reads the contents of the document
and extracts the text including the formatting.
i.e. if the user has used bold, italics, underline, lists, paragraphs
etc, all that information is captured as part of this transformation. The
XSLT has relevant templates to convert the run properties for "bold",
"italics", "lists", etc to
their HTML equivalents. The XSLT outputs an XML document containing the text
with relevant formatting instructions.
As part of this, I need to read the tags that I had inserted
"vasuclause" and using that identify each entity.
Question 2: The XML output will be something
like
<vasuclause>This is the first paragraph with <b> bold </b>
</vasuclause>
The second stage of this process is to use a SAX parser to read
the XML and insert the data into the database. But the XML output is not
correct. It has
the tags escaped. Is there a way to resolve this.
The examples available in mdsn talk about using XSD for defining the Data
Definition for the word document and defining blocks where the user can
provide input. Then the user can save the file by choosing "Save Data Only"
option. But this does not work for my requirement. Doing so saves only the
"data(text)" and
loses the formatting. I need to capture both the data and the formatting.
Also these examples talk about user providing data into specific input
blocks. In my case, the user can add new paragraph texts. Hence, I have not
been able
to use the suggested solution. Is there something that I'm missing?
Is there a way to achieve what I wish to accomplish? Will the procedure I
have explained above work? Is there a better way.