I have to agree with you John. But then again, as the original poster
mentioned, it won't help him out a bit in his application since the
user's probably won't be saving their documents as WML files. And
unless they're using a version above Word 2000, they won't be saving
WML documents at all!
So, what's the point? Everyone needs to upgrade? <sigh> And what if
they _are_ using Word 2003? How's that going to help Neil out in
getting the data into an Access table?
John Nurick wrote:
I don't quite agree. I get the impression that the docx format will
make
parsing of unstructured documents easier, if only by making it easier
to
bring a heavy-duty regex engine to bear. That said, "easier" may just
mean the difference between impossible and not-quite-so-impossible<g>.
On Wed, 29 Jun 2005 19:44:06 +0100, "Richard P"
<
[email protected]>
wrote:
I reckon it depends on your word documents. If they are highly
structured
then xml could help. If they are basically unstructured then xml will
not
help.
The RSS schema is a useful example. I sometimes create RSS files in
Word.
RSS files can be fairly weakly structured if long passages of text
are
embedded between <Description></Description> tags. XML is still
useful
for
me because the Description tag corresponds one-to-one with a column
in
my
database.
Assuming your documents pass the structure test, the key thing is
whether
you can control the document creation process. If you can get the
authors to
create their documents in xml, parsing it is much easier and robust
than
parsing regular text. You can use XML schema to enforce validity and
well-formedness; you can use types from the Xml namespace in the
framework
class library; and you can use xslt to transform from one format to
another.
An article at
http://news.com.com/2100-1012-991694.html?tag=fd_top
states:
"XML [in Office 2003] would allow easier interchange of data
generated
in
Office documents with back-end systems or existing Web services."
As part of an Access 2000 application, I have to continually parse
Word
documents and store the parsings in Access tables using Automation
to
control Word and parse the document. Is there a way that XML would
help
with that?