Importing Word and Excel using automation

J

Jo

I need to be able to import Excel and Word documents. I was told that
automation would be the best way of handling that. Can anyone recommend
a good book or any other resources to start me off in this? The last
time I tried this on my own I was lost in the enormous number of
classes, and didn't have enough information.
If there are commercial libraries I could purchase, however, that
might work out too.
 
C

Cindy Meister

Hard to say, without understanding what you mean by "import", or knowing
which programming environment you have to work from?

Generally, there's very little good literature around that has to do with
interoperability (automating the applications from outside the application
environment). But if what you really mean is, you need to better understand
the object model, so that you can work with it, then you might try the
O'Reilly "Writing Word Macros" book, or the Word 2000 Programmer's Reference
from WROX. Those both give a pretty good overview of working with Word's
object model.

-- Cindy
 
J

Jo

Cindy said:
Hard to say, without understanding what you mean by "import", or knowing
which programming environment you have to work from?

I probably should have said that I'm running VS6 and writing a C++
app. I just need to be able to import the data that would be in your
basic Excel worksheet.
Generally, there's very little good literature around that has to do with
interoperability (automating the applications from outside the application
environment). But if what you really mean is, you need to better understand
the object model, so that you can work with it, then you might try the
O'Reilly "Writing Word Macros" book, or the Word 2000 Programmer's Reference
from WROX. Those both give a pretty good overview of working with Word's
object model.

I don't know about understanding the object model, but when I try
importing automation classes, I'm inundated with all these highly
undocumented classes. The MSDN has examples, but they aren't even
compatible with the classes I have here. For instance,
Workbooks::Open() doesn't even have the same number of parameters,
which hardly matters, because neither are documented, so I have over a
dozen function parameters to figure out for myself.
I've found a few books that deal w/OLE automation, but they are
several years out of date. I've been recommended to try using .NET but
so far I haven't been able to see a connection to automation.
 
C

Cindy Meister

Hi Jo

Well, I haven't much knowledge about C++, however...

the documentation for the object models can be found in the VBA Help files.
Start up Excel, Alt+F11 to open the VB Editor, then Help/Content. Or type the
object, property or method in which you're interested in a module window,
then press F1 for context sensitive help. The Help is VBA, but all the
parameters, etc. are listed.

FWIW, if the worksheet content is in a flattable format (as opposed to being
scattered all over the sheet), or is marked by named ranges, you should be
able to retrieve it using DAO or ADO on the closed workbook. That might be
easier for you?

..NET isn't any more suited to automating Office, really, than C++ (in
certain respects less, since it's not COM). Those who propose it probably are
aware that there's more documentation on automating Office than for C++. The
number of parameters for a particular method may vary according to the
version, which is probably why you noticed a discrepancy in the documentation
you were able to find.

-- Cindy
 
J

Jo

Cindy said:
Hi Jo

Well, I haven't much knowledge about C++, however...

the documentation for the object models can be found in the VBA Help files.
Start up Excel, Alt+F11 to open the VB Editor, then Help/Content. Or type the
object, property or method in which you're interested in a module window,
then press F1 for context sensitive help. The Help is VBA, but all the
parameters, etc. are listed.

I don't know VB, but I looked at the method "Open" for "Workbook"
and there seems to be only 1 parameter--definitely not a guide for
something more specific.
FWIW, if the worksheet content is in a flattable format (as opposed to being
scattered all over the sheet), or is marked by named ranges, you should be
able to retrieve it using DAO or ADO on the closed workbook. That might be
easier for you?

Would that involve the same sort of work as creating an ODBC
connection? The MSDN seems to indicate as much. I've been told that
expecting users to be able to create an ODBC source out of Excel is
asking too much.
.NET isn't any more suited to automating Office, really, than C++ (in
certain respects less, since it's not COM). Those who propose it probably are

Then I can't imagine why it was recommended.
aware that there's more documentation on automating Office than for C++. The
number of parameters for a particular method may vary according to the
version, which is probably why you noticed a discrepancy in the documentation
you were able to find.

That's what I thought. What I find so very hard to understand is how
there could be so much code available, with no documentation
whatsoever.
 
C

Cindy Meister

Hi Jo

<<What I find so very hard to understand is how
there could be so much code available, with no documentation
whatsoever.>>

I'm not sure to what you're referring, here? The automation interface for
the Office applications was created primarily for automating the application
from within the application. Secondarily for automating the application from
"outside". The documentation for the first is certainly quite adequate, and
for the second can usually be derived. At least, that's my experience, but
then I come from the Office side of things :)
I don't know VB, but I looked at the method "Open" for "Workbook"
and there seems to be only 1 parameter--definitely not a guide for
something more specific.
OK, I just checked, and I see that for some odd reason, the context
sensitive Help takes you to the Open EVENT (rather than the Open method). In
that case, try accessing the Help (*.chm) file directly. Then you can go into
the Contents, open the list of methods, go to "O" and click on "Open". The
Excel VBA Help file is named something like VBAXL10.chm (depends on the
version).
retrieve [data] using DAO or ADO on the closed workbook. That might be
easier for you?

Would that involve the same sort of work as creating an ODBC
connection? The MSDN seems to indicate as much. I've been told that
expecting users to be able to create an ODBC source out of Excel is
asking too much.
Well, maybe you need to tell us exactly what you're supposed to deliver? Or
what it is that the user needs? As far as I'm concerned, when I make such a
suggestion to a developer, I expect the developer to take care of the
connections and deliver an end result. The impression I got from your
original post is that you need to pull data from Excel workbooks. Excel has a
database interface; you can pull data from it the same way as from any other
database, using DAO (the Jet engine's native language) or OLE DB (we usually
use ADO to access OLE DB, but maybe you have something different for that in
C++). The user doesn't come into the equation, here.

-- Cindy
 
J

Jo

Cindy said:
Hi Jo

<<What I find so very hard to understand is how
there could be so much code available, with no documentation
whatsoever.>>

I'm not sure to what you're referring, here? The automation interface for
the Office applications was created primarily for automating the application
from within the application. Secondarily for automating the application from

Ok, so I've been given bad advice, several times, to use automation.
"outside". The documentation for the first is certainly quite adequate, and
for the second can usually be derived. At least, that's my experience, but
then I come from the Office side of things :)

Or at least a non-C++ side of things. I'm not sure which "side"
Office would be on, that makes it distinct from a programming
environment.
OK, I just checked, and I see that for some odd reason, the context
sensitive Help takes you to the Open EVENT (rather than the Open method). In
that case, try accessing the Help (*.chm) file directly. Then you can go into
the Contents, open the list of methods, go to "O" and click on "Open". The
Excel VBA Help file is named something like VBAXL10.chm (depends on the
version).

Ok, I found "Open method as it applies to Workbooks" which looks
like it has all the parameters I'm concered w/, so if I make another
try at automation, I might get lucky.
retrieve [data] using DAO or ADO on the closed workbook. That might be
easier for you?

Would that involve the same sort of work as creating an ODBC
connection? The MSDN seems to indicate as much. I've been told that
expecting users to be able to create an ODBC source out of Excel is
asking too much.
Well, maybe you need to tell us exactly what you're supposed to deliver? Or
what it is that the user needs? As far as I'm concerned, when I make such a

I'm not sure how to describe it anymore than I have already. I'm
pulling database data from various types of sources, comma-delimited,
XML, ODBC, and Excel. Even though I'm supposed to make ODBC available,
I'm not supposed to force users to use ODBC to import the Excel data.
suggestion to a developer, I expect the developer to take care of the
connections and deliver an end result. The impression I got from your
original post is that you need to pull data from Excel workbooks. Excel has a
database interface; you can pull data from it the same way as from any other
database, using DAO (the Jet engine's native language) or OLE DB (we usually
use ADO to access OLE DB, but maybe you have something different for that in
C++). The user doesn't come into the equation, here.

In my case it did, because I was having to request that the users
select data sources, and work with the user/password issue.
Fortunately, I've found a way to create a user DSN without any input
from the user. I'm not sure why I was forbidden to use ODBC (and I'm
not sure what a system has to have to support this programaticly
created DSN) but if I can persuade the powers that be that this is
feasible, then I'll probably go with that solution.

But I looked into your recommendation about ADO, and the MSDN's
documentation seems to confirm that working with ODBC is required. If
you look to
http://msdn.microsoft.com/library/en-us/ado270/htm/mdmscexampleadowithextensions.asp
(just look for the first occurance of "select" on that page, if you
don't want to wade through the C++) you'll see that a DSN is
referenced. I don't believe this is another way to open a file.
 
C

Cindy Meister

Hi Jo

RE "bad advice...to use Office automation"

I can't judge whether the advice in any particular circumstance was good or
bad. If you need to manipulate the app environment for the user (or to
extract data from a Word doc), then automation is the way to go about it.
What I was referring to, mainly, when talking about the automation interface
is that it wasn't designed primarily with developers in mind. The logic is
"strange" because it derives primarily from the application interface. People
who don't come from within the application tend to think the other way
around: the interface derives from "logical code" :) What I was trying to
say is, that automating Office apps requires a paradigm shift in your
thinking so that you can get your head around it without going crazy.

<<I'm
pulling database data from various types of sources, comma-delimited,
XML, ODBC, and Excel. Even though I'm supposed to make ODBC available,
I'm not supposed to force users to use ODBC to import the Excel data.>>

OK, that's more than I knew before :) I can see from the list that you've
posted a number of times, but this is the only message thread I've actually
READ. (Busy, busy, busy).

<<But I looked into your recommendation about ADO, and the MSDN's
documentation seems to confirm that working with ODBC is required. If
you look to
http://msdn.microsoft.com/library/en-us/ado270/htm/mdmscexampleadowithextensions.asp
(just look for the first occurance of "select" on that page, if you
don't want to wade through the C++)>>

OK, I see it. You might check out the following link and see if any of the
connection string methods listed will work in place of the one in the sample
with "dsn". To the best of my recollection OLE DB via an ADO connection
shouldn't require a dsn, unless it's a "general OLE DB provider for ODBC".
You can use that with Excel, but you should also be able to use the JET OLE
DB provider. Example:

"Provider=Microsoft.Jet.OLEDB.4.0;Data
Source=c:\somepath\mySpreadsheet.xls;"Extended Properties=""Excel
8.0;HDR=Yes"""

http://www.carlprothman.net/Default.aspx?tabid=81

I think there are some caveats about using this type of connection, but that
they mostly apply to writing information back to Excel, and not pulling data
from it. If you run into problems, you should try asking in an EXCEL
newsgroup and/or a data/ado group.

-- Cindy

Jo said:
Cindy said:
Hi Jo

<<What I find so very hard to understand is how
there could be so much code available, with no documentation
whatsoever.>>

I'm not sure to what you're referring, here? The automation interface for
the Office applications was created primarily for automating the application
from within the application. Secondarily for automating the application from

Ok, so I've been given bad advice, several times, to use automation.
"outside". The documentation for the first is certainly quite adequate, and
for the second can usually be derived. At least, that's my experience, but
then I come from the Office side of things :)

Or at least a non-C++ side of things. I'm not sure which "side"
Office would be on, that makes it distinct from a programming
environment.
OK, I just checked, and I see that for some odd reason, the context
sensitive Help takes you to the Open EVENT (rather than the Open method). In
that case, try accessing the Help (*.chm) file directly. Then you can go into
the Contents, open the list of methods, go to "O" and click on "Open". The
Excel VBA Help file is named something like VBAXL10.chm (depends on the
version).

Ok, I found "Open method as it applies to Workbooks" which looks
like it has all the parameters I'm concered w/, so if I make another
try at automation, I might get lucky.
retrieve [data] using DAO or ADO on the closed workbook. That might be
easier for you?

Would that involve the same sort of work as creating an ODBC
connection? The MSDN seems to indicate as much. I've been told that
expecting users to be able to create an ODBC source out of Excel is
asking too much.
Well, maybe you need to tell us exactly what you're supposed to deliver? Or
what it is that the user needs? As far as I'm concerned, when I make such a

I'm not sure how to describe it anymore than I have already. I'm
pulling database data from various types of sources, comma-delimited,
XML, ODBC, and Excel. Even though I'm supposed to make ODBC available,
I'm not supposed to force users to use ODBC to import the Excel data.
suggestion to a developer, I expect the developer to take care of the
connections and deliver an end result. The impression I got from your
original post is that you need to pull data from Excel workbooks. Excel has a
database interface; you can pull data from it the same way as from any other
database, using DAO (the Jet engine's native language) or OLE DB (we usually
use ADO to access OLE DB, but maybe you have something different for that in
C++). The user doesn't come into the equation, here.

In my case it did, because I was having to request that the users
select data sources, and work with the user/password issue.
Fortunately, I've found a way to create a user DSN without any input
from the user. I'm not sure why I was forbidden to use ODBC (and I'm
not sure what a system has to have to support this programaticly
created DSN) but if I can persuade the powers that be that this is
feasible, then I'll probably go with that solution.

But I looked into your recommendation about ADO, and the MSDN's
documentation seems to confirm that working with ODBC is required. If
you look to
http://msdn.microsoft.com/library/en-us/ado270/htm/mdmscexampleadowithextensions.asp
(just look for the first occurance of "select" on that page, if you
don't want to wade through the C++) you'll see that a DSN is
referenced. I don't believe this is another way to open a file.
 
J

Jo

Cindy said:
OK, I see it. You might check out the following link and see if any of the
connection string methods listed will work in place of the one in the sample
with "dsn". To the best of my recollection OLE DB via an ADO connection
shouldn't require a dsn, unless it's a "general OLE DB provider for ODBC".
You can use that with Excel, but you should also be able to use the JET OLE
DB provider. Example:

I think that this is far more complicated than it needs to be.
But it looks like I'm just going to be able to go with an ODBC
solution anyway. I have been able to create a user DSN from w/in my
app, so it looks like this solution will be quite smooth. Thanks anyway
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top