Reading DOCX-files from Access table

P

Peter Karlström

Hi

I'm working on securing/converting an original Office 2003 COM Addin made in
VB 6 to Office 2007.
I'm stuck with a function that used to load a DOC-file from a table column
of ActiveX/OLE-type.
Now I've converted the original DOC-file to a new DOCX-file, and the file
works fine to open and save.
I loaded the file with the AppendChunk DAO-funtion in Access as I have done
earlier with DOC-files and pictures.
Now when I read the data and save it to a file with ADODBs GetChunk
function, it won't open. I get an 5792 error telling me the file is damaged.
The same code worked as a charm in Office 2003 with DOC-files.
What can be wrong?

Thanks in advance
 
J

Jialiang Ge [MSFT]

Hello Peter,

What's the Access version where the VB6 COM Add-in runs for the docx files?

If it is Access 2003, as far as I know, Access 2003 does not support Office
2007 file types in its field object. Docx is likely to be regarded as doc
file based on my tests, and an error may be popped up when users open the
file.

If it is Access 2007, have you tried the new Field object: Field2?
http://msdn.microsoft.com/en-us/library/bb257445.aspx
Access 2007 should support docx files in a good manner.

Please let me know if there's any other questions or concerns.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================
 
P

Peter Karlström

Hello Jialiang Ge

Thank you for your quick reply.

I have been trying to fully convert everything, so;
Access 2003 (mdb) has been converted to Access 2007 (accdb)
The Word document have been converted from DOC to DOCX, and the
document opens OK from Windows Explorer.
The code for loading the document into the database is located in a module
in the database itself, and looks like this:
+++++ Start code +++++
Private Sub SaveToDB()
Dim db As Database
Dim rs As Recordset2
Dim bytBLOB() As Byte
Dim strImagePath As String
Dim intNum As Integer
Set db = CurrentDb
Set rs = db.OpenRecordset("SELECT fldReportfile FROM tblSystem")
If rs.RecordCount > 0 Then
strImagePath = "C:\Temp\Report.docx"
With rs
intNum = FreeFile
Open strImagePath For Binary As #intNum
ReDim bytBLOB(FileLen(strImagePath))
Get #intNum, , bytBLOB
Close #1
.Edit
.Fields(0).AppendChunk bytBLOB
.Update
End With
End If
rs.Close
Set db = Nothing
End Sub
+++++ End code +++++
References in the database is to ADODB version 2.8 and Access 12 Object
library along with some others non database references.

The code to save the document to a file from the database in the VB6 COM
Addin looks like this:
+++++ Start code +++++
BlockSize = 50
cm.CommandText = "SELECT fldReportfile FROM tblSystem"
rs.Open cm, , adOpenDynamic, adLockReadOnly
If Not rs.BOF And Not rs.EOF Then
If rs("fldReportfile").ActualSize > 0 Then
file_name = repPath & "\Office Installationreport.docx"
On Error Resume Next
Kill file_name
On Error GoTo CreateReportError
file_num = FreeFile
Open file_name For Binary As #file_num
file_length = rs("fldReportfile").ActualSize
If file_length > 0 Then
num_blocks = file_length / BlockSize
left_over = file_length Mod BlockSize
For block_num = 1 To num_blocks
bytes() = rs("fldReportfile").GetChunk(BlockSize)
Put #file_num, , bytes()
Next block_num
On Error Resume Next
If left_over > 0 Then
bytes() = rs("fldReportfile").GetChunk(left_over)
Put #file_num, , bytes()
End If
Else
MsgBox "Error loading report. Template is missing in the database",
vbSystemModal + vbOKOnly, App.ProductName & " Ver: " & App.Major & "." &
App.Minor & "." & App.Revision
GoTo CreateReportExit
End If
Close #file_num
DoEvents
End If
End If
rs.Close
+++++ End code +++++
Referenses int the project is to ADODB 2.8, DAO 3.6 and to Word 12 Object
Library along with some others.

The Field2 object is not availible in DAO 3.6 or ADODB 2.8, since the
reading from the table is made outside the Access database.

Hope this will get you a complete picture of the situation.

Best Regards

--
Peter Karlström
Midrange AB
Sweden


"Jialiang Ge [MSFT]" said:
Hello Peter,

What's the Access version where the VB6 COM Add-in runs for the docx files?

If it is Access 2003, as far as I know, Access 2003 does not support Office
2007 file types in its field object. Docx is likely to be regarded as doc
file based on my tests, and an error may be popped up when users open the
file.

If it is Access 2007, have you tried the new Field object: Field2?
http://msdn.microsoft.com/en-us/library/bb257445.aspx
Access 2007 should support docx files in a good manner.

Please let me know if there's any other questions or concerns.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================
 
J

Jialiang Ge [MSFT]

Hello Peter,

Thank you very much for the codes. I¡¯ve got the issue reproduced on my
side. Here is my test result:

1. Access 2003 + doc environment
I edited your code accordingly to fit the Access 2003 + doc environment. The
generated doc file is identical to the source file, and can be successfully
opened by Word 2003.

2. Access 2007 + docx environment
The generated docx file cannot be opened by Word 2007. It says ¡°The Office
Open XML file cannot be opened because there are problems with the
contents.¡±. Word 2007 provides options to repair the file. When I click on
¡°OK¡± to do the repair, the docx is opened. In order to find out the
difference between the two, I compare them with Windiff.exe. Windiff reports
the only difference lies in the end of the files (shown in binary). Because
docx is, in fact, zip files (see MSDN article ¡°Introducing the Office
(2007) Open XML File Formats¡±
http://msdn.microsoft.com/en-us/library/aa338205.aspx), I rename the output¡¯s
extension to zip, and extract it with winzip, to my surprise, the file can
be extracted successfully, and if I package it and rename it to docx, the
file will be opened! From this point, I get a workaround that is, as you may
say, far from satisfactory: we can programmatically un-package and package
the output docx with some zip libraries to get the file fixed. Peter, please
give me some time, I¡¯m looking into the reason for the subtle difference,
and looking for a better solution for you.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================

Peter Karlstr?m said:
Hello Jialiang Ge

Thank you for your quick reply.

I have been trying to fully convert everything, so;
Access 2003 (mdb) has been converted to Access 2007 (accdb)
The Word document have been converted from DOC to DOCX, and the
document opens OK from Windows Explorer.
The code for loading the document into the database is located in a module
in the database itself, and looks like this:
+++++ Start code +++++
Private Sub SaveToDB()
Dim db As Database
Dim rs As Recordset2
Dim bytBLOB() As Byte
Dim strImagePath As String
Dim intNum As Integer
Set db = CurrentDb
Set rs = db.OpenRecordset("SELECT fldReportfile FROM tblSystem")
If rs.RecordCount > 0 Then
strImagePath = "C:\Temp\Report.docx"
With rs
intNum = FreeFile
Open strImagePath For Binary As #intNum
ReDim bytBLOB(FileLen(strImagePath))
Get #intNum, , bytBLOB
Close #1
.Edit
.Fields(0).AppendChunk bytBLOB
.Update
End With
End If
rs.Close
Set db = Nothing
End Sub
+++++ End code +++++
References in the database is to ADODB version 2.8 and Access 12 Object
library along with some others non database references.

The code to save the document to a file from the database in the VB6 COM
Addin looks like this:
+++++ Start code +++++
BlockSize = 50
cm.CommandText = "SELECT fldReportfile FROM tblSystem"
rs.Open cm, , adOpenDynamic, adLockReadOnly
If Not rs.BOF And Not rs.EOF Then
If rs("fldReportfile").ActualSize > 0 Then
file_name = repPath & "\Office Installationreport.docx"
On Error Resume Next
Kill file_name
On Error GoTo CreateReportError
file_num = FreeFile
Open file_name For Binary As #file_num
file_length = rs("fldReportfile").ActualSize
If file_length > 0 Then
num_blocks = file_length / BlockSize
left_over = file_length Mod BlockSize
For block_num = 1 To num_blocks
bytes() = rs("fldReportfile").GetChunk(BlockSize)
Put #file_num, , bytes()
Next block_num
On Error Resume Next
If left_over > 0 Then
bytes() = rs("fldReportfile").GetChunk(left_over)
Put #file_num, , bytes()
End If
Else
MsgBox "Error loading report. Template is missing in the database",
vbSystemModal + vbOKOnly, App.ProductName & " Ver: " & App.Major & "." &
App.Minor & "." & App.Revision
GoTo CreateReportExit
End If
Close #file_num
DoEvents
End If
End If
rs.Close
+++++ End code +++++
Referenses int the project is to ADODB 2.8, DAO 3.6 and to Word 12 Object
Library along with some others.

The Field2 object is not availible in DAO 3.6 or ADODB 2.8, since the
reading from the table is made outside the Access database.

Hope this will get you a complete picture of the situation.

Best Regards

--
Peter Karlstr?m
Midrange AB
Sweden


"Jialiang Ge [MSFT]" said:
Hello Peter,

What's the Access version where the VB6 COM Add-in runs for the docx
files?

If it is Access 2003, as far as I know, Access 2003 does not support
Office
2007 file types in its field object. Docx is likely to be regarded as doc
file based on my tests, and an error may be popped up when users open the
file.

If it is Access 2007, have you tried the new Field object: Field2?
http://msdn.microsoft.com/en-us/library/bb257445.aspx
Access 2007 should support docx files in a good manner.

Please let me know if there's any other questions or concerns.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you.
Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no
rights.
=================================================
 
P

Peter Karlström

Hi Jialiang Ge

OK, I'll wait. This is not time-critical.

I have been looking around a bit and found that there are a new column
data-type in Access 2007 called Attachment. I have tried to get this to work
in my VB6 COM Addin but without success. I can't seem to find the proper
database object library to read this data from the table.

Regards
--
Peter Karlström
Midrange AB
Sweden


Jialiang Ge said:
Hello Peter,

Thank you very much for the codes. I¡¯ve got the issue reproduced on my
side. Here is my test result:

1. Access 2003 + doc environment
I edited your code accordingly to fit the Access 2003 + doc environment. The
generated doc file is identical to the source file, and can be successfully
opened by Word 2003.

2. Access 2007 + docx environment
The generated docx file cannot be opened by Word 2007. It says ¡°The Office
Open XML file cannot be opened because there are problems with the
contents.¡±. Word 2007 provides options to repair the file. When I click on
¡°OK¡± to do the repair, the docx is opened. In order to find out the
difference between the two, I compare them with Windiff.exe. Windiff reports
the only difference lies in the end of the files (shown in binary). Because
docx is, in fact, zip files (see MSDN article ¡°Introducing the Office
(2007) Open XML File Formats¡±
http://msdn.microsoft.com/en-us/library/aa338205.aspx), I rename the output¡¯s
extension to zip, and extract it with winzip, to my surprise, the file can
be extracted successfully, and if I package it and rename it to docx, the
file will be opened! From this point, I get a workaround that is, as you may
say, far from satisfactory: we can programmatically un-package and package
the output docx with some zip libraries to get the file fixed. Peter, please
give me some time, I¡¯m looking into the reason for the subtle difference,
and looking for a better solution for you.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================

Peter Karlstr?m said:
Hello Jialiang Ge

Thank you for your quick reply.

I have been trying to fully convert everything, so;
Access 2003 (mdb) has been converted to Access 2007 (accdb)
The Word document have been converted from DOC to DOCX, and the
document opens OK from Windows Explorer.
The code for loading the document into the database is located in a module
in the database itself, and looks like this:
+++++ Start code +++++
Private Sub SaveToDB()
Dim db As Database
Dim rs As Recordset2
Dim bytBLOB() As Byte
Dim strImagePath As String
Dim intNum As Integer
Set db = CurrentDb
Set rs = db.OpenRecordset("SELECT fldReportfile FROM tblSystem")
If rs.RecordCount > 0 Then
strImagePath = "C:\Temp\Report.docx"
With rs
intNum = FreeFile
Open strImagePath For Binary As #intNum
ReDim bytBLOB(FileLen(strImagePath))
Get #intNum, , bytBLOB
Close #1
.Edit
.Fields(0).AppendChunk bytBLOB
.Update
End With
End If
rs.Close
Set db = Nothing
End Sub
+++++ End code +++++
References in the database is to ADODB version 2.8 and Access 12 Object
library along with some others non database references.

The code to save the document to a file from the database in the VB6 COM
Addin looks like this:
+++++ Start code +++++
BlockSize = 50
cm.CommandText = "SELECT fldReportfile FROM tblSystem"
rs.Open cm, , adOpenDynamic, adLockReadOnly
If Not rs.BOF And Not rs.EOF Then
If rs("fldReportfile").ActualSize > 0 Then
file_name = repPath & "\Office Installationreport.docx"
On Error Resume Next
Kill file_name
On Error GoTo CreateReportError
file_num = FreeFile
Open file_name For Binary As #file_num
file_length = rs("fldReportfile").ActualSize
If file_length > 0 Then
num_blocks = file_length / BlockSize
left_over = file_length Mod BlockSize
For block_num = 1 To num_blocks
bytes() = rs("fldReportfile").GetChunk(BlockSize)
Put #file_num, , bytes()
Next block_num
On Error Resume Next
If left_over > 0 Then
bytes() = rs("fldReportfile").GetChunk(left_over)
Put #file_num, , bytes()
End If
Else
MsgBox "Error loading report. Template is missing in the database",
vbSystemModal + vbOKOnly, App.ProductName & " Ver: " & App.Major & "." &
App.Minor & "." & App.Revision
GoTo CreateReportExit
End If
Close #file_num
DoEvents
End If
End If
rs.Close
+++++ End code +++++
Referenses int the project is to ADODB 2.8, DAO 3.6 and to Word 12 Object
Library along with some others.

The Field2 object is not availible in DAO 3.6 or ADODB 2.8, since the
reading from the table is made outside the Access database.

Hope this will get you a complete picture of the situation.

Best Regards

--
Peter Karlstr?m
Midrange AB
Sweden


"Jialiang Ge [MSFT]" said:
Hello Peter,

What's the Access version where the VB6 COM Add-in runs for the docx
files?

If it is Access 2003, as far as I know, Access 2003 does not support
Office
2007 file types in its field object. Docx is likely to be regarded as doc
file based on my tests, and an error may be popped up when users open the
file.

If it is Access 2007, have you tried the new Field object: Field2?
http://msdn.microsoft.com/en-us/library/bb257445.aspx
Access 2007 should support docx files in a good manner.

Please let me know if there's any other questions or concerns.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you.
Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no
rights.
=================================================
 
J

Jialiang Ge [MSFT]

Hello Peter,

I think I'm back with some good news for this issue. :). The problem lies
in the macro code:

ReDim bytBLOB(FileLen(strImagePath))

Changing it to

ReDim bytBLOB(FileLen(strImagePath)-1)

can fix it.

In the grammar of vb6 or vba, ReDim abArray(n) defines an array sized n+1.
To allocate an array with n elements, we need to use ReDim abArray(n-1)

You may wonder how I found out the problem. I used your code to import &
export a docx file. The resulting docx cannot be opened, thus I used a
binary editor to compare the two docx. To my surprise, the resulting one
always has an extra byte (00) in the end. If I remove the byte, it opens
fine. That said, if we can determine where the extra byte is getting added
(import or export), we should be set. I started with the import procedure.
The source docx for test contained 1069bytes (it's deteced from my binary
editor). I saw the line "FileLen(strImagePath)" return 1069 as expected,
however, after the Get sentence, I detected that bytBLOB contained 1070
elements. Then I know where the problem is.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================
 
P

Peter Karlström

Hi Jialiang Ge

Thank you VERY much. This works very well.
Funny thing though that Access 2003 handles this "error" incorrectly.

Again, thanks

Best Regards
 
J

Jialiang Ge [MSFT]

Hello Peter,

It is not Access 2003 that handles the "error" incorrectly. Access does the
right thing according to our vb/vba code to import&export the binary. It is
Office Word that overlooks the extra byte when it is a doc file, thus no
error was popped up in the past. But for the new docx file format, it's
mostly likely that the mismatch is seen between the purported file comment
length and the actual remaining bytes in the file.

Regards,
Jialiang Ge ([email protected], remove 'online.')
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
(e-mail address removed).

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top