Handling foreign character sets in VBA - Grrrr

R

r4qm4n

Hi All,

This relates to an earlier question I posted about copying files with
foreign character sets. I worked round the filecopy issue using Robocopy to
handle the recursive copy (and thus the characters)

Now I need to do other operations on files and folders with foreign chracter
sets, but as soon as the file name is passed to VB, it changes the
characters.

For example, the filename 'jezyka.txt' would become 'jezyka.txt', by which
time it's useless for me to pass on to another process because I've lost the
real filename.

Can anyone advise how this is handled in VB/VBA ?

thanks
 
T

Tony Jollans

I can imagine this might be a problem but can't quickly reproduce anything
like it - can you post the code that's causing it.
 
K

Karl E. Peterson

r4qm4n said:
Now I need to do other operations on files and folders with foreign
chracter sets, but as soon as the file name is passed to VB, it
changes the characters.

By what means is the filename "passed to VB"? Is there someway you can
provide us with steps to reproduce this? Maybe just by answering that first
question, and giving the character codes that represent the actual filename.
 
R

r4qm4n

Well specifically, I have a routine that recurses through the folders, and
(as well as copying) am passing on the filenames to a procedure to sort out
file ownership and permissions.

Slight risk of going off-topic here, but this is what I'm doing:

The source file permissions are in a mess, I need to recurse the folders and
fix them. Unfortunately the permissions are in such a mess that cacls.exe,
xcacls.exe, xcacls.vbs and fileacl.exe all seem unable to take ownership,
recurse through the folders and force the permisions and ownership I want,
the only way I've managed to achieve this is to recurse in VBA and pass off
the the individual files/folders to fileacl.exe......

......and this works *great*....until I come across filenames with (what look
like) Polish characters, where by the time I pass the string onto a shell
statement the accented characters have had their accents removed, so the
filename no longer matches the original.

I can manually run fileacl.exe on the original file and it works fine, the
problem is when the filename is passed through a string in VBA.

I'll try to dig out some code and some example filenames, if I can find a
way to post them without the filenames being kludged in the same way.

thanks for the prompt response BTW.

:)
 
K

Karl E. Peterson

All well and good, I'm sure, but you really didn't help me/us understand
what you originally meant by "passed to VB". If we can't get to that basic
fact, there's very little we can offer to help. I mean, really, we need to
get past "doesn't work", eh? ;-)

Here's a more specific question... What's *this* mean?
is to recurse in VBA and pass off

"Show me the code!" Or, alternatively, try the code at
http://vb.mvps.org/samples/DirDrill and see if it gives you the same munged
results.
 
R

r4qm4n

Some code below as requested by Karl to demonstrate the problem.

I can't post the problem filenames here, cos they'll get munged in the same
way, so you need to grab some Polish text
(http://www.notam02.no/~hcholm/altlang/ht/Polish.html for example), paste it
into a filename and save the file in c:\test.

Press the button and see how the special charcters have been changed.

Hope that makes sense, let me know if you need any omre info.

cheers

====================================
Private Sub CommandButton1_Click()
FixPerms "c:\test"
End Sub
====================================
Private Sub FixPerms(ByVal pstrSourcePath As String)

Dim strDirResult As String
Dim colSubdirectories As New Collection
Dim varSubdirectoryName As Variant

'Verify that each parameter is not empty
If (Len(pstrSourcePath) = 0) Or (Len(pstrSourcePath) = 0) Then
Exit Sub
End If

'Add terminating "\" if none exists
If Right$(pstrSourcePath, 1) <> "\" Then
pstrSourcePath = pstrSourcePath & "\"
End If

On Error Resume Next
'Verify that each parameter is a valid path
strDirResult = Dir(pstrSourcePath & "*.*")
If Err Then
MsgBox Err
End If

'Process all files/folders
strDirResult = Dir(pstrSourcePath & "*.*", vbNormal Or vbReadOnly Or
vbHidden Or vbSystem Or vbDirectory Or vbArchive)

Do Until strDirResult = ""
If strDirResult = "." Or strDirResult = ".." Then
'If strDirResult is a directory
ElseIf (GetAttr(pstrSourcePath & strDirResult) And vbDirectory) =
vbDirectory Then
MsgBox pstrSourcePath & strDirResult
colSubdirectories.Add strDirResult
'Else strDirResult is a file
Else
MsgBox pstrSourcePath & strDirResult
End If
DoEvents
strDirResult = Dir
Loop

'Recursively copy folders
For Each varSubdirectoryName In colSubdirectories
'Here's the cool recursive call!
Call FixPerms(pstrSourcePath & CStr(varSubdirectoryName))
Next

Exit Sub
End Sub

====================================
 
R

r4qm4n

Thanks Karl, just tried that DirDrill app, but that doesn't work either...
Error 53 (cos the filenames are wrong)

Having migrated home and profile folders for ~15,000 user accounts, I've
just got 200 or so folders left with these awkward filenames, and I really
don't want to do them manually.

Any ideas ?

thanks
 
K

Karl E. Peterson

r4qm4n said:
Some code below as requested by Karl to demonstrate the problem.

Yeah, but it doesn't really. To repeat, here's what I asked...

What you've shown is a filename typed into the VB IDE. Presumably, that's
not how you're attempting to do it, right? Maybe using a Dir() loop,
instead?

That said, your reference to the page with Polish characters triggered
enough of a hint to see what's probably going on here. The filenames are
composed with Unicode characters, it seems. For example, if I name a file
using the Polish alphabet (aabccdeefghijkllmnnóprsstuwyzzz.txt), I get the
following AscW character codes:

97
261
98
99
263
100
101
281
102
103
104
105
106
107
108
322
109
110
324
243
112
114
115
347
116
117
119
121
122
378
380
46
116
120
116
0

But I only get those if I use Unicode API calls. While VB strings are
Unicode internally, they seem constitutionally unable to deal with getting
those in or out. IOW, all I can really do with them is to look at their
coded values, or pass them to other APIs that expect Unicode.

If it helps, here's the code I used to get those:

Private Declare Function FindFirstFileW Lib "kernel32" (ByVal lpFileName
As Long, lpFindFileData As Any) As Long
Private Declare Function FindNextFileW Lib "kernel32" (ByVal hFindFile As
Long, lpFindFileData As Any) As Long
Private Declare Function FindClose Lib "kernel32" (ByVal hFindFile As
Long) As Long

Private Const MAX_PATH = 260
Private Const INVALID_HANDLE_VALUE = -1
Private Const ERROR_NO_MORE_FILES = 18&

Private Type FILETIME
dwLowDateTime As Long
dwHighDateTime As Long
End Type

Private Type WIN32_FIND_DATA_W
dwFileAttributes As Long
ftCreationTime As FILETIME
ftLastAccessTime As FILETIME
ftLastWriteTime As FILETIME
nFileSizeHigh As Long
nFileSizeLow As Long
dwReserved0 As Long
dwReserved1 As Long
cFileName(1 To MAX_PATH * 2) As Byte
cAlternate(1 To 14) As Byte
End Type

Public Sub DumpFilesW(ByVal Path As String)
Dim hSearch As Long
Dim wfd As WIN32_FIND_DATA_W
Dim nRet As Long

hSearch = FindFirstFileW(StrPtr(Path), wfd)
If hSearch <> INVALID_HANDLE_VALUE Then
Do
Debug.Print TrimNull(wfd.cFileName)
nRet = FindNextFileW(hSearch, wfd)
Loop While nRet
Call FindClose(hSearch)
End If
End Sub

Private Function TrimNull(ByVal Whatever As String) As String
Dim nPos As Long
nPos = InStr(Whatever, vbNullChar)
Select Case nPos
Case 0
TrimNull = Whatever
Case 1
TrimNull = ""
Case Else
TrimNull = Left$(Whatever, nPos)
End Select
End Function
Having migrated home and profile folders for ~15,000 user accounts,
I've just got 200 or so folders left with these awkward filenames,
and I really don't want to do them manually.

You'll probably find it easier to just dig in and do it manually, if that's
"all" you need to do. If you're adventurous, and want to explore the world
of Unicode, that would seem to be the ultimate answer.

Good luck... Karl
 
R

r4qm4n

Hi Karl,

Many thanks for you reply, and I can see that this unicode problem is more
trouble than it's worth, I'll have to sort it manually (this time).

However, I really don't understand how you can say that the code I supplied
'doesn't really' demonstrate the problem. The code is lifted directly from
my app, and clearly demonstrates how any filenames with 'unicode' characters
in them that sit in c:\test get processed and displayed in a messagebox
(with the special characters modified)....what more could you possibly want
?

-->By what means is the filename "passed to VB"?

Well the procedure clearly places the filenames that reside in c:\test into
the strDirResult string variable.

--> What you've shown is a filename typed into the VB IDE.

err..No I haven't, you haven't read the whole post.

--> Presumably, that's not how you're attempting to do it, right?

You're right it's not......you've not read my post properly.

Thanks anyway, I'm grateful for the info re working with unicode, I'll get
my head round that as part of a longer term project.

cheers
 
K

Karl E. Peterson

Hir4qm4n said:
Many thanks for you reply, and I can see that this unicode problem is
more trouble than it's worth, I'll have to sort it manually (this
time).

Probably. If this were an ongoing issue, it'd be worth it, but...
However, I really don't understand how you can say that the code I
supplied 'doesn't really' demonstrate the problem. The code is lifted
directly from my app, and clearly demonstrates how any filenames with
'unicode' characters in them that sit in c:\test get processed and
displayed in a messagebox (with the special characters
modified)....what more could you possibly want ?

Jeez, I'm sorry. I did look way too quick. To me, it seemed you were using
"c:\test" as the filename, not the path, and, well, my mistake. Luckily(?),
I did gauge your intent appropriately, I think. Weird. Subliminal, maybe?
-->By what means is the filename "passed to VB"?

Well the procedure clearly places the filenames that reside in
c:\test into the strDirResult string variable.

--> What you've shown is a filename typed into the VB IDE.

err..No I haven't, you haven't read the whole post.

Yep, mea culpa.
--> Presumably, that's not how you're attempting to do it, right?

You're right it's not......you've not read my post properly.

Thanks anyway, I'm grateful for the info re working with unicode,
I'll get my head round that as part of a longer term project.

Good luck with it!
 
T

Tony Jollans

Maybe I've missed something obvious but I can't see what you're doing with
the filenames except displaying them.

Dir returns an ANSI string. MsgBox only processes ANSI strings (it calls
MessageBoxA).

You can use Application.FileSearch to get the unicode filenames. You can use
an API to use MessageBoxW to display the names.

Private Declare Function APIMsgBox _
Lib "User32" Alias "MessageBoxW" _
(Optional ByVal hWnd As Long, _
Optional ByVal Prompt As Long, _
Optional ByVal Title As String, _
Optional ByVal Buttons As Long) _
As Long

Sub UnicodeMessage()
With Application.FileSearch
.NewSearch
.LookIn = "c:\test"
.SearchSubFolders = True
.FileName = "*"
.MatchTextExactly = False
.FileType = msoFileTypeAllFiles
.Execute
For Each f In .FoundFiles
APIMsgBox Prompt:=StrPtr(f)
Next
End With
End Sub

What else do you want to do?

--
Enjoy,
Tony


r4qm4n said:
Thanks Karl, just tried that DirDrill app, but that doesn't work either...
Error 53 (cos the filenames are wrong)

Having migrated home and profile folders for ~15,000 user accounts, I've
just got 200 or so folders left with these awkward filenames, and I really
don't want to do them manually.

Any ideas ?

thanks
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top