Help with VBA for document formatting.

M

macropod

Hi Joe,

When doing the Find/Replace operation (in Word), did you have the 'use wildcards' option checked?

--
Cheers
macropod
[Microsoft MVP - Word]


HiJRC,

If you're unable to get the macros working, you could use a series of Find/Replace operations in Word. With the 'use wildcards'
option checked, you could have the:
1. first set of Find/Replace parameters as
Find (^13)([!.]{1,}.[0-9]{3})
Replace \1^t\2
2. second set of Find/Replace parameters as
Find (^13)([!.]{1,}.[0-9]{3}.[0-9]{3})
Replace \1^t\2
3. third set of Find/Replace parameters as
Find (^13)([!.]{1,}.[0-9]{3}.[0-9]{3}.[0-9]{3})
Replace \1^t\2
and so on, simply increasing the number of '.[0-9]{3}' Find variables until all are done. The data you posted suggest only 5
levels,
but you can go to 6 levels with this approach.

Hi, macropod.

This is the kind of simple solution I am looking for. Between what you
suggest here and John's subroutines I feel confident I may finally
(after weeks of frustrating failed attempts) resolve this issue.

I tried running the Find and Replace command with the text just as you
suggested above. The first time I did Word just froze and I had to do
a "Force Quit". The subsequent times (I tried a few more times) it
simply didn't work and gave a message that it couldn't find what I was
looking for.

Let me more specific:

When I push the "Find Next" button this is the message I get:

Word has finished searching the document. The search item was not
found.

Could it be something I am doing wrong ?

Thanks again for your help and really helpful suggestions.

Joe
 
R

Rob Schneider

You are doing too much research! :)

The version of Python that is provided as standard in OS X (2.61 on my
box) is just fine. It is not a "simple" version as there is no such
thing. You've been mislead by internet pages. Sorry. Don't install
anything else. Use what's there.

I know that you are not familiar with the Python code I gave you, but
those few lines of code do what you said you need done. It's fit for
purpose and does the job. It took about 5 minutes to write the code and
running on your 50,000 line text file (my recollection) should take just
a few more minutes.

While I understand the need to stay with familiar tools thinking that is
the best course, I've learned in my old age that it's often more
important and valuable to learn when to reject the familiar for the
tools which are more fit for purpose. "When all one has is a hammer,
then the hammer is appropriate for every job."

--rms

www.rmschneider.com
 
J

JRC

Hi JRC:

Works here, in Excel 2004.  You ARE aware that this is Excel code, right?
No ³cells² in Word, this code won¹t run in Word.

It¹s not my code, but there¹s nothing wrong with it.

Did you notice the space, underscore between the two lines?  That's a
continuation character.  Delete the line-breaks until it joins up with the
line above ending in an ampersand, and it will work fine.  Here¹s thecode
pasted out of my copy of Excel after successfully running it.

As soon as you begin programming, you have to set your mind to
character-for-character accuracy and attention to detail :)

 Public Sub TabIndentText()
    Dim sTabs As String
    Dim sTemp As String
    Dim rCell As Range
    Dim nCount As Long
    Dim nSemi As Long
    sTabs = String(10, vbTab)
    For Each rCell In Selection
       With rCell
         nCount = Len(.Text) - _
                        Len(Application.Substitute(.Text, ".", ""))
         nSemi = InStr(1, .Text, ";")
         .Value = Left(sTabs, nCount) & Trim(Left(.Text, nSemi - 1)) & _
                " [" & Trim(Mid(.Text, nSemi + 1)) & "]"
       End With
    Next rCell
 End Sub

Hello, John.
Thank you again for your help and detailed explanation.
I've tried everything just as you suggested and keep getting this
error message when I try to run VBA. I've tried both copying and
pasting into TextEdit and then Excel as you described in a previous
post and also tried typing the entire subroutine myself. In the end
the result is the same.
This is the subroutine you posted on the Excel forum:
One way (with tabs)
 Public Sub TabIndentText()
    Dim sTabs As String
    Dim sTemp As String
    Dim rCell As Range
    Dim nCount As Long
    Dim nSemi As Long
    sTabs = String(10, vbTab)
    For Each rCell In Selection
       With rCell
         nCount = Len(.Text) - _
                        Len(Application.Substitute(.Text, ".", ""))
         nSemi = InStr(1, .Text, ";")
         .Value = Left(sTabs, nCount) & Trim(Left(.Text, nSemi - 1)) &
_
                " [" & Trim(Mid(.Text, nSemi + 1)) & "]"
       End With
    Next rCell
 End Sub
This is the line where I keep getting the error message as it is now
in VBA:
.Value = Left(sTabs, nCount) & Trim(Left(.Text, nSemi - 1)) & "[" &
Trim(Mid(.Text, nSemi + 1)) & "]"
I've also tried typing the entire subroutine with and without indents
but it doesn't seem to make a difference.
I remember in one of your posts that you mentioned the "Split" command
not being a part of VBA in Office Mac 2004. Could the same be the case
for any of the commands in this line ?
This is really a very important project I need to finish and I didn't
anticipate it taking this long and being this complicated. Any help
you can offer in addition to all that you already have will be very
appreciated.
Thank you,

 --

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]

John:

It still won't work. I believe I have done everything right so I am
puzzled as to why this is happening.

Is there any way I can send you a small image of a partial screen
capture so that you can see exactly what I am looking at ?

Please let me know.

Thank you,

Joe
 
J

John McGhie

Hi Joe:

Email me the Excel Workbook you are working on (address in the .sig).

The code runs fine here: I am wondering whether it is "running" fine there
but that you have not selected the correct cells before running it, or if
the data does not follow the format the code expects, or that you may have
put the macro into the wrong workbook.

Cheers


John:

It still won't work. I believe I have done everything right so I am
puzzled as to why this is happening.

Is there any way I can send you a small image of a partial screen
capture so that you can see exactly what I am looking at ?

Please let me know.

Thank you,

Joe

--

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]
 
J

JRC

Hi Joe:

Email me the Excel Workbook you are working on (address in the .sig).

The code runs fine here: I am wondering whether it is "running" fine there
but that you have not selected the correct cells before running it, or if
the data does not follow the format the code expects, or that you may have
put the macro into the wrong workbook.

Cheers








 --

The email below is my business email -- Please do not email me about forum
matters unless I ask you to; or unless you intend to pay!

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]

Hi, John.

Thank you very much for your kind offer.

I've e-mailed you the excel worksheet I have been working on along
with two small screen captures showing the error messages I have been
getting when trying to run the code.

It is probably something I haven't done right but I just can't figure
it out.

Thanks again for your help.

Joe
 
J

JRC

Oh.  My fault, I guess. Apologies to all. I prefer 5 minute solutions to
working on things for days on end for no good reason.

The full instructions for using Python already written and easily
available.  I do not have the energy or time to re-write and re-publish
  especially on an un-billed basis.

Please put a link on your web site to:http://www.python.org/download/mac/

My understanding is that Python installed by standard as part of OS X. I
have no recollection of having to install it.  It's just "there."

(to run the script I gave them:

1. Save the script to a file named "processtext.py" (without the
quotation marks) with a text editor.
2. with the text editor change the name of the input/output file
3. go to the Mac OSX "terminal" program type command "python
processtext.py" (without quotation marks)

The program will process the input file as prescribed, making the output
file. Done.

--rms

www.rmschneider.com



John said:
Because you didn't tell him how to install Python, initialise the
environment and how to use the script.
If you want to take our users out of the GUI, you have to start from the GUI
and tell them how to bring up Terminal and login, step-by-step.
And then they're not going to do it :)  We have to use detailed
instructions for getting VBA macros to run.
If you have similarly-detailed step-by-step instructions for Python,
starting with a running GUI and the install DVDs for OS 10.6.0, send them
along: I am the webmaster, address is in the .sig :)

On 12/12/09 5:09 PM, in article (e-mail address removed), "Rob
JRC,
I'm curious why you didn't try the little Python program I gave you?
Saving those few lines into a file as "convert.py" or something, editing
the input/output file names, and you'd be done in minutes. And you'd
have something which will allow the automatation (for future file
conversions) which you wanted.  Just curious.
Python is provided in Mac OS X ... they this because it's useful for
many things, including exactly the sort of think you want to do.
--rms
www.rmschneider.com
JRCwrote:
When you paste from a browser, it converts "spaces" to non-breaking spaces.
That is correct for HTML, but it is fatal in the compiler, where a
non-breaking space is a literal character, not a "blank"
Instead, paste the code into TextEdit.
In TextEdit, use the menu to "Make plain text".
Then copy in TextEdit and paste THAT into the VBA Editor.
Then it will compile and run :)
Cheers
On 10/12/09 7:22 PM, in article
(e-mail address removed), "JRC"
HiJRC,
I already answered this some days ago - in the Excel context in
microsoft.public.mac.office.excel, but you've given no feedback
there to the solution I posted.
--
Cheers
macropod
[Microsoft MVP - Word]
messagecom
...
Dear all:
I am currently working on a rather large document that I need to
format a certain specific way. The document includes nearly 57,000
lines of plain text that need to be indented according to a simple
rule so that it can then be imported into an application as a tab-
delimited file. A member from another forum suggested this might be
more easily done with Word instead of Excel so I decided to post this
questions here.
My system is a Mac G5 running the latest version of Leopard (not Snow
Leopard) which is 10.5.8 and I have Microsoft Office 2004 installed
(haven't upgraded to 2008 and will probably wait until next version to
upgrade).
Below is an example of the type of file I now have:
Body Regions;A01
Abdomen;A01.047
Abdominal Cavity;A01.047.025
Peritoneum;A01.047.025.600
Douglas' Pouch;A01.047.025.600.225
Mesentery;A01.047.025.600.451
Mesocolon;A01.047.025.600.451.535
Omentum;A01.047.025.600.573
Peritoneal Cavity;A01.047.025.600.678
Peritoneal Stomata;A01.047.025.600.700
Retroperitoneal Space;A01.047.025.750
Abdominal Wall;A01.047.050
Groin;A01.047.365
Inguinal Canal;A01.047.412
Umbilicus;A01.047.849
Back;A01.176
Lumbosacral Region;A01.176.519
Sacrococcygeal Region;A01.176.780
Breast;A01.236
Mammary Glands, Human;A01.236.249
This is just a small portion of the file I am working with. This is a
"flat" text file but what I need is a "hierarchical" or "tree
structure" file where each of these lines is indented with a number of
"tab keystrokes" corresponding to its level as depicted by the
alphanumeric code following the term. Example:
Body Regions;A01
    Abdomen;A01.047
         Abdominal Cavity;A01.047.025
              Peritoneum;A01.047.025.600
                   Douglas' Pouch;A01.047.025..600..225
                   Mesentery;A01.047.025.600.451
                        Mesocolon;A01.047..025..600.451.535
                   Omentum;A01.047.025.600.573
                        Peritoneal Cavity;A01..047.025.600.678
                   Peritoneal Stomata;A01.047..025..600.700
              Retroperitoneal Space;A01.047.025.750
         Abdominal Wall;A01.047.050
         Groin;A01.047.365
         Inguinal Canal;A01.047.412
         Umbilicus;A01.047.849
    Back;A01.176
         Lumbosacral Region;A01.176.519
         Sacrococcygeal Region;A01.176.780
    Breast;A01.236
         Mammary Glands, Human;A01.236.249
As you will notice each line has been indented using the tab key a
number of times corresponding to the size of the alphanumeric code.
Lines with an alphanumeric code containing only 3 characters stayon
the left most position (no tabs). Example:
Body Regions;A01
Lines with an alphanumeric code containing 7 characters (example:
A01.047) are indented with one single tab keystroke. Example:
    Abdomen;A01.047
Lines with an alphanumeric code containing 11 characters (example:
A01.047.025) are indented with two tab keystrokes. Example:
         Abdominal Cavity;A01.047.025
Lines with an alphanumeric code containing 15 characters (example:
A01.047.025.600) are indented with three tab keystrokes. Example:
              Peritoneum;A01.047.025.600
And so on, and so forth.
Once this step is finished and the file has been edited from a flat
text to one with a hierarchical or tree structure using tab entries
(to derive a tab-delimited file) the second part of the project
involves the removal of certain characters and replacement with
others. Basically I need the format to go from the existing:
Abdominal Cavity;A01.047.025
Where the term is followed by a semicolon and alphanumeric code.
To the following:
Abdominal Cavity [A01.047.025]
Where a space is added after the term, the semicolon is removed and
brackets are placed at the beginning and end of the alphanumeric
code.
Considering the size of my document as I stated (little less than
57,000 entries) I would like to find a way to automate the process.
How can these two processes be accomplished with the use of a macro ?
I have never worked with macros before and any help would be extremely
valuable and appreciated. I have both Excel 04 and Word 04 as part of
the Office 2004 package in case the information is relevant. I amnot
sure which of the applications would be better suited for this task
but am open to your suggestions.
Also I would appreciate suggestions on textbooks that would coverVBA
for Excel and Word. I would like to find books that would cover
everything  from A to Z but with an approach that makes it easyfor a
beginner to understand and work with VBA on Microsoft Office for the
Mac. Any recommendations will be appreciated.
Thank you in advance for your help.
JRC
Hi Macropod.
I am sorry as I thought I had posted a reply to your post on the Excel
forum.
As I mentioned on some of my posts in that forum I tried your
subroutine as well as a few others that were posted but kept (and
still keep) getting error messages as I try to run them.
After copying and pasting your subroutine (and the others as well) I
tried to run them and Excel gave me an error message. The first line
appears highlighted in yellow and an arrow appears to the left of that
first line.
Do you know why this might be happening ?
I tried looking at a VBA book at Barnes and Noble but couldn't find
one specific for the Mac. There are several for Windows but I am not
sure what version of VBA on Windows would be equivalent to the one I
am running on the Mac (Office Mac 2004). Which of the Windows versions
of Excel with VBA would be equivalent to Office Mac 2004: the 2003 or
2007 version ?
Thanks,
JRC
 --

...

read more »

Hello, Rob.

I just tried your Python routine but got an error message stating the
following:

/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/
Python.app/Contents/MacOS/Python: can't open file 'processtext.py':
[Errno 2] No such file or directory

At first I thought it had something to do with the fact that it might
be looking for the file 'processtext.py' in the main directory while I
had it on my desktop. So I transferred the files to the main directory
but nothing changed - the result was the same error message.

Is there something I am missing ?

Thank you,

Joe
 
R

Rob Schneider

Hello, Rob.

I just tried your Python routine but got an error message stating the
following:

/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/
Python.app/Contents/MacOS/Python: can't open file 'processtext.py':
[Errno 2] No such file or directory

At first I thought it had something to do with the fact that it might
be looking for the file 'processtext.py' in the main directory while I
had it on my desktop. So I transferred the files to the main directory
but nothing changed - the result was the same error message.

Is there something I am missing ?

Thank you,

Joe

Joe,

It means that you launched Python and asked it to load the
'processtext.py' file and it can't find it. Probably because when you
launched Python the file processtext.py and the input file was not in
the current directory. It has to be in the 'current directory' simply
because you did not give a full path to the actual file location. The
computer has to be told what to do. It cannot read the user's mind.

I don't know what you mean by "main directory".

'processtext.py' is the name I suggested you give to the little
programme file. Is this the file name you gave it?

The error message suggests that it can't find the file; so on that that
basis let's go through the steps to take. I will be *very* explicit.
While there are a lot of words here, it is simple.

1. Launch the Mac program "TextEdit" using Finder and from the
Applications Folder.

2. Copy/paste the program I provided previously into TextEdit. Edit
the name of the input file name and output file name you will use. I
suggest "input.txt" and "output.txt". Use whatever you like. Ensure
that the indented portions of the program are, as mentioned previously,
all four spaces (or if not four then must all be the same, e.g. three).
Do not use tabs, do not use inconsistent indents.

3. Save the file into the "Documents" folder as 'processtext.py'. (You
can put it in any folder you like, including Desktop ... I just say
Documents just so that I can write this procedure and pick an existing
folder to avoid creating a new folder which adds more steps to the
procedure).

4. Copy the file which holds your text that you wish to transform into
the "Documents" folder. Use Finder.

5. Open the Mac "Terminal" program. You'll probably find it on the
"Dock" but not there, launch it via Finder in the Applications/Utilities
folder (help at Google "launch mac terminal")

6. Unless you changed something, the Mac will put you into terminal
mode in the /Users/[youruserid] folder. Change to the Documents folder
by issuing the command (without the quote marks), "cd Documents" then
press Return Key.

7.The "current directory" will be /Users/[youruserid/Documents". Check
this by issuing the command "pwd" to see the *p*present *w*working
*d*iretory.

8.Confirm that your "processtext.py is there by issuing the command "ls
passwordtext.py" and if it is the Mac will respond by by showing you the
file name. If the file is not present, go back to steps 1, 2, and 3.
This file must be in this folder simply because we keep it simple by not
putting it elsewhere which would require giving a full path name when
launching the program.

9.Confirm that your input file (called "input.txt"???) is in that folder
by issuing the "ls input.txt" file. If it is not there, then go to step
4 above. This file must be in this folder simply because we keep it
simple by not putting it elsewhere which would require giving a full
path name in the programm.

10. Issue the command "python processtext.py" which will a. launch
Python, b. load the "processtext.py" file from the current directory
(Documents), then run that program. It should create the file
output.text (or whatever you called it in step 2 above) into the current
folder (Documents). It should only take a few seconds (depending on
the size of the input file).

11. With Word or TextEdit, look at the output.txt file and see if it
did what you wanted.

12. If it did not, then you need to adjust the Python program to do
what you actually want if you want to use Python. You may also find
that the input data format/pattern is not exactly as expected and
therefore the program may not work. You can then chose to change the
data format/pattern, or make the program more sophisticated to enable
transforming the unexpected format/patter. That's just life when
munging data.


--rms

www.rmschneider.com
 
J

John McGhie

Hi Joe:

OK, I'll let you off this time :)

There was nothing wrong with your code, it's your data that's the problem.

The code depends on the presence of a semicolon somewhere after the first
character in the cell to do its thing. It was blowing up on the first cell
that did not have one, because there was no error-checking in the code.

Error-checking is the sort of "nice to have" that you get in
commercially-written code that you pay for -- you can't expect someone who
does you a favour by running up a few lines in a newsgroup to make the code
robust for you :)

Here's a new version, this one will turn red any cell that does not satisfy
the input requirements, so you can find them easily.

You should be aware that this code is NOT "restartable". You need to remove
the rows it has processed before re-running it, or the already-processed
rows will turn red.

Hope this helps

Public Sub TabIndentText()
Dim sTabs As String
Dim sTemp As String
Dim rCell As Range
Dim nCount As Long
Dim nSemi As Long

sTabs = String(10, vbTab)
For Each rCell In Selection
With rCell
nCount = Len(.Text) - Len(Application.Substitute(.Text, ".", ""))
nSemi = InStr(1, .Text, ";")
If nCount > 0 And nSemi > 0 Then
.Value = Left(sTabs, nCount) & Trim(Left(.Text, nSemi - 1)) & "
[" & Trim(Mid(.Text, nSemi + 1)) & "]"
Else
rCell.Interior.ColorIndex = 3
End If
End With
Next rCell
End Sub



Hi, John.

Thank you very much for your kind offer.

I've e-mailed you the excel worksheet I have been working on along
with two small screen captures showing the error messages I have been
getting when trying to run the code.

It is probably something I haven't done right but I just can't figure
it out.

Thanks again for your help.

Joe

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!

--

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]
 
J

JRC

Hi Joe:

OK, I'll let you off this time :)

There was nothing wrong with your code, it's your data that's the problem..

The code depends on the presence of a semicolon somewhere after the first
character in the cell to do its thing.  It was blowing up on the first cell
that did not have one, because there was no error-checking in the code.

Error-checking is the sort of "nice to have" that you get in
commercially-written code that you pay for -- you can't expect someone who
does you a favour by running up a few lines in a newsgroup to make the code
robust for you :)

Here's a new version, this one will turn red any cell that does not satisfy
the input requirements, so you can find them easily.

You should be aware that this code is NOT "restartable".  You need to remove
the rows it has processed before re-running it, or the already-processed
rows will turn red.

Hope this helps

Public Sub TabIndentText()
Dim sTabs As String
Dim sTemp As String
Dim rCell As Range
Dim nCount As Long
Dim nSemi As Long

sTabs = String(10, vbTab)
For Each rCell In Selection
    With rCell
        nCount = Len(.Text) - Len(Application.Substitute(.Text,".", ""))
        nSemi = InStr(1, .Text, ";")
        If nCount > 0 And nSemi > 0 Then
            .Value = Left(sTabs, nCount) & Trim(Left(.Text,nSemi - 1)) & "
[" & Trim(Mid(.Text, nSemi + 1)) & "]"
            Else
            rCell.Interior.ColorIndex = 3
        End If
    End With
Next rCell
End Sub

Hi, John.
Thank you very much for your kind offer.
I've e-mailed you the excel worksheet I have been working on along
with two small screen captures showing the error messages I have been
getting when trying to run the code.
It is probably something I haven't done right but I just can't figure
it out.
Thanks again for your help.

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!

 --

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]

Hi, John.

I just tried running this latest subroutine you provided and this time
it ran without any error messages.

The problem is the fact that (1) all records were highlighted in red
which made the entire column A red from the first record all the way
down to the last one, and (2) the data didn't move to the
corresponding cell according to the size of the alphanumeric character
(the number of tabs).

The file generated after the subroutine or macro in Excel will have to
be exported as a tab-delimited text file in order to be used
elsewhere. I chose to use Excel because it is cell-based and it would
make it visually easier to see the number of tabs by looking at what
column the line had been moved to. The subroutine you provided seems
to have applied tabs to the lines while they all remained in the first
column. In case I export this document as a tab-delimited file the
formatting won't be right and I believe the tabs will disappear.

Perhaps I wasn't clear in my initial posts and I apologize if that was
the case. My understanding was that in working with text files
imported into Excel as tab-delimited files tabs applied to the text
would imply the line content being shifted one or more cells to the
right. Is this not the case ?

What command can I use so that instead of applying tabs the text will
actually be moved a number of cells to the right corresponding to the
lenght of the alphanumeric string contained in the line ?

Thank you very much again for all your help with this issue.

Joe
 
M

macropod

Hi JRC,

Now that you know how to get vba code from newsgroups to run, perhaps it's time to reconsider something based on my earlier
suggestions for Excel:
Sub Indenter()
Dim i As Integer, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
For j = 1 To Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1).Value, ".", ""))
.Cells(i, 1).Insert Shift:=xlToRight
Next
Next
End With
Application.ScreenUpdating = True
End Sub

or

Sub Indenter()
Dim i As Integer, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
For j = 1 To Len(.Cells(i, 1).Value) - Len(Application.Substitute(.Cells(i, 1).Value, ".", ""))
.Cells(i, 1).Insert Shift:=xlToRight
Next
Next
End With
Application.ScreenUpdating = True
End Sub

Very quick and will not affect any records that have already been indented.

--
Cheers
macropod
[Microsoft MVP - Word]


Hi Joe:

OK, I'll let you off this time :)

There was nothing wrong with your code, it's your data that's the problem.

The code depends on the presence of a semicolon somewhere after the first
character in the cell to do its thing. It was blowing up on the first cell
that did not have one, because there was no error-checking in the code.

Error-checking is the sort of "nice to have" that you get in
commercially-written code that you pay for -- you can't expect someone who
does you a favour by running up a few lines in a newsgroup to make the code
robust for you :)

Here's a new version, this one will turn red any cell that does not satisfy
the input requirements, so you can find them easily.

You should be aware that this code is NOT "restartable". You need to remove
the rows it has processed before re-running it, or the already-processed
rows will turn red.

Hope this helps

Public Sub TabIndentText()
Dim sTabs As String
Dim sTemp As String
Dim rCell As Range
Dim nCount As Long
Dim nSemi As Long

sTabs = String(10, vbTab)
For Each rCell In Selection
With rCell
nCount = Len(.Text) - Len(Application.Substitute(.Text, ".", ""))
nSemi = InStr(1, .Text, ";")
If nCount > 0 And nSemi > 0 Then
.Value = Left(sTabs, nCount) & Trim(Left(.Text, nSemi - 1)) & "
[" & Trim(Mid(.Text, nSemi + 1)) & "]"
Else
rCell.Interior.ColorIndex = 3
End If
End With
Next rCell
End Sub

Hi, John.
Thank you very much for your kind offer.
I've e-mailed you the excel worksheet I have been working on along
with two small screen captures showing the error messages I have been
getting when trying to run the code.
It is probably something I haven't done right but I just can't figure
it out.
Thanks again for your help.

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!

--

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]

Hi, John.

I just tried running this latest subroutine you provided and this time
it ran without any error messages.

The problem is the fact that (1) all records were highlighted in red
which made the entire column A red from the first record all the way
down to the last one, and (2) the data didn't move to the
corresponding cell according to the size of the alphanumeric character
(the number of tabs).

The file generated after the subroutine or macro in Excel will have to
be exported as a tab-delimited text file in order to be used
elsewhere. I chose to use Excel because it is cell-based and it would
make it visually easier to see the number of tabs by looking at what
column the line had been moved to. The subroutine you provided seems
to have applied tabs to the lines while they all remained in the first
column. In case I export this document as a tab-delimited file the
formatting won't be right and I believe the tabs will disappear.

Perhaps I wasn't clear in my initial posts and I apologize if that was
the case. My understanding was that in working with text files
imported into Excel as tab-delimited files tabs applied to the text
would imply the line content being shifted one or more cells to the
right. Is this not the case ?

What command can I use so that instead of applying tabs the text will
actually be moved a number of cells to the right corresponding to the
lenght of the alphanumeric string contained in the line ?

Thank you very much again for all your help with this issue.

Joe
 
J

JRC

Hello, Rob.
I just tried your Python routine but got an error message stating the
following:
/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/
Python.app/Contents/MacOS/Python: can't open file 'processtext.py':
[Errno 2] No such file or directory
At first I thought it had something to do with the fact that it might
be looking for the file 'processtext.py' in the main directory while I
had it on my desktop. So I transferred the files to the main directory
but nothing changed - the result was the same error message.
Is there something I am missing ?
Thank you,

Joe,

It means that you launched Python and asked it to load the
'processtext.py' file and it can't find it. Probably because when you
launched Python the file processtext.py and the input file was not in
the current directory.  It has to be in the 'current directory' simply
because you did not give a full path to the actual file location. The
computer has to be told what to do.  It cannot read the user's mind.

I don't know what you mean by "main directory".

'processtext.py' is the  name I suggested you give to the little
programme file.  Is this the file name you gave it?

The error message suggests that it can't find the file; so on that that
basis let's go through the steps to take.  I will be *very* explicit.
While there are a lot of words here, it is simple.

1.  Launch the Mac program "TextEdit" using Finder and from the
Applications Folder.

2.  Copy/paste the program I provided previously into TextEdit.  Edit
the name of the input file name and output file name you will use.  I
suggest "input.txt" and "output.txt".  Use whatever you like. Ensure
that the indented portions of the program are, as mentioned previously,
all four spaces (or if not four then must all be the same, e.g. three).
Do not use tabs, do not use inconsistent indents.

3.  Save the file into the "Documents" folder as 'processtext.py'. (You
can put it in any folder you like, including Desktop ... I just say
Documents just so that I can write this procedure and pick an existing
folder to avoid creating a new folder which adds more steps to the
procedure).

4.  Copy the file which holds your text that you wish to transform into
the "Documents" folder.  Use Finder.

5.  Open the Mac "Terminal" program.  You'll probably find it on the
"Dock" but not there, launch it via Finder in the Applications/Utilities
folder (help at Google "launch mac terminal")

6.  Unless you changed something, the Mac will put you into terminal
mode in the /Users/[youruserid] folder.  Change to the Documents folder
by issuing the command (without the quote marks), "cd Documents" then
press Return Key.

7.The "current directory" will be /Users/[youruserid/Documents".  Check
this by issuing the command "pwd" to see the *p*present *w*working
*d*iretory.

8.Confirm that your "processtext.py is there by issuing the command "ls
passwordtext.py" and if it is the Mac will respond by by showing you the
file name. If the file is not present, go back to steps 1, 2, and 3.
This file must be in this folder simply because we keep it simple by not
  putting it elsewhere which would require giving a full path name when
launching the program.

9.Confirm that your input file (called "input.txt"???) is in that folder
by issuing the "ls input.txt" file.  If it is not there, then go to step
4 above.  This file must be in this folder simply because we keep it
simple by not putting it elsewhere which would require giving a full
path name in the programm.

10. Issue the command "python processtext.py" which will a. launch
Python, b. load the "processtext.py" file from the current directory
(Documents), then run that program.  It should create the file
output.text (or whatever you called it in step 2 above) into the current
  folder (Documents).  It should only take a few seconds (depending on
the size of the input file).

11.  With Word or TextEdit, look at the output.txt file and see if it
did what you wanted.

12.  If it did not, then you need to adjust the Python program to do
what you actually want if you want to use Python.  You may also find
that the input data format/pattern is not exactly as expected and
therefore the program may not work.  You can then chose to change the
data format/pattern, or make the program more sophisticated to enable
transforming the unexpected format/patter.  That's just life when
munging data.

--rms

www.rmschneider.com

Hi, Rob.

I just tried the Python program as you suggested. Thank you for the
detailed instructions that made every step easy to follow.

I am note sure if the program runs or doesn't. After I press enter
Terminal immediately displays the next line waiting for input and
doesn't show any error message. However, as I go into the Documents
folder I simply can't find the "output.txt" file that the program
should have saved. I also tried a spotlight search but couldn't locate
any file in my system with the name "output.txt" (I did this just to
make sure it had not saved the program to a different location
although I can see that Terminal is looking at the "right" folder).

What do you think may have happened ?

Thank you very much for your help and for the detailed explanation and
instructions.

Joe
 
M

macropod

And for a complete solution:

Sub Indenter()
Dim i, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
If InStr(.Cells(i, 1).Value, ";") > 0 Then _
.Cells(i, 1).Value = Replace(.Cells(i, 1).Value, ";", " [") & "]"
For j = 1 To Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1).Value, ".", ""))
.Cells(i, 1).Insert Shift:=xlToRight
Next
Next
End With
Application.ScreenUpdating = True
End Sub

or

Sub Indenter()
Dim i, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
If InStr(.Cells(i, 1).Value, ";") > 0 Then _
.Cells(i, 1).Value = Application.Substitute(.Cells(i, 1).Value, ";", " [") & "]"
For j = 1 To Len(.Cells(i, 1).Value) - Len(Application.Substitute(.Cells(i, 1).Value, ".", ""))
.Cells(i, 1).Insert Shift:=xlToRight
Next
Next
End With
Application.ScreenUpdating = True
End Sub

--
Cheers
macropod
[Microsoft MVP - Word]


Hi Joe:

OK, I'll let you off this time :)

There was nothing wrong with your code, it's your data that's the problem.

The code depends on the presence of a semicolon somewhere after the first
character in the cell to do its thing. It was blowing up on the first cell
that did not have one, because there was no error-checking in the code.

Error-checking is the sort of "nice to have" that you get in
commercially-written code that you pay for -- you can't expect someone who
does you a favour by running up a few lines in a newsgroup to make the code
robust for you :)

Here's a new version, this one will turn red any cell that does not satisfy
the input requirements, so you can find them easily.

You should be aware that this code is NOT "restartable". You need to remove
the rows it has processed before re-running it, or the already-processed
rows will turn red.

Hope this helps

Public Sub TabIndentText()
Dim sTabs As String
Dim sTemp As String
Dim rCell As Range
Dim nCount As Long
Dim nSemi As Long

sTabs = String(10, vbTab)
For Each rCell In Selection
With rCell
nCount = Len(.Text) - Len(Application.Substitute(.Text, ".", ""))
nSemi = InStr(1, .Text, ";")
If nCount > 0 And nSemi > 0 Then
.Value = Left(sTabs, nCount) & Trim(Left(.Text, nSemi - 1)) & "
[" & Trim(Mid(.Text, nSemi + 1)) & "]"
Else
rCell.Interior.ColorIndex = 3
End If
End With
Next rCell
End Sub

Hi, John.
Thank you very much for your kind offer.
I've e-mailed you the excel worksheet I have been working on along
with two small screen captures showing the error messages I have been
getting when trying to run the code.
It is probably something I haven't done right but I just can't figure
it out.
Thanks again for your help.

This email is my business email -- Please do not email me about forum
matters unless you intend to pay!

--

John McGhie, Microsoft MVP (Word, Mac Word), Consultant Technical Writer,
McGhie Information Engineering Pty Ltd
Sydney, Australia. | Ph: +61 (0)4 1209 1410
+61 4 1209 1410, mailto:[email protected]

Hi, John.

I just tried running this latest subroutine you provided and this time
it ran without any error messages.

The problem is the fact that (1) all records were highlighted in red
which made the entire column A red from the first record all the way
down to the last one, and (2) the data didn't move to the
corresponding cell according to the size of the alphanumeric character
(the number of tabs).

The file generated after the subroutine or macro in Excel will have to
be exported as a tab-delimited text file in order to be used
elsewhere. I chose to use Excel because it is cell-based and it would
make it visually easier to see the number of tabs by looking at what
column the line had been moved to. The subroutine you provided seems
to have applied tabs to the lines while they all remained in the first
column. In case I export this document as a tab-delimited file the
formatting won't be right and I believe the tabs will disappear.

Perhaps I wasn't clear in my initial posts and I apologize if that was
the case. My understanding was that in working with text files
imported into Excel as tab-delimited files tabs applied to the text
would imply the line content being shifted one or more cells to the
right. Is this not the case ?

What command can I use so that instead of applying tabs the text will
actually be moved a number of cells to the right corresponding to the
lenght of the alphanumeric string contained in the line ?

Thank you very much again for all your help with this issue.

Joe
 
R

Rob Schneider

Hello, Rob.
I just tried your Python routine but got an error message stating the
following:
/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/
Python.app/Contents/MacOS/Python: can't open file 'processtext.py':
[Errno 2] No such file or directory
At first I thought it had something to do with the fact that it might
be looking for the file 'processtext.py' in the main directory while I
had it on my desktop. So I transferred the files to the main directory
but nothing changed - the result was the same error message.
Is there something I am missing ?
Thank you,

Joe,

It means that you launched Python and asked it to load the
'processtext.py' file and it can't find it. Probably because when you
launched Python the file processtext.py and the input file was not in
the current directory. It has to be in the 'current directory' simply
because you did not give a full path to the actual file location. The
computer has to be told what to do. It cannot read the user's mind.

I don't know what you mean by "main directory".

'processtext.py' is the name I suggested you give to the little
programme file. Is this the file name you gave it?

The error message suggests that it can't find the file; so on that that
basis let's go through the steps to take. I will be *very* explicit.
While there are a lot of words here, it is simple.

1. Launch the Mac program "TextEdit" using Finder and from the
Applications Folder.

2. Copy/paste the program I provided previously into TextEdit. Edit
the name of the input file name and output file name you will use. I
suggest "input.txt" and "output.txt". Use whatever you like. Ensure
that the indented portions of the program are, as mentioned previously,
all four spaces (or if not four then must all be the same, e.g. three).
Do not use tabs, do not use inconsistent indents.

3. Save the file into the "Documents" folder as 'processtext.py'. (You
can put it in any folder you like, including Desktop ... I just say
Documents just so that I can write this procedure and pick an existing
folder to avoid creating a new folder which adds more steps to the
procedure).

4. Copy the file which holds your text that you wish to transform into
the "Documents" folder. Use Finder.

5. Open the Mac "Terminal" program. You'll probably find it on the
"Dock" but not there, launch it via Finder in the Applications/Utilities
folder (help at Google "launch mac terminal")

6. Unless you changed something, the Mac will put you into terminal
mode in the /Users/[youruserid] folder. Change to the Documents folder
by issuing the command (without the quote marks), "cd Documents" then
press Return Key.

7.The "current directory" will be /Users/[youruserid/Documents". Check
this by issuing the command "pwd" to see the *p*present *w*working
*d*iretory.

8.Confirm that your "processtext.py is there by issuing the command "ls
passwordtext.py" and if it is the Mac will respond by by showing you the
file name. If the file is not present, go back to steps 1, 2, and 3.
This file must be in this folder simply because we keep it simple by not
putting it elsewhere which would require giving a full path name when
launching the program.

9.Confirm that your input file (called "input.txt"???) is in that folder
by issuing the "ls input.txt" file. If it is not there, then go to step
4 above. This file must be in this folder simply because we keep it
simple by not putting it elsewhere which would require giving a full
path name in the programm.

10. Issue the command "python processtext.py" which will a. launch
Python, b. load the "processtext.py" file from the current directory
(Documents), then run that program. It should create the file
output.text (or whatever you called it in step 2 above) into the current
folder (Documents). It should only take a few seconds (depending on
the size of the input file).

11. With Word or TextEdit, look at the output.txt file and see if it
did what you wanted.

12. If it did not, then you need to adjust the Python program to do
what you actually want if you want to use Python. You may also find
that the input data format/pattern is not exactly as expected and
therefore the program may not work. You can then chose to change the
data format/pattern, or make the program more sophisticated to enable
transforming the unexpected format/patter. That's just life when
munging data.

--rms

www.rmschneider.com

Hi, Rob.

I just tried the Python program as you suggested. Thank you for the
detailed instructions that made every step easy to follow.

I am note sure if the program runs or doesn't. After I press enter
Terminal immediately displays the next line waiting for input and
doesn't show any error message. However, as I go into the Documents
folder I simply can't find the "output.txt" file that the program
should have saved. I also tried a spotlight search but couldn't locate
any file in my system with the name "output.txt" (I did this just to
make sure it had not saved the program to a different location
although I can see that Terminal is looking at the "right" folder).

What do you think may have happened ?

Thank you very much for your help and for the detailed explanation and
instructions.

Joe

Joe,

I haven't a clue what happened on your computer and why this is so hard
for you. What do you think is wrong?

I don't have a clue if Spotlight will find a recently-created file. Why
do you think the program will have chosen on it's own to save it
somewhere other than the current directory. It will only do that if you
told it to. You don't say that you ran the program from the Documents
folder. The program, as provided, will put the output file in the same
folder as the input file.

As you can see in the original source code I put in no statements to
show progress of work. This is something that you could have done to
debug it. From the symptoms you provide the program did work and there
were no errors and the output file should have been created in the
current directory. But I don't know where you told the program to put
the file.

Here is a *more* sophisticated version of the program (with a *lot* of
additional exception checking and output statements to show it's
working. It's the same program just with added extras (watch that lines
don't get garbled. remember the four space indent rule):

=====================
#!/usr/bin/env python
# encoding: utf-8

# change these if you want. without folder names,
# the current directory assumed
ifn="input.txt" # inputfile name
ofn="output.txt" # outputfile name

try:
ifh=open("input.txt",'r')
except:
print "EXCEPTION: can't open input file."
try:
ofh=open("output.txt","w")
except:
print "EXCEPTION: can't open output file."
try:
inputtext=ifh.readlines()
except:
print "EXCEPTION: can't read input file."
ifh.close

tab='\t'
for line in inputtext:
t1=line.split(';')
if len(t1) < 2:
print "ERROR: no ';' in input line. Skipping.",t1
else:
title=t1[0]
data=t1[1][:-1]
cnt=len(t1[1].split('.'))
print "...will put %i tab stops in %s;%s" % (cnt,title,data)
try:
ofh.write(tab*cnt+title+' ['+data+']\n')
except:
print "EXCEPTION: can't write output file."
try:
ofh.close()
except:
print "EXCEPTION. Can't close output file."
print "\nDone. input from:%s output to: %s." % (ifn,ofn)
print "Check with the command: 'ls %s'" % (ofn)

===============

Here is the input file I used to test (with one error introduced to
check the error checking):

Body Regions;A01
Abdomen;A01.047
Abdominal Cavity;A01.047.025
Mesocolon;A01.047.025.600.451.535
Omentum;A01.047.025.600.573
Peritoneal Cavity;A01.047.025.600.678
Peritoneal Stomata;A01.047.025.600.700
Retroperitoneal Space;A01.047.025.750
Back;A01.176
Lumbosacral Region;A01.176.519
Sacrococcygeal Region;A01.176.780
BreastA01.236
Mammary Glands, Human;A01.236.249
================

Here is the output printed to the screen for this test:
================
MacBook:documents rmschne$ python transform.py
....will put 1 tab stops in Body Regions;A01
....will put 2 tab stops in Abdomen;A01.047
....will put 3 tab stops in Abdominal Cavity;A01.047.025
....will put 6 tab stops in Mesocolon;A01.047.025.600.451.535
....will put 5 tab stops in Omentum;A01.047.025.600.573
....will put 5 tab stops in Peritoneal Cavity;A01.047.025.600.678
....will put 5 tab stops in Peritoneal Stomata;A01.047.025.600.700
....will put 4 tab stops in Retroperitoneal Space;A01.047.025.750
....will put 2 tab stops in Back;A01.176
....will put 3 tab stops in Lumbosacral Region;A01.176.519
....will put 3 tab stops in Sacrococcygeal Region;A01.176.780
ERROR: no ';' in input line. Skipping. ['BreastA01.236\n']
....will put 3 tab stops in Mammary Glands, Human;A01.236.24

Done. input from:input.txt output to: output.txt.
Check with the command: 'ls -l output.txt'
=================

Here is the result of the ls command:
=================
MacBook:documents rmschne$ ls -l output.txt
-rw-r--r-- 1 rmschne staff 416 18 Dec 08:09 output.txt
=================
 
J

JRC

And for a complete solution:

Sub Indenter()
Dim i, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
  For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
    If InStr(.Cells(i, 1).Value, ";") > 0 Then _
      .Cells(i, 1).Value = Replace(.Cells(i, 1).Value, ";", " [")& "]"
    For j = 1 To Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1)..Value, ".", ""))
      .Cells(i, 1).Insert Shift:=xlToRight
    Next
  Next
End With
Application.ScreenUpdating = True
End Sub

or

Sub Indenter()
Dim i, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
  For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
    If InStr(.Cells(i, 1).Value, ";") > 0 Then _
      .Cells(i, 1).Value = Application.Substitute(.Cells(i, 1).Value, ";", " [") & "]"
    For j = 1 To Len(.Cells(i, 1).Value) - Len(Application.Substitute(.Cells(i, 1).Value, ".", ""))
      .Cells(i, 1).Insert Shift:=xlToRight
    Next
  Next
End With
Application.ScreenUpdating = True
End Sub

Hi, macropod.

That did it. It worked and I was able to make the formatting changes I
was trying to make.

I noticed that Excel took quite a long time to run the program. It
wasn't as fast as I expected it would be. Also I tried the first
program you submitted but it would run. The second program you labeled
"And for a complete solution:" was the one that did the job. It ran
the first time without a problem.

I have one more question to ask: How can I make a slight modified
version of the file ? I just noticed as I looked at the application
where I will use this file that I need yet another version of this
file in a different format. This second format needs to be indented
just as the first but the portion after the label (name) needs to be
removed from the line (in other words the alphanumeric portion needs
to be removed). Let me be more specific:

This is what we started with:

Body Regions;A01
Abdomen;A01.047
Abdominal Cavity;A01.047.025

This is what your program did:

Body Regions [A01]
Abdomen [A01.047]
Abdominal Cavity [A01.047.025]

The second version of the file I needs to be formatted like this:

Body Regions
Abdomen
Abdominal Cavity

How could I modify your program so that it will perform this task ?
Can you point me in the right direction ? Perhaps if I understand what
the proper algorightm must be I can then try to research the proper
functions and make the necessary changes to your program.

Thank you very much for all your help and for taking the time to work
with me on this project.

Joe
 
M

macropod

Hi JRC,

Try:
Sub CleanUp_Indenter()
Dim i, j, k As Long
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
k = Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1).Value, ".", ""))
If InStr(.Cells(i, 1).Value, ";") > 0 Then _
.Cells(i, 1).Value = Left(.Cells(i, 1).Value, InStr(.Cells(i, 1).Value, ";") - 1)
For j = 1 To k
.Cells(i, 1).Insert Shift:=xlToRight
Next
Next
End With
Application.ScreenUpdating = True
End Sub

or, for progress feeback via the status bar:
Sub CleanUp_Indenter()
Dim i, j, k, l As Long
Dim SBar As Boolean
SBar = Application.DisplayStatusBar
Application.DisplayStatusBar = True
Application.ScreenUpdating = False
With ActiveSheet
l = .Cells.SpecialCells(xlCellTypeLastCell).Row
For i = 1 To l
k = Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1).Value, ".", ""))
If InStr(.Cells(i, 1).Value, ";") > 0 Then _
.Cells(i, 1).Value = Left(.Cells(i, 1).Value, InStr(.Cells(i, 1).Value, ";") - 1)
For j = 1 To k
.Cells(i, 1).Insert Shift:=xlToRight
Next
Application.StatusBar = i & " of " & l "records processed."
Next
End With
Application.StatusBar = False
Application.DisplayStatusBar = SBar
Application.ScreenUpdating = True
End Sub

Note: the progress report will slow things down slightly.

--
Cheers
macropod
[Microsoft MVP - Word]


And for a complete solution:

Sub Indenter()
Dim i, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
If InStr(.Cells(i, 1).Value, ";") > 0 Then _
.Cells(i, 1).Value = Replace(.Cells(i, 1).Value, ";", " [") & "]"
For j = 1 To Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1).Value, ".", ""))
.Cells(i, 1).Insert Shift:=xlToRight
Next
Next
End With
Application.ScreenUpdating = True
End Sub

or

Sub Indenter()
Dim i, j As Integer
Application.ScreenUpdating = False
With ActiveSheet
For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
If InStr(.Cells(i, 1).Value, ";") > 0 Then _
.Cells(i, 1).Value = Application.Substitute(.Cells(i, 1).Value, ";", " [") & "]"
For j = 1 To Len(.Cells(i, 1).Value) - Len(Application.Substitute(.Cells(i, 1).Value, ".", ""))
.Cells(i, 1).Insert Shift:=xlToRight
Next
Next
End With
Application.ScreenUpdating = True
End Sub

Hi, macropod.

That did it. It worked and I was able to make the formatting changes I
was trying to make.

I noticed that Excel took quite a long time to run the program. It
wasn't as fast as I expected it would be. Also I tried the first
program you submitted but it would run. The second program you labeled
"And for a complete solution:" was the one that did the job. It ran
the first time without a problem.

I have one more question to ask: How can I make a slight modified
version of the file ? I just noticed as I looked at the application
where I will use this file that I need yet another version of this
file in a different format. This second format needs to be indented
just as the first but the portion after the label (name) needs to be
removed from the line (in other words the alphanumeric portion needs
to be removed). Let me be more specific:

This is what we started with:

Body Regions;A01
Abdomen;A01.047
Abdominal Cavity;A01.047.025

This is what your program did:

Body Regions [A01]
Abdomen [A01.047]
Abdominal Cavity [A01.047.025]

The second version of the file I needs to be formatted like this:

Body Regions
Abdomen
Abdominal Cavity

How could I modify your program so that it will perform this task ?
Can you point me in the right direction ? Perhaps if I understand what
the proper algorightm must be I can then try to research the proper
functions and make the necessary changes to your program.

Thank you very much for all your help and for taking the time to work
with me on this project.

Joe
 
R

Rob Schneider

Re your comment on speed, the Python program I provided should be
several orders of magnitude faster, if not faster.
 
R

Rob Schneider

email me and we'll setup a web conference and I'll show you how to do this.


--rms

www.rmschneider.com




Hello, Rob.
I just tried your Python routine but got an error message stating the
following:
/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/
Python.app/Contents/MacOS/Python: can't open file 'processtext.py':
[Errno 2] No such file or directory
At first I thought it had something to do with the fact that it might
be looking for the file 'processtext.py' in the main directory while I
had it on my desktop. So I transferred the files to the main directory
but nothing changed - the result was the same error message.
Is there something I am missing ?
Thank you,

Joe,

It means that you launched Python and asked it to load the
'processtext.py' file and it can't find it. Probably because when you
launched Python the file processtext.py and the input file was not in
the current directory. It has to be in the 'current directory' simply
because you did not give a full path to the actual file location. The
computer has to be told what to do. It cannot read the user's mind.

I don't know what you mean by "main directory".

'processtext.py' is the name I suggested you give to the little
programme file. Is this the file name you gave it?

The error message suggests that it can't find the file; so on that that
basis let's go through the steps to take. I will be *very* explicit.
While there are a lot of words here, it is simple.

1. Launch the Mac program "TextEdit" using Finder and from the
Applications Folder.

2. Copy/paste the program I provided previously into TextEdit. Edit
the name of the input file name and output file name you will use. I
suggest "input.txt" and "output.txt". Use whatever you like. Ensure
that the indented portions of the program are, as mentioned previously,
all four spaces (or if not four then must all be the same, e.g. three).
Do not use tabs, do not use inconsistent indents.

3. Save the file into the "Documents" folder as 'processtext.py'. (You
can put it in any folder you like, including Desktop ... I just say
Documents just so that I can write this procedure and pick an existing
folder to avoid creating a new folder which adds more steps to the
procedure).

4. Copy the file which holds your text that you wish to transform into
the "Documents" folder. Use Finder.

5. Open the Mac "Terminal" program. You'll probably find it on the
"Dock" but not there, launch it via Finder in the Applications/Utilities
folder (help at Google "launch mac terminal")

6. Unless you changed something, the Mac will put you into terminal
mode in the /Users/[youruserid] folder. Change to the Documents folder
by issuing the command (without the quote marks), "cd Documents" then
press Return Key.

7.The "current directory" will be /Users/[youruserid/Documents". Check
this by issuing the command "pwd" to see the *p*present *w*working
*d*iretory.

8.Confirm that your "processtext.py is there by issuing the command "ls
passwordtext.py" and if it is the Mac will respond by by showing you the
file name. If the file is not present, go back to steps 1, 2, and 3.
This file must be in this folder simply because we keep it simple by not
putting it elsewhere which would require giving a full path name when
launching the program.

9.Confirm that your input file (called "input.txt"???) is in that folder
by issuing the "ls input.txt" file. If it is not there, then go to step
4 above. This file must be in this folder simply because we keep it
simple by not putting it elsewhere which would require giving a full
path name in the programm.

10. Issue the command "python processtext.py" which will a. launch
Python, b. load the "processtext.py" file from the current directory
(Documents), then run that program. It should create the file
output.text (or whatever you called it in step 2 above) into the current
folder (Documents). It should only take a few seconds (depending on
the size of the input file).

11. With Word or TextEdit, look at the output.txt file and see if it
did what you wanted.

12. If it did not, then you need to adjust the Python program to do
what you actually want if you want to use Python. You may also find
that the input data format/pattern is not exactly as expected and
therefore the program may not work. You can then chose to change the
data format/pattern, or make the program more sophisticated to enable
transforming the unexpected format/patter. That's just life when
munging data.

--rms

www.rmschneider.com

Hi, Rob.

I just tried the Python program as you suggested. Thank you for the
detailed instructions that made every step easy to follow.

I am note sure if the program runs or doesn't. After I press enter
Terminal immediately displays the next line waiting for input and
doesn't show any error message. However, as I go into the Documents
folder I simply can't find the "output.txt" file that the program
should have saved. I also tried a spotlight search but couldn't locate
any file in my system with the name "output.txt" (I did this just to
make sure it had not saved the program to a different location
although I can see that Terminal is looking at the "right" folder).

What do you think may have happened ?

Thank you very much for your help and for the detailed explanation and
instructions.

Joe
 
J

JRC

Hi JRC,

Try:
Sub CleanUp_Indenter()
Dim i, j, k As Long
Application.ScreenUpdating = False
With ActiveSheet
  For i = 1 To .Cells.SpecialCells(xlCellTypeLastCell).Row
    k = Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1).Value, "..", ""))
    If InStr(.Cells(i, 1).Value, ";") > 0 Then _
      .Cells(i, 1).Value = Left(.Cells(i, 1).Value, InStr(.Cells(i, 1).Value, ";") - 1)
    For j = 1 To k
      .Cells(i, 1).Insert Shift:=xlToRight
    Next
  Next
End With
Application.ScreenUpdating = True
End Sub

or, for progress feeback via the status bar:
Sub CleanUp_Indenter()
Dim i, j, k, l As Long
Dim SBar As Boolean
SBar = Application.DisplayStatusBar
Application.DisplayStatusBar = True
Application.ScreenUpdating = False
With ActiveSheet
  l = .Cells.SpecialCells(xlCellTypeLastCell).Row
  For i = 1 To l
    k = Len(.Cells(i, 1).Value) - Len(Replace(.Cells(i, 1).Value, "..", ""))
    If InStr(.Cells(i, 1).Value, ";") > 0 Then _
      .Cells(i, 1).Value = Left(.Cells(i, 1).Value, InStr(.Cells(i, 1).Value, ";") - 1)
    For j = 1 To k
      .Cells(i, 1).Insert Shift:=xlToRight
    Next
    Application.StatusBar = i & " of " &  l "records processed."
  Next
End With
Application.StatusBar = False
Application.DisplayStatusBar = SBar
Application.ScreenUpdating = True
End Sub

Note: the progress report will slow things down slightly.

Hi, macropod.

I tried the both subroutines above and this time I got an error
message. With both the Replace function becomes highlighted when I try
to run it and I get a message that says the highlighted function is
not defined. I checked the spelling to make sure I had not mistyped
any of the commands but it all appears to be just as you wrote it.

Any ideas of what might be happening ?

Thanks again for all your help.

Joe
 
J

JRC

Hello, Rob.
I just tried your Python routine but got an error message stating the
following:
/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/
Python.app/Contents/MacOS/Python: can't open file 'processtext.py':
[Errno 2] No such file or directory
At first I thought it had something to do with the fact that it might
be looking for the file 'processtext.py' in the main directory while I
had it on my desktop. So I transferred the files to the main directory
but nothing changed - the result was the same error message.
Is there something I am missing ?
Thank you,
Joe
Joe,
It means that you launched Python and asked it to load the
'processtext.py' file and it can't find it. Probably because when you
launched Python the file processtext.py and the input file was not in
the current directory.  It has to be in the 'current directory' simply
because you did not give a full path to the actual file location. The
computer has to be told what to do.  It cannot read the user's mind.
I don't know what you mean by "main directory".
'processtext.py' is the  name I suggested you give to the little
programme file.  Is this the file name you gave it?
The error message suggests that it can't find the file; so on that that
basis let's go through the steps to take.  I will be *very* explicit.
While there are a lot of words here, it is simple.
1.  Launch the Mac program "TextEdit" using Finder and from the
Applications Folder.
2.  Copy/paste the program I provided previously into TextEdit.  Edit
the name of the input file name and output file name you will use.  I
suggest "input.txt" and "output.txt".  Use whatever you like. Ensure
that the indented portions of the program are, as mentioned previously,
all four spaces (or if not four then must all be the same, e.g. three)..
Do not use tabs, do not use inconsistent indents.
3.  Save the file into the "Documents" folder as 'processtext.py'.. (You
can put it in any folder you like, including Desktop ... I just say
Documents just so that I can write this procedure and pick an existing
folder to avoid creating a new folder which adds more steps to the
procedure).
4.  Copy the file which holds your text that you wish to transform into
the "Documents" folder.  Use Finder.
5.  Open the Mac "Terminal" program.  You'll probably find it on the
"Dock" but not there, launch it via Finder in the Applications/Utilities
folder (help at Google "launch mac terminal")
6.  Unless you changed something, the Mac will put you into terminal
mode in the /Users/[youruserid] folder.  Change to the Documents folder
by issuing the command (without the quote marks), "cd Documents" then
press Return Key.
7.The "current directory" will be /Users/[youruserid/Documents".  Check
this by issuing the command "pwd" to see the *p*present *w*working
*d*iretory.
8.Confirm that your "processtext.py is there by issuing the command "ls
passwordtext.py" and if it is the Mac will respond by by showing you the
file name. If the file is not present, go back to steps 1, 2, and 3.
This file must be in this folder simply because we keep it simple by not
   putting it elsewhere which would require giving a full path name when
launching the program.
9.Confirm that your input file (called "input.txt"???) is in that folder
by issuing the "ls input.txt" file.  If it is not there, then go to step
4 above.  This file must be in this folder simply because we keepit
simple by not putting it elsewhere which would require giving a full
path name in the programm.
10. Issue the command "python processtext.py" which will a. launch
Python, b. load the "processtext.py" file from the current directory
(Documents), then run that program.  It should create the file
output.text (or whatever you called it in step 2 above) into the current
   folder (Documents).  It should only take a few seconds (depending on
the size of the input file).
11.  With Word or TextEdit, look at the output.txt file and see if it
did what you wanted.
12.  If it did not, then you need to adjust the Python program todo
what you actually want if you want to use Python.  You may also find
that the input data format/pattern is not exactly as expected and
therefore the program may not work.  You can then chose to changethe
data format/pattern, or make the program more sophisticated to enable
transforming the unexpected format/patter.  That's just life when
munging data.
--rms
www.rmschneider.com
I just tried the Python program as you suggested. Thank you for the
detailed instructions that made every step easy to follow.
I am note sure if the program runs or doesn't. After I press enter
Terminal immediately displays the next line waiting for input and
doesn't show any error message. However, as I go into the Documents
folder I simply can't find the "output.txt" file that the program
should have saved. I also tried a spotlight search but couldn't locate
any file in my system with the name "output.txt" (I did this just to
make sure it had not saved the program to a different location
although I can see that Terminal is looking at the "right" folder).
What do you think may have happened ?
Thank you very much for your help and for the detailed explanation and
instructions.

Joe,

I haven't a clue what happened on your computer and why this is so hard
for you. What do you think is wrong?

I don't have a clue if Spotlight will find a recently-created file. Why
do you think the program will have chosen on it's own to save it
somewhere other than the current directory. It will only do that if you
told it to.  You don't say that you ran the program from the Documents
folder.  The program, as provided, will put the output file in the same
folder as the input file.

As you can see in the original source code I put in no statements to
show progress of work.  This is something that you could have done to
debug it.  From the symptoms you provide the program did work and there
were no errors and the output file should have been created in the
current directory.  But I don't know where you told the program to put
the file.

Here is a *more* sophisticated version of the program (with a *lot* of
additional exception checking and output statements to show it's
working.  It's the same program just with added extras (watch that lines
don't get garbled. remember the four space indent rule):

=====================
#!/usr/bin/env python
# encoding: utf-8

# change these if you want. without folder names,
# the current directory assumed
ifn="input.txt"   # inputfile name
ofn="output.txt"  # outputfile name

try:
     ifh=open("input.txt",'r')
except:
     print "EXCEPTION: can't open input file."
try:
     ofh=open("output.txt","w")
except:
     print "EXCEPTION: can't open output file."
try:
     inputtext=ifh.readlines()
except:
     print "EXCEPTION: can't read input file."
ifh.close

tab='\t'
for line in inputtext:
     t1=line.split(';')
     if len(t1) < 2:
         print "ERROR: no ';' in input line. Skipping.",t1
     else:
         title=t1[0]
         data=t1[1][:-1]
         cnt=len(t1[1].split('.'))
         print "...will put %i tab stops in %s;%s" % (cnt,title,data)
         try:
             ofh.write(tab*cnt+title+'['+data+']\n')
         except:
             print "EXCEPTION: can't write output file."
try:
     ofh.close()
except:
     print "EXCEPTION.  Can't close output file."
print "\nDone. input from:%s output to: %s." % (ifn,ofn)
print "Check with the command: 'ls %s'" % (ofn)

===============

Here is the input file I used to test (with one error introduced to
check the error checking):

Body Regions;A01
Abdomen;A01.047
Abdominal Cavity;A01.047.025
Mesocolon;A01.047.025.600.451.535
Omentum;A01.047.025.600.573
Peritoneal Cavity;A01.047.025.600.678
Peritoneal Stomata;A01.047.025.600.700
Retroperitoneal Space;A01.047.025.750
Back;A01.176
Lumbosacral Region;A01.176.519
Sacrococcygeal Region;A01.176.780
BreastA01.236
Mammary Glands, Human;A01.236.249
================

Here is the output printed to the screen for this test:
================
MacBook:documents rmschne$ python transform.py
...will put 1 tab stops in Body Regions;A01
...will put 2 tab stops in Abdomen;A01.047
...will put 3 tab stops in Abdominal Cavity;A01.047.025
...will put 6 tab stops in Mesocolon;A01.047.025.600.451.535
...will put 5 tab stops in Omentum;A01.047.025.600.573
...will put 5 tab stops in Peritoneal Cavity;A01.047.025.600.678
...will put 5 tab stops in Peritoneal Stomata;A01.047.025.600.700
...will put 4 tab stops in Retroperitoneal Space;A01.047.025.750
...will put 2 tab stops in Back;A01.176
...will put 3 tab stops in Lumbosacral Region;A01.176.519
...will put 3 tab stops in Sacrococcygeal Region;A01.176.780
ERROR: no ';' in input line. Skipping. ['BreastA01.236\n']
...will put 3 tab stops in Mammary Glands, Human;A01.236.24

Done. input from:input.txt output to: output.txt.
Check with the command: 'ls -l output.txt'
=================

Here is the result of the ls command:
=================
MacBook:documents rmschne$ ls -l output.txt
-rw-r--r--  1 rmschne  staff  416 18 Dec 08:09 output.txt
=================

Hi, Rob.

I just tried to run this second version of the Python program you
provided and it gave me an error message. This is what I got on the
Terminal window:

File "processtext.py", line 3
try: 
     ifh=open("input.txt",'r') 
except: 
     print
"EXCEPTION: can't open input file." 
try: 
     ofh=open
("output.txt","w") 
except: 
     print "EXCEPTION: can't open output
file." 
try: 
     inputtext=ifh.readlines() 
except: 
     print
"EXCEPTION: can't read input file." 
ifh.close
^
SyntaxError: invalid syntax

The file was saved using TextEdit using UTF-8 encoding. It seems
Python couldn't open the file for some reason. Could it possibly have
something to do with the formatting ?

I followed your instructions to the letter. Changed from the root
directory to the Documents folder to make it the working directory,
issued the ls command and verified that both files were indeed in the
Documents directory. Everything seems to be in the right place
following your instructions.

What should I do ?

Thank you,

Joe
 
J

JRC

Re your comment on speed, the Python program I provided should be
several orders of magnitude faster, if not faster.

Hi, Rob.

On a separate note I was at Barnes and Noble yesterday to look at some
books. I decided to look at some on programming and chose to look at
Python first. I must be honest and tell you I was a little discouraged
to see how complex Python is. One of the books from O'Reilly media on
Python is about the size of a large Phone directory (yellow pages
even). I also read last night while looking at Mailman that the whole
Mailman application was written with Python (and some C added for
security, as the developers state on their own page). This seems like
a very robust language one would need sometime to learn. No ???

Thanks,

Joe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top