Split astring using Capital letter as identifier

R

Raj

Hi,

I am looking for a formula/function/UDF that would split
VasantRamPatil into Vasant Ram Patil ( ie split the string into three
words using the capital letter as the initial letter of each word).
The split can be either by inserting spaces inside the string itself
or splitting the string into three strings).

Thanks in advance for the Help.

Regards,
Raj
 
J

Jacob Skaria

Try the below UDF

'Usage
=SplitWords(A1)

Function SplitWords(strData) As String
Dim intTemp As Integer
SplitWords = Left(strData, 1)
For intTemp = 2 To Len(Trim(strData))
If Mid(strData, intTemp, 1) = UCase(Mid(strData, _
intTemp, 1)) Then SplitWords = SplitWords & " "
SplitWords = SplitWords & Mid(strData, intTemp, 1)
Next
End Function

If this post helps click Yes
 
S

Stefi

Try this UDF:

Function splitcap(wholestr)
splitstr = ""
For i = 1 To Len(wholestr)
currchr = Mid(wholestr, i, 1)
splitstr = splitstr & IIf(Asc(currchr) < 91 And i > 1, " ", "") &
currchr
Next i
splitcap = splitstr
End Function

Regards,
Stefi


„Raj†ezt írta:
 
R

Rick Rothstein

I think this UDF does what you want..

Function SplitOnCaps(S As String) As String
Dim X As Long
SplitOnCaps = S
For X = Len(SplitOnCaps) To 2 Step -1
If Mid(SplitOnCaps, X, 1) Like "[A-Z]" Then
SplitOnCaps = Left(SplitOnCaps, X - 1) & " " & Mid(SplitOnCaps, X)
End If
Next
End Function
 
R

Rick Rothstein

You might need to expand your test... just testing for <91 makes your code
treat most punctuation marks and digits as if they were upper case letters.
Think names like (FOX News host) BillO'Reilly or (when she was married to
Lee Majors) FarrahFawcett-Majors.

Just as an aside, concatenating *each* letter to make the string value is
not as efficient as just handling the concatenations at the required break
points (see the other posting in this thread for examples). If the UDF is to
be used a great many times in the OP's spreadsheet, the extra overhead in
your method might begin to show up as a slower spreadsheet recalculations.
 
R

Ron Rosenfeld

Hi,

I am looking for a formula/function/UDF that would split
VasantRamPatil into Vasant Ram Patil ( ie split the string into three
words using the capital letter as the initial letter of each word).
The split can be either by inserting spaces inside the string itself
or splitting the string into three strings).

Thanks in advance for the Help.

Regards,
Raj

Here is a UDF that inserts spaces.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic
Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=SplitOnCaps(A1)

in some cell.

=====================================
Option Explicit
Function SplitOnCaps(s As String) As String
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = "([a-z])([A-Z])"
SplitOnCaps = re.Replace(s, "$1 $2")
End Function
======================================
--ron
 
R

Rajendra

I am looking for a formula/function/UDF that would split
VasantRamPatil into Vasant Ram Patil ( ie split the string into three
words using the capital letter as the initial letter of each word).
The split can be either by inserting spaces inside the string itself
or splitting the string into three strings).
Thanks in advance for the Help.
Regards,
Raj

Here is a UDF that inserts spaces.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual Basic
Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

        =SplitOnCaps(A1)

in some cell.

=====================================
Option Explicit
Function SplitOnCaps(s As String) As String
Dim re As Object
    Set re = CreateObject("vbscript.regexp")
        re.Global = True
        re.Pattern = "([a-z])([A-Z])"
SplitOnCaps = re.Replace(s, "$1 $2")
End Function
======================================
--ron

Thanks to All for immediately solving the problem as well as for the
learning that there are many different ways of doing things. The
knowledge will be useful for in solving future problems.

Regards,
Raj
 
R

Rick Rothstein

I am looking for a formula/function/UDF that would split
VasantRamPatil into Vasant Ram Patil ( ie split the string into three
words using the capital letter as the initial letter of each word).
The split can be either by inserting spaces inside the string itself
or splitting the string into three strings).

Thanks in advance for the Help.

Regards,
Raj

Here is a UDF that inserts spaces.

To enter this User Defined Function (UDF), <alt-F11> opens the Visual
Basic
Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this User Defined Function (UDF), enter a formula like

=SplitOnCaps(A1)

in some cell.

=====================================
Option Explicit
Function SplitOnCaps(s As String) As String
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = "([a-z])([A-Z])"
SplitOnCaps = re.Replace(s, "$1 $2")
End Function
======================================

Unlike the other solutions provided, the pattern you used will not split the
text at consecutive upper case letters (which might, of course, be what the
OP wants as he didn't say in his posting). I was thinking of a first name,
middle initial, last name construction like JohnQPublic or even a first
initial, middle initial, last name construction like JQPublic. Of course,
this would not be the desired treatment for a company name like
IBMCorporation though.
 
A

Ashish Mathur

Hi,

Please find attached the workaround (please array enter this formula -
Ctrl+Shift+Enter). I had worked on this ages ago and have not seen this
solution since then, thereofore I am sure there is a way to shorten this.

=MID(A10,MATCH(TRUE,EXACT(MID(A10,ROW($1:$28),1),PROPER(MID(A10,ROW($1:$28),1))),0),MATCH(TRUE,EXACT(MID(A10,ROW($2:$28),1),PROPER(MID(A10,ROW($2:$28),1))),0))&"
"&MID(A10,MATCH(TRUE,EXACT(MID(A10,ROW($2:$28),1),PROPER(MID(A10,ROW($2:$28),1))),0)+1,255)

--
Regards,

Ashish Mathur
Microsoft Excel MVP
www.ashishmathur.com
 
R

Rick Rothstein

Your formula has the same problem I pointed out in Ron's solution (which I
noted may not really be a problem depending on the OP's actual data);
namely, consecutive upper case letters do not get split apart. The examples
of this problem that I posted to Ron's message were constructions like
these... a first name, middle initial, last name construction like
JohnQPublic or a first initial, middle initial, last name construction like
JQPublic. I then noted this to Ron... of course, this would not be the
desired treatment for a company name like IBMCorporation though. The OP will
have to decide based on his knowledge of his own data whether this is
actually a problem to him or not.
 
R

Ron Rosenfeld

Unlike the other solutions provided, the pattern you used will not split the
text at consecutive upper case letters (which might, of course, be what the
OP wants as he didn't say in his posting). I was thinking of a first name,
middle initial, last name construction like JohnQPublic or even a first
initial, middle initial, last name construction like JQPublic. Of course,
this would not be the desired treatment for a company name like
IBMCorporation though.

As is frequently the case, any solution offered must depend critically on our
interpretation of the request, given incomplete information.

For example, your solution adds an inappropriate space here:

BillO'Reilly Bill O' Reilly

And if the original happens to have a space, it adds an extra:

Bill O'Reilly Bill O' Reilly

A slight change in my regex solution (both the regex and the replacement
string) can also handle the problem of JQPublic, but, of course, it doesn't
also deal properly with IBM Corporation. I'm not sure how to programmatically
differentiate between a person's name and a corporation, in this context.

========================================
Option Explicit
Function SplitOnCaps(s As String) As String
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = "([a-zA-Z])(?=[A-Z])"
SplitOnCaps = re.Replace(s, "$1 ")
End Function
=======================
--ron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top