Deleting non-alphanumeric characters

M

Mark

Hello, I have code that extracts certain things from a text file
containing source code for a web site. One of the things I am
extracting is the web page name. For some reason it extracts it with
a LOT of carriage returns. Like 7-8 of those rectangular characters.
I have in my code the following:

Cells.Replace What:=Chr(13), Replacement:="", LookAt:=xlPart,
SearchOrder:=xlByRows, MatchCase:=False
Cells.Replace What:=Chr(10), Replacement:="", LookAt:=xlPart,
SearchOrde:=xlByRows, MatchCase:=False

But it only seem to delete like the first and last 2 of the
characters...I can't figure out how to get rid of them. I figured if
I also put something in the code along the lines of "If not
alphanumeric then delete character (not entire cell)"

Any suggestions??
Thanks!
 
J

Jim Cone

Try replacing Chr(160) with " " or ""
--
Jim Cone
Portland, Oregon USA
http://www.realezsites.com/bus/primitivesoftware
(Excel Add-ins / Excel Programming)


"Mark"
wrote in message
Hello, I have code that extracts certain things from a text file
containing source code for a web site. One of the things I am
extracting is the web page name. For some reason it extracts it with
a LOT of carriage returns. Like 7-8 of those rectangular characters.
I have in my code the following:

Cells.Replace What:=Chr(13), Replacement:="", LookAt:=xlPart,
SearchOrder:=xlByRows, MatchCase:=False
Cells.Replace What:=Chr(10), Replacement:="", LookAt:=xlPart,
SearchOrde:=xlByRows, MatchCase:=False

But it only seem to delete like the first and last 2 of the
characters...I can't figure out how to get rid of them. I figured if
I also put something in the code along the lines of "If not
alphanumeric then delete character (not entire cell)"
Any suggestions??
Thanks!
 
M

Mark

Try replacing Chr(160) with  " "  or  ""
--
Jim Cone
Portland, Oregon  USAhttp://www.realezsites.com/bus/primitivesoftware
(Excel Add-ins / Excel Programming)

"Mark"
wrote in message
Hello, I have code that extracts certain things from a text file
containing source code for a web site.  One of the things I am
extracting is the web page name.  For some reason it extracts it with
a LOT of carriage returns.  Like 7-8 of those rectangular characters.
I have in my code the following:

    Cells.Replace What:=Chr(13), Replacement:="", LookAt:=xlPart,
SearchOrder:=xlByRows, MatchCase:=False
    Cells.Replace What:=Chr(10), Replacement:="", LookAt:=xlPart,
SearchOrde:=xlByRows, MatchCase:=False

But it only seem to delete like the first and last 2 of the
characters...I can't figure out how to get rid of them.  I figured if
I also put something in the code along the lines of "If not
alphanumeric then delete character (not entire cell)"
Any suggestions??
Thanks!

Yea for some odd reason they still won't delete. I have no idea why.
Is there a way to seach for ASCII characters? I looked it up and its
0013 I think...

Thanks again
 
R

Rick Rothstein \(MVP - VB\)

Yea for some odd reason they still won't delete. I have no idea why.
Is there a way to seach for ASCII characters? I looked it up and its
0013 I think...

Let me preface this by saying I have zero experience with international
coding issues (all my programming efforts have been US based), so this is
only a guess. That 0013 may be a Unicode value... try using ChrW instead of
Chr and see if that finds them.

Rick
 
J

Jim Cone

Run the Excel "Clean" function on each cell.
Either on the worksheet in a helper column or in code using Application.Clean( )
--
Jim Cone
Portland, Oregon USA
http://www.realezsites.com/bus/primitivesoftware
(Excel Add-ins / Excel Programming)


"Mark"
<[email protected]>
wrote in message
Yea for some odd reason they still won't delete. I have no idea why.
Is there a way to seach for ASCII characters? I looked it up and its
0013 I think...

Thanks again
 
R

Ron Rosenfeld

Hello, I have code that extracts certain things from a text file
containing source code for a web site. One of the things I am
extracting is the web page name. For some reason it extracts it with
a LOT of carriage returns. Like 7-8 of those rectangular characters.
I have in my code the following:

Cells.Replace What:=Chr(13), Replacement:="", LookAt:=xlPart,
SearchOrder:=xlByRows, MatchCase:=False
Cells.Replace What:=Chr(10), Replacement:="", LookAt:=xlPart,
SearchOrde:=xlByRows, MatchCase:=False

But it only seem to delete like the first and last 2 of the
characters...I can't figure out how to get rid of them. I figured if
I also put something in the code along the lines of "If not
alphanumeric then delete character (not entire cell)"

Any suggestions??
Thanks!

Is the text file coded in Unicode? If *not*, then you could do something like:

=======================
Dim c As Range
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.Global = True
re.Pattern = "[^ -~]+"
For Each c In Selection
c.Value = re.Replace(c.Value, "")
Next c
=========================

The regular expression selects any character that is not in the class (range)
of the <space> character (ASCII code 32) to the tilde (ASCII code 126) and
replaces it with nothing.
--ron
 
M

Mark

Hello, I have code that extracts certain things from a text file
containing source code for a web site.  One of the things I am
extracting is the web page name.  For some reason it extracts it with
a LOT of carriage returns.  Like 7-8 of those rectangular characters.
I have in my code the following:
   Cells.Replace What:=Chr(13), Replacement:="", LookAt:=xlPart,
SearchOrder:=xlByRows, MatchCase:=False
   Cells.Replace What:=Chr(10), Replacement:="", LookAt:=xlPart,
SearchOrde:=xlByRows, MatchCase:=False
But it only seem to delete like the first and last 2 of the
characters...I can't figure out how to get rid of them.  I figured if
I also put something in the code along the lines of "If not
alphanumeric then delete character (not entire cell)"
Any suggestions??
Thanks!

Is the text file coded in Unicode?  If *not*, then you could do something like:

=======================
Dim c As Range
Dim re As Object
Set re = CreateObject("vbscript.regexp")
    re.IgnoreCase = True
    re.Global = True
    re.Pattern = "[^ -~]+"
For Each c In Selection
    c.Value = re.Replace(c.Value, "")
Next c
=========================

The regular expression selects any character that is not in the class (range)
of the <space> character (ASCII code 32) to the tilde (ASCII code 126) and
replaces it with nothing.
--ron- Hide quoted text -

- Show quoted text -

Thanks Ron

That worked perfectly, nice and clean without any boxes!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top