Counting and deleting

K

KevinUK

I have a spreadsheet of over 40,000 lines for analysis. The first
column contains phrases from the British National Corpus. The other
columns contain various categorical descriptions of each phrase. This
is what I need to do (but don't know how!):

1. How to count the number of words in each phrase and write the
result to a new column.

2. How to count the number of letters in a single-word phrase and
write the result to a new column.

3. How to delete every line whose phrase contains more than one word.

4. How to delete every line whose phrase contains one word of less
than x letters.

5. How to delete lines whose phrases consist of one specific word
(e.g. 'it')

I would be very grateful for any guidance anyone can give. Many
thanks.

Kevin Glover
 
P

Pete_UK

1. The trick here is to count the number of spaces and add one. This
formula will do so, say in C2:

=LEN(A2) - LEN(SUBSTITUTE(A2," ","")) + 1

assuming your string is in A2. You might like to put Num_words in C1
as a heading.

2. LEN returns the number of characters, so you could put this formula
in, say, D2:

=IF(C2=1,LEN(A2),"")

which only returns a value for single-word entries in column A. Put
Num_letters as the heading in D1.

3. A formula cannot delete a row. You can use a formula to mark a row
for deletion, and then delete all marked rows later in one operation
using a filter (or you could use a macro to do it for you). You could
put this formula in E2:

=IF(C2>1,"Delete","OK")

4. Put this formula in F2:

=IF(AND(D2<>"",D2<5),"Delete","ok")

I've assumed 5 letters for your "x".

5. Put this formula in G2:

=IF(AND(C2=1,A2="it"),"Delete","ok")

All these formulae can be copied down for as many phrases as you have.
Then you can apply autofilter (Data | Filter | Autofilter), and then
you can use the filter drop-down in column E and select Delete from
the list. Then highlight all the visible rows and click on Edit |
Delete and then select All in that filter. Do the same for columns F
and for G.

Hope this helps.

Pete
 
D

Don Guillett Excel MVP

I have a spreadsheet of over 40,000 lines for analysis. The first
column contains phrases from the British National Corpus. The other
columns contain various categorical descriptions of each phrase. This
is what I need to do (but don't know how!):

1. How to count the number of words in each phrase and write the
result to a new column.

2. How to count the number of letters in a single-word phrase and
write the result to a new column.

3. How to delete every line whose phrase contains more than one word.

4. How to delete every line whose phrase contains one word of less
than x letters.

5. How to delete lines whose phrases consist of one specific word
(e.g. 'it')

I would be very grateful for any guidance anyone can give. Many
thanks.

Kevin Glover

I would be willing to try to provide a one click macro solution. A
very clear explanation and

"If desired, send your file to dguillett @gmail.com I will only look
if:
1. You send a copy of this message on an inserted sheet
2. You give me the newsgroup and the subject line
3. You send a clear explanation of what you want
4. You send before/after examples and expected results."
 
R

Rick Rothstein

Pete, I have a sneaking suspicion that you just did Kevin's homework
assignment for him (I could be wrong about this though).

Rick Rothstein (MVP - Excel)




"Pete_UK" wrote in message

1. The trick here is to count the number of spaces and add one. This
formula will do so, say in C2:

=LEN(A2) - LEN(SUBSTITUTE(A2," ","")) + 1

assuming your string is in A2. You might like to put Num_words in C1
as a heading.

2. LEN returns the number of characters, so you could put this formula
in, say, D2:

=IF(C2=1,LEN(A2),"")

which only returns a value for single-word entries in column A. Put
Num_letters as the heading in D1.

3. A formula cannot delete a row. You can use a formula to mark a row
for deletion, and then delete all marked rows later in one operation
using a filter (or you could use a macro to do it for you). You could
put this formula in E2:

=IF(C2>1,"Delete","OK")

4. Put this formula in F2:

=IF(AND(D2<>"",D2<5),"Delete","ok")

I've assumed 5 letters for your "x".

5. Put this formula in G2:

=IF(AND(C2=1,A2="it"),"Delete","ok")

All these formulae can be copied down for as many phrases as you have.
Then you can apply autofilter (Data | Filter | Autofilter), and then
you can use the filter drop-down in column E and select Delete from
the list. Then highlight all the visible rows and click on Edit |
Delete and then select All in that filter. Do the same for columns F
and for G.

Hope this helps.

Pete
 
K

KevinUK

Pete,

You are a star. Thanks so much for the guidance. I was amused by the
suggestion that you might have done my homework. At age 60, it's been
a while since I did homework, so I'm afraid I won't be getting that A!
Thanks again.

Kevin
 
P

Pete_UK

You are welcome, Kevin - thanks for feeding back.

When you said you had 40,000 records I thought that would be a bit
much for a homework assignment, so I'm glad I was able to help you in
your task.

Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top