Sub to extract uniques from 200k data in xl03

M

Max

Alan, sorry, I just posted a response to your earlier reply which crossed
this. It **worked** and took only 15 sec to spit out the results !! Superb.
 
M

Max

James, thanks. Tried running your sub on smaller test set of data and it
works fine. For my needs here I'll be going with Alan's offering in the
other branch which also works as well and is very fast. Nonetheless, I do
appreciate your response/sub.
 
A

Alan Beban

Alan said:
My last email assumed your data was the same as my test file. In your
file, when you mouse over x it should be equal to the number of unique
elements in your data; and y should be the integral part of x/65536,
i.e., 0,1,2,3 or 4.

Alan Beban

Whoops! Must have been too close to my bedtime when I wrote the above. y
is, of course, the number of elements in the rightmost column of the
unique output.

Alan Beban
 
T

Tom Ogilvy

It wasn't mentioned because it isn't necessary.

It isn't necessary to check scripting runtime because it uses the builtin
collection rather than starting up another DLL.

==
Regards,
Tom Ogilvy
 
A

Alan Beban

Max said:
Alan, sorry, I just posted a response to your earlier reply which crossed
this. It **worked** and took only 15 sec to spit out the results !! Superb.

Just to check on how much the array functions from my library slowed
down the process, I coded the following. I didn't time it, but there
doesn't seem to be much difference in execution time. Most of the time
is taken up by loading the dictionary, whether that's directly or within
the ArrayUniques function. And the ArrayTranspose function, which is not
required in the code below, does not seem to add appreciable execution time.

FWIW,
Alan Beban

Sub abtest3()
Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD()
Dim rng As Range
Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range
Set rng = Sheets(1).Range("A:D")
Set rngA = Sheets(2).Range("A:A")
Set rngB = Sheets(2).Range("B:B")
Set rngC = Sheets(2).Range("C:C")
Set rngD = Sheets(2).Range("D:D")
arr1 = rng
'Load the unique elements into a Dictionary Object
Set x = New dictionary
On Error Resume Next
For Each Elem In arr1
x.Add Item:=Elem, key:=CStr(Elem)
Next
x.Remove ("")
On Error GoTo 0

'Load a 0-based horizontal array with the unique
'elements from the Dictionary Object
arr2 = x.Items
x = UBound(arr2) - LBound(arr2) + 1
z = 65536
y = x - (x \ z) * z
Select Case x \ z
Case 0
Sheets(2).Range("A1:A" & y).Value = arr2
Case 1
ReDim arrA(1 To 65536, 1 To 1)
ReDim arrB(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i - 1)
Next
For i = 1 To y
arrB(i, 1) = arr2(i - 1 + z)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & y).Value = arrB
Case 2
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i - 1)
arrB(i, 1) = arr2(i - 1 + z)
Next
For i = 1 To y
arrC(i, 1) = arr2(i - 1 + 2 * z)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & y).Value = arrC
Case 3
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To z, 1 To 1)
ReDim arrD(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i - 1)
arrB(i, 1) = arr2(i - 1 + z)
arrC(i, 1) = arr2(i + 1 + 2 * z)
Next
For i = 1 To y
arrD(i, 1) = arr2(i + 3 * z)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & z).Value = arrC
Sheets(2).Range("A1:D" & y).Value = arrD
Case 4
Sheets(2).Range("A1:D" & z).Value = _
Sheets(1).Range("A1:D" & z).Value
End Select
End Sub
 
A

Alan Beban

Tom said:
It wasn't mentioned because it isn't necessary.

It isn't necessary to check scripting runtime because it uses the builtin
collection rather than starting up another DLL.

==
Regards,
Tom Ogilvy

Thanks. I didn't notice the use of the built-in collection in the
Developer Tip. It raises an interesting question whether the use of a
Dictionary, which could avoid the looping to fill the list box, might be
faster because of the NoDupes.Items property of the Dictionary.

Alan Beban
 
A

Alan Beban

Max said:
Alan, sorry, I just posted a response to your earlier reply which crossed
this. It **worked** and took only 15 sec to spit out the results !! Superb.

There's a bug in the SubProcedure abtest1 that you are using. If the
number of unique elements is an exact multiple of 65536, then in the
Select Case section of the code, y will equal 0 and the code will throw
an error. You need to modify the code to provide for this case.

Alan Beban
 
A

Alan Beban

Max said:
I won't know how to modify the code, Alan
Could you assist on this? Thanks

I tested only for the case when the number of unique elements is 131072
(2 full columns). I leave it to you to arrange the basic data so that it
has 65536 unique items and 196608 (i.e., 1 full column and 3 full
columns), and test those two cases.

Alan Beban

Sub abtest1()
Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD()
Dim rng As Range
Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range
Set rng = Sheets(1).Range("A:D")
Set rngA = Sheets(2).Range("A:A")
Set rngB = Sheets(2).Range("B:B")
Set rngC = Sheets(2).Range("C:C")
Set rngD = Sheets(2).Range("D:D")
arr1 = rng
arr2 = ArrayUniques(arr1)
q = ArrayCount(arr2)
z = 65536
y = q - (q \ z) * z
Select Case q \ z
Case 0
If y = 0 Then
MsgBox rng.Address & "has no data."
Else
Sheets(2).Range("A1:A" & y).Value = arr2
End If
Case 1
If y = 0 Then
ReDim arrA(1 To 65536, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Else
ReDim arrA(1 To 65536, 1 To 1)
ReDim arrB(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
Next
For i = 1 To y
arrB(i, 1) = arr2(i + z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & y).Value = arrB
End If
Case 2
If y = 0 Then
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Else
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
Next
For i = 1 To y
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & y).Value = arrC
End If
Case 3
If y = 0 Then
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To z, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & z).Value = arrC
Else
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To z, 1 To 1)
ReDim arrD(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
For i = 1 To y
arrD(i, 1) = arr2(i + 3 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & z).Value = arrC
Sheets(2).Range("A1:D" & y).Value = arrD
End If
Case 4
Sheets(2).Range("A1:D" & z).Value = _
Sheets(1).Range("A1:D" & z).Value
End Select
End Sub
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top