Sorting

J

Jake

Hi,
I'm working on table design for a primary key that allows sorting a text
field. The field contains values such as 9.21, 9.21(a), 9.21(a)(ii),
9.21(a)(iii), 9.21(a)(iv) etc. I have converted these to integer aliases to
sort, but I was wondering if there is a better way?
Thanks,
Jake
 
J

Jake

Sorry, I wasn't clear; the field contains values such as 9.1001(a), 9.101(a)
which sort 9.1001 before 9.101.
Jake
 
D

Duane Hookom

I would either
-go back and split the single field into two or three
-create a function that takes the value and returns a sortable value
 
A

Allen Browne

Jake, that's really messy data, particularly with the alpha characters,
Roman numerals, brackets, et al.

The function below illustrates how to sort with numeric data of this kind.
If the field is named ID, you would end up with a query that ends:
ORDER BY PointSort([ID])

But in your case, you will need to use Replace to convert the opening
brackets to periods and to drop the closing brackets, convert the
alpha/roman to real numbers before you pass the string into the PointSort()
function. The function also assumes no points larger than 9999.

HTH.

Public Function PointSort(varInput As Variant, Optional strDelim As String =
".") As Variant
'Purpose: Sort on a field such as "5.11.2"
Dim strOut As String
Dim i As Integer
Dim varArray As Variant

PointSort = Null 'Initialize to null

If Not (IsError(varInput) Or strDelim = vbNullString) Then
If Not IsNull(varInput) Then
varArray = Split(varInput, strDelim)
For i = LBound(varArray) To UBound(varArray)
strOut = strOut & Format(varArray(i), "0000")
Next
If strOut <> vbNullString Then
PointSort = strOut
End If
End If
End If
End Function
 
J

Jake

Thanks very much Allen and Duane for your kind advice. I had considered
creating a field, a surrogate, for lack of a better term, that would convert
from a text field to an integer, substituting integers for the
sub-paragrahs. For example convert 9.1001(a)(i) to 9100111 etc. I believe
Allen's function accomplishes that. This field is a primary key in the main
table and a foreign key in most of the other tables in the database. I can
display the 9.1001(a)(i) to the user but use the surrogate to relate to
other tables and as master child for sub forms and reports. When the user
adds new records he/she would need to enter the surrogate key however. But
would this be inefficient? Would I be better to use the PointSort function
and not establish a surrogate?
thanks
Jake
Allen Browne said:
Jake, that's really messy data, particularly with the alpha characters,
Roman numerals, brackets, et al.

The function below illustrates how to sort with numeric data of this kind.
If the field is named ID, you would end up with a query that ends:
ORDER BY PointSort([ID])

But in your case, you will need to use Replace to convert the opening
brackets to periods and to drop the closing brackets, convert the
alpha/roman to real numbers before you pass the string into the
PointSort() function. The function also assumes no points larger than
9999.

HTH.

Public Function PointSort(varInput As Variant, Optional strDelim As String
= ".") As Variant
'Purpose: Sort on a field such as "5.11.2"
Dim strOut As String
Dim i As Integer
Dim varArray As Variant

PointSort = Null 'Initialize to null

If Not (IsError(varInput) Or strDelim = vbNullString) Then
If Not IsNull(varInput) Then
varArray = Split(varInput, strDelim)
For i = LBound(varArray) To UBound(varArray)
strOut = strOut & Format(varArray(i), "0000")
Next
If strOut <> vbNullString Then
PointSort = strOut
End If
End If
End If
End Function

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

Jake said:
Sorry, I wasn't clear; the field contains values such as 9.1001(a),
9.101(a) which sort 9.1001 before 9.101.
Jake
 
T

TC

Stuffin' google! I hit the posting limit & coulnd't reply.

I'd create an extra field that was normalized to allow proper sorting.
(I'm using "normalized" here to mean, translated into a standard form.)
Eg:

original extra
9.1001(a) 9001001a
9.101(a) 9000101a
etc.

You'd just need to keep that field updated, if & when you updated the
original field.

HTH,
TC
 
A

Allen Browne

You probably know that storing dependent data is a no-no, but if you have
lots of data (tens of thousands of records) and you are constantly
seeking/parsing/sorting on this field, it might be worth the trouble.

--
Allen Browne - Microsoft MVP. Perth, Western Australia.

Reply to group, rather than allenbrowne at mvps dot org.

Jake said:
Thanks very much Allen and Duane for your kind advice. I had considered
creating a field, a surrogate, for lack of a better term, that would
convert from a text field to an integer, substituting integers for the
sub-paragrahs. For example convert 9.1001(a)(i) to 9100111 etc. I believe
Allen's function accomplishes that. This field is a primary key in the
main table and a foreign key in most of the other tables in the database.
I can display the 9.1001(a)(i) to the user but use the surrogate to relate
to other tables and as master child for sub forms and reports. When the
user adds new records he/she would need to enter the surrogate key
however. But would this be inefficient? Would I be better to use the
PointSort function and not establish a surrogate?
thanks
Jake
Allen Browne said:
Jake, that's really messy data, particularly with the alpha characters,
Roman numerals, brackets, et al.

The function below illustrates how to sort with numeric data of this
kind. If the field is named ID, you would end up with a query that ends:
ORDER BY PointSort([ID])

But in your case, you will need to use Replace to convert the opening
brackets to periods and to drop the closing brackets, convert the
alpha/roman to real numbers before you pass the string into the
PointSort() function. The function also assumes no points larger than
9999.

HTH.

Public Function PointSort(varInput As Variant, Optional strDelim As
String = ".") As Variant
'Purpose: Sort on a field such as "5.11.2"
Dim strOut As String
Dim i As Integer
Dim varArray As Variant

PointSort = Null 'Initialize to null

If Not (IsError(varInput) Or strDelim = vbNullString) Then
If Not IsNull(varInput) Then
varArray = Split(varInput, strDelim)
For i = LBound(varArray) To UBound(varArray)
strOut = strOut & Format(varArray(i), "0000")
Next
If strOut <> vbNullString Then
PointSort = strOut
End If
End If
End If
End Function
 
T

TC

I think this is one of those cases. Personally I'd rather carry the
overhead of the extra field (and attendan risk of letting it get out of
synch), rather than a repeated runtime overhead of recalculating the
sortable values everey time I wanted to do the sort.

Cheers,
TC
 
T

TC

Or, here's another (and possibly better) idea. Say your numbering
scheme is currently limited to a maximum of three levels:

1. the main par number (7, 8, 9 etc.)
2. the sub-par number (101, 1001, etc.)
3. the par letter (a, b, c etc.)

Maybe, if you are confident that you are unlikely to ever need more
than three level, you could define a sperate field for each level.
Those three fields together would form the composite primary key for
the table (if your current single field is currently the primary key).
Sorts woiuld be simple (because now you have three seperate fields).
And you could concatenate those fields at runtime (perhaps in a query)
to display on forms & reports. If the composite primary key became
unweildy, you could have an autonumber PK, and make the three fields
no-key attributes.

Yes? No?

HTH,
TC
 
T

Tim Ferguson

You probably know that storing dependent data is a no-no, but if you
have lots of data (tens of thousands of records) and you are
constantly seeking/parsing/sorting on this field, it might be worth
the trouble.

In terms of longterm robustness, it would be worth thinking about
changing the design:

"translate" all these messy string identifiers into a rational scheme,
for example a long integer or a properly formatted string

store the sensible key, so that it will sort etc properly

parse it back into the brackets and things for display


If you start with a record 9.1001(a)(i), rewrite it as either 9100111 or
"9100111" [1] and update all the FKs in all the other tables. Then use a
format or a public function to traslate it back into "9.1001(a)(i)".

[1] The difference lies in how you manage depth of nesting. If you
suddenly decide you need a fifth level, say 9.1001(a)(i)(1), then you'll
have to multiply all the numbers by 10, whereas the string version uses
left-to-right character compares and will cope quite happily with and
extra bit on the end.

Whatever you decide, don't do anything till you have had a long and hard
think about your design needs. You really don't want to have to
restructure an important PK in a production database more than once!

All the best


Tim F
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top