Sign in to follow this  
Followers 0
VeeDub

Possible bug - _ArraySort

17 posts in this topic

Hi,

I have a script that works with files and uses _ArraySort to identify the latest file.

Recently the script had some unexpected results and after examining the latest filename and the results from _ArraySort I can see what is happening.

Here is an example script

#include <Array.au3>

Dim $FILES[10000]

For $Count = 1 To 1000
        
    $Incremental = String($Count)
    If $Count < 100 Then
        While StringLen($Incremental) < 3
            $Incremental = "0" & $Incremental
        WEnd    
    EndIf       

    $FILES[$Count] = "C_VOL-b001-i" & $Incremental & ".spi"

Next        

_ArraySort($FILES,1)
_ArrayDisplay($FILES)

When $Incremental reaches 1000, rather than appearing at the top of the array (as I am sorting in descending order), the entry appears between 100 and 101 in the array (row 899 rather than row 0).

While no doubt I can program around this. I think it's a bug, but I would like a second opinion before I formally log it.

Thanks

VW

Share this post


Link to post
Share on other sites



Sort works fine as it does a proper Alphanumeric sort as far as I can tell looking at your array.

You need to add enough 0 infront of the actual number to sort it in a proper numeric way.

This will work properly up till 9999

#include <Array.au3>

Dim $FILES[10000]

For $Count = 1 To 1002
    $Incremental = String($Count)
    $Incremental = StringRight("000" & $Incremental,4)
    $FILES[$Count] = "C_VOL-b001-i" & $Incremental & ".spi"

Next

_ArraySort($FILES, 1)
_ArrayDisplay($FILES)

Jos


Visit the SciTE4AutoIt3 Download page for the latest versions  - Beta files                                How to post scriptsource        Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

Sort works fine as it does a proper Alphanumeric sort as far as I can tell looking at your array.

You need to add enough 0 infront of the actual number to sort it in a proper numeric way.

This will work properly up till 9999

@Jos

Thanks for taking a look at this.

I agree that by adding the leading zero that the sort then works as expected.

However by doing this, I am then no longer working with the real filename, so I then have to convert the entry back after the sort to continue the processing. Which is not the end of the world, but in practice the filenames can have more than just the pattern that I have shown to demonstrate the sort behaviour, so it would be much simpler if I could avoid the need for conversion.

It's interesting that Windows Explorer displays the filenames in the correct order, without the leading zero being present.

VW

Share this post


Link to post
Share on other sites

As it stands, I think the simplest solution is a 2D array, with one dimension being used to get the right sort result and the second dimension being used to store the real filename.

But to me this approach ought not to be necessary, really it's a work-around.

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Try _ArraySortClib on this page

http://dundats.mvps.org/autoit/udf_code.aspx?udf=arrayx

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Try _ArraySortClib on this page

@GEOSoft

Thanks for your suggestion.

Unfortunately _ArraySortClib has the same behaviour as _ArraySort

VW

Share this post


Link to post
Share on other sites

If you search Example Scripts I'm pretty sure SmOke_N posted a numeric sort. I have the function but I'm not sure where at the moment.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

If you search Example Scripts I'm pretty sure SmOke_N posted a numeric sort. I have the function but I'm not sure where at the moment.

@GEOSoft,

Thanks for your suggestion. Spent sometime searching and found some old threads, and some old scripts (some of which no longer work no doubt because of changes to AutoIt), could not find any thread from SmOke_N. I haven't been on the forums much recently, maybe when the forums switched to this new format some of the older material was archived.

Anyway it doesn't matter, I'm going to do what I said earlier and use a 2D array. It's a kludge really, but thanks to Jos's feedback the work-around to achieve the correct sort order is no big deal. The performance will be perfectly acceptable for my needs and really I'm after a low effort fix to my current script, which I expect this will be.

Cheers

VW

Share this post


Link to post
Share on other sites

I do not know why your explorer shows the correct sequence (Maybe sorted on creation date?) but you have to agree that when you do a alphanumeric sort, the filename that contains 1000 is shown at the correct position in your example:

[897]|C_VOL-b001-i102.spi
[898]|C_VOL-b001-i101.spi
[899]|C_VOL-b001-i1000.spi
[900]|C_VOL-b001-i100.spi
[901]|C_VOL-b001-i099.spi

Agree that adding a second column in your array with a key to sort on is probably the best approach.

Jos


Visit the SciTE4AutoIt3 Download page for the latest versions  - Beta files                                How to post scriptsource        Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

This is nothing new. Lexicographic collation is a different beast from numeric string collation, that is, for a lexicographic collation, '100' < '11' and '9' > '10'. There is no definitive solution to this problem: _you_ have to place enough leading zeroes to all numeric values parts to have them sorted consistently by a lexicographic sort. The numeric part could be anywhere: leading, in the middle, at the end. How could a general-purpose sort determine this? And what would happen to a bunch of items consisting of, say, random hex values?

Unicode even makes the thing worse: it would be consistent for a numeric-aware string sort to identify decimal digits as equivalent, irrespective of the character (Unicode codepoint) that their own language demands for representation. I mean Devanagari digits (Unicode codepoints 0x000966 to 0X00096F actualy have the exact same numeric string sorting value as ASCII '0' to '9'. A numerical string sort is sorting a concept relying of (for the most common case) on a decimal representation and the positional convention (increasing powers of ten from right to left). In this view, '¹²³' = '123' = '१२३' (Devanagari digits) = '൧൨൩' (Malayalam digits) ...

BTW, I have written an SQLite extension for dealing with non-latin text and one of the collations found there handles this correctly for all known stable Unicode decimal digits representations (AFAIK there are exactly 40 of them as per Unicode v5.1). Non-decimal and/or non-positional numeric systems offer another challenge.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

The algorithm you want is called natural sort. I found one from wraithdu:

http://www.autoitscript.com/forum/index.php?showtopic=83626


*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Share this post


Link to post
Share on other sites

The algorithm you want is called natural sort. I found one from wraithdu:

http://www.autoitscript.com/forum/index.php?showtopic=83626

Nice one .. never saw that before and seems to work very nicely on this example.

Only things is that its slower but that doesn't have to be a real issue.

Jos


Visit the SciTE4AutoIt3 Download page for the latest versions  - Beta files                                How to post scriptsource        Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

Share this post


Link to post
Share on other sites

I switched to using _ArraySortClib() by Siao in SMF, maybe give that one a try too ;)...

... I am in a 64Bit OS and this doesn't seem to work for it:

; Return Value(s): Success = Returns 1

; Failure = Returns 0 and sets error:

; @error 1 = invalid array

; @error 2 = invalid param

; @error 3 = dll error

; @error 64 = 64-bit AutoIt unsupported


Visit the SciTE4AutoIt3 Download page for the latest versions  - Beta files                                How to post scriptsource        Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

You're right, 32bit only. I'm running it in 32bit mode anyhow because of all the necessary dll-includes (Mediainfo, 7zip & TrIDLib) ;) ...

Share this post


Link to post
Share on other sites

Well, "natural" is completely domain- and culture-dependant, that's it's problem.

BTW, the linked implementation seems terrible to me. I believe run times can by mucho shorter by using a single regexp to split parts (I guess than up to 9 parts should fit "most" needs).

Another (awfully closely related) issue is the fate of non-ASCII characters. There you enter the unsolvable question of worldwide collation. This isn't a technical issue, it actually is a political one.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Here is the link for SmOke_Ns number sort

http://www.autoitscript.com/forum/index.php?showtopic=95383&view=findpost&p=685701


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0