Jump to content

array search optimization


frank10
 Share

Recommended Posts

How do you think regexp works? By magic? Really.

Both ways are doing pretty much the same thing, they check each letter (character). The difference is that regexp has sometimes way to skip few characters. Contrary to general belief, regexp isn't science fiction, it's just awesome.

♡♡♡

.

eMyvnE

Link to comment
Share on other sites

EDIT:

The same was for StringInStr(): that string was at the first positions of the var, so it was a lot fast also!

If the matching string is near the beginning of the var, StringInStr is even slightly faster than RegExp.

This was bothering me when I got pulled away.  If you are doing a char by char match stringinstr should be just as fast as SRE (I think...).

 

I want to clarify it a bit.

When I was saying StringInStr was faster than RegExp I was referring at my point 1) of post 11.

In that case I made the same error of reading only the first digits, instead it was e-005, so there wasn't that slow-down of 3,2s.

But, apart that misunderstanding, I confirm that using RegExp, searching match string apart from the beginning of the list, I get quite an 40-80x faster search.

 

Using this:

Local $sSearchStr = "mysearchID..." 

local $st = timerinit()
local $iMatch = stringregexp( $myList ,  StringRegExpReplace($sSearchStr, "[\W]", "\\$0")  )
ConsoleWrite('StringRegExp search sec:' & Round(timerdiff($st)/1000 ,4)  & @CRLF) 

$st = TimerInit()
$result = StringInStr($myList , $sSearchStr )
ConsoleWrite("StringInStr      search sec:" &  Round(TimerDiff($st)/1000 ,4) & @CRLF)

For example,

searching a string at the beginning of the list, I get: (only here I used more decimal digits...)

StringRegExp search sec: 0.001147

StringInStr      search sec: 0.000028

in the middle of the list, I get:

StringRegExp search sec: 0.0012

StringInStr      search sec: 0.0506

by the end of the list, I get:

StringRegExp search sec: 0.002

StringInStr      search sec: 0.1572

StringInStr is faster only when searching at the very beginning of the list, let's say up to ID n° 100, out of a total of 35000 IDs.

(This corresponds about 3300chars)

All ID >100, RegExp is faster, a lot faster and quite consistent, always about 0.001, 0.002 seconds.

Link to comment
Share on other sites

czardas,

To deal with special chars you can use Q...E

Local $iMatch = StringRegExp("|" & $sDelimString & "|", "\Q|" & $sSearchStr & "|\E")

 

Oh yeah - 'disable metacharacters'. I thought there would be a shortcut to this. Thx :).

Edited by czardas
Link to comment
Share on other sites

@czardas

But your StringReplace inside RegExp is faster than these Q E.

From 0.0019 to 0.0032 seconds.

 

That's interesting! I'll have to run some tests this weekend. Varied input can sometimes produce results that are difficult to interpret. Longer and more random search patterns might have the opposite effect. Here the search patterns are typical file names which are quite short and also mainly alphanumeric. Not all data is so easy to parse or manipulate. I'm guessing there's some truth to this - further testing is required.

Edit

One more thing you need to be aware of: You should be using the case insensitive version of the pattern (in post 16): because file names are not case sensitive. Also both the search string pattern, and the string $myList in post 23, must start and end with the delimiter "|". Look again at the examples in this thread. You need to match an exact file name - not part of a name (by accident).

Edited by czardas
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...