Jump to content

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more here. X
X


Photo

New option for StringInStr?


  • Please log in to reply
43 replies to this topic

#21 KJohn

KJohn

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 461 posts

Posted 07 January 2008 - 02:44 PM

It's a decent suggestion, I'll add it to my list.



Ummmmm...... Why is this in the rejected subforum then???
Edit: Has since been moved into the 'AutoIt Feature Requests (Possible)' subforum.

Edited by Koshy John, 12 January 2008 - 03:51 PM.








#22 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 11 January 2008 - 11:47 PM

It's a decent suggestion, I'll add it to my list.

Allow me to translate what Jon said:
"It's a decent suggestion, I'll add it to my list of things I'll do eventually but probably not get around to." :)

#23 KJohn

KJohn

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 461 posts

Posted 12 January 2008 - 03:47 PM

Allow me to translate what Jon said:
"It's a decent suggestion, I'll add it to my list of things I'll do eventually but probably not get around to." :P


:D was that a joke on how long this has been ignored??
I've really been waiting patiently for ages. I really don't see why this would take more than 5 minutes to implement (excluding documentation) since even internally there is unlikely to be dependancies that will break by the addition of 2 optional parameters. Of course, I've not seen the actual AutoIt code but I can say so from my experience with C++...

Anyway, I hope that was a joke... considering that autoit will only become more powerful by implementing this, allowing scripters to write more efficient scripts...

#24 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 12 January 2008 - 04:46 PM

Other changes have been more important, and given the limited time I have to code I have to work on things I think more important first. Sorry.

#25 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 12 January 2008 - 04:50 PM

It was just my way of prodding Jon, "Get this shit done so we can stop using this stupid forum thing". Every once in awhile I have to prod Jon along with incessant nagging (my specialty).

#26 KJohn

KJohn

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 461 posts

Posted 23 March 2008 - 03:34 PM

Is this in limbo because I haven't added it to the trac system? This has been in-waiting for so so so so long... 8 months! :)

#27 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 23 March 2008 - 06:06 PM

It's still on my personal todo list, regardless of whether it's in trac.

I just reread the orginal post though, and I am baffled by the "limit" parameter - I just don't see why it's there.

"start" is simply the character position in the string to begin the search, yes? If so I can probably add that quickly.

Edit: I've done the "start" bit.

#28 SmOke_N

SmOke_N

    It's not what you know ... It's what you can prove!

  • Moderators
  • 16,014 posts

Posted 23 March 2008 - 10:29 PM

It's still on my personal todo list, regardless of whether it's in trac.

I just reread the orginal post though, and I am baffled by the "limit" parameter - I just don't see why it's there.

"start" is simply the character position in the string to begin the search, yes? If so I can probably add that quickly.

Edit: I've done the "start" bit.

Cool about the "start", I'm sure the "limit" was for "end character position". Default being end of string, and "limit" being chars from start (Sort of a StringMid type I would suppose).

Edited by SmOke_N, 23 March 2008 - 10:30 PM.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.


#29 SmOke_N

SmOke_N

    It's not what you know ... It's what you can prove!

  • Moderators
  • 16,014 posts

Posted 23 March 2008 - 11:50 PM

This is what I'm assuming he's looking for or hoping for:
AutoIt         
Global $sString1 = "abcdefghijklmnopqrstuvwxyz", $sString2 = "efghijkl" Global $bExam[4] $bExam[0] = _StringInStrEx($sString1, $sString2);Beginning to end search not case sensitive $bExam[1] = _StringInStrEx($sString1, $sString2, 0, 1, 5, -1);5 chars from beginning to end of string search $bExam[2] = _StringInStrEx($sString1, $sString2, 0, 1, 5, 8); 5 chars from beginning of string to 13 chars from beginning of string $bExam[3] = _StringInStrEx($sString1, $sString2, 0, 1, 6);start search from sixth char of string (should return 0/False as no match in this example) For $i = 0 To 3     If $bExam[$i] > 0 Then $bExam[$i] = 1     MsgBox(64, "bExam[" & $i & "]", ($bExam[$i] = 1)) Next Func _StringInStrEx($sString1, $sString2, $iCase = 0, $iOccurence = 1, $nStart = 1, $nEnd = -1)     ;Do we only need a regular StringInStr func     If ($nStart = 1 Or $nStart = Default) And ($nEnd = -1 Or $nEnd = Default) Then         Return StringInStr($sString1, $sString2, $iCase, $iOccurence)     EndIf         ;Configure start/end expression     Local $sEnd = "(.{" & $nEnd & "}).*?(?m:$)", $sStart = ".{" & ($nStart -1) & "}"     If ($nEnd = Default Or $nEnd = -1) Then $sEnd = "(.*?)(?m:$)"     If ($nStart = 1 Or $nStart = Default) Then $sStart = "^"         ;Pull out start to end string for string one     Local $aSRE = StringRegExp($sString1, "(?s)" & $sStart & $sEnd, 1)     If IsArray($aSRE) = 0 Then Return SetError(1, 0, 0)     Return StringInStr($aSRE[0], $sString2, $iCase, $iOccurence) EndFunc
The only one that should fail should be the last one [3]

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.


#30 KJohn

KJohn

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 461 posts

Posted 24 March 2008 - 04:01 AM

A detailed explanation for the limit parameter is in post number 6 in this thread. Here's the link to quickly read it: http://www.autoitscript.com/forum/index.ph...mp;#entry395410
Limit is important because it will limit the number of comparisons - translates to faster performance. If it reduces the confusion, Zedna had once suggested calling the "limit" parameter "end".

Smoke_N seems to get the idea - implementing it as a UDF is easy but beats the point -> performance. The whole argument for pushing it (as detailed in the various posts in this thread) into the autoit binary is the massive increase in performance when coded directly in C++.

If it makes your work any easier, I can put up how I would implement it in C++ and you can just make changes as it suits you. I can also draft documentation for the help file.

Anything to get this into a release in the near future!

Edited by Koshy John, 24 March 2008 - 04:03 AM.


#31 SmOke_N

SmOke_N

    It's not what you know ... It's what you can prove!

  • Moderators
  • 16,014 posts

Posted 24 March 2008 - 04:11 AM

Sounds more like a chunk comparison to me then... guess I didn't really get it after all.

Edit for clarity:
I didn't understand the "limit/end" part, I was assuming it was characters after the starting position, not the number of chunks to loop through.

Edited by SmOke_N, 24 March 2008 - 04:16 AM.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.


#32 Valik

Valik

    Former developer.

  • Active Members
  • PipPipPipPipPipPip
  • 18,879 posts

Posted 24 March 2008 - 04:31 AM

If I understand you correctly, you're saying "I have a string 100 character in length. I want to search for "ab" in the range 70 - 80 of this string." Okay, if that's what you want, then extract that character range with StringMid() and use StringInStr() on that shortened string. Why must StringInStr() do 100% of the work when there are other tools to share the work-load? I realize, your issue is performance. But we can't bloat up every function just for the sake of performance. I haven't looked at the implementation of StringInStr() but unless it's stupidly simple to add a limit parameter, IMO, you should be using StringMid() to pluck out the range and then StringInStr() on the new string.

#33 KJohn

KJohn

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 461 posts

Posted 24 March 2008 - 10:00 AM

If I understand you correctly, you're saying "I have a string 100 character in length. I want to search for "ab" in the range 70 - 80 of this string." Okay, if that's what you want, then extract that character range with StringMid() and use StringInStr() on that shortened string. Why must StringInStr() do 100% of the work when there are other tools to share the work-load? I realize, your issue is performance. But we can't bloat up every function just for the sake of performance. I haven't looked at the implementation of StringInStr() but unless it's stupidly simple to add a limit parameter, IMO, you should be using StringMid() to pluck out the range and then StringInStr() on the new string.


I perfectly understand your hesitance to bloat functions. But I have pondered a lot over a lot of things like backward compatibility and scope of use before even suggesting it. I'm sure it would be invaluable for a lot of AutoIt scripters since StringInStr is a very basic function and any improvement performance-wise can only be beneficial.

My suggestions were put forward by carefully analyzing the what would be useful to the widest range of scripters. What I really want from my suggestions being implemented is only a small fraction of that:

Assume a string: "supercalifragilisticexpialidocious" and the substring i'm searching for being "viva". Now all I want is StringInStr telling me if the word starts with "viva" and that to as soon as possible (i know i should use stringmid for this but surprisingly stringinstr is much faster even with more comparisons). What stringinstr does now is perform stringlen("supercali...") - stringlen("viva") = (34 - 4) = 30 comparisons before telling me that the word is not in the "super"-string. With the new parameters, I can do it in 1 comparison.

Now take into account that I have, lets say, 5000 superstrings to consider. You do the math. The performance implications are massive.

All these arguments and counter-arguments have been considered in the posts leading up to this one and they are being brought up again only because everyone's forgotten what they were. I am repeating it again because I really really want to see this included, not only for my selfish ends, but also coz I believe it will be useful for the community of AutoIt scripters.

I finally understand that the decision rests entirely with Jon and the AutoIt team; and I hope you agree to my proposal. Jon had approved this a long time back but I'm restating the above so that this is not taken back to the decision table.

#34 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 24 March 2008 - 10:02 AM

Sounds more like a chunk comparison to me then... guess I didn't really get it after all.

Edit for clarity:
I didn't understand the "limit/end" part, I was assuming it was characters after the starting position, not the number of chunks to loop through.

I _think_ it's the same thing. The number of "chunks" is effectively the "length" which is effectively end-start.

#35 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 24 March 2008 - 10:09 AM

If it makes your work any easier, I can put up how I would implement it in C++ and you can just make changes as it suits you. I can also draft documentation for the help file.

Not really :)

Not unless your code takes unicode and case sensitivity across locales into account ( this code is not a simple "if string[i] = search[j]" ) It uses different APIs and runtime comparison functions depending on the setup :)

I've not got a problem with the implementation, just struggling to understand exactly what you want for the parameters and why. I am thinking along the same lines as Valik about StringMid/StringLeft and coming to terms with the fact this is a pure performance related request and not anything to do with missing functionality.

#36 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 24 March 2008 - 10:15 AM

The code is pretty trivial to change to add a start pos (already done) and I can add a "count" parameter (to be consistent with other string functions). Once I've done that it should be fairly easy to write an example to compare performance.

#37 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 24 March 2008 - 11:54 AM

The difference in performance is fairly small:

AutoIt         
Sleep(100) $time = TimerInit() For $i = 1 to 200000  StringInStr("supercalifragilisticexpialidocious", "viva", 0, 1) Next $time1diff = Round(TimerDiff($time) / 1000, 2) $time = TimerInit() For $i = 1 to 200000  StringInStr( StringMid("supercalifragilisticexpialidocious", 1, 4), "viva", 0, 1) Next $time2diff = Round(TimerDiff($time) / 1000, 2) $time = TimerInit() For $i = 1 to 200000  StringInStr("supercalifragilisticexpialidocious", "viva", 0, 1, 1, 4) Next $time3diff = Round(TimerDiff($time) / 1000, 2) $msg = "No limits: " & $time1diff & @LF $msg &= "StringMid: " & $time2diff & @LF $msg &= "Limits: " & $time3diff MsgBox(0, @AutoItVersion, $msg)


No limits: 1.51 StringMid: 0.8 Limits: 0.62


#38 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 24 March 2008 - 12:50 PM

Edit: With a string size of 16KB the difference between a native implementation and the StringMid() one is still constant (0.18 secs). Not exactly setting my world alight with performance boosts.

#39 KJohn

KJohn

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 461 posts

Posted 24 March 2008 - 01:31 PM

The difference in performance is fairly small:

AutoIt         
Sleep(100) $time = TimerInit() For $i = 1 to 200000  StringInStr("supercalifragilisticexpialidocious", "viva", 0, 1) Next $time1diff = Round(TimerDiff($time) / 1000, 2) $time = TimerInit() For $i = 1 to 200000  StringInStr( StringMid("supercalifragilisticexpialidocious", 1, 4), "viva", 0, 1) Next $time2diff = Round(TimerDiff($time) / 1000, 2) $time = TimerInit() For $i = 1 to 200000  StringInStr("supercalifragilisticexpialidocious", "viva", 0, 1, 1, 4) Next $time3diff = Round(TimerDiff($time) / 1000, 2) $msg = "No limits: " & $time1diff & @LF $msg &= "StringMid: " & $time2diff & @LF $msg &= "Limits: " & $time3diff MsgBox(0, @AutoItVersion, $msg)


No limits: 1.51 StringMid: 0.8 Limits: 0.62

0.62 is only 40% of 1.51. That is quite a bang. So the new StringInStr against the old one, the new one wins hands down.
0.62 is 77% of 0.8. Not quite great in light of the previous but still noteworthy.
0.8 is 52% of 1.51.

First of all, you seem to have quite a fast processor setup. I've a 2GHz Core 2 Duo (with 4MB L2 cache) laptop with 2GB of RAM, not a sloppy system by any standards, and my results of the first 2 tests in your code give me (OS: Vista):
No limits: 2.23 StringMid: 1.59

(my installed autoit version is 3.2.10.0)

1.59 is 71% of 2.23, which contrasts against the 52% you achieved. Considering that, I cannot venture what my results with the 'limits' might have been. Either the test is flawed or something else is.

My point is 0.62 against a (questionable) 0.8 is trivial but you do have a seriously fast system. But when you run the same code on more commonly used processor setups, the performance difference may become greatly more pronounced.

I also ran my system in power saver mode where the processor works at roughly half the clock speed and the results were:
No limits: 4.67 StringMid: 3.26

3.26 is 70% of 4.67 which is not a wide deviation relatively from the 71% in high performance mode.

If my arguments have no ground, I'm sorry for wasting your time.

Edited by Koshy John, 24 March 2008 - 01:33 PM.


#40 Jon

Jon

    Up all night to get lucky

  • Administrators
  • 10,630 posts

Posted 24 March 2008 - 03:32 PM

It's in the beta now, so you can post results from your setup.

As an aside, 3.2.11.5 is almost 10% faster than 3.2.10.0 anyway :)




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users