Italiano

StringInStr - can this script be faster? (texts files included)

37 posts in this topic

#1 ·  Posted (edited)

Hello,

As always, sorry for my bad english.

here is the code i have

#include <File.au3>
#include <String.au3>

$file1 = "d:\doppioniautoit\international.txt"
FileOpen($file1, 0)

$file2 = "d:\doppioniautoit\standard.txt"
FileOpen($file2, 0)

For $i = 1 to _FileCountLines($file1)
   $line = FileReadLine($file1, $i)
   $aExtract = _StringBetween($line, "(", ")")

;MsgBox(0, $line, $aExtract[0])
$itime = TimerInit()
      For $x = 1 to _FileCountLines($file2)
         $line2 = FileReadLine($file2, $x)
         Local $iPosition = StringInStr($line2, $aExtract[0], 1)
         ;Local $iPosition = StringRegExp($line2,$aExtract[0], 0)
         if $iPosition <> 0 then
            ;MsgBox(0, "Trovato", $aExtract & " " & $line2)
         endif
         ConsoleWrite($line2  & @CRLF)
      Next

  ConsoleWrite(@TAB&'Str='&TimerDiff($itime)&' ms'&@lf)
  MsgBox(0, "TIME", @TAB&'Str='&TimerDiff($itime)&' ms'&@lf)
Next
FileClose($file1)

So, what do i want to do? I try to explain with my poor english :)  Basically, i have 2 text files (see attachments below). They both contains movie titles with Director and Year  in this form

Movie Title (Director, Year)

"Standard.txt" contains, mostly, italian titles. "International.txt", as you can image, contains the internationals one. With the script i would like to search for the Director, Year of "international.txt" in the "standard.txt" file.

For example... first row of "international.txt" is "¡Atraco! (Cortés, 2012)". The script takes just the "Cortés, 2012" and it searches for it in the standard. txt file.

The simple code i wrote works...  I tried using StringInStr and using StringRegExp.. they both need about 2 minutes and 30 seconds (stringinstr is little faster) to process one row.

I was wondering... is there any other method to make it faster using autoit? Any help would be much appreciated, thx!

 

 

standard.txt

international.txt

Edited by Italiano
typo

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

HI, i would make some changes, for example, i wouldn't re-declare Local $iPosition in the loop like that, and i'd move the linecount to the top like so:

#include <File.au3>
#include <String.au3>
Local $iPosition

$file1 = "d:\doppioniautoit\international.txt"
FileOpen($file1, 0)
Local $LinesC1 = _FileCountLines($file1)

$file2 = "d:\doppioniautoit\standard.txt"
FileOpen($file2, 0)
Local $LinesC2 = _FileCountLines($file2)

For $i = 1 to $LinesC1
   $line = FileReadLine($file1, $i)
   $aExtract = _StringBetween($line, "(", ")")
$itime = TimerInit()
      For $x = 1 to $LinesC2
         $line2 = FileReadLine($file2, $x)
         $iPosition = StringInStr($line2, $aExtract[0], 1)
         if $iPosition <> 0 then
            ConsoleWrite($aExtract & " " & $line2)
         endif
         ConsoleWrite($line2  & @CRLF)
      Next
  ConsoleWrite(@TAB&'Str='&TimerDiff($itime)&' ms'&@lf)
  MsgBox(0, "TIME", @TAB&'Str='&TimerDiff($itime)&' ms'&@lf)
Next
FileClose($file1)
FileClose($file2)

Oh, and i'd change timers for @‌MIN&':'&@MSEC in msgbox and consolewrite.

Edited by careca

Spoiler

Paster - Main function is to paste text, but has more functions.

OpenW - Open With... alternative, Open any file with any application, set it's icon, set application as default.

Renamer - Rename files and folders, remove portions of text from the filename etc.

BeatsPlayer - Music player.

Params Tool - Right click an exe to see it's parameters or execute them.

Regedit Control - Registry browsing history, quickly jump into any saved key.

Time4Shutdown - Write the time for shutdown in minutes.

Power Profiles Tool - Set a profile as active, delete, duplicate, export and import.

Firefox Profile Backup - Backup/restore previously saved profile.

Finished Task Shutdown - Shuts down pc when specified window/Wndl/process closes.

NetworkSpeedShutdown - Shuts down pc if download speed goes under "X" Kb/s.

IUIAutomation - Topic with framework and examples

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Weird but with those changes it is even slower :( 157657 milliseconds against 151279...  thx for effort tough.

 

Any other idea to make it faster? anyone? :) 

 

Edited by Italiano

Share this post


Link to post
Share on other sites

Something along these lines.

#include <File.au3>
#include <String.au3>

Local $aFile1
Local $aFile2
_FileReadToArray( @ScriptDir & "\international.txt", $aFile1)
_FileReadToArray(@Scriptdir & "\standard.txt", $aFile2)

For $i = 1 to $aFile1[0]
   $aExtract = _StringBetween($aFile1[$i], "(", ")")


$itime = TimerInit()
      For $i2= 1 to $aFile2[0]
         Local $iPosition = StringInStr($aFile2[$i2], $aExtract[0], 1)
         ;Local $iPosition = StringRegExp($line2,$aExtract[0], 0)
         if $iPosition <> 0 then
            ;MsgBox(0, "Trovato", $aExtract & " " & $line2)
         endif
         ConsoleWrite($aFile2[$i2]  & @CRLF)
      Next

  ConsoleWrite(@TAB&'Str='&TimerDiff($itime)&' ms'&@lf)
  MsgBox(0, "TIME", @TAB&'Str='&TimerDiff($itime)&' ms'&@lf)
Next

Not perfect, just the foundation. 

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

ViciousXUSMC faster than me!

#include <File.au3>
#include <String.au3>
Opt("TrayAutoPause", 0)

Global $sFile1 = @ScriptDir & "\international.txt"
Global $sFile2 = @ScriptDir & "\standard.txt"

Global $aContentFile1 = FileReadToArray($sFile1)
Global $aContentFile2 = FileReadToArray($sFile2)
Local $iExtract, $pTime, $iPosition, $sOut, $iLine
For $i = 0 To UBound($aContentFile1) - 1
    $iExtract = _StringBetween($aContentFile1[$i], "(", ")")
    If Not IsArray($iExtract) Then ContinueLoop
    $pTime = TimerInit()
    $iLine = "-> " & "LineInFile1 [" & $i & "] : " & $aContentFile1[$i] & @CRLF & "-> TextFind: " & $iExtract[0] & @CRLF
    For $x = 0 To UBound($aContentFile2) - 1
        $iPosition = StringInStr($aContentFile2[$x], $iExtract[0], 1)
        If $iPosition <> 0 Then $sOut &= "+> LineInFile2: [" & $x & "] : " & $aContentFile2[$x] & @CRLF
        ;ConsoleWrite($aContentFile2[$x] & @CRLF)
    Next
    If $sOut <> "" Then
        ConsoleWrite('Time finded: ' & TimerDiff($pTime) & ' ms' & @CRLF & "!" & $iLine & $sOut)
        MsgBox(0, "Time finded: " & TimerDiff($pTime) & ' ms', $iLine & @CRLF & $sOut)
    Else
        ConsoleWrite('Time finded: ' & TimerDiff($pTime) & ' ms' & @CRLF)
    EndIf
    $sOut = ""
Next

 

Edited by Trong

Regards,
 

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

Hi,

i am not great at  RegEx, but your problem cries for....

#include <array.au3>
#include <string.au3>  
;
$international = FileRead("international.txt")                         ;files
$standard = FileRead("standard.txt")

$searcharray1 = StringRegExp($international, "(?m)^.*\((.*)\).*$", 3)  ;find all autors between ( and ) and write them into an array
;or 
$searcharray1=_stringbetween($international,"(",")")
_ArrayDisplay($searcharray1)

For $author In $searcharray1                                           ;every author in the array
    ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $author = ' & $author & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console

    $searcharray2 = StringRegExp($standard, "(?m)(^.*" & $author & ".*$)", 3)
    _ArrayUnique($searcharray2)                                        ;doesn´t work everytime because of white-spaces...something to do for you with the regex;)
    _ArrayDisplay($searcharray2)
Next

this also could be done with stringinstr(), but has also more code....

If you want to know (explanation!!) what the RegEx does, take a look here https://regex101.com

Edited by AndyG

Share this post


Link to post
Share on other sites

I slow!

 

For me it was processing about 2x faster, scrolling so fast that it was skipping letters in the console write.

Problem here is we need to find the better way to do what you want to, not so much make it faster the way your doing it.

Kind of like a pivot table type logic. 

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

only some seconds to find all authors in standard.txt

#include <array.au3>
#include <string.au3>
;

$authors_in_standard = ""

$international = FileRead("international.txt")        ;files
$standard = FileRead("standard.txt")

$searcharray1 = StringRegExp($international, "(?m)^.*\((.*)\).*$", 3) ;find all autors between ( and ) and write them into an array
;or
$searcharray1 = _StringBetween($international, "(", ")") ;double time vs RegEx!
_ArrayDisplay($searcharray1)

$t = TimerInit()                                      ;timer start
For $author In $searcharray1                          ;every author in the array
    ; ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $author = ' & $author & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console

    If StringInStr($standard, $author, 1) Then        ;is faster than RegEx
        $author = StringReplace($author, "|", "/")
        $searcharray2 = StringRegExp($standard, "(?m)^(.*" & $author & ".*)$", 3);finds line
        If IsArray($searcharray2) Then                ;only if matched
            $searcharray2 = _ArrayUnique($searcharray2);number of arrayitems, eliminate equals
            For $i = 1 To UBound($searcharray2) - 1
                $authors_in_standard &= $searcharray2[$i] & @CRLF;sum all
            Next
            ; _ArrayDisplay($searcharray2)

        EndIf
    EndIf
Next

$time = TimerDiff($t)
ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $time = ' & $time & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console

ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $authors_in_standard = ' & $authors_in_standard & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console

There are some "errors" in the files, sometimes there are missing commas between the author and the year, sometimes a pipe | (RegEx means OR) and so on...results are double/multiple matches

The sourcefiles have to be edited so that the results become better.

 

I guess there is a much faster solution maybe with a scripting.dictionary or something like this (maybe database?!)

Edited by AndyG

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

What I would do is take what I have above and add an extra step.

Once a match is found, remove that matching entry from the 2nd array, so each time a match is made the comparison set gets smaller and smaller causing the loop to get faster and faster until completion. 

Just not sure the fast way to do this as redim type usage is slow for large arrays.  I think _ArrayDelete() uses redim.

Something to play with though, maybe tomorrow I will find some time. 

Edited by ViciousXUSMC

Share this post


Link to post
Share on other sites

Or.. that extra verification extends the time instead of reducing. lol

 


Spoiler

Paster - Main function is to paste text, but has more functions.

OpenW - Open With... alternative, Open any file with any application, set it's icon, set application as default.

Renamer - Rename files and folders, remove portions of text from the filename etc.

BeatsPlayer - Music player.

Params Tool - Right click an exe to see it's parameters or execute them.

Regedit Control - Registry browsing history, quickly jump into any saved key.

Time4Shutdown - Write the time for shutdown in minutes.

Power Profiles Tool - Set a profile as active, delete, duplicate, export and import.

Firefox Profile Backup - Backup/restore previously saved profile.

Finished Task Shutdown - Shuts down pc when specified window/Wndl/process closes.

NetworkSpeedShutdown - Shuts down pc if download speed goes under "X" Kb/s.

IUIAutomation - Topic with framework and examples

Share this post


Link to post
Share on other sites

String-functions are usually (!) much faster than array-functions, because they are highly optimized "api"-functions. 

Any additional intervention with the array (delete items e.g.) makes the script not faster but slower.

 

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Italiano,

This lists everything from Standard.txt that is also in International.txt using SQLite...

#include <array.au3>
#include <sqlite.au3>

Local $st = TimerInit(), $total = TimerInit()
Local $aStandard = StringRegExp(FileRead(@ScriptDir & '\standard.txt'), '(?:\((.*?)\))', 3)
Local $aInternational = StringRegExp(FileRead(@ScriptDir & '\international.txt'), '(?:\((.*?)\))', 3)
ConsoleWrite('Time to split files to arrays = ' & StringFormat('%2.4f seconds', TimerDiff($st) / 1000) & @CRLF)

$st = TimerInit()

_SQLite_Startup()
_SQLite_Open()
_SQLite_Exec(-1, 'create table t1 (c1); create table t2 (c1);')

Local $sql
For $i = 0 To UBound($aStandard) - 1
    $sql &= ( mod($i,500) = 0 ) ? ';insert into t1 values(' & _SQLite_FastEscape($aStandard[$i]) & ')' : ',(' & _SQLite_FastEscape($aStandard[$i]) & ')'
Next
_SQLite_Exec(-1, $sql)

$sql = ''
For $i = 0 To UBound($aInternational) - 1
    $sql &= ( mod($i,500) = 0 ) ? ';insert into t2 values(' & _SQLite_FastEscape($aInternational[$i]) & ')' : ',(' & _SQLite_FastEscape($aInternational[$i]) & ')'
Next
_SQLite_Exec(-1, $sql)

ConsoleWrite('Time to load SQLite = ' & StringFormat('%2.4f seconds', TimerDiff($st) / 1000) & @CRLF)

$st = timerinit()
Local $ret, $arows, $irow, $icol
_SQLite_GetTable2d(-1, 'select distinct t1.[c1] from t1 join t2 on t2.[c1] = t1.[c1] order by t1.[c1];', $arows, $irow, $icol)
ConsoleWrite('Time to get international entries that are also in standard = ' & StringFormat('%2.4f seconds', TimerDiff($st) / 1000) & @CRLF)

ConsoleWrite('Total time = ' & StringFormat('%2.4f seconds', TimerDiff($total) / 1000) & @CRLF)

_ArrayDisplay($arows)

Warning, my SQL is marginal at best.

kylomas

Edited by kylomas
streamlined the code a bit

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

Revised my original script.
 
This is my interpretation of what you required. It will output to the console, all lines of standard.txt which have a matching "Director" and "Year" from each line of international.txt.
 
This one handles the differences in the lines. I believe it's 100% accurate.
 
; by Hydranix@gmx.com
;
; Assumes .txt files are not editable
;
; Using contents of .txt files as a basis,
;  it was determined that only the year part
;  of each line is constant enough to use to
;  programatically extract the required data
;  100% of the time without effecting performance
;  too negatively.
#NoTrayIcon

$File1 = FileOpen("T:\international.txt")
$aSearchOrig = FileReadToArray("T:\standard.txt")
$Len = UBound($aSearchOrig)-1
$aSearch = $aSearchOrig
; Preprocess array
For $i = 0 To $Len
  $line = $aSearchOrig[$i]
  $line = StringReverse($line)
  $line = StringTrimLeft($line, StringInStr($line,")"))
  $line = StringReverse(StringTrimRight($line,StringLen($line) - StringInStr($line,"(")+1))
  $aSearch[$i] = $line
Next

While 1
  $line = ""
  ; Process line to ensure acccurate results
  $line = FileReadLine($File1)
  if @error = -1 Then ExitLoop
  $line = StringReverse($line)
  $line = StringTrimLeft($line, StringInStr($line,")"))
  $line = StringTrimRight($line,StringLen($line) - StringInStr($line,"(")+1)
  ;year
  $y = StringStripWS(StringReverse(StringLeft($line,4)),3)
  $line = StringReverse($line)
  ;director
  If StringInStr($line, ", ") <> 0 Then
    $d = StringStripWS(StringTrimRight($line,6),3)
  Else
    $d = StringStripWS(StringTrimRight($line,5),3)
  EndIf

  For $i = 0 To $Len
    ; Only care if we find director
    If StringInStr($aSearch[$i],$d) <> 0 Then
      ; If director found, then ensure year is the same
      If StringInStr($aSearch[$i], $y) <> 0 Then
        ; Good match
        ConsoleWrite($aSearchOrig[$i]&@CRLF)
      EndIf
    EndIf
  Next
WEnd
FileClose($File1)

 

On my tablet this script took just over 2.5 minutes to finish.

Edited by hydranix

Share this post


Link to post
Share on other sites

I guess there is a much faster solution maybe with a scripting.dictionary or something like this (maybe database?!)

Warning, my SQL is marginal at best.

Time to get international entries that are also in standard = 0.2661 seconds
Total time = 1.4269 seconds (/EDIT Laptop in sleepmode @800Mhz )

qed...:thumbsup:

Revised my original script.
 

On my tablet this script took just over 2.5 minutes to finish.

100 times slower...but works also...so, who cares :thumbsup:

 

 

Share this post


Link to post
Share on other sites

Any reason I am not getting an array result for 

_ArrayDisplay($arows)

No error on the script just not seeing the result. 

Share this post


Link to post
Share on other sites

Vxusmc,

Are you running it under scite?  This is just an example of how it might be done using sqlite.  There is no error checking at all.

Kylomas

 


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Both SCITE and the actual script.

I get your console write readouts just no array with results at the end. 

Share this post


Link to post
Share on other sites

If you are not getting sql error messages than i suspect that your input is not in the same dir as the script

Kylomas

P.s. answering this on a smart phone so typing is atrocious


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

kylomas,
SQLite is a great idea - with the ability to get the line numbers if necessary  :D

#include <SQLite.au3>
#include <Array.au3>

$t0 = TimerInit()
Local $aStandard = StringRegExp(FileRead(@ScriptDir & '\standard.txt'), "(?m)^(.*)\(([^\)]+).*$", 3)
;_ArrayDisplay($aStandard)
Local $aStandard2D[UBound($aStandard)/2][2]
For $i = 0 to UBound($aStandard)-1 step 2
   $aStandard2D[$i/2][0] = $aStandard[$i]
   $aStandard2D[$i/2][1] = $aStandard[$i+1]
Next
;_ArrayDisplay($aStandard2D)
Local $aInternational = StringRegExp(FileRead(@ScriptDir & '\international.txt'), '\(([^\)]+)', 3)
;_ArrayDisplay($aInternational)

ConsoleWrite('Building arrays = ' & StringFormat('%2.4f seconds', TimerDiff($t0) / 1000) & @CRLF)

$t1 = TimerInit()
  Local $array, $aTemp, $iRows, $iColumns
  _SQLite_Startup()
  _SQLite_Open()   ; ':memory:'
  _SQLite_Exec (-1, "CREATE TABLE table1 (id, names, authors); CREATE TABLE table2 (id, authors);") 
  _SQLite_Exec(-1, "Begin;")
  For $i = 0 to UBound($aStandard2D)-1
        _SQLite_Exec(-1, "INSERT INTO table1 VALUES (" & $i & ", " & _SQLite_FastEscape($aStandard2D[$i][0]) & ", " & _SQLite_FastEscape($aStandard2D[$i][1]) & ");")
  Next
  For $i = 0 to UBound($aInternational)-1
        _SQLite_Exec(-1, "INSERT INTO table2 VALUES (" & $i & ", " & _SQLite_FastEscape($aInternational[$i]) & ");")
  Next
  _SQLite_Exec(-1, "Commit;")
  _SQLite_GetTable2d(-1, "SELECT * FROM table1 WHERE authors IN (SELECT authors FROM table2) ;", $array, $iRows, $iColumns) 

ConsoleWrite('SQLite global work = ' & StringFormat('%2.4f seconds', TimerDiff($t1) / 1000) & @CRLF)
;_ArrayDisplay($array, "end")

$t2 = TimerInit()
Local $result[$iRows]  
For $i = 1 to $iRows
   $result[$i-1] = $array[$i][1] & "(" & $array[$i][2] & ")"
Next
;_ArrayDisplay($result, "end")
$result = _ArrayUnique($result)
ConsoleWrite('Formatting = ' & StringFormat('%2.4f seconds', TimerDiff($t2) / 1000) & @CRLF)

ConsoleWrite('Total time = ' & StringFormat('%2.4f seconds', TimerDiff($t0) / 1000) & @CRLF)
_ArrayDisplay($result, "end")
_SQLite_Close ()
_SQLite_Shutdown ()

 

Edited by mikell
typo

Share this post


Link to post
Share on other sites

Mikell,

Try select distinct and let the db engine work for you...

Kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • LoneWolf_2106
      By LoneWolf_2106
      Hi everybody,
      i have a log file with several entries like the following one:
      INFO [26.04.2017 11:37:48] [main] XML-Data: <online-activation> <general> <userid>XYZ</userid> <mac-address/> <OU>VG-DE</OU> <ROLE>KDT</ROLE> <FOA>PRO;FC;DOM;MDD</FOA> <BRD>XYZ;IMP</BRD> </general> <applications> <app expiration-date="2017-10-01" name="BB-INFO"/> <app expiration-date="2017-10-01" name="MMSKD"/> <app expiration-date="2017-10-01" name="FM-TOOL"/> <app expiration-date="2017-04-05" name="WEB-BTD"/> <app expiration-date="2017-10-01" name="OFFLINE-BTD"/> <app expiration-date="2017-10-01" name="MDU-UPD"/> <app expiration-date="2017-10-01" name="MDU"/> <app expiration-date="2017-04-05" name="WEB-WDA"/> <app expiration-date="2017-04-05" name="WEB-ETD"/> <app expiration-date="2017-10-01" name="OFFLINE-WDA"/> <app expiration-date="2017-10-01" name="OFFLINE-ETD"/> </applications> </online-activation>  
      I need to collect all the XML-Data, my search doesn't work:
       
      Func FileSearch() $j=0 For $i = 0 To UBound($content_array) - 1 $search_result=StringInStr($content_array[$i],$search) If $search_result<>0 Then ReDim $searchResultArray[UBound($searchResultArray) + 1] $searchResultArray [$j] = $content_array[$i] _ArrayDisplay($searchResultArray) $j+=1 EndIf Next EndFunc If i search for "online-activation", it retrieves only the first and the last tag, but not what is within.
      In addition, when i use FileReadToArray, all the entries between <online-activation> and </online-activation>

    • Robinson1
      By Robinson1
      Well the plan is to use the power of regular expressions engine of AutoIT for patching binary data.
      Something like this: StringRegExp( $BinaryData,  "(?s)\x55\x8B.."
       
      <cut> ... Okay straight to question/problem
      ... certain bytes that are in the range from 0x80 to 0xA0 won't match.
      Hmm seem to be a char encoding problem. In detail these are 27 chars: 0x80, 0x82~8C, 0x8E, 0x91~9C, 0x9E,0x9F
      Here's a small code snippet to explore / explain this problem:
      #include "StringConstants.au3" $TestData = BinaryToString("0x7E7F808182") ;Okay $match = StringRegExp( $TestData ,'\x7E' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Okay $match = StringRegExp( $TestData ,'\x7F' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Error no match $match = StringRegExp( $TestData ,'\x80' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Okay $match = StringRegExp( $TestData ,'\x81' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;Error no match $match = StringRegExp( $TestData ,'\x82' ,$STR_REGEXPARRAYFULLMATCH) ConsoleWrite('@extended = ' & @extended & ' $match = ' & $match & @CRLF) ;~ output: ;~ @extended = 2 $match = ;~ @extended = 3 $match = ;~ @extended = 0 $match = 1 ;~ @extended = 5 $match = ;~ @extended = 0 $match = 1 Hmm what to do? Go back and use the 'numberstring monster' implementation or just omit that range of 'unsafe bytes'. What is the root of this problem?
      Any idea how to fix this?
       
      Update: Okay I know a byte is not a character.
      But StringRegExp operates on String and so character level.
      Okay as long as you stay at Ansi encoding and only use /x00 - /X7F in the search pattern using  StringRegExp works well to search for binary data.
      What bytes can be matched that are in the range from /X7F - /xFF is also depending on the code page.
      So this avoid to search for bytes in the range from 0x80-0xa0 only applies to Germany.
      I just change this country setting:

      to Thai and now near all bytes from /X7F - /xFF fails to match.
    • Carm01
      By Carm01
      Hello,
      I have spent the past day fooling with StringRegExp to no avail attempting to get what would be a simple solution to an issue using StringRegExp.
      I will post the code in a sec. The string 'Java x Update y' where x and y are numeric values ONLY if a letter is mixed in anywhere then it should fail. I have been able to successfully deal with the x value so if x = 1234 or a1234 or 1a234 or 1234a would result in a fail if 'a' was in the string. However, when y = 1a234 then I get an output of 1 and when y = 1234a then the output = 1234 when both should fail. I am probably overlooking something simple and in looking through all the material and experimenting I am unable to figure it out and my experience with stringregexp and trying to find examples of this proved difficult. If someone could assist or point me to a thread ? Here is my code ; prob a simple fix. I am also trying to avoid white spaces.
      Thanks in advance
      #include <array.au3> $aArray = StringRegExp('Java 3009 Update 1a21', '(?i)Java (\d+) Update (\d+)', $STR_REGEXPARRAYGLOBALMATCH) If @error Then Exit _ArrayDisplay($aArray)  
    • VIP
      By VIP
      Need help to make function better  with full infomation
      #include <Array.au3> #include <File.au3> _TEST(@ScriptFullPath) _TEST("A:") _TEST("A:\B.c") _TEST("D:\E\F\") _TEST("G:\H/../J.k/") _TEST("M:\N\k..J.k") _TEST("D:\E\F\..\G\G\I..J.K.M") Func _TEST($sFilePath) Local $sDrive = "", $sFullPathDir = "", $sDirPath = "", $sDirName = "", $sFileName = "", $sFileNameExt = "", $sExtension = "", $sExt = "" Local $aPathSplit = _PathSplitByRef($sFilePath, $sDrive, $sFullPathDir, $sDirPath, $sDirName, $sFileName, $sFileNameExt, $sExtension, $sExt) ConsoleWrite("!Path IN : " & $sFilePath & @CRLF) ; C:\Windows\System32\etc\hosts.exe ConsoleWrite("- Driver : " & $sDrive & @CRLF) ; C: ConsoleWrite("- DirPath : " & $sFullPathDir & @CRLF) ; C:\Windows\System32\etc\etc ConsoleWrite("- DirPath : " & $sDirPath & @CRLF) ; \Windows\System32\etc\ ConsoleWrite("- DirName : " & $sDirName & @CRLF) ; etc ConsoleWrite("- FileName : " & $sFileName & @CRLF) ; hosts ConsoleWrite("- FileNameExt: " & $sFileNameExt & @CRLF) ; hosts.exe ConsoleWrite("- Extension : " & $sExtension & @CRLF) ; .exe ConsoleWrite("- Ext : " & $sExt & @CRLF & @CRLF) ; exe ;~ ConsoleWrite("!Path IN : " & $aPathSplit[0] & @CRLF) ; C:\Windows\System32\etc\hosts.exe ;~ ConsoleWrite("- Driver : " & $aPathSplit[1] & @CRLF) ; C: ;~ ConsoleWrite("- DirPath : " & $aPathSplit[2] & @CRLF) ; C:\Windows\System32\etc\etc ;~ ConsoleWrite("- DirPath : " & $aPathSplit[3] & @CRLF) ; \Windows\System32\etc\ ;~ ConsoleWrite("- DirName : " & $aPathSplit[4] & @CRLF) ; etc ;~ ConsoleWrite("- FileName : " & $aPathSplit[5] & @CRLF) ; hosts ;~ ConsoleWrite("- FileNameExt: " & $aPathSplit[6] & @CRLF) ; hosts.exe ;~ ConsoleWrite("- Extension : " & $aPathSplit[7] & @CRLF) ; .exe ;~ ConsoleWrite("- Ext : " & $aPathSplit[8] & @CRLF) ; exe ;~ _ArrayDisplay($aPathSplit, "_PathSplit of " & $sFilePath) EndFunc ;==>_TEST Func _PathSplitByRef($sFilePath, ByRef $sDrive, ByRef $sFullPathDir, ByRef $sDirPath, ByRef $sDirName, ByRef $sFileName, ByRef $sFileNameExt, ByRef $sExtension, ByRef $sExt) If StringInStr($sFilePath,"..") Then $sFilePath=_PathFull($sFilePath) Local $aPartOfPath=StringRegExp($sFilePath, "^\h*((?:\\\\\?\\)*(\\\\[^\?\/\\]+|[A-Za-z]:)?(.*[\/\\]\h*)?((?:[^\.\/\\]|(?(?=\.[^\/\\]*\.)\.))*)?([^\/\\]*))$", $STR_REGEXPARRAYMATCH) ;~ If @error Then ReDim $aPartOfPath[9] ;~ $aPartOfPath[0] = $sFilePath ;~ EndIf $aPartOfPath[0] = $sFilePath ; C:\Windows\System32\etc\hosts.exe $sDrive = $aPartOfPath[1] ; C: $sFullPathDir = $aPartOfPath[1] & $aPartOfPath[2] ; C:\Windows\System32\etc If StringLeft($aPartOfPath[2], 1) == "/" Then $sDirPath = StringRegExpReplace($aPartOfPath[2], "\h*[\/\\]+\h*", "\/") Else $sDirPath = StringRegExpReplace($aPartOfPath[2], "\h*[\/\\]+\h*", "\\") EndIf $aPartOfPath[2] = $sFullPathDir ; C:\Windows\System32\etc $sDirName=StringReplace($sDirPath,"\","") $sDirName=StringReplace($sDirPath,"/","") $sFileName = $aPartOfPath[3] ; hosts $aPartOfPath[5] = $sFileName ; hosts $sExtension = $aPartOfPath[4] ; .exe $aPartOfPath[7] = $sExtension ; .exe $aPartOfPath[3] = $sDirPath ; \Windows\System32\etc\ $aPartOfPath[4] = $sDirName ; etc $aPartOfPath[6] = $sFileName & $sExtension ; hosts.exe $sFileNameExt = $aPartOfPath[6] ; hosts.exe $sExt = StringReplace($sExtension,".","") ; exe $aPartOfPath[8] = $sExt ; exe Return $aPartOfPath EndFunc ;==>_PathSplitByRef  
    • Subz
      By Subz
      Does anyone know how to split a string using multiple delimiters, returning both the values and delimiters withing the Array using StringRegExp?  For example:
      ;~ Split on " Not ", " And ", " Or " $sString = ' Not $a = 1 And $b = 2 Or $b = 3' $aArray = StringRegExp($sString,...) ;~ Returned Results $aArray[0] = '$a = 1' $aArray[1] = 'And' $aArray[2] = '$b = 2' $aArray[3] = 'Or' $aArray[4] = '$b = 3' At the moment I'm using
      Local $aArray1 = StringRegExp($sString, '(?i) Or | And | Not ', 3) Creating a new array using string split and then joining the two arrays together again
      Local $aArray1 = StringSplit(StringRegExpReplace($sString, '(?i) Or | And | Not ', '******'), '******', 3) Unfortunately regular expression isn't my forte.