Robinson1

Use RegExp on binary data

10 posts in this topic

#1 ·  Posted (edited)

Well the plan is to use the power of regular expressions engine of AutoIT for patching binary data.
Something like this: StringRegExp( $BinaryData,  "(?s)\x55\x8B.."
 

<cut> ... Okay straight to question/problem

Spoiler

 

As introduction here's a working (and a little senseless) example:
~it'll just match the first 4 letters of Notepad.exe~

#include <FileConstants.au3>
; = 1a.= Get Data
$BinaryData = FileRead( FileOpen( @SystemDir & "\notepad.exe" , $FO_Binary), 0x1000)

ConsoleWrite('$BinaryData ' & @TAB & '= ' & $BinaryData & @CRLF )

; = 1b.= Convert
;~ $BinaryData = BinaryToString( $BinaryData )

#include "StringConstants.au3"
; = 2.= seek
$pat = "(?s) "

$pat &= "4D 5A.."

; Nice looking => working RE-Pattern
$pat = StringReplace( $pat," ",     ""  )
$pat = StringReplace( $pat,".",     ".." )


$match = StringRegExp( $BinaryData, _
            $pat, _
            $STR_REGEXPARRAYFULLMATCH _
        )
;3 Output
$Pos = @extended
$Pos -= StringLen( $match [0] ) ; seek to start of match

$Pos -= 2                       ; to skip '0x...'
$Pos = BitShift($Pos,1)         ; divide by 2 (via rightshift) to


ConsoleWrite('$Pos ' & @TAB & @TAB &'= ' & hex(  $Pos    ) & @CRLF )
ConsoleWrite('$match[0] ' & @TAB  & '= ' &       $match[0] & @CRLF )


;~ Expected OUTPUT:
;~ $BinaryData  = 0x4D5A900...
;~ $Pos         = 00000000
;~ $match[0]    = 4D5A9000

You may again with 55 8B to match some start of a function

55            PUSH    EBP
8Bxx          MOV     ESP, ExX

The problem. Like this it's fucking slow and wastes much memory.
So instead of working with a 'number string monster' that looks like this:
"0x4D5A90..."

It would be really awesome to work with the real binary data.
So here we go:

#include <FileConstants.au3>
; = 1a.= Get Data
$BinaryData = FileRead( FileOpen( @SystemDir & "\notepad.exe" , $FO_Binary), 0x1000)

; = 1b.= Convert
$BinaryData = BinaryToString( $BinaryData ) ;Mod #1  line added

ConsoleWrite('$BinaryData ' & @TAB & '= ' & $BinaryData )
ConsoleWrite( @CRLF)

#include "StringConstants.au3"
; = 2.= seek
$pat = "(?s)"

$pat &= " 4D 5A.."

; Nice looking => working RE-Pattern
$pat = StringReplace( $pat," ",     "\x"    )   ;Mod #2 ""  => "/x"
;~ $pat = StringReplace( $pat,".",  ".." )      ;Mod #3  line commented out
ConsoleWrite('pat ' & @TAB  & @TAB  & '= ' & $pat & @CRLF )


$match = StringRegExp( $BinaryData, _
            $pat, _
            $STR_REGEXPARRAYFULLMATCH _
        )
;3 Output
$Pos = @extended
$Pos -= StringLen( $match [0] ) ; seek to start of match

;~ $Pos -= 2                      ;Mod #4  line commented out  ; to skip '0x...'
;~ $Pos = BitShift($Pos,1)        ;Mod #5  line commented out  ; divide by 2 (via rightshift) to


ConsoleWrite('$Pos ' & @TAB & @TAB &'= ' & hex(  $Pos    ) & @CRLF )
ConsoleWrite('$match[0] ' & @TAB  & '= ' &       $match[0] & @CRLF )


;~ Expected OUTPUT:
;~ $BinaryData  = MZ...
;~ pat      = (?s)\x4D\x5A..
;~ $Pos

Wow that seems to work. BUT ...

 

... certain bytes that are in the range from 0x80 to 0xA0 won't match. :'(

Hmm seem to be a char encoding problem. In detail these are 27 chars: 0x80, 0x82~8C, 0x8E, 0x91~9C, 0x9E,0x9F

Here's a small code snippet to explore / explain this problem:

#include "StringConstants.au3"

$TestData = BinaryToString("0x7E7F808182")

;Okay
$match = StringRegExp( $TestData ,'\x7E' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x7F' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x80' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x81' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x82' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;~ output:
;~ @extended = 2  $match = 
;~ @extended = 3  $match = 
;~ @extended = 0  $match = 1
;~ @extended = 5  $match = 
;~ @extended = 0  $match = 1

Hmm what to do? Go back and use the 'numberstring monster' implementation or just omit that range of 'unsafe bytes'. What is the root of this problem?

Any idea how to fix this?
 

Update: Okay I know a byte is not a character.
But StringRegExp operates on String and so character level.
Okay as long as you stay at Ansi encoding and only use /x00 - /X7F in the search pattern using  StringRegExp works well to search for binary data.

What bytes can be matched that are in the range from /X7F - /xFF is also depending on the code page.
So this avoid to search for bytes in the range from 0x80-0xa0 only applies to Germany.
I just change this country setting:

vollbildaufzeichnung1p8uaa.jpg

to Thai and now near all bytes from /X7F - /xFF fails to match.

Edited by Robinson1

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Well I don't know what you're trying to do, but binary is quite meaningless if not interpreted the same way as encoded. Perhaps you should consider trying the other encoding options for BinaryToString(), if you haven't done that already. Sorry I misread your code.

Edit: Try adding (*UCP) to the start of the regular expression and see if that helps with UTF-8 encoding. Perhaps it won't. It's a mystery!

Edited by czardas

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

How about this?

#include "StringConstants.au3"

$TestData = BinaryToString("7E7F808182")

;Okay
$match = StringRegExp( $TestData ,'\x7E' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x7F' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x80' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x81' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x82' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

No that's not it. :whistle:

The instruction \x in the regular expression is trying to match a unicode code point. This is most unlikely to coincide with an ascii character. I'm beginning to think that is the problem.

Edited by czardas

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Cool thanks for ya reply.

Okay what I wanna do now is to patch away this nag-screen

"Gesponserte Sitzung"
"Dies war eine kostenlose Sitzung mit Unterstützung von www.teamviewer.com"

that pops ups after each remote Session.
 

Spoiler

 

To do so I need to

  1. seek/located the 'ShowSponsoredSessionDialog' via some specific vectors/unique values.
  2. seek to the start of this function and put 'Return' there to disable it.
  3. Null that CRC-check - same procedure as 1. & 2. ...

Well the specifications are already there. I just thought it would be nice to apply them with Autoit using RegExp patch pattern.

 


Well yes BinaryToString seems to be the critical point.
And more in particular its flags that specify how the binary data is converted/encode.
And there only
$SB_ANSI (1) = binary data is ANSI (default) makes some sense here.

However what is  not in the AutoIT documentation that this encode/decoding is depending on the country settings. I uses Phython3 before and there string encoding decoding issue is well done. It's nice to learn and to get practical experience on that topic..

 

Well so far I end up creation some function called BinRegExp() that wraps in StringRegExp.

  1. Does some preSearch by replacing all /x7F-/xFF in the pattern with . (anychar)
  2. Checks each match via the slower but better working Version regex that uses HexNumberStrings
  3. Loops if needed (to filter out match artefacts )

I may posted it here sooner or later but it's not really a solution more like workaround around the problem.

 

But let's get focus back on this:

StringRegExp( $TestData ,'\x80'...

Why it is not working?.

A.) all 0x80 inside $TestData got somehow messed up during by BinaryToString
B.) \x80 is somehow not transformed by StringRegExp as intented
C.) Something else

Edited by Robinson1

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

Using \x with extended ascii is a futile exercise. The number of bytes may be incorrect or the binary might refer to meaningless code points. This has to be the reason it doesn't work. I still need to try and understand it properly myself.

Edited by czardas

Share this post


Link to post
Share on other sites

#6 ·  Posted

Short answer:
PCRE isn't well suited to match binary data.

Long answer:
Wait a minute guys. You supply a string and call BinaryToString()?
If you want binary input, then performing conversion to binary would be a good idea, perhaps?

$TestData = Binary("0x7E7F808182")
ConsoleWrite(_vardump($TestData) & @LF)

Gives:
Binary       (5) 0x7E7F808182

Then, please realize that
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)
isn't going to tell you much as $STR_REGEXPARRAYFULLMATCH returns an array of matches. $match being an array, ConsoleWrite-ing it in nonsensical.

Then, the changelog of AutoIt warns us:
3.3.10.0 (23rd December, 2013) (Release)
...
Added: Regular expressions (PCRE engine) now using the new native 16bit mode and also compiled with full UCP support. Prefix patterns with (*UCP) to enable.

PCRE works character per character. Strings supplied to StringRegExp[Replace] were previously converted from native AutoIt UTF16-LE (actually just UCS-2 in fact) to UTF8 and the 8-bit PCRE engine was used. Now in 16-bit mode PCRE matches UCS-2 codepoints (using 16-bit encoding units).

As a user of the StringRegExp[Replace] wrappers, you can't control which engine (8- or 16-bit) is linked with AutoIt core and then used.
Note that you can't use \C to tell PCRE to match individual bytes regardless of character encoding, since \C works in current encoding units size (16-bit since AutoIt v3.3.10.0).
In theory you could bypass the hurdle by first converting your input binary to UTF16 but that would complicate thing further and doing so doesn't raise the final issue below.

Finally, the last problem --even with 8-bit PCRE-- with random binary data is input containing \x00 which is a string stop.

All of this results in binary not being the best food for StringRegExp. You're still not out of business. Forget binary and work on its raw hex representation!

All you have to do then is insure that your regexp always group couples of characters, each of then representing one input byte.

#include "StringConstants.au3"

$TestData = "7E7F8081823031323300006162637E507F518052815382548200"

$match = StringRegExp($TestData, "(?:..)*?(7E..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(7F..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(80..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(81..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(82..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

 


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#7 ·  Posted

I did notice and tried converting the string to binary, but it didn't solve the problem. Since it had been reported as working with ANSI, I dismissed that as being the issue. The Help File states that \x applies to unicode (not ASCII). I was also quite tired. Thanks for the explanation.

Share this post


Link to post
Share on other sites

#8 ·  Posted

"(*UCP)\x{0102}"

As an example for UTF-8 encoding note the code point is enclosed in {}  - or am I missing your problem? 

Share this post


Link to post
Share on other sites

#9 ·  Posted

(*UCP) effect is only to enable [Unicode] character properties, so you can use \p and \P spécifications. See PCRE documentation for more details.

Again, current implementation of PCRE in AutoIt is UTF16 only, not UTF8 as before 12/2013.

The OP wants matching on the byte basis, that's why one needs to use the hex representation to match binary because "our" PCRE will never match bytes, only UTF16.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#10 ·  Posted

Okay so in the end I decided for some hybrid Implementation:

1.Do a test to find out which chars RegExp can not match and store it.
2. Do a search on char level ( in the match pattern a replace all not working chars with '.' Any char
3. Check each match by converting the the match to a hexnumber string ( and the match pattern as well to match a hexnumber string) .
 

Func init_NotWorkingBytes()
        ;Create TestData
        local $TestData = "0x"
        for $i=00 to 0xff
            $TestData &= StringFormat( "%02X", $i)
        Next
        $TestData = BinaryToString ($TestData)

        Global $RegExpNotWorkingBytes    =    "(?|\\x80)"

        for $i=0x0 to 0xFF
            $pat = StringFormat( "\x%02X", $i)
            $match = StringRegExp( $TestData ,$pat  ,$STR_REGEXPARRAYFULLMATCH)
            if @error<>0 then
;~             ConsoleWrite('$match = ' & _
;~                 $pat& ' - ' & $match & '  > ' & chr($i) ) ;### Debug Console

                $RegExpNotWorkingBytes &= "|(?|\" & $pat & ")"
;~                 ConsoleWrite( @CRLF)

            EndIf
        Next
    Return $RegExpNotWorkingBytes
EndFunc


; #FUNCTION# ====================================================================================================================
; Name ..........: BinRegExp
; Description ...: Use RegExp with binary data
; Syntax ........: BinRegExp($test, $pattern[, $flag = 0[, $offset = 1]])
; Parameters ....: $test                - a dll struct value.
;                  $pattern             - a pointer value.
;                  $flag                - [optional] a floating point value. Default is 0.
;                  $offset              - [optional] an object. Default is 1.
; Return values .: None
; Remarks .......: That's kind of workaround since the
;That's a kinda hybrid for /x00-/x7F it uses StringRegExp with binary data and
;                   checks each match again with the slower StringRegExp hexnumberstring binary data
; Related .......:
; Link ..........:
; Example .......: No
; ===============================================================================================================================
Func BinRegExp($test, $pattern, $flag = 0, $offset = 1)

    $RegExpNotWorkingBytes = init_NotWorkingBytes()
;~     ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $RegExpNotWorkingBytes = ' & $RegExpNotWorkingBytes & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console
;~     $RegExpNotWorkingBytes = '\\x[7-9A-Fa-f][0-9A-Fa-f]'

    ;Replace not working in Range of /x7F-/xFF with .
    $SafePattern = StringRegExpReplace( $pattern, _
            $RegExpNotWorkingBytes, _
            '.')

;~     ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $SafePattern = ' & $SafePattern & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console

    for $Round = 1 to 0x7FFFFFFF

        Local $RetVal         = StringRegExp($test, $SafePattern, $flag, $offset)
        Local $RetError     = @error
        Local $RetExtended     = @extended

        If $RetError = 0 Then

            $MatchData      = $RetVal[0]
            $MatchLength = StringLen($MatchData)


                $MatchStart         =  $RetExtended
                $MatchStart     -=  $MatchLength
                $MatchStart     -= 1

            $RetVal2 = _BinRegExp($MatchData, $pattern, $flag)
            If @error = 0 Then
                ; Match is valid
                ExitLoop

            ElseIf @error = 3 Then
              ; the match was to big - apply delta; seek back from end of current match

                $offset = $MatchStart + @extended


            else
                ; ... was not a real match - look for more
                $offset = $RetExtended
            EndIf


        Else
            ExitLoop
        EndIf

        ConsoleWrite('.')
;~         myLog(@ScriptLineNumber ,"$offset = " & hex($offset) )
    Next

    Return SetError($RetError, $RetExtended, $RetVal)

EndFunc   ;==>BinRegExp

Func _BinRegExp($test, $pattern, $flag = 0, $offset = 1)
        const $xdigit = "." ;"[0-9A-Fa-f]"

    ; Replace \xXX with .
    $Numberstring = StringReplace($pattern, '\x', '')
    $Numberstring = StringReplace($Numberstring, ".", "(?:" & $xdigit & $xdigit & ")")


    $test = StringToBinary($test)


    Local $RetVal = StringRegExp( $test, $Numberstring, $STR_REGEXPARRAYMATCH  )
    Local $RetError     = @error
    Local $RetExtended     = @extended


    If $RetError = 0 Then
        $MatchData      = $RetVal[0]

        $MatchLength = StringLen($MatchData)


        $testLength = StringLen( $test ) - 2 ; no '0x'

        $delta = $testLength - $MatchLength
        if $delta >= 2 then
            ; the match was to big - set Error 4 and return adjustment delta

;~             $delta = $MatchLength - $delta ; set delta to how many bytes to seek back from end of current match
            $delta = DivBy2($delta)


            Return SetError(3, $delta)

        EndIf


    endif


    Return SetError($RetError, $RetExtended, $RetVal)
EndFunc   ;==>_BinRegExp


Func DivBy2($Divident)
    Return BitShift($Divident, 1)
EndFunc   ;==>DivBy2

Full sample using this is here:
http://bit.do/TeamViewerNA 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • Carm01
      By Carm01
      Hello,
      I have spent the past day fooling with StringRegExp to no avail attempting to get what would be a simple solution to an issue using StringRegExp.
      I will post the code in a sec. The string 'Java x Update y' where x and y are numeric values ONLY if a letter is mixed in anywhere then it should fail. I have been able to successfully deal with the x value so if x = 1234 or a1234 or 1a234 or 1234a would result in a fail if 'a' was in the string. However, when y = 1a234 then I get an output of 1 and when y = 1234a then the output = 1234 when both should fail. I am probably overlooking something simple and in looking through all the material and experimenting I am unable to figure it out and my experience with stringregexp and trying to find examples of this proved difficult. If someone could assist or point me to a thread ? Here is my code ; prob a simple fix. I am also trying to avoid white spaces.
      Thanks in advance
      #include <array.au3> $aArray = StringRegExp('Java 3009 Update 1a21', '(?i)Java (\d+) Update (\d+)', $STR_REGEXPARRAYGLOBALMATCH) If @error Then Exit _ArrayDisplay($aArray)  
    • VIP
      By VIP
      Need help to make function better  with full infomation
      #include <Array.au3> #include <File.au3> _TEST(@ScriptFullPath) _TEST("A:") _TEST("A:\B.c") _TEST("D:\E\F\") _TEST("G:\H/../J.k/") _TEST("M:\N\k..J.k") _TEST("D:\E\F\..\G\G\I..J.K.M") Func _TEST($sFilePath) Local $sDrive = "", $sFullPathDir = "", $sDirPath = "", $sDirName = "", $sFileName = "", $sFileNameExt = "", $sExtension = "", $sExt = "" Local $aPathSplit = _PathSplitByRef($sFilePath, $sDrive, $sFullPathDir, $sDirPath, $sDirName, $sFileName, $sFileNameExt, $sExtension, $sExt) ConsoleWrite("!Path IN : " & $sFilePath & @CRLF) ; C:\Windows\System32\etc\hosts.exe ConsoleWrite("- Driver : " & $sDrive & @CRLF) ; C: ConsoleWrite("- DirPath : " & $sFullPathDir & @CRLF) ; C:\Windows\System32\etc\etc ConsoleWrite("- DirPath : " & $sDirPath & @CRLF) ; \Windows\System32\etc\ ConsoleWrite("- DirName : " & $sDirName & @CRLF) ; etc ConsoleWrite("- FileName : " & $sFileName & @CRLF) ; hosts ConsoleWrite("- FileNameExt: " & $sFileNameExt & @CRLF) ; hosts.exe ConsoleWrite("- Extension : " & $sExtension & @CRLF) ; .exe ConsoleWrite("- Ext : " & $sExt & @CRLF & @CRLF) ; exe ;~ ConsoleWrite("!Path IN : " & $aPathSplit[0] & @CRLF) ; C:\Windows\System32\etc\hosts.exe ;~ ConsoleWrite("- Driver : " & $aPathSplit[1] & @CRLF) ; C: ;~ ConsoleWrite("- DirPath : " & $aPathSplit[2] & @CRLF) ; C:\Windows\System32\etc\etc ;~ ConsoleWrite("- DirPath : " & $aPathSplit[3] & @CRLF) ; \Windows\System32\etc\ ;~ ConsoleWrite("- DirName : " & $aPathSplit[4] & @CRLF) ; etc ;~ ConsoleWrite("- FileName : " & $aPathSplit[5] & @CRLF) ; hosts ;~ ConsoleWrite("- FileNameExt: " & $aPathSplit[6] & @CRLF) ; hosts.exe ;~ ConsoleWrite("- Extension : " & $aPathSplit[7] & @CRLF) ; .exe ;~ ConsoleWrite("- Ext : " & $aPathSplit[8] & @CRLF) ; exe ;~ _ArrayDisplay($aPathSplit, "_PathSplit of " & $sFilePath) EndFunc ;==>_TEST Func _PathSplitByRef($sFilePath, ByRef $sDrive, ByRef $sFullPathDir, ByRef $sDirPath, ByRef $sDirName, ByRef $sFileName, ByRef $sFileNameExt, ByRef $sExtension, ByRef $sExt) If StringInStr($sFilePath,"..") Then $sFilePath=_PathFull($sFilePath) Local $aPartOfPath=StringRegExp($sFilePath, "^\h*((?:\\\\\?\\)*(\\\\[^\?\/\\]+|[A-Za-z]:)?(.*[\/\\]\h*)?((?:[^\.\/\\]|(?(?=\.[^\/\\]*\.)\.))*)?([^\/\\]*))$", $STR_REGEXPARRAYMATCH) ;~ If @error Then ReDim $aPartOfPath[9] ;~ $aPartOfPath[0] = $sFilePath ;~ EndIf $aPartOfPath[0] = $sFilePath ; C:\Windows\System32\etc\hosts.exe $sDrive = $aPartOfPath[1] ; C: $sFullPathDir = $aPartOfPath[1] & $aPartOfPath[2] ; C:\Windows\System32\etc If StringLeft($aPartOfPath[2], 1) == "/" Then $sDirPath = StringRegExpReplace($aPartOfPath[2], "\h*[\/\\]+\h*", "\/") Else $sDirPath = StringRegExpReplace($aPartOfPath[2], "\h*[\/\\]+\h*", "\\") EndIf $aPartOfPath[2] = $sFullPathDir ; C:\Windows\System32\etc $sDirName=StringReplace($sDirPath,"\","") $sDirName=StringReplace($sDirPath,"/","") $sFileName = $aPartOfPath[3] ; hosts $aPartOfPath[5] = $sFileName ; hosts $sExtension = $aPartOfPath[4] ; .exe $aPartOfPath[7] = $sExtension ; .exe $aPartOfPath[3] = $sDirPath ; \Windows\System32\etc\ $aPartOfPath[4] = $sDirName ; etc $aPartOfPath[6] = $sFileName & $sExtension ; hosts.exe $sFileNameExt = $aPartOfPath[6] ; hosts.exe $sExt = StringReplace($sExtension,".","") ; exe $aPartOfPath[8] = $sExt ; exe Return $aPartOfPath EndFunc ;==>_PathSplitByRef  
    • RichardL
      By RichardL
      Text in a file, read into var with fileread:
      <> <> <> <> < J please look > <> <> <> Hi, 
      I want  a RegExp to select around 'please', back to the previous < and forward to the next >.  I can select the line of text.  Then I add in (?s) and it selects the whole text.  I think I want to make it not greedy, (?U) , that seems to make it ungreedy after, but it still selects all the previous lines.
      $sPattern = "(?s)<.*please.*>" ; 1 $sPattern = "(?s)<(?U).*please.*>" ; 2 $sPattern = "(?s)<(?U).*please(?U).*>" ; 3 $sAry = StringRegExp($sHTML, $sPattern, 3)  
    • Subz
      By Subz
      Does anyone know how to split a string using multiple delimiters, returning both the values and delimiters withing the Array using StringRegExp?  For example:
      ;~ Split on " Not ", " And ", " Or " $sString = ' Not $a = 1 And $b = 2 Or $b = 3' $aArray = StringRegExp($sString,...) ;~ Returned Results $aArray[0] = '$a = 1' $aArray[1] = 'And' $aArray[2] = '$b = 2' $aArray[3] = 'Or' $aArray[4] = '$b = 3' At the moment I'm using
      Local $aArray1 = StringRegExp($sString, '(?i) Or | And | Not ', 3) Creating a new array using string split and then joining the two arrays together again
      Local $aArray1 = StringSplit(StringRegExpReplace($sString, '(?i) Or | And | Not ', '******'), '******', 3) Unfortunately regular expression isn't my forte.
    • VIP
      By VIP
      Hi,
      I need help string RegEx to get string from CREATE to GO
      #include <StringConstants.au3> ;~ Global $fileSQL1 = @ScriptDir & "\fileSQL1.sql" ;~ Global $fileSQL2 = @ScriptDir & "\fileSQL2.sql" Global $tmpSQLfile = @TempDir & "\tmpFile.sql" OnAutoItExitRegister("_OnExit") _SetTMPsql() If Not FileExists($tmpSQLfile) Then OnAutoItExitUnRegister("_OnExit") Exit MsgBox(48, "/!\", "File: " & $tmpSQLfile & @CRLF & " is not Exists!", 3) EndIf Global $ContentSQLfile = FileRead($tmpSQLfile) _Start() Func _Start() Local $aArray, $iOffset = 1, $stringRegExp = '(?i)CREATE(.*?)GO' While 1 $aArray = StringRegExp($ContentSQLfile, $stringRegExp, $STR_REGEXPARRAYMATCH, $iOffset) If @error Then MsgBox(48, "StringRegExp Error " & @error, "+> StringRegExp: " & $stringRegExp & @CRLF & @CRLF & "=> With STRING:" & @CRLF & @CRLF & $ContentSQLfile) ExitLoop EndIf $iOffset = @extended For $i = 0 To UBound($aArray) - 1 MsgBox(0, "RegExp Test with Option 1 - " & $i, $aArray[$i]) Next WEnd EndFunc ;==>_Start Func _SetTMPsql() Local $tmpSQLContent = "" $tmpSQLContent &= "USE [Master]" & @CRLF $tmpSQLContent &= "GO" & @CRLF $tmpSQLContent &= "" & @CRLF $tmpSQLContent &= "CREATE DATABASE [Sales] ON PRIMARY " & @CRLF $tmpSQLContent &= "( NAME = N’Sales’, FILENAME = N’\FSASQLDBSales.mdf’ , " & @CRLF $tmpSQLContent &= " SIZE = 2GB , MAXSIZE = 8GB, FILEGROWTH = 1GB )" & @CRLF $tmpSQLContent &= "LOG ON " & @CRLF $tmpSQLContent &= "( NAME = N’Sales_log’, FILENAME = N’\FSASQLDBSales_log.ldf’ , " & @CRLF $tmpSQLContent &= " SIZE = 1GB , MAXSIZE = 2GB , FILEGROWTH = 10%)" & @CRLF $tmpSQLContent &= "GO" & @CRLF $tmpSQLContent &= "" & @CRLF $tmpSQLContent &= "USE [Sales]" & @CRLF $tmpSQLContent &= "GO" & @CRLF $tmpSQLContent &= "" & @CRLF $tmpSQLContent &= "-- Table Product" & @CRLF $tmpSQLContent &= "CREATE TABLE [dbo].[Product]" & @CRLF $tmpSQLContent &= "(" & @CRLF $tmpSQLContent &= " [ProductId] [uniqueidentifier] DEFAULT NEWID() NOT NULL," & @CRLF $tmpSQLContent &= " [ProductName] [nchar](50) NULL," & @CRLF $tmpSQLContent &= " [ProductDescription] [nchar](3000) NULL," & @CRLF $tmpSQLContent &= " [ProductPrice] MONEY NULL" & @CRLF $tmpSQLContent &= ") ON [PRIMARY]" & @CRLF $tmpSQLContent &= "GO" & @CRLF $tmpSQLContent &= "" & @CRLF $tmpSQLContent &= "-- Table Sales" & @CRLF $tmpSQLContent &= "CREATE TABLE [dbo].[Sales]" & @CRLF $tmpSQLContent &= "( " & @CRLF $tmpSQLContent &= " [SaleId] [uniqueidentifier] DEFAULT NEWID() NOT NULL," & @CRLF $tmpSQLContent &= " [SaleName] [nchar](50) NULL," & @CRLF $tmpSQLContent &= " [SaleInfo] [nchar](3000) NULL," & @CRLF $tmpSQLContent &= " [SaleMoney] MONEY NULL" & @CRLF $tmpSQLContent &= ") ON [PRIMARY]" & @CRLF $tmpSQLContent &= "GO" & @CRLF $tmpSQLContent &= "" & @CRLF $tmpSQLContent &= "SET ANSI_NULLS ON" & @CRLF $tmpSQLContent &= "GO" & @CRLF $tmpSQLContent &= "SET QUOTED_IDENTIFIER ON" & @CRLF $tmpSQLContent &= "GO" & @CRLF $tmpSQLContent &= "" & @CRLF $tmpSQLContent &= "-- The End" & @CRLF Local $hOpen = FileOpen($tmpSQLfile, 2 + 8 + 128) FileWrite($hOpen, $tmpSQLContent) Return FileClose($hOpen) EndFunc ;==>_SetTMPsql Func _OnExit() Exit FileDelete($tmpSQLfile) EndFunc ;==>_OnExit  
      mikell