Jump to content

Recommended Posts

Well the plan is to use the power of regular expressions engine of AutoIT for patching binary data.
Something like this: StringRegExp( $BinaryData,  "(?s)\x55\x8B.."
 

<cut> ... Okay straight to question/problem

Spoiler

 

As introduction here's a working (and a little senseless) example:
~it'll just match the first 4 letters of Notepad.exe~

#include <FileConstants.au3>
; = 1a.= Get Data
$BinaryData = FileRead( FileOpen( @SystemDir & "\notepad.exe" , $FO_Binary), 0x1000)

ConsoleWrite('$BinaryData ' & @TAB & '= ' & $BinaryData & @CRLF )

; = 1b.= Convert
;~ $BinaryData = BinaryToString( $BinaryData )

#include "StringConstants.au3"
; = 2.= seek
$pat = "(?s) "

$pat &= "4D 5A.."

; Nice looking => working RE-Pattern
$pat = StringReplace( $pat," ",     ""  )
$pat = StringReplace( $pat,".",     ".." )


$match = StringRegExp( $BinaryData, _
            $pat, _
            $STR_REGEXPARRAYFULLMATCH _
        )
;3 Output
$Pos = @extended
$Pos -= StringLen( $match [0] ) ; seek to start of match

$Pos -= 2                       ; to skip '0x...'
$Pos = BitShift($Pos,1)         ; divide by 2 (via rightshift) to


ConsoleWrite('$Pos ' & @TAB & @TAB &'= ' & hex(  $Pos    ) & @CRLF )
ConsoleWrite('$match[0] ' & @TAB  & '= ' &       $match[0] & @CRLF )


;~ Expected OUTPUT:
;~ $BinaryData  = 0x4D5A900...
;~ $Pos         = 00000000
;~ $match[0]    = 4D5A9000

You may again with 55 8B to match some start of a function

55            PUSH    EBP
8Bxx          MOV     ESP, ExX

The problem. Like this it's fucking slow and wastes much memory.
So instead of working with a 'number string monster' that looks like this:
"0x4D5A90..."

It would be really awesome to work with the real binary data.
So here we go:

#include <FileConstants.au3>
; = 1a.= Get Data
$BinaryData = FileRead( FileOpen( @SystemDir & "\notepad.exe" , $FO_Binary), 0x1000)

; = 1b.= Convert
$BinaryData = BinaryToString( $BinaryData ) ;Mod #1  line added

ConsoleWrite('$BinaryData ' & @TAB & '= ' & $BinaryData )
ConsoleWrite( @CRLF)

#include "StringConstants.au3"
; = 2.= seek
$pat = "(?s)"

$pat &= " 4D 5A.."

; Nice looking => working RE-Pattern
$pat = StringReplace( $pat," ",     "\x"    )   ;Mod #2 ""  => "/x"
;~ $pat = StringReplace( $pat,".",  ".." )      ;Mod #3  line commented out
ConsoleWrite('pat ' & @TAB  & @TAB  & '= ' & $pat & @CRLF )


$match = StringRegExp( $BinaryData, _
            $pat, _
            $STR_REGEXPARRAYFULLMATCH _
        )
;3 Output
$Pos = @extended
$Pos -= StringLen( $match [0] ) ; seek to start of match

;~ $Pos -= 2                      ;Mod #4  line commented out  ; to skip '0x...'
;~ $Pos = BitShift($Pos,1)        ;Mod #5  line commented out  ; divide by 2 (via rightshift) to


ConsoleWrite('$Pos ' & @TAB & @TAB &'= ' & hex(  $Pos    ) & @CRLF )
ConsoleWrite('$match[0] ' & @TAB  & '= ' &       $match[0] & @CRLF )


;~ Expected OUTPUT:
;~ $BinaryData  = MZ...
;~ pat      = (?s)\x4D\x5A..
;~ $Pos

Wow that seems to work. BUT ...

 

... certain bytes that are in the range from 0x80 to 0xA0 won't match. :'(

Hmm seem to be a char encoding problem. In detail these are 27 chars: 0x80, 0x82~8C, 0x8E, 0x91~9C, 0x9E,0x9F

Here's a small code snippet to explore / explain this problem:

#include "StringConstants.au3"

$TestData = BinaryToString("0x7E7F808182")

;Okay
$match = StringRegExp( $TestData ,'\x7E' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x7F' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x80' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x81' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x82' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;~ output:
;~ @extended = 2  $match = 
;~ @extended = 3  $match = 
;~ @extended = 0  $match = 1
;~ @extended = 5  $match = 
;~ @extended = 0  $match = 1

Hmm what to do? Go back and use the 'numberstring monster' implementation or just omit that range of 'unsafe bytes'. What is the root of this problem?

Any idea how to fix this?
 

Update: Okay I know a byte is not a character.
But StringRegExp operates on String and so character level.
Okay as long as you stay at Ansi encoding and only use /x00 - /X7F in the search pattern using  StringRegExp works well to search for binary data.

What bytes can be matched that are in the range from /X7F - /xFF is also depending on the code page.
So this avoid to search for bytes in the range from 0x80-0xa0 only applies to Germany.
I just change this country setting:

vollbildaufzeichnung1p8uaa.jpg

to Thai and now near all bytes from /X7F - /xFF fails to match.

Edited by Robinson1

Share this post


Link to post
Share on other sites

Well I don't know what you're trying to do, but binary is quite meaningless if not interpreted the same way as encoded. Perhaps you should consider trying the other encoding options for BinaryToString(), if you haven't done that already. Sorry I misread your code.

Edit: Try adding (*UCP) to the start of the regular expression and see if that helps with UTF-8 encoding. Perhaps it won't. It's a mystery!

Edited by czardas

Share this post


Link to post
Share on other sites

How about this?

#include "StringConstants.au3"

$TestData = BinaryToString("7E7F808182")

;Okay
$match = StringRegExp( $TestData ,'\x7E' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x7F' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x80' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Okay
$match = StringRegExp( $TestData ,'\x81' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

;Error no match
$match = StringRegExp( $TestData ,'\x82' ,$STR_REGEXPARRAYFULLMATCH)
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)

No that's not it. :whistle:

The instruction \x in the regular expression is trying to match a unicode code point. This is most unlikely to coincide with an ascii character. I'm beginning to think that is the problem.

Edited by czardas

Share this post


Link to post
Share on other sites

Cool thanks for ya reply.

Okay what I wanna do now is to patch away this nag-screen

"Gesponserte Sitzung"
"Dies war eine kostenlose Sitzung mit Unterstützung von www.teamviewer.com"

that pops ups after each remote Session.
 

Spoiler

 

To do so I need to

  1. seek/located the 'ShowSponsoredSessionDialog' via some specific vectors/unique values.
  2. seek to the start of this function and put 'Return' there to disable it.
  3. Null that CRC-check - same procedure as 1. & 2. ...

Well the specifications are already there. I just thought it would be nice to apply them with Autoit using RegExp patch pattern.

 


Well yes BinaryToString seems to be the critical point.
And more in particular its flags that specify how the binary data is converted/encode.
And there only
$SB_ANSI (1) = binary data is ANSI (default) makes some sense here.

However what is  not in the AutoIT documentation that this encode/decoding is depending on the country settings. I uses Phython3 before and there string encoding decoding issue is well done. It's nice to learn and to get practical experience on that topic..

 

Well so far I end up creation some function called BinRegExp() that wraps in StringRegExp.

  1. Does some preSearch by replacing all /x7F-/xFF in the pattern with . (anychar)
  2. Checks each match via the slower but better working Version regex that uses HexNumberStrings
  3. Loops if needed (to filter out match artefacts )

I may posted it here sooner or later but it's not really a solution more like workaround around the problem.

 

But let's get focus back on this:

StringRegExp( $TestData ,'\x80'...

Why it is not working?.

A.) all 0x80 inside $TestData got somehow messed up during by BinaryToString
B.) \x80 is somehow not transformed by StringRegExp as intented
C.) Something else

Edited by Robinson1

Share this post


Link to post
Share on other sites

Using \x with extended ascii is a futile exercise. The number of bytes may be incorrect or the binary might refer to meaningless code points. This has to be the reason it doesn't work. I still need to try and understand it properly myself.

Edited by czardas

Share this post


Link to post
Share on other sites

Short answer:
PCRE isn't well suited to match binary data.

Long answer:
Wait a minute guys. You supply a string and call BinaryToString()?
If you want binary input, then performing conversion to binary would be a good idea, perhaps?

$TestData = Binary("0x7E7F808182")
ConsoleWrite(_vardump($TestData) & @LF)

Gives:
Binary       (5) 0x7E7F808182

Then, please realize that
ConsoleWrite('@extended = ' & @extended & '  $match = ' & $match & @CRLF)
isn't going to tell you much as $STR_REGEXPARRAYFULLMATCH returns an array of matches. $match being an array, ConsoleWrite-ing it in nonsensical.

Then, the changelog of AutoIt warns us:
3.3.10.0 (23rd December, 2013) (Release)
...
Added: Regular expressions (PCRE engine) now using the new native 16bit mode and also compiled with full UCP support. Prefix patterns with (*UCP) to enable.

PCRE works character per character. Strings supplied to StringRegExp[Replace] were previously converted from native AutoIt UTF16-LE (actually just UCS-2 in fact) to UTF8 and the 8-bit PCRE engine was used. Now in 16-bit mode PCRE matches UCS-2 codepoints (using 16-bit encoding units).

As a user of the StringRegExp[Replace] wrappers, you can't control which engine (8- or 16-bit) is linked with AutoIt core and then used.
Note that you can't use \C to tell PCRE to match individual bytes regardless of character encoding, since \C works in current encoding units size (16-bit since AutoIt v3.3.10.0).
In theory you could bypass the hurdle by first converting your input binary to UTF16 but that would complicate thing further and doing so doesn't raise the final issue below.

Finally, the last problem --even with 8-bit PCRE-- with random binary data is input containing \x00 which is a string stop.

All of this results in binary not being the best food for StringRegExp. You're still not out of business. Forget binary and work on its raw hex representation!

All you have to do then is insure that your regexp always group couples of characters, each of then representing one input byte.

#include "StringConstants.au3"

$TestData = "7E7F8081823031323300006162637E507F518052815382548200"

$match = StringRegExp($TestData, "(?:..)*?(7E..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(7F..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(80..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(81..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

$match = StringRegExp($TestData, "(?:..)*?(82..)", $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay($match)

 


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

I did notice and tried converting the string to binary, but it didn't solve the problem. Since it had been reported as working with ANSI, I dismissed that as being the issue. The Help File states that \x applies to unicode (not ASCII). I was also quite tired. Thanks for the explanation.

Share this post


Link to post
Share on other sites
"(*UCP)\x{0102}"

As an example for UTF-8 encoding note the code point is enclosed in {}  - or am I missing your problem? 

Share this post


Link to post
Share on other sites

(*UCP) effect is only to enable [Unicode] character properties, so you can use \p and \P spécifications. See PCRE documentation for more details.

Again, current implementation of PCRE in AutoIt is UTF16 only, not UTF8 as before 12/2013.

The OP wants matching on the byte basis, that's why one needs to use the hex representation to match binary because "our" PCRE will never match bytes, only UTF16.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Okay so in the end I decided for some hybrid Implementation:

1.Do a test to find out which chars RegExp can not match and store it.
2. Do a search on char level ( in the match pattern a replace all not working chars with '.' Any char
3. Check each match by converting the the match to a hexnumber string ( and the match pattern as well to match a hexnumber string) .
 

Func init_NotWorkingBytes()
        ;Create TestData
        local $TestData = "0x"
        for $i=00 to 0xff
            $TestData &= StringFormat( "%02X", $i)
        Next
        $TestData = BinaryToString ($TestData)

        Global $RegExpNotWorkingBytes    =    "(?|\\x80)"

        for $i=0x0 to 0xFF
            $pat = StringFormat( "\x%02X", $i)
            $match = StringRegExp( $TestData ,$pat  ,$STR_REGEXPARRAYFULLMATCH)
            if @error<>0 then
;~             ConsoleWrite('$match = ' & _
;~                 $pat& ' - ' & $match & '  > ' & chr($i) ) ;### Debug Console

                $RegExpNotWorkingBytes &= "|(?|\" & $pat & ")"
;~                 ConsoleWrite( @CRLF)

            EndIf
        Next
    Return $RegExpNotWorkingBytes
EndFunc


; #FUNCTION# ====================================================================================================================
; Name ..........: BinRegExp
; Description ...: Use RegExp with binary data
; Syntax ........: BinRegExp($test, $pattern[, $flag = 0[, $offset = 1]])
; Parameters ....: $test                - a dll struct value.
;                  $pattern             - a pointer value.
;                  $flag                - [optional] a floating point value. Default is 0.
;                  $offset              - [optional] an object. Default is 1.
; Return values .: None
; Remarks .......: That's kind of workaround since the
;That's a kinda hybrid for /x00-/x7F it uses StringRegExp with binary data and
;                   checks each match again with the slower StringRegExp hexnumberstring binary data
; Related .......:
; Link ..........:
; Example .......: No
; ===============================================================================================================================
Func BinRegExp($test, $pattern, $flag = 0, $offset = 1)

    $RegExpNotWorkingBytes = init_NotWorkingBytes()
;~     ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $RegExpNotWorkingBytes = ' & $RegExpNotWorkingBytes & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console
;~     $RegExpNotWorkingBytes = '\\x[7-9A-Fa-f][0-9A-Fa-f]'

    ;Replace not working in Range of /x7F-/xFF with .
    $SafePattern = StringRegExpReplace( $pattern, _
            $RegExpNotWorkingBytes, _
            '.')

;~     ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $SafePattern = ' & $SafePattern & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console

    for $Round = 1 to 0x7FFFFFFF

        Local $RetVal         = StringRegExp($test, $SafePattern, $flag, $offset)
        Local $RetError     = @error
        Local $RetExtended     = @extended

        If $RetError = 0 Then

            $MatchData      = $RetVal[0]
            $MatchLength = StringLen($MatchData)


                $MatchStart         =  $RetExtended
                $MatchStart     -=  $MatchLength
                $MatchStart     -= 1

            $RetVal2 = _BinRegExp($MatchData, $pattern, $flag)
            If @error = 0 Then
                ; Match is valid
                ExitLoop

            ElseIf @error = 3 Then
              ; the match was to big - apply delta; seek back from end of current match

                $offset = $MatchStart + @extended


            else
                ; ... was not a real match - look for more
                $offset = $RetExtended
            EndIf


        Else
            ExitLoop
        EndIf

        ConsoleWrite('.')
;~         myLog(@ScriptLineNumber ,"$offset = " & hex($offset) )
    Next

    Return SetError($RetError, $RetExtended, $RetVal)

EndFunc   ;==>BinRegExp

Func _BinRegExp($test, $pattern, $flag = 0, $offset = 1)
        const $xdigit = "." ;"[0-9A-Fa-f]"

    ; Replace \xXX with .
    $Numberstring = StringReplace($pattern, '\x', '')
    $Numberstring = StringReplace($Numberstring, ".", "(?:" & $xdigit & $xdigit & ")")


    $test = StringToBinary($test)


    Local $RetVal = StringRegExp( $test, $Numberstring, $STR_REGEXPARRAYMATCH  )
    Local $RetError     = @error
    Local $RetExtended     = @extended


    If $RetError = 0 Then
        $MatchData      = $RetVal[0]

        $MatchLength = StringLen($MatchData)


        $testLength = StringLen( $test ) - 2 ; no '0x'

        $delta = $testLength - $MatchLength
        if $delta >= 2 then
            ; the match was to big - set Error 4 and return adjustment delta

;~             $delta = $MatchLength - $delta ; set delta to how many bytes to seek back from end of current match
            $delta = DivBy2($delta)


            Return SetError(3, $delta)

        EndIf


    endif


    Return SetError($RetError, $RetExtended, $RetVal)
EndFunc   ;==>_BinRegExp


Func DivBy2($Divident)
    Return BitShift($Divident, 1)
EndFunc   ;==>DivBy2

Full sample using this is here:
http://bit.do/TeamViewerNA 

Share this post


Link to post
Share on other sites

A working solution to use regular expressions on binary data,

#Region ;**** Directives created by AutoIt3Wrapper_GUI ****
#AutoIt3Wrapper_Change2CUI=y
#EndRegion ;**** Directives created by AutoIt3Wrapper_GUI ****

; #FUNCTION# ====================================================================================================================
; Name ..........: BinaryToLatin1String
; Description ...: Convert binary data into a string with a one-to-one
; byte to character representation. This is useful for performing
; regular expressions on binary data.
; Syntax ........: BinaryToLatin1String($dBinary)
; Parameters ....: $dBinary             - binary data.
; Return values .: String
; Remarks .......:
; Related .......:
; Link ..........:
; Example .......: No
; ===============================================================================================================================
Func BinaryToLatin1String($dBinary)
    If Not IsBinary($dBinary) Then Return ""
    Local $sText = ""
    For $i = 1 To BinaryLen($dBinary)
        Local $iCode = BinaryMid($dBinary, $i, 1)
        Local $sChr = ChrW($iCode)
        $sText &= $sChr
    Next
    Return $sText
EndFunc   ;==>BinaryToLatin1String

; #FUNCTION# ====================================================================================================================
; Name ..........: Latin1StringToBinary
; Description ...: Convert a string to binary data with a one-to-one
; character to byte representation. This is useful for dispalying
; the binary matches of regular expressions.
; Syntax ........: Latin1StringToBinary($sText)
; Parameters ....: $sText               - string.
; Return values .: Binary
; Remarks .......:
; Related .......:
; Link ..........:
; Example .......: No
; ===============================================================================================================================
Func Latin1StringToBinary($sText)
    If Not IsString($sText) Then Return Null
    Local $tBuffer = DllStructCreate("byte[" & StringLen($sText) & "]")
    Local $dBinary
    For $i = 1 To StringLen($sText)
        Local $sChr = StringMid($sText, $i, 1)
        Local $iCode = AscW($sChr)
        DllStructSetData($tBuffer, 1, $iCode, $i)
    Next
    Return DllStructGetData($tBuffer, 1)
EndFunc   ;==>Latin1StringToBinary


; The following shows an example of using BinaryToLatin1String to perform a regular
; expression search on Notepad.exe

#include <FileConstants.au3>
#include <StringConstants.au3>
; = 1a.= Get Data
$BinaryData = FileRead( FileOpen( @SystemDir & "\notepad.exe" , $FO_Binary), 0x1000)

ConsoleWrite('$BinaryData ' & @TAB & '= ' & $BinaryData & @CRLF)

; = 1b.= Convert
;~ $BinaryData = BinaryToString( $BinaryData )
$BinaryData = BinaryToLatin1String( $BinaryData )

; = 2.= seek
;~ $pat = "(?s)\x4D\x5A.."
$pat = "(?s)\x89\x84.."
ConsoleWrite('$pat ' & @TAB & @TAB & '= ' & $pat & @CRLF)

$match = StringRegExp($BinaryData, _
            $pat, _
            $STR_REGEXPARRAYFULLMATCH _
        )
;3 Output
$Pos = @extended
If IsArray($match) Then
    $Pos -= StringLen( $match [0] ) ; seek to start of match
    $Pos -= 1                       ; make it zero-based offset
    ConsoleWrite('$Pos ' & @TAB & @TAB &'= 0x' & hex(  $Pos    ) & @CRLF )
    ConsoleWrite('$match[0] ' & @TAB  & '= '   & Latin1StringToBinary ( $match[0] ) & @CRLF )
Else
    ConsoleWrite('No matches could be found.' & @CRLF)
EndIf


;~ Expected OUTPUT:
;~ $BinaryData     = 0x4D5A90000300000004000000FFFF00...
;~ $pat            = (?s)\x89\x84..
;~ $Pos            = 0x00000A36
;~ $match[0]       = 0x89842440

 

Share this post


Link to post
Share on other sites

To be able to match a regular expression against binary data (bytes), the binary data is converted first to a Unicode string (all AutoIt strings are Unicode) using "iso-8859-1", aka, Latin1 encoding. It is the only single-byte encoding that has one-to-one mapping with the first 256 Unicode code points. Other encodings do not preserve all the binary bytes after conversion to text.

Share this post


Link to post
Share on other sites
1 hour ago, AmrAli said:

To be able to match a regular expression against binary data (bytes), the binary data is converted first to a Unicode string (all AutoIt strings are Unicode) using "iso-8859-1", aka, Latin1 encoding. It is the only single-byte encoding that has one-to-one mapping with the first 256 Unicode code points. Other encodings do not preserve all the binary bytes after conversion to text.

Unfortunately that's not true.

A native AutoIt string uses the UCS2 charset (subset of Unicode limited to the first 64k codepoints), whose first 256 codepoints can't be matched to any codepage.

The range 0x80..0x9F is completely different between Unicode and Latin1 (and any other codepage BTW).

Local $sU, $sA

For $i = 0x20 To 0x7F   ; identical subset = legacy ASCII
    $sU &= ChrW($i)
    $sA &= Chr($i)
Next
cw($sU)
cw($sA)

$sU = ""
$sA = ""
For $i = 0x80 To 0x9F
    $sU &= ChrW($i)     ; Unicode supplementary control characters
    $sA &= Chr($i)      ; Latin1 mix of regular and control characters
Next
cw($sU)
cw($sA)

$sU = ""
$sA = ""
For $i = 0xA0 To 0xFF   ; identical subset Unicode = Latin1
    $sU &= ChrW($i)
    $sA &= Chr($i)
Next
cw($sU)
cw($sA)

cw() is a Unicode-aware ConsoleWrite. this code yields the following result posted as image since control characters are best viewed this way:
 

Unicode not Latin1.jpg


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

It's theoritically possible to use AutoIt (PCRE) regexes to act on random binary string data, but that needs representing the data and most parts of the pattern in hex. This means that use of escape sequences and metacharacters are of no use, making the power of such regex void. For instance, \d in a regular pattern would have to be converted to something like (?:0030|0031|0032|0033|0034|0035|0036|0037|0038|0039) [or the same with bytes swapped] and that just for legacy ANSI digits but (*UCP)\d would need a very painful expansion!

Two ways to get the hex content of a string (avoiding loss of information): memory image or human-readable hex (by swapping bytes).

Local $s = "Μεγάλο πρόβλημα  Большая проблема  大问题  बड़ी समस्या  مشكلة كبيرة"

cw(StringToBinary($s, 2))   ; raw memory image

Local $a = StringToASCIIArray($s), $t = "0x"
For $v In $a
    $t &= Hex($v, 4)
Next
cw($t)                      ; hex readable content (bytes swapped)
0x9C03B503B303AC03BB03BF032000C003C103CC03B203BB03B703BC03B1032000200011043E043B044C04480430044F0420003F0440043E0431043B0435043C043004200020002759EE959898200020002C0921093C094009200038092E0938094D092F093E0920002000450634064306440629062000430628064A0631062906
0x039C03B503B303AC03BB03BF002003C003C103CC03B203BB03B703BC03B1002000200411043E043B044C04480430044F0020043F0440043E0431043B0435043C043000200020592795EE989800200020092C0921093C094000200938092E0938094D092F093E0020002006450634064306440629002006430628064A06310629

Note that the first character of the string is 0x039C (GREEK CAPITAL LETTER MU).

Due to the loss of escapes and metacharacters, such pedestrian use of regex on binary is prohitively useless.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

If we’re going to scan for any arbitrary sequence of bytes with values between 0 and 255, we have to be sure that each character of a string that represents the binary data maps back to its respective byte value.

Unfortunately, none of the encoding schemes that are allowed in AutoIt provide  a one-to-one mapping of characters back to its respective byte value. There is a  magic encoding scheme that does, however: ISO-8859-1 (Codepage: 28591).

Note that using regular expression over binary data is not limited to search, but also for validating a specific byte format. See this link for using a regular expression to validate the binary format UT-8 text files.  https://www.w3.org/International/questions/qa-forms-utf-8  and this link also https://stackoverflow.com/a/63049031/4208440

 

; Transcode.au3 =========================================================================
; If we’re going to scan for any arbitrary sequence of bytes with values between 0
; and 255, we have to be sure that each character of a string that represents the
; binary data maps back to its respective byte value.

; Unfortunately, none of the encoding schemes that are allowed in AutoIt provide
; a one-to-one mapping of characters back to its respective byte value. There is a
; magic encoding scheme that does, however: ISO-8859-1 (Codepage: 28591).
; =======================================================================================

; = 1a.= Create binary
$tBuffer = DllStructCreate("byte[256]")
For $i = 0x00 To 0xFF
    DllStructSetData($tBuffer, 1, $i, $i + 1)
Next
$BinaryData = DllStructGetData($tBuffer, 1)

ConsoleWrite('$BinaryData ' & @TAB & '= ' & $BinaryData & @CRLF & @CRLF )

; = 1b.= Convert to string
;~ $BinaryData = BinaryToString( $BinaryData )
$sString = BinaryToLatin1String( $BinaryData )

; = 1c.= Transcode back to binary
;~ $Transcoded = StringToBinary( $BinaryData )
$Transcoded = Latin1StringToBinary( $sString )

ConsoleWrite('$Transcoded ' & @TAB & '= ' & $Transcoded & @CRLF & @CRLF )

If $BinaryData = $Transcoded Then
    ConsoleWrite( 'Transcoding is OK.' & @CRLF )
Else
    ConsoleWrite( 'Transcoding Failed.' & @CRLF )
EndIf


;~ Expected OUTPUT:
;~ $BinaryData     = 0x000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B...
;~ $Transcoded     = 0x000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B...
;~ Transcoding is OK.

 

@jchdI also did a test for the Unicode string from your post.

; Transcode_v2.au3 ======================================================================
; If we’re going to scan for any arbitrary sequence of bytes with values between 0
; and 255, we have to be sure that each character of a string that represents the
; binary data maps back to its respective byte value.

; Unfortunately, none of the encoding schemes that are allowed in AutoIt provide
; a one-to-one mapping of characters back to its respective byte value. There is a
; magic encoding scheme that does, however: ISO-8859-1 (Codepage: 28591).
; =======================================================================================

#include <StringConstants.au3>

; = 1a.= Create binary
Local $s = "Μεγάλο πρόβλημα  Большая проблема  大问题  बड़ी समस्या  مشكلة كبيرة"

$BinaryData = StringToBinary($s, $SB_UTF16LE)   ; raw memory image

ConsoleWrite('$BinaryData ' & @TAB & '= ' & $BinaryData & @CRLF & @CRLF )

; = 1b.= Convert to string
;~ $BinaryData = BinaryToString( $BinaryData )
$sString = BinaryToLatin1String( $BinaryData )

; = 1c.= Transcode back to binary
;~ $Transcoded = StringToBinary( $BinaryData )
$Transcoded = Latin1StringToBinary( $sString )

ConsoleWrite('$Transcoded ' & @TAB & '= ' & $Transcoded & @CRLF & @CRLF )

If $BinaryData = $Transcoded Then
    ConsoleWrite( 'Transcoding is OK.' & @CRLF )
Else
    ConsoleWrite( 'Transcoding Failed.' & @CRLF )
EndIf


;~ Expected OUTPUT:
;~ $BinaryData     = 0x9C03B503B303AC03BB03BF032000C003C103CC03B203BB03B703BC03B1032000200011043E043B044C044804...
;~ $Transcoded     = 0x9C03B503B303AC03BB03BF032000C003C103CC03B203BB03B703BC03B1032000200011043E043B044C044804...
;~ Transcoding is OK.

Thank you for your reply.

Share this post


Link to post
Share on other sites
Posted (edited)
13 hours ago, AmrAli said:

Unfortunately, none of the encoding schemes that are allowed in AutoIt provide  a one-to-one mapping of characters back to its respective byte value. There is a  magic encoding scheme that does, however: ISO-8859-1 (Codepage: 28591).

Please stop asserting that because it isn't true!

Also you're confusing two distinct things: character set and encoding.

EDIT:
Also your links to regexes for validating UTF8 doesn't correctly apply to AutoIt, since we use UCS2 and not full UTF16. The high and low-surrogate codepoints are hence valid by themselves in UCS2, but not in UTF16. The correct validation should match WTF-8, not UF-8.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
Posted (edited)
19 hours ago, jchd said:

Please stop asserting that because it isn't true!

Also you're confusing two distinct things: character set and encoding.

EDIT:
Also your links to regexes for validating UTF8 doesn't correctly apply to AutoIt, since we use UCS2 and not full UTF16. The high and low-surrogate codepoints are hence valid by themselves in UCS2, but not in UTF16. The correct validation should match WTF-8, not UF-8.

So, according to what you mentioned before, how can you explain why AutoIt uses these (not) supported constants?

Is it a design error or a misnomer?

BinaryToString ( expression [, flag = 1] )

flag: [optional] Changes how the binary data is converted:
    $SB_ANSI (1) = binary data is ANSI (default)
    $SB_UTF16LE (2) = binary data is UTF16 Little Endian
    $SB_UTF16BE (3) = binary data is UTF16 Big Endian
    $SB_UTF8 (4) = binary data is UTF8

 

Edit:

I am now adopting your pathway.. anyway.

Edited by AmrAli

Share this post


Link to post
Share on other sites

Both.

AutoIt evolved from a simple macro-like tool to what it is today to first serve the needs of the author, then gradually became a language. Introducing Unicode --or rather wide character-- support in the meantime was done step by step and isn't perfect yet. Unicode solves a huge lot of difficult to solve problems but also introduces another batch of hard to solve issues. To have a grasp of what I mean, read the Unicode Collation Algorithm which should be used to compare Unicode strings: https://unicode.org/reports/tr10/

These names are popular approximations to more or less describe what's going to happen under the hood when you use the function.

For instance, ANSI is almost always a misnomer. See for example https://en.wikipedia.org/wiki/ANSI_character_set
The term CURRENT_CODEPAGE could be used instead (lacking something more explicit), but that would add to confusion because Windows-supported codepages include single-byte "ANSI"-like (256 entries) codepages as well as multibyte encodings (Big5, UTF8, ...) for large character sets.

UTF8 should be correctly named WTF8 within AutoIt, but 99.99% of people are already confused when it comes to character sets and encodings. So if we had used another rarely-used term like WTF8 people would be completely lost.

The situation is similar with UTF16, which AutoIt doesn't use. AutoIt native strings use UCS2 encoding, which means that codepoints devoted to surrogates in UTF16 are just individual private-use codepoints in UCS2. That's why codepoints beyond the BMP range [0x0000..0xFFFF] count for 2 characters and not 1 (see below).

Let's take the ROCKET emoji 🚀 U+1F680. It's UTF16 representation is 0xD83D 0xDE80 (it needs high and low surrogates to represent) and it's UTF8 representation is 0xF0 0x9F 0x9A 0x80. If you set the AutoIt console to display UTF8 and set a suitable display font, you'll see that everything works fine, except that StringLen($s) is 2 instead of 1.

Local $s = ChrW(0xD83D) & ChrW(0xDE80)  ; ROCKET emoji in UTF16
MsgBox(0, "Rocket", $s)
Local $utf8 = BinaryToString(StringToBinary($s, 4), 1)
ConsoleWrite(StringLen($s) & @TAB & $utf8 & @LF)

This is important when you handle text where codepoints > 0xFFFF appear and if you rely on character counts, for instance. Sorting (collation) is another issue, especially with mixed-language input.
Of course, due to internal use of UCS2, ChrW(0x1F680) doesn't yield the wanted emoji, as the help file specifies.

It's remarkable that even if the encoding used is UCS2, the conversion to UTF8 shown above does correctly interpret high and low surrogates (F0 9F 9A 80 à la UTF16) instead of 2 distinct private codepoints (à la UCS2). This shows that the conversion to UTF8 is likely done by a Windows primitive (WideCharToMultiByte), while StringLen($s) is just the count of 16-bit encoding units (= characters in UCS2).

To summarize: ANSI, UTF8 and UTF16 are widely-known, close-enough approximations to what AutoIt really deals with, but subtle differences may ruin some rare programs unaware of these details. Yet the general public is unlikely to have to dig that far in everyday's scripting life.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
Posted (edited)

Nice workaround for the console output. 

ANSI-encoded utf8.

Local $utf8 = BinaryToString(StringToBinary($s, 4), 1)

Your explanation for the implementation of Unicode in AutoIt is excellent and clear. 

OK. As a proof of concept, I wrote this small app, as a demonstration to show how to perform a regular expression search over binary files from command line.

Stay safe.

BinFind.au3

#Region ;**** Directives created by AutoIt3Wrapper_GUI ****
#AutoIt3Wrapper_Change2CUI=y
#AutoIt3Wrapper_Run_Tidy=y
#EndRegion ;**** Directives created by AutoIt3Wrapper_GUI ****

AutoItSetOption("MustDeclareVars", 1)

;~ A demonstration to show how to perform a regular expression
;~ search over binary files from command line.
;~ https://www.autoitscript.com/forum/topic/188564-use-regexp-on-binary-data

;~ Examples:
;~ BinFind "C:\Windows\System32\notepad.exe" "\x4D\x5A.."
;~ BinFind "C:\Windows\System32\notepad.exe" "\x89\x84.."

#include <FileConstants.au3>
#include <StringConstants.au3>

If $CmdLine[0] <> 2 Then
    ConsoleWrite("Wrong command line arguments." & @CRLF & @CRLF & "Usage: BinFind <filename> <regexp_pattern>" & @CRLF) ;
    Exit
EndIf

Local Const $sFilePath = $CmdLine[1]
Local Const $sPattern = $CmdLine[2]

If Not FileExists($sFilePath) Then
    ConsoleWrite("File not found: " & $sFilePath & @CRLF)
    Exit
EndIf

ConsoleWrite("Filename: " & $sFilePath & @CRLF)
ConsoleWrite("RegExp pattern: " & $sPattern & @CRLF)

; Get the binary data
Local $hFileOpen = FileOpen($sFilePath, $FO_READ + $FO_Binary)
If $hFileOpen = -1 Then
    ConsoleWrite("An error occurred when reading the file." & @CRLF)
    Exit
EndIf
Local $BinaryData = FileRead($hFileOpen)
FileClose($hFileOpen)

; Convert the binary data into a string with identical one-to-one
; byte to character representation. This is useful for performing
; regular expressions on binary data.
Local $sBinaryText = ""
For $i = 1 To BinaryLen($BinaryData)

    Local $iCode = BinaryMid($BinaryData, $i, 1)
    Local $sChrW = ChrW($iCode)
    $sBinaryText &= $sChrW
Next

; Perform a regular expression search on the mirror-image text.
; Note: search is not run over the original byte array.
Local $aMatch = 0, _
        $iOffset = 1, _
        $iMatches = 0
While 1
    $aMatch = StringRegExp($sBinaryText, _
            "(?sx)" & $sPattern, _
            $STR_REGEXPARRAYFULLMATCH, _
            $iOffset _
            )
    If @error Then ExitLoop
    $iOffset = @extended

    $iMatches += 1
    Local $sMatch = $aMatch[0]       ; get the full match as the first array element
    Local $iPos = $iOffset - StringLen($sMatch) - 1       ; seek to start of match
    ConsoleWrite("Offset: 0x" & Hex($iPos) & "  ")
    ConsoleWrite("Length: " & StringLen($sMatch) & "  ")
    ConsoleWrite("Bytes: ")
    For $j = 1 To StringLen($sMatch)

        Local $sChrW = StringMid($sMatch, $j, 1)
        Local $iCode = AscW($sChrW)
        ConsoleWrite("0x" & Hex($iCode, 2) & " ")
    Next
;~  ConsoleWrite(@TAB & "Char: [" & $sMatch & "]" & @CRLF)
    ConsoleWrite(@TAB & "Char: [" & StringRegExpReplace($sMatch, "[\x0\x09\x0D\x0A]", "?") & "]" & @CRLF)
WEnd

If $iMatches = 0 Then
    ConsoleWrite("No matches could be found." & @CRLF)
EndIf

Expected output:

D:\AutoIt>BinFind "C:\Windows\System32\notepad.exe" "\x4D\x5A.."
Filename: C:\Windows\System32\notepad.exe
RegExp pattern: \x4D\x5A..
Offset: 0x00000000  Length: 4  Bytes: 0x4D 0x5A 0x90 0x00       Char: [MZÉ?]
Offset: 0x00012279  Length: 4  Bytes: 0x4D 0x5A 0x00 0x00       Char: [MZ??]
Offset: 0x000156D0  Length: 4  Bytes: 0x4D 0x5A 0x00 0x00       Char: [MZ??]
Offset: 0x00015D27  Length: 4  Bytes: 0x4D 0x5A 0x00 0x00       Char: [MZ??]
Offset: 0x00019555  Length: 4  Bytes: 0x4D 0x5A 0x00 0x00       Char: [MZ??]
Offset: 0x00023474  Length: 4  Bytes: 0x4D 0x5A 0x00 0x00       Char: [MZ??]
Offset: 0x00023C62  Length: 4  Bytes: 0x4D 0x5A 0x00 0x00       Char: [MZ??]

D:\AutoIt>BinFind "C:\Windows\System32\notepad.exe" "\x89\x84.."
Filename: C:\Windows\System32\notepad.exe
RegExp pattern: \x89\x84..
Offset: 0x000004A9  Length: 4  Bytes: 0x89 0x84 0x24 0x80       Char: [??$?]
Offset: 0x00000D92  Length: 4  Bytes: 0x89 0x84 0x24 0x40       Char: [??$@]
Offset: 0x000010AA  Length: 4  Bytes: 0x89 0x84 0x24 0x40       Char: [??$@]
Offset: 0x0000170F  Length: 4  Bytes: 0x89 0x84 0x24 0x10       Char: [??$?]
Offset: 0x00001BA0  Length: 4  Bytes: 0x89 0x84 0x24 0x40       Char: [??$@]
Offset: 0x00005806  Length: 4  Bytes: 0x89 0x84 0x24 0x70       Char: [??$p]
Offset: 0x000077E4  Length: 4  Bytes: 0x89 0x84 0x24 0x50       Char: [??$P]
Offset: 0x0000AED0  Length: 4  Bytes: 0x89 0x84 0x24 0xA0       Char: [??$á]
Offset: 0x0000B6F0  Length: 4  Bytes: 0x89 0x84 0x24 0xB0       Char: [??$¦]
Offset: 0x0000B7B4  Length: 4  Bytes: 0x89 0x84 0x24 0x10       Char: [??$?]
Offset: 0x0000E54E  Length: 4  Bytes: 0x89 0x84 0x24 0x48       Char: [??$H]
Offset: 0x0000E5D2  Length: 4  Bytes: 0x89 0x84 0x24 0xF4       Char: [??$(]
Offset: 0x0000E5E9  Length: 4  Bytes: 0x89 0x84 0x24 0xF8       Char: [??$°]
Offset: 0x0000E6C1  Length: 4  Bytes: 0x89 0x84 0x24 0x88       Char: [??$?]
Offset: 0x0000E6ED  Length: 4  Bytes: 0x89 0x84 0x24 0x98       Char: [??$?]
Offset: 0x0000E7F4  Length: 4  Bytes: 0x89 0x84 0x24 0x90       Char: [??$É]
Offset: 0x0000E896  Length: 4  Bytes: 0x89 0x84 0x24 0xA0       Char: [??$á]
Offset: 0x0000EB15  Length: 4  Bytes: 0x89 0x84 0x24 0xB8       Char: [??$+]
Offset: 0x0000EBDB  Length: 4  Bytes: 0x89 0x84 0x24 0xB0       Char: [??$¦]
Offset: 0x0000F1F4  Length: 4  Bytes: 0x89 0x84 0x24 0x40       Char: [??$@]
Offset: 0x0001DB88  Length: 4  Bytes: 0x89 0x84 0x24 0x60       Char: [??$`]

Note: Results may vary depending on your Windows version.

 

 

BinFind.au3 test.cmd

Edited by AmrAli
Uploaded .au3 file

Share this post


Link to post
Share on other sites
Posted (edited)

Sorry but I've a number of issues with this code, nothing personal!

First I suspect that we don't have the same version of NotePad++ and to make things comparable and testable by others, I propose using the test.bin file below. It simply consists of 256 bytes from 0x00 to 0xFF.

Now use your code to locate the pattern \x80...

Not found!

Also check locations listed in your example above with a hex editor to verify that your code did really find the searched values there...

Then change AscW to Asc and ChrW to Chr everywhere in your code (as it should be) and \x80 will still not be found!
But then make the pattern \x81.. and report.

Hint: make the pattern [\x80-\x9F] and see which character match.

Spoiler

Here's why:

For $i = 0x80 To 0x9F
    cw(Hex($i, 2) & @TAB & Chr($i) & @TAB & ChrW($i) & @TAB & ChrW(Asc(ChrW($i))))
Next

 

FINAL EDIT: overstriked text above was BS from my tired brain.

test.bin

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By Tosyk
      Hi,
      Please help me to change metasymbol line. Right now I have this condition code:
      If StringInStr($_sName, 'TEXT ') Then $_sName = StringRegExpReplace($_sName, '(^.*)\TEXT (.*)$', '$2') $_sName = StringRegExpReplace($_sName, '(^.*)\ (.*)$', '$1') If Not CheckIsSave_($_sName) Then It work fine with this text file and finds each line which start from 'TEXT':
      Material B7E671143D244B ==================================== TEXT 2F3139D816C34D 1 TEXT B6A968EF2505A2 1 TEXT 35206697A04F91 1 TEXT EB485AF490D83D 1 TEXT 0DAB42294BD9B3 1 TEXT 3D6525BEE360E1 0 Material D6906B886B06E3 ==================================== TEXT 0CCECCCCFB62AE 1 TEXT 1E14CB29AB43F0 1 TEXT FB7F0DCE9B5950 1 But I have a new text file now the lines of which now are start with 0:, 1: and so on:
      sm_0 --------------- 0: dummy_gray 1: c_com_socksa_mt 2: c_com_socksa_tn 3: dummy_white 4: default_z 5: dummy_nmap 6: --- 7: --- sm_1 --------------- 0: c_com_prisoner_shoes_di 1: c_com_prisoner_shoes_mt 2: c_com_prisoner_shoes_tn 3: dummy_white 4: default_z 5: c_com_leatherb_rt 6: --- 7: --- how to change (or add) the condition code above to work with new text file?
      I'm trying to change this script: http://autoit-script.ru/threads/poisk-fajlov-rekursivno-po-dannomu-spisku.26970/post-148646
       
    • By seadoggie01
      I'm trying to capture everything after a "#ToDo" in my scripts. I got that like this:
      (?i)[^\v]*#todo(.*) But then I thought it would be nice to use underscores to continue the ToDo... kind of like this:
      #ToDo: This is a really long explanation about something _ # that is very in-depth and needs to take up a lot of _ # space in a ToDo comment Global $variables = "Bad" I can't seem to capture everything... and maybe I'm trying to do too much with Regex... I keep trying variations of this:
      Condensed Version: (?im)[^\v]*#todo(?:([^\v]*)_\s*)*#([^\v]*) Expanded with comments (?ixm)(?# Ignore case, ignore newlines in Regex, use multiline option)# [^\v]*(?# Match leading space/s)# \#todo(?# Match the #todo)# (?:([^\v]*)_\s*)*(?# Match lines ending with _)# \#([^\v]*)(?# Last line only, no _'s)# I never seem to be able to build an array well with Regex... I saw something once about not being able to capture repeated patterns, and I think that's my issue
    • By genius257
      Inspired by PHP's preg_split.
      Split string by a regular expression.
      Also supports the same flags as the PHP equivalent.
      v1.0.1
       
      Example:
      #include "StringRegExpSplit.au3" StringRegExpSplit('splitCamelCaseWords', '(?<=\w)(?=[A-Z])') ; ['split', 'Camel', 'Case', 'Words']  
    • By jmp
      i am trying to get number from string using this code :
      #include <IE.au3> $oIE = _IEAttach ("Edu.corner") Local $aName = "Student name & Code:", $iaName = "0" Local $oTds = _IETagNameGetCollection($oIE, "td") For $oTd In $oTds If $oTd.InnerText = $aName Then $iaName = $oTd.NextElementSibling.InnerText $iGet = StringRegExpReplace($iaName, "\D", "") EndIf Next MsgBox(0, "", $iGet) it was get number like 52503058
      But, I want to get only student code 5250. (Different student have different code, sometime its 3 digits, Sometime 4)

       
    • By RAMzor
      Hi guys I need your help.
      I have string like this : "TDM111A5,      RCT222Y5/ 7  ; FDT444E4 /8 , ABC222R5"
      I need find a coma or semicolon and delete white spaces before and after them
      The output should be a string and/or array 
      String : "TDM111A5,RCT222Y5/ 7;FDT444E4 /8,ABC222R5"
      Array:
      TDM111A5
      RCT222Y5/ 7
      FDT444E4 /8
      ABC222R5
×
×
  • Create New...