Jump to content
Sign in to follow this  
DeltaRocked

StringregEx - identify hex

Recommended Posts

DeltaRocked

hello all,

find enclosed the code for identifying and converting HEX values which are an integral part of a PDF. The previous post had a workable code which has evolved for extracting streams from PDF and converting them from HEX to ASCII characters.

I need help to revamp the code expecially the Regular Expression which is being used. I believe that there is a huge scope of improvement .

PS: There is no problem what-so-ever in this code - but I am bit disappointed with the manner in which I have used the regular expression.

#include <array.au3>
#include <String.au3>

Local $iLoc = 1, $aPCRE, $cChar

Local $sPCRE = '(?:^_A-Za-z)?[0-9A-Fa-f](?:^_A-Za-z)?[0-9A-Fa-f]|#(?:^_A-Za-z)?[0-9A-Fa-f](?:^_A-Za-z)?[0-9A-Fa-f]|$1';(?=[^_G-zg-z])';(\1+)'

;~ Local $sString = ' function (0d'
Local $sString

; string1 and string2 occur during different scenarios which are under control hence return value will do the trick of identifying
; eg. string1 will be provided to the parser when PDF extraction is in process and
; string2 will be provided when PDF data streams are being extracted by passing them through the zlib function


;string1
;this string will always have HEX preceeded by a #
$sString = 'trailer<</Siz#65 7/Ro#6f74 10 R>>'

;string2
;this string may have 'z' or ' ' or '0x' or '' preceeding the hex number
;~ $sString ='z0d 0a0d0a0966z75z6ez63z74z69z6fz6e0x200x6dz5fz6bz33z33z31z56z4fz57z32z28z68z5fz30z35z5fz5fz62z5fz32z2cz20z'



Local $prev_iLoc = 0, $prev_len = 0
Local $count = 0
While 1
    $aPCRE = StringRegExp($sString, $sPCRE, 1, $iLoc)
    $iLoc = @extended
    $error = @error
    If $error Then ExitLoop
    $convert = StringReplace($aPCRE[0], '#', '')
    If @extended > 0 Then
        $convert = _HexToString($convert)
        $count = 1
    ElseIf $count == 0 Then
        $convert = _HexToString($convert)
    Else
        $convert = ''
    EndIf

    If Asc($convert) < 128 And Asc($convert) > 31 Then
        $cChar &= $convert
    EndIf
    $prev_iLoc = $iLoc
    $prev_len = StringLen($aPCRE[0])
WEnd
$cChar = StringReplace($cChar, ';', ';' & @CRLF)
$cChar = StringReplace($cChar, '}', '}' & @CRLF)
$cChar = StringReplace($cChar, '{', '{' & @CRLF)
ConsoleWrite("Resulting string: " & $cChar & @CRLF)
Edited by deltarocked

Share this post


Link to post
Share on other sites
GEOSoft

$sString ='z0d 0a0d0a0966z75z6ez63z74z69z6fz6e0x200x6dz5fz6bz33z33z31z56z4fz57z32z28z68z5fz30z35z5fz5fz62z5fz32z2cz20z' isn't even close to being a hex string.

Valid Hex values are [0-9a-fA-F] which you are better of to write using the xdigit class.

[[:xdigit:]] which is the same as writing [0-9a-fA-F] or (?i)[0-9a-f]

EDIT: The Toolkit in my signature can be used to test a regex or even to store one in a library for future use.

EDIT 2: I'm off to bed but if you still need help tomorrow I'll look at it again.

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites
DeltaRocked

$sString ='z0d 0a0d0a0966z75z6ez63z74z69z6fz6e0x200x6dz5fz6bz33z33z31z56z4fz57z32z28z68z5fz30z35z5fz5fz62z5fz32z2cz20z' isn't even close to being a hex string.

Valid Hex values are [0-9a-fA-F] which you are better of to write using the xdigit class.

[[:xdigit:]] which is the same as writing [0-9a-fA-F] or (?i)[0-9a-f]

EDIT: The Toolkit in my signature can be used to test a regex or even to store one in a library for future use.

EDIT 2: I'm off to bed but if you still need help tomorrow I'll look at it again.

PCRE toolkit .... but I do not know how to use it .... i.e. how do I use it for testing ?

The string in question is extracted from a malicious PDF ... so I cant help much .... These viurs makers are making life hell for reversing ... am I supposed to blame Adobe Reader for creating such a versatile file-format is one question I aint even thinkin about.

when you decode it you will find that its an heavily obfuscated Javascript.

I have seen examples wherein 'eval' is treated as a bunch of variables with a mixture of HEX and ascii characters. Hence detecting eval is a huge job in itself - cause we have no idea as to how it will be presented.

eg1.

var Xgsf='e';

var GHSF='va'

var JHSF=0x4c

eg2:

$sString = 'trailer<</Siz#65 7/Ro#6f#74 10 R>>'

after decoding : 'trailer<</Size 7/Root 10 R>>' where 10 represents something else but is not HEX. i.e. 10th Object

To make the matters worse sometimes I end up finding SWF with heapspray attacks. Posted Image

[EDIT]

Local $sPCRE = '(?:^_A-Za-z)?[[:xdigit:]](?:^_A-Za-z)?[[:xdigit:]]|#(?:^_A-Za-z)?[[:xdigit:]]?[[:xdigit:]]'

Edited by deltarocked

Share this post


Link to post
Share on other sites
GEOSoft

I see you have edited with a RegExp so where do you stand with this now?

Usage PCRE Toolkit couldn't be simpler but I do have to get to work on the help file. Someone had volunteered to work at it and then dropped off so it needs a total restart.

Just paste the string in the edit control and then place a regex that you want to test against that string in the larger of the combo boxes. and click the green "Go" arrow to test.

If you want to test against the whole file you would use the Local File tab instead and browse to the file. In that case you may be better off to extract the file to some readable format and browse to that instead.

Would the string you posted earlier be enough to identify the file?


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites
DeltaRocked

I see you have edited with a RegExp so where do you stand with this now?

Usage PCRE Toolkit couldn't be simpler but I do have to get to work on the help file. Someone had volunteered to work at it and then dropped off so it needs a total restart.

Just paste the string in the edit control and then place a regex that you want to test against that string in the larger of the combo boxes. and click the green "Go" arrow to test.

If you want to test against the whole file you would use the Local File tab instead and browse to the file. In that case you may be better off to extract the file to some readable format and browse to that instead.

Would the string you posted earlier be enough to identify the file?

All I can see is a GREEN + sign on the right ... anyway.... will work on PCRE in the evening ...

The reduced string is also working fine .... thanks for that [[:xdigit:]] tip.... i was using only a single [ and ended up geting huge number of erroneous results...

Share this post


Link to post
Share on other sites
GEOSoft

Go arrow is on the left.

When using classes they must also be enclosed within a [] group but each class doesn't need a separate outer group

[[:alpha:][:punct:]]+ is perfectly valid aqnd as long as you were looking for [a-zA-Z] or any punctuation would be fine.

If you enable Tips in Options then out of the very limited number that I have included so far there is one about the [[:group:]] and another about handling the puntuation characters [.!,?] which will lose their meta-character status in a [] group.

If you have any questions about the Toolkit just PM me.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites
DeltaRocked

Go arrow is on the left.

When using classes they must also be enclosed within a [] group but each class doesn't need a separate outer group

[[:alpha:][:punct:]]+ is perfectly valid aqnd as long as you were looking for [a-zA-Z] or any punctuation would be fine.

If you enable Tips in Options then out of the very limited number that I have included so far there is one about the [[:group:]] and another about handling the puntuation characters [.!,?] which will lose their meta-character status in a [] group.

If you have any questions about the Toolkit just PM me.

Dont know why .... but the button is not visible on my PC (Windows2003) .... since you have added mouse hover .... at that time I realized that there is a control underneath.... All these years I was wondering about it.... but since regex was rarely used by me ... I never really bothered...

find attached screenshot... http://imgur.com/sjP4W

Edited by deltarocked

Share this post


Link to post
Share on other sites
GEOSoft

Ouch!! That is one ugly GUI running it on 2003.

I know what the problem is for your system (Shell32.dll icon) and I will fix it for the next release. Thanks.

P.S. I used to have that set as the Default button and I'm not sure why it was changed so I'll look into that as well.


George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×