Jump to content
Sign in to follow this  
Decipher

RegEx Split String at every 'n' characters into Array

Recommended Posts

The string only contains word characters and should only be 40 characters long.

String Example: ce1fc50bffb09962be8f3c49478cbeb65e2afe0f

I need to split the string into chunks of two.

Array[0] = ce

Array[1] = 1f

And so on.

My attempt:

Dim $aArray = StringRegExp($sString, '(w{2})+', 1, 1)

I would also like to know how to specify exclude characters or (if not) if there is similiar functionality with regex please.

This will be used to check torrent tracker status. I need to put % before every two characters. The above ex. string is SHA-1 Torrent Info Hash.

Edited by Decipher

Spoiler

censored.jpg

 

Share this post


Link to post
Share on other sites

I don't know if this is any help to you. Look at the second example in that post.

Edit

I'm not sure what you mean by exclude characters. You could remove unwanted characters before or after the split - leading to different results.

Edited by czardas

Share this post


Link to post
Share on other sites

I don't know if this is any help to you. Look at the second example in that post.

Edit

I'm not sure what you mean by exclude characters. You could remove unwanted characters before or after the split - leading to different results.

Thats what I asked for thank you. Lets say I wanted to write a regular expression where I wanted the statement to be true if it dosen't contain a given character. How would I do that?

#include <Array.au3>
Dim $aArray = _StringEqualSplit(StringUpper('962fb077e814c55a9a01b8f07e2fd2945cef6998'), 2)
ConsoleWrite('%' & _ArrayToString($aArray, '%') & @CRLF)
Func _StringEqualSplit($sString, $iNumChars)
    If (Not IsString($sString)) Or $sString = "" Then Return SetError(1, 0, 0)
    If (Not IsInt($iNumChars)) Or $iNumChars < 1 Then Return SetError(2, 0, 0)
    Return StringRegExp($sString, "(?s).{1," & $iNumChars & "}", 3)
EndFunc

The above code outputs: %96%2F%B0%77%E8%14%C5%5A%9A%01%B8%F0%7E%2F%D2%94%5C%EF%69%98

It is not valid. I am confused as how to encode the info hash to request information from a torrent tracker. This page shows what I'm trying to do, if someone dosen't mind helping me out.

http://nakkaya.com/2009/12/03/bittorrent-tracker-protocol/

Edited by Decipher

Spoiler

censored.jpg

 

Share this post


Link to post
Share on other sites

If your condition is based on there being only one character then I would use StringInStr() to check if the string contains the character you want to avoid. I'm not sure about the other technical details of your request. It sounds like it might take some study.

Edited by czardas

Share this post


Link to post
Share on other sites

Perhaps this is what you are after, but I'm not sure:

ConsoleWrite('%' & _ArrayToString($aArray, '') & @CRLF)

If your condition is based on there being only one character then I would use StringInStr() to check if the string contains the character you want to avoid. I'm not sure about the other technical details of your request. It sounds like it might take some study.

Thanks for your help, This my attempt to encode the hash.

ConsoleWrite(_HashEncode(_StringEqualSplit('123456789abcdef123456789abcdef123456789a', 2)) & @CRLF)
Func _HashEncode($aArray)
Local $url = "", $acode
For $i = 0 To UBound($aArray, 1) - 1
  $acode = $aArray[$i]
  Select
   Case ($acode >= 48 And $acode <= 57) Or _
     ($acode >= 65 And $acode <= 90) Or _
     ($acode >= 97 And $acode <= 122)
    $url = $url & Chr($acode)
   Case $acode = 45 Or $acode = 95 Or $acode = 46 Or $acode = 126
    $url = $url & Chr($acode)
   Case Else
    $url = $url & '%' & $aArray[$i]
  EndSelect
Next
Return $url
EndFunc   ;==>_URLEncode
Func _StringEqualSplit($sString, $iNumChars)
    If (Not IsString($sString)) Or $sString = "" Then Return SetError(1, 0, 0)
    If (Not IsInt($iNumChars)) Or $iNumChars < 1 Then Return SetError(2, 0, 0)
    Return StringRegExp($sString, "(?s).{1," & $iNumChars & "}", 3)
EndFunc

Output: %12%348N%9a%bc%de%f1%23-CY%ab%cd%ef%12%348N%9a

Needed Output: %124Vx%9a%bc%de%f1%23Eg%89%ab%cd%ef%124Vx%9a

Based on the above link.

If you don't pay attention to the spec and send this directly to tracker you will get an error this should be in URL Encoded form. Padding every two chars with % sign also doesn't work, been there done that don't waste your time. Any hex in the hash that corresponds to a unreserved character should be replaced,

a-z A-Z 0-9 -_.~

Partition the hex in to chunks of two and check if the hex corresponds to any of these values, if they do replace them with the unreserved char,

(defn url-encode [hash]

(apply str

(map (fn [[a b]]

(let [byte (BigInteger. (str a ;) 16) ]

(if (or (and (>= byte 65) (<= byte 90)) ; A-Z

(and (>= byte 97) (<= byte 122)) ; a-z

(and (>= byte 48) (<= byte 57)) ; 0-9

(= byte 45) (= byte 95) (= byte 46) (= byte 126))

(char byte) (str "%" a :)) )) (partition 2 hash))))

So that a hash such as,

123456789abcdef123456789abcdef123456789a

becomes,

%124Vx%9a%bc%de%f1%23Eg%89%ab%cd%ef%124Vx%9a

notice that hex 34 became 4 which is what it is in ASCII. You can test the correctness of your hashes using the tracker url but don't request from announce request from file,

http://some.tracker.com/file?info_hash=hash

If you get a torrent back that means you have the correct hash.

Here are some tracker you can test it with:

Tr_1=udp://fr33domtracker.h33t.com:3310/announce

Tr_2=http://announce.torrentsmd.com:8080/announce

Tr_4=udp://9.rarbg.com:2710/announce

Tr_6=udp://tracker.openbittorrent.com:80/announce

Here are some info hashes for different torrents each on its on line:

89fc7b8e7aa220368213bc555e0f24b72295c35d1

1326dcb7fb42312d7d47d8347281a89b8c3804f3

a9e22c72041e3d336766a3ed5769070ebe9c548a

86e1b0fac439d34d34d96af168d8dfe7cbc42000

646ca38de9ec32ae6d1971fcc46665ab88c888fa

25942cb889387c509c165c923cf34c907b6e83aa

Before where I was asking how to exclude characters I was just asking out of curiosity not for my current project.

$sString = 'thisisatest'

Normally to check if it contains the word 'test' you could do StringRegExp($sString, 'test')

How could I tell it to return False if it also contains the word 'this'?

Edited by Decipher

Spoiler

censored.jpg

 

Share this post


Link to post
Share on other sites

Hmm, did you solve it? Here's what I wrote just now.

#include <Array.au3>
Dim $aArray = _StringEqualSplit('123456789abcdef123456789abcdef123456789a', 2)

$testStr = "0123456789abcdefghijklmnopqrstuvwxyz-_.~"
For $i = 0 To UBound($aArray) -1
    If StringInStr($testStr, Chr(Dec($aArray[$i]))) Then
        $aArray[$i] = Chr(Dec($aArray[$i]))
    Else
        $aArray[$i] = "%" & $aArray[$i]
    EndIf
Next
ConsoleWrite(_ArrayToString($aArray, "") & @CRLF)

Func _StringEqualSplit($sString, $iNumChars)
    If (Not IsString($sString)) Or $sString = "" Then Return SetError(1, 0, 0)
    If (Not IsInt($iNumChars)) Or $iNumChars < 1 Then Return SetError(2, 0, 0)
    Return StringRegExp($sString, "(?s).{1," & $iNumChars & "}", 3)
EndFunc

Share this post


Link to post
Share on other sites

Hmm, did you solve it? Here's what I wrote just now.

#include <Array.au3>
Dim $aArray = _StringEqualSplit('123456789abcdef123456789abcdef123456789a', 2)

$testStr = "0123456789abcdefghijklmnopqrstuvwxyz-_.~"
For $i = 0 To UBound($aArray) -1
    If StringInStr($testStr, Chr(Dec($aArray[$i]))) Then
        $aArray[$i] = Chr(Dec($aArray[$i]))
    Else
        $aArray[$i] = "%" & $aArray[$i]
    EndIf
Next
ConsoleWrite(_ArrayToString($aArray, "") & @CRLF)

Func _StringEqualSplit($sString, $iNumChars)
    If (Not IsString($sString)) Or $sString = "" Then Return SetError(1, 0, 0)
    If (Not IsInt($iNumChars)) Or $iNumChars < 1 Then Return SetError(2, 0, 0)
    Return StringRegExp($sString, "(?s).{1," & $iNumChars & "}", 3)
EndFunc

Yes you did.

Do you mind explaining how your using dec and char for future reference aka break it down?


Spoiler

censored.jpg

 

Share this post


Link to post
Share on other sites

The function Chr only accepts decimal input, and because the 2 digits in each element represent hex values, they first need to be converted to decimal using Dec before you pass them to the Chr function. This can be done easily by passing the functions as parameters in the correct sequence as I did (but it isn't necessary to do it this way). You should look up these functions in the help file and check out the ascii character codes page in there too.

I hope that makes sense. You might need to study it a bit. ;)

One thing that isn't clear to me from the artucle is whether or not the string needs to begin with %. You may need to watch out for this if the first array element is replaced (my code will not always add a preceeding % at the start). I'm unfamiliar with bencode but it is easy to add % at the start of the string if it is needed.

Edited by czardas

Share this post


Link to post
Share on other sites

The function Chr only accepts decimal input, and because the 2 digits in each element represent hex values, they first need to be converted to decimal using Dec before you pass them to the Chr function. This can be done easily by passing the functions as parameters in the correct sequence as I did (but it isn't necessary to do it this way). You should look up these functions in the help file and check out the ascii character codes page in there too.

I hope that makes sense. You might need to study it a bit. ;)

One thing that isn't clear to me from the artucle is whether or not the string needs to begin with %. You may need to watch out for this if the first array element is replaced (my code will not always add a preceeding % at the start). I'm unfamiliar with bencode but it is easy to add % at the start of the string if it is needed.

Awesome, I think I got everything I need now. I appreciate your help with this.


Spoiler

censored.jpg

 

Share this post


Link to post
Share on other sites

It was interesting to spend time on. I'm happy to help. ;)

Edit

Actually looking at this again, it can be improved slightly. It's not a major change.

#include <Array.au3>
Dim $aArray = _StringEqualSplit('123456789abcdef123456789abcdef123456789a', 2)

$testStr = "0123456789abcdefghijklmnopqrstuvwxyz-_.~"
For $i = 0 To UBound($aArray) -1
    $sChar = Chr(Dec($aArray[$i])) ; This should make it slightly faster - less conversion involved.
    If StringInStr($testStr, $sChar) Then
        $aArray[$i] = $sChar
    Else
        $aArray[$i] = "%" & $aArray[$i]
    EndIf
Next
ConsoleWrite(_ArrayToString($aArray, "") & @CRLF)

Func _StringEqualSplit($sString, $iNumChars)
    If (Not IsString($sString)) Or $sString = "" Then Return SetError(1, 0, 0)
    If (Not IsInt($iNumChars)) Or $iNumChars < 1 Then Return SetError(2, 0, 0)
    Return StringRegExp($sString, "(?s).{1," & $iNumChars & "}", 3)
EndFunc

It may be possible to do all this with one complicated RegExpReplace, but that would take some thinking about (I have my doubts about it). Also whether it would be better or not is perhaps debatable (depending on the complexity).

Edited by czardas

Share this post


Link to post
Share on other sites

If you want to increase speed you should get rid of _ArrayToString.

$sHash = '123456789abcdef123456789abcdef123456789a'
ConsoleWrite(_BencodeHash($sHash) & @CRLF) ; '%124Vx%9a%bc%de%f1%23Eg%89%ab%cd%ef%124Vx%9a'

Func _BencodeHash($sStr)
    Local $aArray = StringRegExp($sStr, 'w{2}', 3)
    Local $sReturn = '', $sChar = ''
    For $i = 0 To UBound($aArray) -1
        $sChar = Chr(Dec($aArray[$i]))
        If StringInStr("0123456789abcdefghijklmnopqrstuvwxyz-_.~", $sChar) Then
            $sReturn &= $sChar
        Else
            $sReturn &= "%" & $aArray[$i]
        EndIf
    Next
    Return $sReturn
EndFunc

$sString = 'thisisatest'

Normally to check if it contains the word 'test' you could do StringRegExp($sString, 'test')

How could I tell it to return False if it also contains the word 'this'?

I think you want something like this:

#cs
    A(?!.*this).*test
    A                Anchor the match at the beginning of the string
    (?!             Start a negative lookahead subpattern
      .*            Match 0 or more of any characters, except newline
      this           The word we do NOT want in the string
    )               Close the negative lookahead
    .*               Match 0 or more of any characters, except newline
    test             The word we do want in the string

    The lookahead is a zero width assertion, so matching will continue at the
    same position after matching the subpattern, provided it did not terminate.
    Because it is a "negative" lookahead the subpattern will be satisfied if it
    can NOT match and continue with the rest of the pattern, it will terminate
    if it does match.
#ce

$sString = 'this is a test'
$sString = 'that was a test'
If StringRegExp($sString, 'A(?!.*this).*test') Then
    ConsoleWrite("Contains 'test' and NOT 'this'" & @CRLF)
ElseIf StringRegExp($sString, 'test') Then
    ConsoleWrite("Contains 'test' and 'this'" & @CRLF)
Else
    ConsoleWrite("Does not contain 'test'" & @CRLF)
EndIf

But for such simple needs, why not just use StringInStr?

$sString = 'this is a test'
$sString = 'that was a test'
If StringInStr($sString, 'test') And Not StringInStr($sString, 'this') Then
    ConsoleWrite("Contains 'test' and NOT 'this'" & @CRLF)
ElseIf StringInStr($sString, 'test') Then
    ConsoleWrite("Contains 'test' and 'this'" & @CRLF)
Else
    ConsoleWrite("Does not contain 'test'" & @CRLF)
EndIf
Edited by Robjong

Share this post


Link to post
Share on other sites

If you want to increase speed you should get rid of _ArrayToString.

$sHash = '123456789abcdef123456789abcdef123456789a'
ConsoleWrite(_BencodeHash($sHash) & @CRLF) ; '%124Vx%9a%bc%de%f1%23Eg%89%ab%cd%ef%124Vx%9a'

Func _BencodeHash($sStr)
    Local $aArray = StringRegExp($sStr, 'w{2}', 3)
    Local $sReturn = '', $sChar = ''
    For $i = 0 To UBound($aArray) -1
        $sChar = Chr(Dec($aArray[$i]))
        If StringInStr("0123456789abcdefghijklmnopqrstuvwxyz-_.~", $sChar) Then
            $sReturn &= $sChar
        Else
            $sReturn &= "%" & $aArray[$i]
        EndIf
    Next
    Return $sReturn
EndFunc

Yeah that should run slightly faster. I wonder which regexp is quicker. I might test it later. ;)

Of course there are no error checks.

Edited by czardas

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By meety
      Hello!
      wingettext function can get the text content of the IE browser page, but the text content of the page cannot be obtained in the chrome browser? What should I do?
    • By roeselpi
      hello again,
      it has been a long time since i have been here and a long time since i last used autoit. ever so often when the time allows me to, then i follow up on an idea that i had a long time ago. i have done all the work on paper but now it is up to writing it in autoit and i keep stumbling over many little issues here and there. sometimes after a few days i will try again and get a step further but sometimes it just will not help no matter how long i try and think about a solution. for most of you it will be the basics but for me it is not all that easy, but at least i give it a try.
      right, down to business:
      here is my code:
      #include <MsgBoxConstants.au3> #include <StringConstants.au3> #include <Array.au3> #include <String.au3> ; ; PART 1: define replacements and check with msgbox ; Global $y, $z $y = "Yes" $z = "No" MsgBox(0,"replacements", $y & @CRLF & $z) ;the replacements in a message box ; ; PART 2: set the texts and check via console and msgbox ; Global $my1string = "abab" ;the first specified text MsgBox(0,"my1string", $my1string) ;the message box to output the first specified text Global $my2string = "icic" ;the second specified text MsgBox(0,"my2string", $my2string) ;the message box to output the second specified text ; ; PART 3: transform the strings to individual arrays ; $my1array = StringSplit($my1string, "") $my1array[0] = "" _ArrayDelete($my1array, 0) _ArrayDisplay($my1array, "my1array") ;the display of the first specified array $my2array = StringSplit($my2string, "") $my2array[0] = "" _ArrayDelete($my2array, 0) _ArrayDisplay($my2array, "my2array") ;the display of the first specified array ; ; PART 4: create an empty array for filling ; Global $OutputArray[4] $OutputArray[0] = "" _ArrayDisplay($OutputArray, "OutputArray") ;the display of the first specified array ; ; PART 5: compare & fill empty OutputArray with data after evaluation ; Global $i, $j, $k For $i = 0 to UBound($my1array) -1 For $j = 0 to UBound($my2array) -1 For $k = 0 to UBound($OutputArray) -1 If $my1array[$i] = "a" And $my2array[$j] = "i" Then $OutputArray[$k] = $y Else $OutputArray[$k] = $z EndIf Next Next Next _ArrayDisplay($OutputArray, "OutputArray") ;the display of the Newly filled Array In "Part 2" i make a string that is converted to an array in "Part 3" ... Now, I know that "a" and "i" are always in the exact same spot in both arrays and so i wanted to compare this and make a further array to document my findings by saying "yes" or "no" ... however my new array keeps saying just "no" allthough i can clearly see and know that it should say:
      yes no yes no my guess is that there is something wrong within my for-loops and that the counting is somehow "off" i guess that when the first for-loop is finished it reaches the second whilst the second for-loop is checking the first which would explain why it always says "no" instead of seeing the obvious.
      so my question would be: what is wrong with my for-loop? or where am i making an error that ultimately gives me the wrong results?
      help is much appreciated.
      kind regards
      roeselpi
       
       
      PS: sorry for my not so great english spelling ... stupid german sitting here trying out intermediate english skills.
    • By DirtyJohny
      Hi everyone.Need rewrite this function how in еxample.
      Original:
      #RequireAdmin #NoTrayIcon Opt("MustDeclareVars",1) Func _a() Local $sls=ObjGet("winmgmts:{impersonationLevel=impersonate," _ &"authenticationLevel=Pkt}!\\"& _ @ComputerName&'\root\wmi'),$lss=$sls.ExecQuery _ ('SELECT * FROM WmiMonitorID'), _ $lll,$sll,$sss="",$lsl,$lls,$i,$z For $z In $lss $lsl=$z.UserFriendlyName For $i=0 To Ubound($lsl)-1 if ($lsl[$i]) Then $lll&=Chr($lsl[$i]) Next $lls=$z.SerialNumberID For $i=0 To Ubound($lls)-1 if ($lls[$i]) Then $sll&=Chr($lls[$i]) Next $sss&="Model"&@TAB&@TAB&": "&$lll&@CR&"Serial Number"&@TAB&": "&$sll&@CR&@CR $lll="" $sll="" Next MsgBox(262144,'Monitors '&$lss.Count,$sss&" "&@CR) $lss=Null $sls=Null EndFunc _a() Example:
      Func _InfoPC() Local $ObjService = ObjGet('winmgmts:{impersonationLevel = impersonate}!\\' & @ComputerName & '\root\cimv2') Local $ObjMB = $ObjService.ExecQuery('SELECT * FROM Win32_BaseBoard', 'WQL', 0x30) If IsObj($ObjService) Then For $objItemMB In $ObjMB $sInfo &= @TAB & 'Motherboard: ...... ' & $objItemMB.Product & @CRLF I'm beginner in this sphere and need  you all speak easy and simply because i'm Russian.Thanks)
    • By Deshanur
      Am trying to automate injecting credential on the login form for all kind of Web application for IE. I know how to identify the form name by viewing the source code and using the method - _IEFormGetObjByName($ie, $form_Name).
      I would like to know how to identify or get the form object for the web app where there is no form name tag for example below, for the is I have used - _IEFormGetCollection($ie, 0) to get the form object.
      My Question is does it work for all kind of application "_IEFormGetCollection($ie, 0)" how to identify Index value? is it always 0? is there any better solution?
      The final solution am looking for is find out form object, get the username, password field and inject credential and submit the form.
      How to find out index value? for the forms which does not have form name field.
      $login_form = _IEFormGetCollection($ie, 0) $email_field = _IEFormElementGetObjByName($login_form, $form_UserName) $pass_field = _IEFormElementGetObjByName($login_form, $form_password) $login_button = _IEFormElementGetObjByName($login_form, $form_submitbutton) _IEFormElementSetValue($email_field, $CmdLine[2]) _IEFormElementSetValue($pass_field, $CmdLine[3]) ControlSend($hwnd, "", "[CLASS:Internet Explorer_Server; INSTANCE:1]","{Enter}") OR This works fine if the form has form name. $login_form = _IEFormGetObjByName($ie, $form_Name) $email_field = _IEFormElementGetObjByName($login_form, $form_UserName) $pass_field = _IEFormElementGetObjByName($login_form, $form_password) $login_button = _IEFormElementGetObjByName($login_form, $form_submitbutton) _IEFormElementSetValue($email_field, $CmdLine[2]) _IEFormElementSetValue($pass_field, $CmdLine[3]) ControlSend($hwnd, "", "[CLASS:Internet Explorer_Server; INSTANCE:1]","{Enter}")
    • By JuanFelipe
      Hello guys!
      I have a problem with a script, in the past I made a program and it worked perfect, I recently used it again and it already stands still in one step, the problem is a Javascript event that changed but now I cannot execute it.
       
      <td colspan="2" align="center"><input type="button" id="Buscar" name="Buscar" value="Buscar" onclick="javascript:enviarForma(document.obtenercasosPersonaPorDocumento,'0');" class="boton"></td>  
      Previously I used this code and it worked, but now it does nothing
      $botonconsulta = _IEGetObjByName($oIE,"Buscar") _IEAction($botonconsulta, 'click') _IELoadWait($oIE)  
      Now I have tried the following codes but none works.
       
      ;==================== 1 _IEHeadInsertEventScript($oIE, "document", "onclick", "javascript:enviarForma(document.obtenercasosPersonaPorDocumento,'0');") ControlClick("SUPER CELAC", "", "[CLASS:Internet Explorer_Server; INSTANCE:1]") _IELoadWait($oIE) ;==================== 2 $forma = $oIE.document.obtenercasosPersonaPorDocumento $oIE.Navigate("JavaScript:enviarForma("&$forma&",""0"");") _IELoadWait($oIE) ;==================== 3 $boton = _IEGetObjById($oIE, "Buscar") $boton.document.parentwindow.execScript("enviarForma(document.obtenercasosPersonaPorDocumento,'0');", "javascript") I hope you can guide me, because I think the error is in the parameters that the javascript function receives but I don't know how to send it, here I leave the first part of the JavaScript function.
      function enviarForma(frm,tipoForma){ var bExisteDatoConsulta = false; var bError=true; var mensajeError=""; switch (tipoForma) { //--Forma documento case "0": if (validar_campo_no_vacio_no_print(frm.numeroDocumento)) { if (!isNum(frm.numeroDocumento.value)) { bError = false; mensajeError += "El número de documento debe ser un valor numérico\n"; } bExisteDatoConsulta = true; } else { mensajeError += "Debe digitar un número de documento\n"; } break;  Thanks!
×
×
  • Create New...