MachinistProgrammer

generate random string using regex as template

7 posts in this topic

has anyone come across a way to take a regex and generate a random string that matches

like

$my_random_string = randregex("[0-9_a-zA-z]+")
msgbox(0,'',StringRegExp($my_random_string,"[0-9_a-zA-z]+"))

All my projects live on github

Share this post


Link to post
Share on other sites



That sounds terribly complicated with all the possible patterns.

An engine I'd not even remotely begin to take on myself.


Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Maybe something like this.

Local $sREPattern = "[0-9_a-zA-Z\r]+"

Local $my_random_string = randregex($sREPattern)
ConsoleWrite($my_random_string & @LF)
ConsoleWrite(StringLen($my_random_string) & " characters <<<<<<<" & @LF)

MsgBox(0, 'Check random string against pattern', StringRegExp($my_random_string, $sREPattern))


Func randregex($sPattern)
    Local $i = Random(1, 100, 1) ; Random length of string
    Local $iCount = 0, $sChar, $sRetStr
    ;ConsoleWrite($i & @LF)
    While $iCount < $i
        $sChar = ChrW(Random(0, 255, 1))
        If StringRegExp($sChar, $sPattern) Then
            $iCount += 1
            $sRetStr &= $sChar
        EndIf
    WEnd
    Return $sRetStr
EndFunc   ;==>randregex

Edit: Moved $iCount += 1 to within "If Then EndIf" statement; &
Changed $sChar = Chr(Random(32, 126, 1)) to $sChar = ChrW(Random(0, 255, 1)) in order to have every character available.

Edited by Malkey

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Malkey's way is certainly simplest, doesn't really play into specifics of a regex random string generator without specifics.

I started messing around with the concept, and I can now tell you.. I 100% would not attempt it.  Just messing with your single bracket expression made my head hurt when I thought of all the other possible regular expression combinations.

Here's an example:

#include <Array.au3>
; please note you have your A-z probably wrong, but I'm leaving it just in case it was intentional
Global $gsRet = _myCustomRandomGen("[0-9_a-zA-Z]+", 22)
ConsoleWrite($gsRet & @CRLF)

Func _myCustomRandomGen($sExpression, $nStrLen, $nEncoding = 0); see StringFromAsciiArray for encoding

    If $nStrLen < 1 Then
        Return SetError(1, 0, "")
    EndIf

    ; only deal with square bracket
    Local $aBrackets = StringRegExp($sExpression, "^(?!\\)\[(?!^)(.+?)(?!\\)\]", 3)
    If @error Then
        ; no square brakets
        Return SetError(2, 0, "")
    EndIf

    Local $aMatch = 0, $aTmp
    Local $aTo[10][2], $iCc
    For $i = 0 To UBound($aBrackets) - 1
        ; match x-x chars
        $aMatch = StringRegExp($aBrackets[$i], "(.-.|.)", 3)
        If @error Then ContinueLoop
        For $j = 0 To UBound($aMatch) - 1
            ; manage our dynamic array size
            If $iCc And Mod($iCc, 10) = 0 Then
                ReDim $aTo[$iCc + 10][2]
            EndIf
            ; extract from / to and single chars
            $aTmp = StringRegExp($aMatch[$j], "(?:(.)-(.)|(.))", 1)
            If @error Then ContinueLoop
            If UBound($aTmp) = 3 Then
                $aTo[$iCc][0] = AscW($aTmp[2])
                $aTo[$iCc][1] = $aTo[$iCc][0]
            Else
                $aTo[$iCc][0] = AscW($aTmp[0])
                $aTo[$iCc][1] = AscW($aTmp[1])
            EndIf
            $iCc += 1
        Next
    Next

    ; if we had nothing in the brackets, escape
    If Not $iCc Then
        Return SetError(3, 0, "")
    EndIf

    ; trim ubound of array based on matches
    ReDim $aTo[$iCc][2]

    ; ascii container
    Local $aRet[100]

    $iCc = 0
    For $i = 0 To UBound($aTo, 1) - 1
        For $j = $aTo[$i][0] To $aTo[$i][1]
            If $iCc And Mod($iCc, 100) = 0 Then
                ReDim $aRet[$iCc + 100]
            EndIf
            $aRet[$iCc] = $j
            $iCc += 1
        Next
    Next

    ; trim the array
    ReDim $aRet[$iCc]

    ; ensure array is at least as large as tring wanted
    If $iCc < $nStrLen Then
        $aTmp = $aRet
        While UBound($aRet) < $nStrLen
            _ArrayConcatenate($aRet, $aTmp)
        WEnd
    EndIf

    ; randomly shuffle the array
    _ArrayShuffle($aRet)

    ; trim to the stringlen we need
    ReDim $aRet[$nStrLen]

    Return StringFromASCIIArray($aRet)
EndFunc 

Keep in mind, I didn't test past your single expression.

Edit:

Had to ensure array size was at least string length wanted... oops!

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

This would be very, very, very hard to accomplish algorithmically for an unknown pattern in a finite time.

My first thought was a brute force random string generator that would, in a flat universe without heat death, reach a solution in a finite time for any given pattern, but then I realized that there is an infinite amount of possible patterns that are unmatchable, like ^(?=y)x$ . So any brute force approach for one of those would mean an infinite loop. To fix this problem means to be able to algorithmically recognize any unmatchable pattern as such in a finite time, which seems like a daunting task in logic, to set the Guinness record for understatement.

But you're not the first one to try:

https://github.com/asciimoo/exrex/blob/master/exrex.py (Python)

https://github.com/mifmif/Generex (Java)

If you accept possible infinite loops and keep your supported regular expression feature set relatively modest (no lookaheads/lookbehinds for instance, they complicate things a lot) it's fairly doable. Still, I have better things to do :D

Edited by SadBunny

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Share this post


Link to post
Share on other sites

As far as there is a single explicit character class involved, things can be made pretty simple, even if we optionally allow characters in the whole Unicode range that AutoIt handles:

Local $sPattern = "(*UCP)[0-9_a-zA-Z\p{Greek}]+"

Local $my_random_string

For $i = 1 To 10
    $my_random_string = randregex($sPattern, Random(8, 15, 1),True)
    msgbox(0, '', ">" & $my_random_string & "<" & @LF & StringRegExp($my_random_string, $sPattern))
Next


Func randregex($sPattern, $iLength = 8, $bUnicode = False)
    Local Static $sChars = ""
    If $sChars = "" Then
        For $i = 0x20 To ($bUnicode ? 0xFFFF : 0xFF)
            If $bUnicode And $i = 0x80 Then $i = 0xA0
            If $i = 0xD800 Then $i = 0xE000
            $sChars &= ChrW($i)
        Next
        $sChars = StringRegExpReplace($sChars, StringReplace($sPattern, "[", "[^"), "")
    EndIf
    Local Static $iMax = StringLen($sChars)
    Local $sRand = ''
    For $i = 1 To $iLength
        $sRand &= StringMid($sChars, Random(1, $iMax, 1), 1)
    Next
    Return($sRand)
EndFunc

It would be much more problematic to cope with the general case of a PCRE pattern.

Say you want to enforce a more complex set of rules, like any optionally signed non empty sequence of Unicode digits followed by a currency symbol (unlikely to be choosen for password): the pattern would have to be "(*UCP)(?=[-+]?d+pSc"

Then you'd have to dissect the pattern just like PCRE does at compile-time and filter allowed characters to build up a valid result.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

Another solution for a class similar to the one defined in the first post. Although it's not a case sensitive nor a universal solution, I think the first step can possibly be a fairly efficient solution for sets containing between 11 and 16 characters. Some replacements can be made after generating the hex. I have not made any comparisons but it might still be of interest.

;

Local $sRandom = _RandomHexStr(Random(1, 1000, 1)), $sNewString = ""

; Add some underscores :)
For $i = 1 To StringLen($sRandom) +1
    While Not Random(0, 16, 1)
        $sNewString &= "_"
    WEnd
    $sNewString &= StringMid($sRandom, $i, 1)
Next
ConsoleWrite($sNewString & @LF)

Func _RandomHexStr($iLen)
    Local $sHexString = ""
    For $i = 1 To Floor($iLen/7)
        $sHexString &= StringRight(Hex(Random(0, 0xFFFFFFF, 1)), 7)
    Next
    $sHexString &= StringRight(Hex(Random(0, 0xFFFFFFF, 1)), Mod($iLen, 7))
    Return $sHexString
EndFunc ;==> _RandomHexStr
Edited by czardas

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now