Jump to content

StringRegExp Help


Recommended Posts

So i have this string that i am trying to split up into an array.

 

 
{"q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"q2":{"a1":[1,1],"a2":[0,1]},"q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},"q4":{"a2":[1,1],"a4":[0,1],"a1":[0,0],"a5":[0,1],"a3":0,0]},"q5":{"a1":[1,1],"a5":[0,1],"a4":[0,0],"a3":[0,0],"a2":[0,0]},"q6":{"a1":[1,1],"a4":[0,1],"a2":[0,1],"a3":[0,1]},"q7":{"a5":[1,0],"a3":[0,1],"a2":[0,1],"a1":[0,0],"a4":[0,1]},"q8":{"a3":[1,1],"a4":0,1],"a2":[0,1],"a1":[0,1]},"q9":{"a1":[1,0],"a2":[0,0]}}}}
What i am trying to accomplish is build a RegExp that will output something very similar to this:
$aArray[0] = "q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"
$aArray[1] = "q2":{"a1":[1,1],"a2":[0,1]},
$aArray[3] = "q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},
....
So you can see i pretty much want the StringRegExp to grab all of the information inbetween "q\d"
Here is what i have done so far:
#include <Array.au3>

$hResults = FileOpen("Results.txt", 0)
$sResults = FileRead($hResults)
$aQuestions = StringRegExp($sResults, '"q\d{1,}.*', 3)
ConsoleWrite(@error & @CRLF & @extended & @CRLF)
_ArrayDisplay($aQuestions)

The output i get is $aQuestions[0] with the entire string. I tried less greedy options for the '.' by adding ++ or +? but neither are working. 

Hopefully i explained it well enough. Any help is appreciated!

Edited by Kidney
Link to comment
Share on other sites

#include<Array.au3>

$string = '{"q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"q2":{"a1":[1,1],"a2":[0,1]},"q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},"q4":{"a2":[1,1],"a4":[0,1],"a1":[0,0],"a5":[0,1],"a3":0,0]},"q5":{"a1":[1,1],"a5":[0,1],"a4":[0,0],"a3":[0,0],"a2":[0,0]},"q6":{"a1":[1,1],"a4":[0,1],"a2":[0,1],"a3":[0,1]},"q7":{"a5":[1,0],"a3":[0,1],"a2":[0,1],"a1":[0,0],"a4":[0,1]},"q8":{"a3":[1,1],"a4":0,1],"a2":[0,1],"a1":[0,1]},"q9":{"a1":[1,0],"a2":[0,0]}}}}'


$aSplit = stringsplit($string , "]}" , 3)

For $i = 0 to ubound($aSplit) - 1
$aSplit[$i] = stringtrimleft($aSplit[$i] & "]}" , 1)
Next

_ArrayDelete($aSplit , ubound($aSplit) - 1)
_ArrayDisplay($aSplit)

 

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

In your expression the ending  .*  matches the entire string
You need to get "q\d":  then the following braces with their content :  \{  [^}]+  \}

#include <Array.au3>

$str = '{"q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"q2":{"a1":[1,1],"a2":[0,1]},"q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},"q4":{"a2":[1,1],"a4":[0,1],"a1":[0,0],"a5":[0,1],"a3":0,0]},"q5":{"a1":[1,1],"a5":[0,1],"a4":[0,0],"a3":[0,0],"a2":[0,0]},"q6":{"a1":[1,1],"a4":[0,1],"a2":[0,1],"a3":[0,1]},"q7":{"a5":[1,0],"a3":[0,1],"a2":[0,1],"a1":[0,0],"a4":[0,1]},"q8":{"a3":[1,1],"a4":0,1],"a2":[0,1],"a1":[0,1]},"q9":{"a1":[1,0],"a2":[0,0]}}}}'

$aQuestions = StringRegExp($str, '"q\d+":\{[^}]+\}', 3)
_ArrayDisplay($aQuestions)

 

Edited by mikell
Link to comment
Share on other sites

very nice,  in playing with that solution (moreso attempting to write one i understand), is this flawed, or less optimal in regex?  

 

$aQuestions = StringRegExp($str, '("q\d+".*?})(?:,|})', 3)

 

 

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

It works too, the closing brace used as 'delimiter' after the lazy star makes the final non-capturing group useless
BTW the simplest could be

$aQuestions = StringRegExp($str, '"q[^}]+\}', 3)

You're right, the }  doesn't need to be escaped but I have this habit ...  :)
The negated class is better than .*?  because it avoids backtracking (http://www.regular-expressions.info/repeat.html)

Edited by mikell
Link to comment
Share on other sites

 

#include<Array.au3>

$string = '{"q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"q2":{"a1":[1,1],"a2":[0,1]},"q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},"q4":{"a2":[1,1],"a4":[0,1],"a1":[0,0],"a5":[0,1],"a3":0,0]},"q5":{"a1":[1,1],"a5":[0,1],"a4":[0,0],"a3":[0,0],"a2":[0,0]},"q6":{"a1":[1,1],"a4":[0,1],"a2":[0,1],"a3":[0,1]},"q7":{"a5":[1,0],"a3":[0,1],"a2":[0,1],"a1":[0,0],"a4":[0,1]},"q8":{"a3":[1,1],"a4":0,1],"a2":[0,1],"a1":[0,1]},"q9":{"a1":[1,0],"a2":[0,0]}}}}'


$aSplit = stringsplit($string , "]}" , 3)

For $i = 0 to ubound($aSplit) - 1
$aSplit[$i] = stringtrimleft($aSplit[$i] & "]}" , 1)
Next

_ArrayDelete($aSplit , ubound($aSplit) - 1)
_ArrayDisplay($aSplit)

​This kinda works. the only thing this doesnt do is grab q1. 

 

In your expression the ending  .*  matches the entire string
You need to get "q\d":  then the following braces with their content :  \{  [^}]+  \}

#include <Array.au3>

$str = '{"q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"q2":{"a1":[1,1],"a2":[0,1]},"q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},"q4":{"a2":[1,1],"a4":[0,1],"a1":[0,0],"a5":[0,1],"a3":0,0]},"q5":{"a1":[1,1],"a5":[0,1],"a4":[0,0],"a3":[0,0],"a2":[0,0]},"q6":{"a1":[1,1],"a4":[0,1],"a2":[0,1],"a3":[0,1]},"q7":{"a5":[1,0],"a3":[0,1],"a2":[0,1],"a1":[0,0],"a4":[0,1]},"q8":{"a3":[1,1],"a4":0,1],"a2":[0,1],"a1":[0,1]},"q9":{"a1":[1,0],"a2":[0,0]}}}}'

$aQuestions = StringRegExp($str, '"q\d+":\{[^}]+\}', 3)
_ArrayDisplay($aQuestions)

​Awesome! this works perfectly! 

Now its time to figure out how im gonna grab the "a\d" and the corresponding result (the 0 and 1 combo in the []'s

Link to comment
Share on other sites

This kinda works. the only thing this doesnt do is grab q1

im pretty sure it is in $aSplit[0].   But the regex is definitely the route to go.

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Link to comment
Share on other sites

@kidney : for your next step, here is a start point :

#include <Array.au3>

$str = '{"q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"q2":{"a1":[1,1],"a2":[0,1]},"q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},"q4":{"a2":[1,1],"a4":[0,1],"a1":[0,0],"a5":[0,1],"a3":0,0]},"q5":{"a1":[1,1],"a5":[0,1],"a4":[0,0],"a3":[0,0],"a2":[0,0]},"q6":{"a1":[1,1],"a4":[0,1],"a2":[0,1],"a3":[0,1]},"q7":{"a5":[1,0],"a3":[0,1],"a2":[0,1],"a1":[0,0],"a4":[0,1]},"q8":{"a3":[1,1],"a4":0,1],"a2":[0,1],"a1":[0,1]},"q9":{"a1":[1,0],"a2":[0,0]}}}}'

$aQuestions = StringRegExp($str, '"q[^}]+\}', 3)
For $i = 0 To UBound($aQuestions) - 1
    $iQuestion = StringMid($aQuestions[$i], StringInStr($aQuestions[$i], "q") + 1, 1)
    $aAnswers = StringRegExp($aQuestions[$i], '"a(\d+)":\[(\d+),(\d+)\]', 3)
    
    ConsoleWrite("Question #" & $iQuestion & " : " & @CRLF)
    For $j = 0 To UBound($aAnswers) - 1 Step 3
        ConsoleWrite("    Answer #" & $aAnswers[$j] & " : " & $aAnswers[$j + 1] & " - " & $aAnswers[$j + 2] & @CRLF)
    Next
    
Next

 

Link to comment
Share on other sites

Looks like json to me. Search the example section for AutoIt json parsers.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Or you can get the results in an array  :)

#include <Array.au3>

$str = '{"q1":{"a2":[0,1],"a5":[0,0],"a3":[0,1],"a4":[0,0],"a1":[1,0]},"q2":{"a1":[1,1],"a2":[0,1]},"q3":{"a2":[1,0],"a4":[0,0],"a5":[0,0],"a3":[0,0],"a1":[0,1]},"q4":{"a2":[1,1],"a4":[0,1],"a1":[0,0],"a5":[0,1],"a3":0,0]},"q5":{"a1":[1,1],"a5":[0,1],"a4":[0,0],"a3":[0,0],"a2":[0,0]},"q6":{"a1":[1,1],"a4":[0,1],"a2":[0,1],"a3":[0,1]},"q7":{"a5":[1,0],"a3":[0,1],"a2":[0,1],"a1":[0,0],"a4":[0,1]},"q8":{"a3":[1,1],"a4":0,1],"a2":[0,1],"a1":[0,1]},"q9":{"a1":[1,0],"a2":[0,0]}}}}'

$aQuestions = StringRegExp($str, '"q[^}]+\}', 3)
Local $aResults[1000][4], $n = 0
For $i = 0 To UBound($aQuestions) - 1
    $aResults[$n][0] = StringRegExp($aQuestions[$i], '(q\d+)', 3)[0] 
    $aAnswers = StringRegExp($aQuestions[$i], '"(a\d+)":\[(\d+),(\d+)\]', 4)
    For $j = 0 To UBound($aAnswers) - 1
        For $k = 1 to 3
            $aResults[$n][$k] = ($aAnswers[$j])[$k]
        Next
        $n += 1
    Next
Next
Redim $aResults[$n][4]
_ArrayDisplay($aResults)

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...