Jump to content
Sign in to follow this  
daxle

Parsing Text File

Recommended Posts

daxle

Hi there everyone,

I have a text file that looks something like the following:

Administrator        Guest               Matt                   
jaes                     james               jas                    
jasdfs               jasds               js                 
Matthew              pjaasddasdfs            pjaasdds               
pjads                pjdfs

Notice the odd formatting; there are not always a consistant amount of spaces between terms, I'm looking for a way to break up this text file into the individual terms (in this case user accounts), and not include the spaces. Any ideas?

Thanks for any advice!

Matt

Share this post


Link to post
Share on other sites
SmOke_N

If it's at least "consistant" with two or more spaces separating the columns, below will work.

The example returns a two dimensional array... rows ->[n][n]<- columns

#include <Array.au3> ; for _arraydisplay() func

Global $gs_somefile = "somefile_here.txt"
Global $ga_parsed = _myparse_filefunc($gs_somefile)

_ArrayDisplay($ga_parsed)

Func _myparse_filefunc($s_file)

    Local $s_fread = $s_file
    If FileExists($s_file) Then $s_fread = FileRead($s_file)

    If $s_fread = "" Then Return SetError(1, 0, 0)

    Local $a_lines = StringSplit(StringStripCR($s_fread), @LF)

    Local $a_col = 0
    Local $i_cols = 1, $i_rows = 0, $i_ub
    Local $a_wret[$a_lines[0] + 1][$i_cols]
    For $iline = 1 To $a_lines[0]
        If StringLen(StringStripWS($a_lines[$iline], 8)) = 0 Then ContinueLoop
        $a_col = StringRegExp($a_lines[$iline], "(?s)(.+?)(?:z|s{2,})", 3)
        If @error Then ContinueLoop
        $i_ub = UBound($a_col)
        If $i_ub > $i_cols Then
            $i_cols = $i_ub
            ReDim $a_wret[$a_lines[0] + 1][$i_cols]
        EndIf
        For $iword = 0 To $i_ub - 1
            $a_wret[$i_rows][$iword] = $a_col[$iword]
        Next
        $i_rows += 1
    Next

    If Not $i_rows Then Return SetError(2, 0, 0)

    ReDim $a_wret[$i_rows][$i_cols]
    Return $a_wret
EndFunc
Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Share this post


Link to post
Share on other sites
Spiff59

#include <Array.au3>

$file = @ScriptDir & "test.txt"
$data = FileRead($file)
$data = StringStripWS($data, 7)
$array = StringSplit($data, " " & @CR)

_ArrayDisplay($array)

Edited by Spiff59
  • Like 1

Share this post


Link to post
Share on other sites
daxle

Thanks guys, both work fine!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • caramen
      By caramen
      Hey.  
       
      I requested help about how to get a value from a text in a variable. 
      Now i know how to do that. But i learned with the command FileRead. Now i whould like to know how to replace the command :
      FileRead('Dossier.txt') The purpose is to read a webpage text. To find some value inside. 
       
      Btw i tryed to play with WindowsInfo.au3 but i dont got much thing.  
    • Rskm
      By Rskm
      Hi, I have the following line in a text file 'input.txt'. I know the line number - say '6'. I wish to replace the text 'WWW' in the below line with a random number (I can generate that with random()).
      WERIS  WWWJP   3.83  8.330  1.000                1097.RAXX 
      The WWW is a 3 digit integer (could be any number between 0 to 999), I can use stringtrimleft and get the numerical value of WWW in this file
      so, basically, I know the string to replace (ie; WWW stored in a variable), I know the line number to work on and the file location/name and the replacement variable (through random()). My requirement is to fill that 3 spaces with my random number (which Is a integer between 1 and 999)
      please put ur suggestions
       
    • nacerbaaziz
      By nacerbaaziz
      Hello Members of this best Forum
      i have a question please
      for example if i have a long string
      and i want to extract a text between two tag
      what i can do to make that?
      note :
      i know that there is a
      StringRegExp function
      it's do that work
      but it result is be as an array
      i want the result to be a string
      is there any function on autoit can do that?
      Thanks in advance.
    • therks
      By therks
      I'm trying to create a simple clock widget that automatically scales the text to the size of the window. I came up with the following method, but it doesn't work as well as I'd like. It especially has trouble scaling to the width of the window for some reason (in the example, try resizing the window to be narrow and tall).
      Does anyone have a better method?
      #include <Misc.au3> #include <WinAPIConv.au3> #include <GUIConstants.au3> #include <GDIPlus.au3> Opt('MustDeclareVars', 1) Global $_FONT_FAMILY = 'Arial', $_LB_TEXT Main() Func Main() _GDIPlus_Startup() Local $hGUI GUIRegisterMsg($WM_SIZE, WM_SIZE) $hGUI = GUICreate('', 300, 100, Default, Default, $WS_OVERLAPPEDWINDOW, BitOR($WS_EX_TOOLWINDOW, $WS_EX_TOPMOST)) $_LB_TEXT = GUICtrlCreateLabel('This is a string', 0, 0, 300, 100, BitOR($SS_CENTER, $SS_CENTERIMAGE)) GUICtrlSetFont($_LB_TEXT, _MeasureString($hGUI, GUICtrlRead($_LB_TEXT), $_FONT_FAMILY), 0, 0, $_FONT_FAMILY, 5) GUISetState() Local $iGM While 1 $iGM = GUIGetMsg() Switch $iGM Case $GUI_EVENT_CLOSE ExitLoop EndSwitch WEnd _GDIPlus_Shutdown() EndFunc Func WM_SIZE($hWnd, $iMsg, $wParam, $lParam) GUICtrlSetFont($_LB_TEXT, _MeasureString($hWnd, GUICtrlRead($_LB_TEXT), $_FONT_FAMILY), 0, 0, $_FONT_FAMILY, 5) EndFunc Func _MeasureString($hWnd, $sString, $sFont = 'Arial') Local $iError, $aSize, $hGraphic, $hFormat, $hFamily, $tLayout, $iFontSize, $hFont, $aInfo If Not IsHWnd($hWnd) Then $hWnd = GUICtrlGetHandle($hWnd) EndIf $aSize = WinGetClientSize($hWnd) $hGraphic = _GDIPlus_GraphicsCreateFromHWND($hWnd) $hFormat = _GDIPlus_StringFormatCreate() $hFamily = _GDIPlus_FontFamilyCreate($sFont) $tLayout = _GDIPlus_RectFCreate(0, 0, $aSize[0], $aSize[1]) $iFontSize = 0 Do If Not $hFamily Then $iError = 1 $iFontSize = 10 ExitLoop EndIf $iFontSize += 1 $hFont = _GDIPlus_FontCreate($hFamily, $iFontSize, 0) $aInfo = _GDIPlus_GraphicsMeasureString($hGraphic, $sString, $hFont, $tLayout, $hFormat) _GDIPlus_FontDispose($hFont) If $aInfo[1] = 0 Then ExitLoop Until DllStructGetData($aInfo[0], 3) >= $aSize[0] Or DllStructGetData($aInfo[0], 4) >= $aSize[1] $iFontSize -= 1 _GDIPlus_FontFamilyDispose($hFamily) _GDIPlus_StringFormatDispose($hFormat) _GDIPlus_GraphicsDispose($hGraphic) Return SetError($iError, 0, $iFontSize) EndFunc
    • mistersquirrle
      By mistersquirrle
      Hello!
       
      I wrote myself a script to follow Google Maps Polyline encoding steps: https://developers.google.com/maps/documentation/utilities/polylinealgorithm, and that works (although I think that it's a bit janky), but now I'm having issues getting the output.
       
      When I run the script, all the points come out correctly in the console, and even when they're the only things that I log, it displays them fine. However, I'm adding each point into a variable to return all of them at once at the end, fully formatted, and it's only taking the very first point. I can't figure out what I'm doing wrong, as it seems fine.
       
      When run with the default value, it should output this at the end: Custom Polygon: _p~iF~ps|U_ulLnnqC_mqNvxq`@
      But instead I'm just getting this: Custom Polygon: _p~iF
       
      I know that it's reaching the string combination lines because it's logging the data before it (and even if a put log AFTER the $sPolygon &= $aPoints[0], it's logged fine).
       
      Here's my full code (problem is lines ~209 - 234, search "$sPolygon &= $aPoints[1]"):
      #include <Array.au3> #include <ButtonConstants.au3> #include <EditConstants.au3> #include <GUIConstantsEx.au3> #include <StaticConstants.au3> #include <WindowsConstants.au3> _PolyGUI() Func _PolyGUI() #Region ### START Koda GUI section ### Form= $hInputGUI = GUICreate("Lat Long encoder", 403, 301, 192, 124) GUISetFont(8, 400, 0, "Consolas") GUICtrlCreateLabel("Input polygon points here, format as:", 8, 8, 263, 19) GUICtrlSetFont(-1, 10, 800, 0, "Consolas") GUICtrlCreateLabel("Lat Long - Single point", 8, 24, 142, 17) GUICtrlCreateLabel("Lat Long, Lat Long, Lat Long - Multiple points", 8, 40, 280, 17) Local $sPoints = GUICtrlCreateEdit("", 8, 64, 385, 201, BitOR($ES_WANTRETURN, $WS_VSCROLL)) GUICtrlSetData(-1, "38.5 -120.2, 40.7 -120.95, 43.252 -126.453") GUICtrlSetFont(-1, 10, 400, 0, "Consolas") $bOK = GUICtrlCreateButton("bOK", 16, 272, 123, 25) GUICtrlSetFont(-1, 12, 800, 0, "Consolas") $bCancel = GUICtrlCreateButton("bCancel", 304, 272, 75, 25) GUICtrlSetFont(-1, 12, 800, 0, "Consolas") GUISetState(@SW_SHOW, $hInputGUI) #EndRegion ### END Koda GUI section ### While 1 $nMsg = GUIGetMsg() Switch $nMsg Case $GUI_EVENT_CLOSE Exit Case $bCancel Exit Case $bOK $sPoints = GUICtrlRead($sPoints) GUISetState(@SW_HIDE, $hInputGUI) _GetPoly($sPoints, True) ExitLoop EndSwitch Sleep(10) WEnd EndFunc ;==>_PolyGUI ;https://developers.google.com/maps/documentation/utilities/polylinealgorithm ;https://app.dsmobileidx.com/api/DescribeSearchForLinkId?linkId=469787 ; Note that this will only really work inside the US (this side of the World), as it's assuming any negative is the Longitude ;https://gist.github.com/ismaels/6636986 - decoder ;Using: 41.83162 -87.64696 ; Expected: sfi~F np}uO ; Actual: sfi~f np}uo ; If we remove 32 from the last ASCII code, since the last bit chunk is 0, we get the correct case/ characters ; We need to run this logic back through all the indexes though and do this to all that that <= 63 ;LinkId=469787 ; Expected: q{`aHpa_iVi[kp@}`Aa{@e[eCoqBbAyc@iRy{@g_@mz@|gA{eAh~@Vf~Etv@gB~p@gQ`^yg@~p@ekAldA{KfFxIrJ^pO~Mtl@dPrJnUz[nSpo@wf@fc@yw@n@ob@ ; Actual: s{`aHpa_iVg[kp@}`Aa{@g[gCmqBbA{c@iRy{@e_@kz@|gA{eAh~@Td~Evv@gB|p@gQb^wg@|p@ekAndA{KfFvIpJ`@rO~Mrl@dPrJnU|[lSpo@wf@dc@yw@n@mb@ ; I assume that this is because of bad data, the points have repeating 9's and 0's, which looks fishy. The polygon is (very) close, but not quite the same. Func _GetPoly($sPoints, $bLog = False) Local $timer = TimerInit(), $sConsole[11] Local $sPolygon = "" ; Step 1, take the initial signed value: Local $aCoords = StringRegExp($sPoints, "(-*?\d*\.\d*) (-*?\d*\.\d*)", 3), $aPoints[2] ;~ _ArrayDisplay($aCoords) If $bLog Then _Log(_ArrayToString($aCoords)) For $c = 0 To (UBound($aCoords) - 1) Step 2 ;~ If $bLog Then _Log($c) If $c = 0 Then $aPoints[0] = $aCoords[$c] $aPoints[1] = $aCoords[$c + 1] Else $aPoints[0] = $aCoords[$c] - $aCoords[$c - 2] $aPoints[1] = $aCoords[$c + 1] - $aCoords[$c - 1] EndIf If $bLog Then _Log("- Step 1, take the initial signed value:") _Log(" " & $aPoints[0]) _Log(" " & $aPoints[1]) EndIf ; Step 2, multiply each by 1e5, and round $aPoints[0] = Round($aPoints[0] * 1e5, 0) $aPoints[1] = Round($aPoints[1] * 1e5, 0) If $bLog Then _Log("- Step 2, multiply each by 1e5, and round") _Log(" " & $aPoints[0]) _Log(" " & $aPoints[1]) EndIf ; Step 3, convert Decimal to Binary, using two's complement for negatives. Padded to 32 bits $aPoints[0] = _NumberToBinary($aPoints[0]) $aPoints[1] = _NumberToBinary($aPoints[1]) If $bLog Then _Log("- Step 3, convert Decimal to Binary, using two's complement for negatives. Padded to 32 bits") _Log(" " & $aPoints[0]) _Log(" " & $aPoints[1]) EndIf ; Step 4, left-shifted 1 bit $aPoints[0] = StringTrimLeft($aPoints[0], 1) & "0" $aPoints[1] = StringTrimLeft($aPoints[1], 1) & "0" If $bLog Then _Log("- Step 4, left-shifted 1 bit") _Log(" " & $aPoints[0]) _Log(" " & $aPoints[1]) EndIf ; Step 5, if negative, invert binary If $c = 0 Then If $aCoords[$c] < 0 Then $aPoints[0] = _InvertBinary($aPoints[0]) If $aCoords[$c + 1] < 0 Then $aPoints[1] = _InvertBinary($aPoints[1]) Else If $aCoords[$c] - $aCoords[$c - 2] < 0 Then $aPoints[0] = _InvertBinary($aPoints[0]) If $aCoords[$c + 1] - $aCoords[$c - 1] < 0 Then $aPoints[1] = _InvertBinary($aPoints[1]) EndIf If $bLog Then _Log("- Step 5, if negative, invert binary") _Log(" " & $aPoints[0]) _Log(" " & $aPoints[1]) EndIf Local $aChunks[2][6], $0x20 For $i = 0 To 1 $0x20 = "1" ; This is out BitOR flag, 0x20 BitOR'd onto our 5-bit chunks is always 1*****, except the last chunk $sConsole[5] = "" ; Clearing console variables $sConsole[6] = "" $sConsole[7] = "" $sConsole[8] = "" $sConsole[9] = "" For $j = 0 To 5 ;There will always be 6 chunks ; Step 6 & 7, break into 5-bit chunks, and reverse order $aChunks[$i][$j] = StringTrimLeft($aPoints[$i], StringLen($aPoints[$i]) - 5) ; This splits into 5-bit chunks in reverse order, doing 6 & 7 in one operation ;~ If $bLog Then _Log(" " & $aPoints[$i]) ;~ If $bLog Then _Log(" " & StringLen($aPoints[$i])) ;~ If $bLog Then _Log(" " & StringTrimLeft($aPoints[$i], StringLen($aPoints[$i]) - 5)) ;~ If $bLog Then _Log(" " & $aChunks[$i][$j]) ; Here we consume the original binary string, so the next loop gets the correct next 5-bit chunk $aPoints[$i] = StringTrimRight($aPoints[$i], 5) $sConsole[5] &= $aChunks[$i][$j] & " " ; Once consumed, if the remaining length isn't enough for another bit chunk, switch 0x20 to 0 (no following chunks) If StringLen($aPoints[$i]) <= 5 Then $0x20 = "0" ; Step 8, BitOR 100000 (0x20) to our 5-bit chunks (effectively) $aChunks[$i][$j] = $0x20 & $aChunks[$i][$j] $sConsole[7] &= $aChunks[$i][$j] & " " ; Step 9, converting the chunk from Binary back to Decimal $aChunks[$i][$j] = _BinaryToDec($aChunks[$i][$j]) $sConsole[8] &= $aChunks[$i][$j] & " " ; Step 10, adding 63 to decimal values $aChunks[$i][$j] += 63 $sConsole[9] &= $aChunks[$i][$j] & " " If StringLen($aPoints[$i]) < 5 Then ExitLoop Next If $bLog Then _Log("- Step 6 & 7 (part " & $i & "), break into 5-bit chunks, and reverse order") _Log(" " & $sConsole[5]) _Log("- Step 8 (part " & $i & "), BitOR 100000 (0x20) to our 5-bit chunks (effectively)") _Log(" " & $sConsole[7]) _Log("- Step 9 (part " & $i & "), converting the chunk from Binary back to Decimal") _Log(" " & $sConsole[8]) _Log("- Step 10 (part " & $i & "), adding 63 to decimal values") _Log(" " & $sConsole[9]) EndIf Next Local $aASCII[0] For $i = 0 To 1 Dim $aASCII[0] ; Reset ASCII array For $j = 0 To (UBound($aChunks, 2) - 1) ; For both chunk sets ReDim $aASCII[UBound($aASCII) + 1] ; Add an index for the ASCII array If $aChunks[$i][$j] = "" Or $aChunks[$i][$j] <= 63 Then ; If the chunk is not useful $l = $j For $k = $l To 1 Step -1 If $aChunks[$i][$k] = "" Or $aChunks[$i][$k] <= 63 Or $aASCII[$k] <= 63 Then $aASCII[$k - 1] -= 32 If $aASCII[$k - 1] <= 63 Then _ArrayDelete($aASCII, $k - 1) Else ExitLoop EndIf Next ExitLoop EndIf $aASCII[$j] = Int($aChunks[$i][$j]) Next ;Step 11, convert each value to ASCII equivalent For $k = UBound($aASCII) - 1 To 0 If $aASCII[$k] <= 63 Or $aASCII[$k] = "" Then ReDim $aASCII[UBound($aASCII) - 1] Else ExitLoop EndIf Next $aPoints[$i] = StringFromASCIIArray($aASCII, 0, -1, 0) Next If $bLog Then _Log("- Step 11, convert each value to ASCII equivalent, finished") If $aCoords[$c] <= 0 Then ;@CRLF & " " & If $bLog Then _Log($aPoints[1]) _Log($aPoints[0]) _Log("Next set") EndIf $sPolygon &= $aPoints[1] $sPolygon &= $aPoints[0] Else If $bLog Then _Log($aPoints[0]) _Log($aPoints[1]) _Log("Next set") EndIf $sPolygon &= $aPoints[0] $sPolygon &= $aPoints[1] EndIf Next If $bLog Then _Log("Custom Polygon: " & $sPolygon) _Log(TimerDiff($timer) & @CRLF) EndIf Return $sPolygon EndFunc ;==>_GetPoly Func _NumberToBinary($iNumber) Local $sBinString = "" ; Maximum 32-bit # range is -2147483648 to 2147483647 If $iNumber < -2147483648 Or $iNumber > 2147483647 Then Return SetError(1, 0, "") ; Convert to a 32-bit unsigned integer. We can't work on signed #'s $iUnsignedNumber = BitAND($iNumber, 0x7FFFFFFF) ; Cycle through each bit, shifting to the right until 0 Do $sBinString = BitAND($iUnsignedNumber, 1) & $sBinString $iUnsignedNumber = BitShift($iUnsignedNumber, 1) Until Not $iUnsignedNumber ; Was it a negative #? Put the sign bit on top, and pad the bits that aren't set If $iNumber < 0 Then Return '1' & StringRight("000000000000000000000000000000" & $sBinString, 31) ; Always return 32 bit binaries If StringLen($sBinString) < 32 Then Return StringRight("0000000000000000000000000000000" & $sBinString, 32) Return $sBinString EndFunc ;==>_NumberToBinary Func _BinaryToDec($sBinary) Local Const $aPower[8] = [128, 64, 32, 16, 8, 4, 2, 1] Local $iDec If StringRegExp($sBinary, "[0-1]") Then If StringLen($sBinary) < 8 Then Do $sBinary = "0" & $sBinary Until StringLen($sBinary) = 8 EndIf $aBinary = StringSplit($sBinary, "", 2) For $i = 0 To UBound($aBinary) - 1 ;~ $aBinary[$i] = $aBinary[$i] * $aPower[$i] $iDec += $aBinary[$i] * $aPower[$i] Next Return $iDec Else Return SetError(0, 0, "Not a binary string") EndIf EndFunc ;==>_BinaryToDec Func _InvertBinary($iNumber) ;~ ConsoleWrite(@CRLF & $iNumber) Local $sNumber $aNumber = StringSplit($iNumber, "") For $i = 1 To $aNumber[0] If $aNumber[$i] = 0 Then $aNumber[$i] = 1 ElseIf $aNumber[$i] = 1 Then $aNumber[$i] = 0 Else Return SetError(0, 0, "Not a binary number") EndIf $sNumber &= String($aNumber[$i]) Next Return $sNumber EndFunc ;==>_InvertBinary Func _Log($data) ;~ Local Static $LogEnable = True ConsoleWrite(@CRLF & @HOUR & ":" & @MIN & "." & @SEC & " " & $data) LogData(@HOUR & ":" & @MIN & "." & @SEC & " " & $data, "logs/LOGFILE.txt") EndFunc ;==>_Log Func LogData($text, $File = "logs/LOGFILE.txt") Global $LogFile = "" If $LogFile = "" Then $LogFile = FileOpen($File, 9) OnAutoItExitRegister(CloseLog) EndIf FileWriteLine($LogFile, $text) EndFunc ;==>LogData Func CloseLog() If $LogFile <> "" Then _Log("Closing LoD script" & @CRLF) FileClose($LogFile) EndIf EndFunc ;==>CloseLog  
      I've tried:
      $sPolygon &= $aPoints[0] & $aPoints[1] ;---- $sPolygon = $sPolygon & $aPoints[0] & $aPoints[1] ;---- $sPolygon = $sPolygon & String($aPoints[0] & $aPoints[1]) ;---- $sPolygon = String($sPolygon) & String($aPoints[0]) & String($aPoints[1]) ;---- $sPolygon &= $aPoints[1] $sPolygon &= $aPoints[0] ;----  
      I'm sure it's something basic that I'm overlooking, but I don't understand why it's not combining the strings. 
      Also, unrelated, why doesn't $LogFile = FileOpen($File, 9) create the directory/ file if they don't exist? 9 should be $FO_CREATEPATH (8) + $FO_APPEND (1)?
      Thanks!
×