Jump to content

Breaking a text file into sections


fu2m8
 Share

Recommended Posts

Hey Guys,

Here at work i've got a text (ldif) file that i would like to extract data from but Im not sure how to do it properly.

Basically the ldif file contains a whole bunch of LDAP information relating to certain workstations, however the only stuff im concerned with currently is the lines that begin with groupMembership: and the data those lines contain.

Currently if i export a single workstations LDAP information i can do a StrInStr function to return me the relevant lines of information i want and dump this basic information into Excel, however what i would hope to be able to do is export a whole container of multiple workstations LDAP information (which i can do however all the information goes into one file) & somehow pull each workstations relevant groupMembership: into its own array/section which i could then populate into Excel (either in its own book per workstation or someother way i chose to format it).

The basic structure of the LDIF file is as follows (note that a whole chunk of irrelevant information has been taken out for simplicity's sake but this is the structure of the text file):

#-------------------------------------------------------------------------------

# This file has been generated on 11.03.2006 at 17:02 from test-server:839

# by Softerra LDAP Browser 2.6 (http://www.ldapbrowser.com)

#-------------------------------------------------------------------------------

version: 1

dn: cn=00003,ou=Workstations,ou=ZEN,ou=test,o=test

groupMembership: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test

groupMembership: cn=A0738 Trim,ou=Applications,ou=Services,ou=test,o=test

groupMembership: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A1021 Lawpoint,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test

dn: cn=00004,ou=Workstations,ou=ZEN,ou=test,o=test

groupMembership: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test

groupMembership: cn=A1021 Lawpoint,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test

dn: cn=00005,ou=Workstations,ou=ZEN,ou=test,o=test

groupMembership: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test

groupMembership: cn=A0115 CMS Author,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0025 MS Project 2000,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0020 MS Access 2000,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0056 ACL,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test

The text in Red relates to the workstation name (so 3 workstations in this example) and the groupMembership: stuff directly below this is information on what applications are associated to the workstation (delivered through Novell Zenworks). Each different type of workstation seems to be seperated by a Carriage Return (thats what its called right?! :whistle: you know when you push enter... ;) )

To recap I would basically like to know how i could break the 3 workstations (with their relevant data) into their own array/section type thing that i could then use in some other way (i.e for this i would ideally like the information populated into Excel).

Here's the basic (uncommented) code i have for importing a single workstations groupMembership: stuff into an Excel document:

#include <file.au3>
#include <array.au3>
#include <ExcelCOM_UDF.au3>

Dim $group[1]

Dim $aRecords

Dim $y = 1

$iBegin = MsgBox(1, "This will run the test code", "This example will open Excel & write some data from the text file to it. Click OK to begin, or CANCEL to exit.")

If $iBegin = 2 Then Exit        ; Close out if the user clicked "Cancel"
    
$file = FileOpen("Workstations.ldif", 0)

_FileReadToArray("Workstations.ldif",$aRecords)

$oExcel = _ExcelBookNew()

For $x In $aRecords

    If StringInStr($x, "groupMembership:") Then
        If Not StringInStr($x, "Base Applications") Then
    
        _ExcelWriteCellR1C1($oExcel, $y, 2, $x)
        
        $y = $y + 1
        
        EndIf
    
    EndIf
Next

Exit

Thanks for any ideas or insight you guys can provide! :P

Peace

Link to comment
Share on other sites

  • Moderators

Well I took a stab at it, and tried to use StringRegExp(), it could easily be done with FileRead/StringSplit/StringInStr (or _FileReadToArray + StringInStr) and a few loops.

If anyone wants to figure out why I can't get the last group feel free:

CODE
$sString = FileRead(@DesktopDir & '\blah.txt')
$aArray = _StringBetween($sString, 'dn:\s', '\n', -1, 1);Get [0] of the 2 dim array (the headers of the workstations)
If Not IsArray($aArray) And MsgBox(64, 'Info', 'No WorkStations Found') Then Exit
Dim $aWKStationData[UBound($aArray)][2]
For $iCC = 0 To UBound($aArray) - 1
    $aWKStationData[$iCC][0] = 'dn: ' & $aArray[$iCC]
    $aData = _StringBetween($sString, 'groupMembership:\s', '\r\n\r\n|$', -1, 1) ;supposed to get the remaining groupmemberships under the headers...
    If IsArray($aData) Then $aWKStationData[$iCC][1] = 'groupMembership: ' & $aData[$iCC]
Next
    
;~ ================== _StringBetween is In the newest 3.2.1.12 beta And _ArrayDisplay2D is my own    =====================
_ArrayDisplay2D($aWKStationData, 'Array Display 2Dim', 0)   
Func _ArrayDisplay2D($aArray, $sTitle = 'Array Display 2Dim', $iBase = 1, $sToConsole = 0)
    If Not IsArray($aArray) Then Return SetError(1, 0, 0)
    Local $sHold = 'Dimension 1 Has:  ' & UBound($aArray, 1) -1 & ' Element(s)' & @LF & _
            'Dimension 2 Has:  ' & UBound($aArray, 2) - 1 & ' Element(s)' & @LF & @LF
    For $iCC = $iBase To UBound($aArray, 1) - 1
        For $xCC = 0 To UBound($aArray, 2) - 1
            $sHold &= '[' & $iCC & '][' & $xCC & ']  = ' & $aArray[$iCC][$xCC] & @LF
        Next
    Next
    If $sToConsole Then Return ConsoleWrite(@LF & $sHold)
    Return MsgBox(262144, $sTitle, StringTrimRight($sHold, 1))
EndFunc
Func _StringBetween($sString, $sStart, $sEnd, $vCase = -1, $iSRE = -1)
    If $iSRE = -1 Or $iSRE = Default Then
        If $vCase = -1 Or $vCase = Default Then 
            $vCase = 0
        Else
            $vCase = 1
        EndIf
        Local $sHold = '', $sSnSStart = '', $sSnSEnd = ''
        While StringLen($sString) > 0
            $sSnSStart = StringInStr($sString, $sStart, $vCase)
            If Not $sSnSStart Then ExitLoop
            $sString = StringTrimLeft($sString, ($sSnSStart + StringLen($sStart)) - 1)
            $sSnSEnd = StringInStr($sString, $sEnd, $vCase)
            If Not $sSnSEnd Then ExitLoop
            $sHold &= StringLeft($sString, $sSnSEnd - 1) & Chr(1)
            $sString = StringTrimLeft($sString, $sSnSEnd)
        WEnd
        If Not $sHold Then Return SetError(1, 0, 0)
        $sHold = StringSplit(StringTrimRight($sHold, 1), Chr(1))
        Local $aArray[UBound($sHold) - 1]
        For $iCC = 1 To UBound($sHold) - 1
            $aArray[$iCC - 1] = $sHold[$iCC]
        Next
        Return $aArray
    Else
        If $vCase = Default Or $vCase = -1 Then 
            $vCase = '(?i)'
        Else
            $vCase = ''
        EndIf
        Local $aArray = StringRegExp($sString, '(?s)' & $vCase & $sStart & '(.*?)' & $sEnd, 3)
        If IsArray($aArray) Then Return $aArray
        Return SetError(1, 0, 0)
    EndIf
EndFunc
;~ =====================================================================================================================
Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

If I understand the task at hand this should do it?

Func testExtractLDIFRecords()
   Local $data, $regexp, $arr, $i
#region - data   
   $data &= '#-------------------------------------------------------------------------------' & @CRLF
   $data &= '# This file has been generated on 11.03.2006 at 17:02 from test-server:839' & @CRLF
   $data &= '# by Softerra LDAP Browser 2.6 (http://www.ldapbrowser.com)' & @CRLF
   $data &= '#-------------------------------------------------------------------------------' & @CRLF
   $data &= 'version: 1' & @CRLF
   $data &= 'dn: cn=00003,ou=Workstations,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0738 Trim,ou=Applications,ou=Services,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A1021 Lawpoint,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= '' & @CRLF
   $data &= 'dn: cn=00004,ou=Workstations,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A1021 Lawpoint,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= '' & @CRLF
   $data &= 'dn: cn=00005,ou=Workstations,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0115 CMS Author,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0025 MS Project 2000,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0020 MS Access 2000,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0056 ACL,ou=Applications,ou=ZEN,ou=test,o=test' & @CRLF
   $data &= 'groupMembership: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test'
#endregion   
   $regexp = '(?m)(?s)(dn:.*(\s{4}|$))'
   $arr = StringRegExp($data, $regexp, 3)
   If @error <> 0 Then ConsoleWrite("@error:=" & @error & ", @extended:=" & @extended & @LF)
   If Not IsArray($arr) Then ConsoleWrite("NOT an array" & @LF)
   For $i = 0 to UBound($arr) -1
      ConsoleWrite($arr[$i] & @LF)
   Next 
EndFunc
Link to comment
Share on other sites

thx Smoke_N and Uten for the quick replies ;)

gee this StringRegExp stuff could do your head in couldn't it lol :whistle:

I made only one slight change to your code Uten however it is returning all the entries for everything in the Workstations.ldif file.

$data = FileRead("Workstations.ldif")

A full entry for 1 Workstation in the LDIF file looks something like this (these weren't shown in the original post):

CODE

dn: cn=00003,ou=Workstations,ou=ZEN,ou=test,o=test

zenwmDisableUserHistory: FALSE

zenimgCompression: 1

zenimgImageFlags: 0

zenzfdVersion: <?xml version="1.0" encoding="UTF-8"?><AgentData><Version>7.0.1.0</Version><VerWriteTime>1162507394</VerWriteTime></AgentData>

zenwmSubnetMask: 255.255.255.0

zenwmMACAddress: 00:15:C5:43:E4:42

zenwmID: a6507acdc884b8d66b144d398db7fd30

wMLastRegisteredTime: 20061102224314Z

wMNAMEComputer: 00003

wMNAMECPU: PENTIUM III

wMNAMEDNS: 00003.commerce.nsw.gov.au

wMNAMEOS: WINXP (5.1 Service Pack 2)

wMNAMEServer: PWS-IS2-NSRV

wMNAMEUser: cn=testuser,ou=AUDIT,ou=ESD,ou=test,o=test

wMNetworkAddress: 192.168.115.16

wMUserHistory: cn=ZEN7Adm1,ou=Department,ou=ZEN7Test,ou=test,o=test

wMUserHistory: cn=testuser1,ou=TIM,ou=ESD,ou=test,o=test

wMUserHistory: cn=AuditG,ou=Department,ou=ZEN7Test,ou=test,o=test

wMUserHistory: cn=testuser,ou=AUDIT,ou=ESD,ou=test,o=test

groupMembership: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test

groupMembership: cn=A0738 Trim,ou=Applications,ou=Services,ou=test,o=test

groupMembership: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A1021 Lawpoint,ou=Applications,ou=ZEN,ou=test,o=test

groupMembership: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test

securityEquals: cn=Base Applications,ou=Workstations,ou=ZEN,ou=test,o=test

securityEquals: cn=A0738 Trim,ou=Applications,ou=Services,ou=test,o=test

securityEquals: cn=A0886 Dial Connect,ou=Applications,ou=ZEN,ou=test,o=test

securityEquals: cn=A1021 Lawpoint,ou=Applications,ou=ZEN,ou=test,o=test

securityEquals: cn=A0533 Corporate Archive Viewer,ou=Applications,ou=ZEN,ou=test,o=test

objectClass: Workstation

objectClass: computer

objectClass: device

objectClass: top

cn: 00003

ACL: 16#subtree#cn=ZEN7 Server Package:General:Workstation Import,ou=Policies,ou=ZEN,ou=test,o=test#[Entry Rights]

ACL: 1#subtree#cn=00003,ou=Workstations,ou=ZEN,ou=test,o=test#[Entry Rights]

ACL: 15#subtree#cn=00003,ou=Workstations,ou=ZEN,ou=test,o=test#[All Attributes Rights]

ACL: 3#entry#[Public]#wMNetworkAddress

ACL: 3#entry#[Public]#zenwmMACAddress

ACL: 3#entry#[Public]#zenwmSubnetMask

ACL: 3#entry#[Public]#groupMembership

and there potentially could be hundreds of these entries in the file.

I'm assuming its the Regular Expression that is returning all the entries when I run the test function so I'll have a go at trying to work it out, but if it's glaringly obvious for any of you pro's on what to change please feel free to pass it on :P .

I was also thinking i could do what Smoke_N mentioned and create a couple of loops that check each line, if dn= (the dn= line(s) means its a new workstation) is found then add every groupMembership: into an array until the next dn= line is found and start a new array for the new workstation.

Thx again for the help (and for any in the future) ;) .

Link to comment
Share on other sites

Yeah, the regexp stuff can be a real pain :whistle: Specially if it has been a while since you last did it. I think a well crafted regexp will be faster in this case but you might feel more comfy with a loop.

Link to comment
Share on other sites

Ohh, yeah forgot...

To filter out the lines you dont want you have to play with non capturing groups or do a manual filter in the array loop and use a regexp that capturs lines starting with dn: or groupMembership:

Somthing like

$regexp = '(?m)(dn:.*)$|(groupMembership:.*)$'
And then filter in a loop?
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...