gritts

Using Regex in _ArraySearch to find CRLF

17 posts in this topic

I am working on a script that parses through a log file. At then end of a log file entry (several lines) there is a blank line (CRLF). I want to use the _ArraySearch and the Regex search option to determine what line the current entry I am parsing ends. The closest I have come is below but it only matches the CRLF that my loop through the file array is on.

The log file is read into an array. Once I match the first criteria, I search for the end of the related log entries (the CRLF). Once I have that information I will do further parsing. I'm just stuck getting the results I need.

For $aRow = 1 to $aDumpFile[0]
    ;ConsoleWrite($aDumpFile[$aRow]&@CRLF)
    If StringInStr($aDumpFile[$aRow],"AB000DD4") Then
        $iEntryEnd = _ArraySearch($aDumpFile,"(*CRLF)",$aRow,0,0,3)
        ConsoleWrite("Entry End: "&$iEntryEnd&@CRLF)
        For $aSubRow = $aRow to $iEntryEnd
            ConsoleWrite("Sub entry: "&$aDumpFile[$aSubRow]&@CRLF)
        Next
    EndIf

Next

Share this post


Link to post
Share on other sites



How are you getting the log file into the array? Is the blank line in your array? Is there a CRLF in that spot in the array or is it just an empty entry? If it's just an empty entry look for that, if it's a CRLF look for that, because the other entries probably won't contain a CRLF anywhere in them.


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

So you know R determines an EOL char.


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

I am not sure if the CRLF is being read in as part of the array to be honest. I had thought of searching for @CRLF alone and didn't come out with a method using the _ArraySearch. (Probably is a means but when you work on 6 things at once  :* )

The log file I am parsing is read into the array with the following:

_FileReadToArray(@ScriptDir&"\20140909.dmp",$aDumpFile)

Don't let the extension distract you, it is just a text file with a different extension.

I've played with this some more using what is in the help files and I think my "trying to overthink REGEX" is shooting me in the foot here. 

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

If there's a blank line in the log, it will more than likely show up as an empty string in the array. Do a search for any empty elements in your array by looping through the array. Something like this.

#include <MsgBoxConstants.au3>
#include <array.au3>
Example()

Func Example()
    ; Read the file into an array using the filepath.
    Local $aArray = FileReadToArray("C:\Windows\WindowsUpdate.log")
    If @error Then
        MsgBox($MB_SYSTEMMODAL, "", "There was an error reading the file. @error: " & @error) ; An error occurred reading the current script file.
        Return
    Else
        _ArrayDisplay($aArray)
    EndIf
    For $I = 0 To UBound($aArray) - 1
        ; Loop through the array and look for empty strings in an array element
        If $aArray[$I] = "" Then ConsoleWrite("Empty line found in $aArray[" & $I & "]" & @CRLF)
    Next
EndFunc   ;==>Example
Edited by BrewManNH

If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

gritts,

Can you post a representative example of the file?


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

Maybe isolate the wanted part before reading to array ?

#Include <Array.au3>

$txt = "AutoIt has been designed to be as small as possible and stand-alone with no external .dll files " & @crlf & _ 
"or registry entries required making it safe to use on Servers."  & @crlf & _ 
"Scripts can be compiled into stand-alone executables with Aut2Exe." & @crlf & _ 
@crlf & _ 
"Also supplied is a AB000DD4 combined COM and DLL version of AutoIt called AutoItX." & @crlf & _ 
"that allows you to add the unique features of AutoIt" & @crlf & _ 
"to your own favorite scripting or programming languages!" & @crlf & _ 
 @crlf & _ 
"Best of all, AutoIt continues to be free - but if you want to support the time, " & @crlf & _ 
"money and effort spent on the project and web hosting then you may donate at the AutoIt homepage." & @crlf 

Msgbox(0,"", $txt)   ; initial text
 
$res = StringRegExpReplace($txt, '(?s).*\R{2}(.*?AB000DD4.*?)\R{2}.*', "$1")  ; isolate the part
$lines = StringRegExp($res, '(?m)^(\N*)\R?', 3)   ; read the part to array 
_ArrayDisplay($lines)
Edited by mikell

Share this post


Link to post
Share on other sites

Thank you guys for your input. @Univarsilist your code pretty much is what I do to locate the section of the log I wish to parse.

For example:

Here is some test text from process AKB48 that completed a 4:00pm

Stat 1: 334

Stat 2: 4456

Stat 3: 5543

<crlf>

Here is some test text from process BR549 that completed a 4:00pm

Stat 1: 352

Stat 2: 44446

Stat 3: 5522

<crlf>
 
Once I have isolated the section, I plan to parse through that section using the starting and ending row numbers determined. Then possibly rinse repeat with a new value.
 
I did manage to locate the end of a section with the following:
$iEntryEnd = _ArraySearch($aDumpFile,"",$aRow)

For learning sake I'd like to figure out how to do the same with REGEX. 

Share this post


Link to post
Share on other sites

Geez, I didn't realized mikell gave it ealier...


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

gritts,

Not an SRE solution but may be useful to you...

#include <array.au3>

local $str  =   'Stat 1: 334' & @crlf & 'Stat 2: 4456' & @crlf & 'Stat 3: 5543' & @crlf & _
                @crlf & 'Stat 1: 352' & @crlf & 'Stat 2: 44446' & @crlf & 'Stat 3: 5522' & @crlf & @crlf & 'blah blah blah'

local $aRSLT = stringsplit($str,@crlf & @crlf,3)

for $1 = 0 to ubound($aRSLT) - 1
    ConsoleWrite('!  Set # ' & $1+1 & @CRLF & $aRSLT[$1] & @lf)
next

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

I figured that there was a post on this, so I would not start another one. I have done a search for "" and " " both are found, but there are others elements, that have LF or many spaces, how do I go about finding them, without a loop, adding one more space, and then search again? I want to find empty (no data) on any array element?


All by me:

"Sometimes you have to go back to where you started, to get to where you want to go." 

"Everybody catches up with everyone, eventually" 

"As you teach others, you are really teaching yourself."

From my dad

"Do not worry about yesterday, as the only thing that you can control is tomorrow."

 

WindowsError.gif

WIKI | Tabs; | Arrays; | Strings | Wiki Arrays | How to ask a Question | Forum Search | FAQ | Tutorials | Original FAQ | ONLINE HELP | UDF's Wiki | AutoIt PDF

AutoIt Snippets | Multple Guis | Interrupting a running function | Another Send

StringRegExp | StringRegExp Help | RegEXTester | REG TUTOR | Reg TUTOT 2

AutoItSetOption | Macros | AutoIt Snippets | Wrapper | Autoit  Docs

SCITE | SciteJump | BB | MyTopics | Programming | UDFs | AutoIt 123 | UDFs Form | UDF

Learning to script | Tutorials | Documentation | IE.AU3 | Games? | FreeSoftware | Path_Online | Core Language

Programming Tips

Excel Changes

ControlHover.UDF

GDI_Plus

Draw_On_Screen

GDI Basics

GDI_More_Basics

GDI Rotate

GDI Graph

GDI  CheckExistingItems

GDI Trajectory

Replace $ghGDIPDll with $__g_hGDIPDll

DLL 101?

Array via Object

GDI Swimlane

GDI Plus French 101 Site

GDI Examples UEZ

GDI Basic Clock

GDI Detection

Ternary operator

Share this post


Link to post
Share on other sites

What you want to do is not clear
Could you provide an example ?

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

I have an array from a test file that was created by coping the text of an alert and writing it to the file. There are multiple blank lines in my array, that I want to remove. I have removed a few of them, that have either nothing on them, or one space, but in my tests there are other lines that I am not finding in my search - I am trying to remove those lines and merge them together with the first original line of the alert. These alerts are copied from different browsers, so each on has its own unique format - same alert from the different browsers, comes in in different formats - all ASCII, but some have multiple spaces, or EOL, or something that I am not able to find?

 

I have tried this example, but it does not work.

I use this to convert EOL characters to CRLF.

#include <Constants.au3>

; Convert all line endings to @CRLF.
Local $sString = StringEOLToCRLF("This is a sentence " & @CR & Chr(12) & "with " & Chr(13) & "whitespace." & @CRLF)

; Display the converted line endings.
MsgBox($MB_SYSTEMMODAL, "", $sString)

; Visually display that @CR and @LF have converted to @CRLF.
$sString = StringReplace($sString, @CRLF, '@CRLF')
MsgBox($MB_SYSTEMMODAL, "", $sString)

Func StringEOLToCRLF($sString) ; Regular expression by Melba23 and modified By guinness.
    Return StringRegExpReplace($sString, '((?<!\r)\n|\r(?!\n))', @CRLF)
    ; Return StringRegExpReplace($sString, '\R', @CRLF) ; By Ascend4nt
EndFunc   ;==>StringEOLToCRLF

 

 

My sample text is as follows. The last two are the exact same alert, yet I did change the info, as it is proprietary.


11/11/15 09:26:20 AM

 

 

 

 

 

System Up Time : Notification Text (System Up Time : IP Address (X.X.X.X) : System ID (someID) : UpTime ( 6 Days, 13 Hours, 35 Minutes, 53

Seconds))

 

11/11/15 10:45:48 AM
                        
PRISTAT Alarm - X : Alarm Text (Error - X,X : System ID (someID) : System IP Address (X.X.X.X) : Alarm Text (Part

status report Event Source: TYPE Status Down Filename: SomeFileName Line Number 1234) : Log Date and Time (Nov 11 11:53:19) : Code

X,X (X)) : Code X(X)

 

11/11/15 10:45:48 AM
PRISTAT Alarm - X : Alarm Text (Error - X,X : System ID (SomeID) : System IP Address (X.X.X.X) : Alarm Text (Part

status report
 Event Source: TYPE
 TYPE
 Status Down
 Filename: SomeFileName
 Line Number 1234
 
 ) : Log Date and Time (Nov 11 11:53:19) :

Code X,X (X)) : Code X (X)

.

Edited by nitekram

All by me:

"Sometimes you have to go back to where you started, to get to where you want to go." 

"Everybody catches up with everyone, eventually" 

"As you teach others, you are really teaching yourself."

From my dad

"Do not worry about yesterday, as the only thing that you can control is tomorrow."

 

WindowsError.gif

WIKI | Tabs; | Arrays; | Strings | Wiki Arrays | How to ask a Question | Forum Search | FAQ | Tutorials | Original FAQ | ONLINE HELP | UDF's Wiki | AutoIt PDF

AutoIt Snippets | Multple Guis | Interrupting a running function | Another Send

StringRegExp | StringRegExp Help | RegEXTester | REG TUTOR | Reg TUTOT 2

AutoItSetOption | Macros | AutoIt Snippets | Wrapper | Autoit  Docs

SCITE | SciteJump | BB | MyTopics | Programming | UDFs | AutoIt 123 | UDFs Form | UDF

Learning to script | Tutorials | Documentation | IE.AU3 | Games? | FreeSoftware | Path_Online | Core Language

Programming Tips

Excel Changes

ControlHover.UDF

GDI_Plus

Draw_On_Screen

GDI Basics

GDI_More_Basics

GDI Rotate

GDI Graph

GDI  CheckExistingItems

GDI Trajectory

Replace $ghGDIPDll with $__g_hGDIPDll

DLL 101?

Array via Object

GDI Swimlane

GDI Plus French 101 Site

GDI Examples UEZ

GDI Basic Clock

GDI Detection

Ternary operator

Share this post


Link to post
Share on other sites

#14 ·  Posted (edited)

This is the closest I have come:

 

If $string = '' Or $string = ' ' Then ContinueLoop
If (StringRegExp($string, "\S", 1)) Then
; Grab the line unless it is only whitespaces.
    $sNewBody &= $string]
EndIf

EDIT

But this still is leaving me with blank lines - I am at a loss.

EDIT 2

Yeah, I had other code mixed in, so this did not work as I thought.

Edited by nitekram

All by me:

"Sometimes you have to go back to where you started, to get to where you want to go." 

"Everybody catches up with everyone, eventually" 

"As you teach others, you are really teaching yourself."

From my dad

"Do not worry about yesterday, as the only thing that you can control is tomorrow."

 

WindowsError.gif

WIKI | Tabs; | Arrays; | Strings | Wiki Arrays | How to ask a Question | Forum Search | FAQ | Tutorials | Original FAQ | ONLINE HELP | UDF's Wiki | AutoIt PDF

AutoIt Snippets | Multple Guis | Interrupting a running function | Another Send

StringRegExp | StringRegExp Help | RegEXTester | REG TUTOR | Reg TUTOT 2

AutoItSetOption | Macros | AutoIt Snippets | Wrapper | Autoit  Docs

SCITE | SciteJump | BB | MyTopics | Programming | UDFs | AutoIt 123 | UDFs Form | UDF

Learning to script | Tutorials | Documentation | IE.AU3 | Games? | FreeSoftware | Path_Online | Core Language

Programming Tips

Excel Changes

ControlHover.UDF

GDI_Plus

Draw_On_Screen

GDI Basics

GDI_More_Basics

GDI Rotate

GDI Graph

GDI  CheckExistingItems

GDI Trajectory

Replace $ghGDIPDll with $__g_hGDIPDll

DLL 101?

Array via Object

GDI Swimlane

GDI Plus French 101 Site

GDI Examples UEZ

GDI Basic Clock

GDI Detection

Ternary operator

Share this post


Link to post
Share on other sites

#16 ·  Posted (edited)

That worked great...I wish I could get a handle on RegExp.

Thanks for your help and time.

 

EDIT - spelling

.

 

Edited by nitekram

All by me:

"Sometimes you have to go back to where you started, to get to where you want to go." 

"Everybody catches up with everyone, eventually" 

"As you teach others, you are really teaching yourself."

From my dad

"Do not worry about yesterday, as the only thing that you can control is tomorrow."

 

WindowsError.gif

WIKI | Tabs; | Arrays; | Strings | Wiki Arrays | How to ask a Question | Forum Search | FAQ | Tutorials | Original FAQ | ONLINE HELP | UDF's Wiki | AutoIt PDF

AutoIt Snippets | Multple Guis | Interrupting a running function | Another Send

StringRegExp | StringRegExp Help | RegEXTester | REG TUTOR | Reg TUTOT 2

AutoItSetOption | Macros | AutoIt Snippets | Wrapper | Autoit  Docs

SCITE | SciteJump | BB | MyTopics | Programming | UDFs | AutoIt 123 | UDFs Form | UDF

Learning to script | Tutorials | Documentation | IE.AU3 | Games? | FreeSoftware | Path_Online | Core Language

Programming Tips

Excel Changes

ControlHover.UDF

GDI_Plus

Draw_On_Screen

GDI Basics

GDI_More_Basics

GDI Rotate

GDI Graph

GDI  CheckExistingItems

GDI Trajectory

Replace $ghGDIPDll with $__g_hGDIPDll

DLL 101?

Array via Object

GDI Swimlane

GDI Plus French 101 Site

GDI Examples UEZ

GDI Basic Clock

GDI Detection

Ternary operator

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

$sText = "Entry 1 Generic log entry" & @CRLF & "Entry 2, " & @LF & "multline Security log entry" & @CRLF & "Entry 3, " & @LF & "multiline System log entry" & @CRLF &  "Entry 4, Application log entry" & @CRLF & "Line 5, telemetry log entry" & @CRLF & "Line 6 log entry"
$aText = stringsplit($sText , @CRLF , 3)


;~ $target = "Security"
$target = "System"
;~ $target = "Application"


For $entry in $aText
    If stringinstr($entry , $target) Then msgbox(0, '' , $entry)
Next

 

 

If the OP legitimately needs to separate by the next @CRLF after having selected another criteria (string i am assuming), and no other edge cases.

 

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now