Jump to content
Deon

Find string in string by RegEx

Recommended Posts

Deon

I have 100 text files which contain some information which is (just) human-readable.

I want to extract two things from them, one is a string which fits a RegEx:

[a-zA-Z]{2}\d{4}[a-zA-Z]{2}\d{3}

(although technically it will say DescriptionAA1111AA111, and I just want to catch the AA1111AA111 part of it, but I can use a StringRight() function to clean that up)

 

And the scond is a string showing the status of a device on our network, something like:

06:52:16 AWST01.24.00OnPlaying streamIdlePresent and mountedInternet: XX A1str.to: 0, buf.emp: 0, str.dsc: 0, vs.eof: 0

which always starts with a time stamp (something like \d{2}:\d{2}:\d{2}?)

Basically, I just want to define two variables:

  • The last 11 characters of DescriptionAA1111AA111 (only appears once in the string)
  • The entire line the first time a time like 06:52:16 is found

 

I tried playing using FileReadLine() but because they aren't always on the same line from file to file. I also tried passing the entire file to StringRegExp(), but I can't seem to find a way to get StringRegExp() to trawl through the file looking for a match, rather than trying to match the entire contents against the expression.

 

Any help appreciated!

 

Share this post


Link to post
Share on other sites
SadBunny

The regex pattern modifier for multiline is your friend: (?m)

#include <Array.au3>

$fileContent = "This is the first line."
$fileContent = $fileContent & "Something something, blah, DescriptionAA1111AA111, and another thing." & @CRLF
$fileContent = $fileContent & "06:52:16 this line starts with a valid timestamp"  & @CRLF
$fileContent = $fileContent & "24:01:01 this line doesn't start with a valid timestamp"  & @CRLF
$fileContent = $fileContent & "00:00:00 this line also starts with a valid timestamp"  & @CRLF
$fileContent = $fileContent & "23:59:59 just like this one, which also has a description thingy: DescriptionZZ1234zz123"  & @CRLF
$fileContent = $fileContent & "This is a bonus line."
$fileContent = $fileContent & "DescriptionXy1111aB111, aaaaaand it's gone." & @CRLF

$aTimestampLines = StringRegExp($fileContent, "(?m)^((?:0\d|1\d|2[0-3]):[0-5]\d:[0-5]\d .*)$", $STR_REGEXPARRAYGLOBALMATCH )
_ArrayDisplay($aTimestampLines)

$aDescriptionLines = StringRegExp($fileContent, "(?m)([a-zA-Z]{2}\d{4}[a-zA-Z]{2}\d{3})", $STR_REGEXPARRAYGLOBALMATCH )
_ArrayDisplay($aDescriptionLines)

 

  • Like 1

Roses are FF0000, violets are 0000FF... All my base are belong to you.

Share this post


Link to post
Share on other sites
Deon

Perfect! Thank you :D

Share this post


Link to post
Share on other sites
SadBunny

My pleasure :) Regex is fun.


Roses are FF0000, violets are 0000FF... All my base are belong to you.

Share this post


Link to post
Share on other sites
iamtheky

Thats too much safeness for my taste :) 

 

#include <Array.au3>

$fileContent = "This is the first line."
$fileContent = $fileContent & "Something something, blah, DescriptionAA1111AA111, and another thing." & @CRLF
$fileContent = $fileContent & "06:52:16 this line starts with a valid timestamp"  & @CRLF
$fileContent = $fileContent & "24:01:01 this line doesn't start with a valid timestamp"  & @CRLF
$fileContent = $fileContent & "00:00:00 this line also starts with a valid timestamp"  & @CRLF
$fileContent = $fileContent & "23:59:59 just like this one, which also has a description thingy: DescriptionZZ1234zz123"  & @CRLF
$fileContent = $fileContent & "This is a bonus line."
$fileContent = $fileContent & "DescriptionXy1111aB111, aaaaaand it's gone." & @CRLF

$aDescriptionLines = StringRegExp($fileContent, "Description(.{11})", $STR_REGEXPARRAYGLOBALMATCH)  ;11 characters after every instance of 'Description'
_ArrayDisplay($aDescriptionLines)


msgbox(0, '' , stringregexp($fileContent, "(.*\d\d\:\d\d\:\d\d\s.*)", $STR_REGEXPARRAYGLOBALMATCH )[0]) ; First match of timestamp

 

Edited by boththose

,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites
SadBunny

Why would you want to decrease safeness?


Roses are FF0000, violets are 0000FF... All my base are belong to you.

Share this post


Link to post
Share on other sites
iamtheky

to show simplicity when things are guaranteed, like X number of characters after static string.  And an alternate syntax if criteria like validity of the timestamp are unnecessary and/or the data does not have a risk of another colon separated string of numbers, like the ass half of a MAC address.


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×