Jump to content

Reg Ex help need (Solved)


Recommended Posts

Hello,

        Ok lets get this out there.....   I really struggle with Regular expressions..     So I would like a little help with this issue.   I have a text file that I need to extract records from.  Each record that I need to extract begins with:

MSH|^~\&|Flexilab|CART

the end of the record, the last line I need (the whole line) begins with:

FT1|

how would you write the regex to extract this data?    The begining and ending strings have special characters.     I am pretty sure I can figure out how to loop through the file and get the data needed.  I just cannot figure the regex to be able to select the needed data.   Thank you 

Link to comment
Share on other sites

This should be pretty easy, but even with your description of the data it would be much better if you actually provide a sample (of course change the parts that are sensitive but leave all the special characters and such that would be needed for the regex.

Especially since it looks like there could be some more simple patterns to run based on the description. 

Link to comment
Share on other sites

This is a lot like the files I get from the state to validate.  depending on the size of the files you are processing, you may not want to use RegEx unless you have to as it's one of the slowest ways to find data.  "I process millions of rows like these to validate data"

I'm assuming your doing a FileReadToArray then looping through

Just do a If StringLeft($aLines[$i], 4) = "FT1|" then

$aSegments = StringSplit($aLines[$i], "|")

;Since your lines end with the separator, the Last value in that line will be in $aSegments[$aSegments[0] - 1]

Link to comment
Share on other sites

Agree with @BigDaddyO.  It is very easy to extract since the part you are searching for is at the beginning of the line.  Also, I would test performance between FileReadToArray and a loop with FileReadLine.  If there are millions of those lines, maybe processing directly from the file would be better...

Link to comment
Share on other sites

Here's a very basic couple of regular expressions that will extract your records.  Of course there are numerous ways to do it with regular expressions.  This is just one.

#include <Constants.au3>
#include <Array.au3>

example()

Func example()
    Local $sData
    Local $aResult

    $sData = FileRead("ExampleData.txt")

    ;Get MSH records
    $aResult = StringRegExp($sData, "(?m)(^MSH\|\^~\\&\|Flexilab\|CART.*)", $STR_REGEXPARRAYGLOBALMATCH)
    _ArrayDisplay($aResult)

    ;Get last FT1 record
    $aResult = StringRegExp($sData, "(?m).*(^FT1.*)", $STR_REGEXPARRAYMATCH)
    _ArrayDisplay($aResult)
EndFunc

 

Edited by TheXman
Removed unused variable ($sExtractedRecords)
Link to comment
Share on other sites

 

#Include <Array.au3>

$txt = FileRead("ExampleData.txt")

$res = StringRegExp($txt, '(?ms)^MSH.*?FT1\|\N+', 3)
 _ArrayDisplay($res)

For $i = 0 to UBound($res)-1
   $r = StringSplit($res[$i], @crlf, 3)
   _ArrayDisplay($r, $i)
Next

Comments :
(?m)     multiline mode
^         in this mode, ^  means "start of line"
\|        pipe char, escaped
\N+     one or more non-newline characters

Edited by mikell
Link to comment
Share on other sites

Mikell

         Thank you!  That is getting the last line of the record but I am still having a hard time with getting the begining line.  

MSH|^~\&|Flexilab|CART

I tried this:

$Msh = StringRegExp($txt, '(?m)^MSH\|\^~\\\&\|FLexilab\|CART\N+', 3)

But does not pull the line.    I believe it is because I am not escaping the escape characters correctly?

And once I am able to get the start of the record  and the end of the record I would use Stringbetween to get the whole record?   

 

Link to comment
Share on other sites

Mikell,

       That is just about what I need.  Except that code

$res = StringRegExp($txt, '(?ms)^MSH.*?FT1\|\N+', 3)

is pulling every line that begins with MSH.    I just need the lines that begin with MSH|^~\&|Flexilab|CART.  I cannot figure out how to regex this pattern.   Again thank you for your help.  

Link to comment
Share on other sites

  • xcaliber13 changed the title to Reg Ex help need (Solved)

Oh I see. Damn... I misunderstood again  :sweating:

#Include <Array.au3>

$txt = FileRead("ExampleData.txt")

$mark = "^MSH\|\^~\\&\|Flexilab\|CART"

$res = StringRegExp($txt, '(?ms)' & $mark & '.*?FT1\|\N+', 3)
 _ArrayDisplay($res)

For $i = 0 to UBound($res)-1
   $r = StringSplit($res[$i], @crlf, 3)
   _ArrayDisplay($r, $i)
Next

Edit
Too late  :D

Edited by mikell
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...