Jump to content

Parsing a .txt file, using regexp to do logic - help!


Aeterna
 Share

Recommended Posts

Thanks for your guys' help in the last post!

Now the next step of my project which is stumping me too.

So i have about 30,000 lines of random B's and P's like the following.

B

P

B

B

B

P

P

B

P

I need to write a script that goes through line by line, reads the current line, and sometimes previous few lines depending on the logic. The program then needs to use the lines that it read, and make a prediction about the next line based on some logical rules I have.

Sorry for posting what seems like such a complicated problem (at least to me), I'm just so new to AutoIt I have no idea where to start.

I'm sure thats not enough information for you guys to even begin to help me, so please ask me for anything you need that would help to solve this!

Feel free to contact me on IM too!

Thanks so much in advance, you guys are really helpful!

Link to comment
Share on other sites

Use FileReadLine ( filehandle or "filename" [, line] ) and a counter to now the actual line...

i.e:

$lineNumber = 1
while 1
    $lineText = FileReadLine ($fileOpened, $lineNumber)
    If @error = -1 Then ExitLoop
    if (conditions) then
        read previous lines using $lineNumber like a pointmark
        do your predicction logic
    endIf
    $lineNumber += 1
wend

I'm not sure if this help you, but is a start point...

P.S: Sorry for the bad english

Edited by Chapi
Link to comment
Share on other sites

FileReadLine is not necessary. You are constantly opening and closing the file. This does one read, and splits the strings.

$sFilePath = @ScriptDir & "\mybsandps.txt"

$sFileContent = FileRead($sFilePath)
If @error Then
    MsgBox(0, "", "Unable to open file '" & $sFilePath & "'")
    Exit
EndIf

$sLine = StringSplit( StringStripCR($sFileContent), @LF,1 )

For $i = 1 to $sLine[0]
    ;; To get the previous line, use the variable $sLine[$i-1]
    ;; To get 5 lines back, use the variable $sLine[$i-5]
    Switch $sLine[$i]
        Case "B"
            ;; You are now reading a B
        Case "P"
            ;; You are now reading a P
    EndSwitch
Next
Edited by Manadar
Link to comment
Share on other sites

FileReadLine is not necessary. You are constantly opening and closing the file. This does one read, and splits the strings.

Not if you use the FILEHANDLE of an open file (FileOpen ( "filename", mode )) rather than filename

Link to comment
Share on other sites

Not if you use the FILEHANDLE of an open file (FileOpen ( "filename", mode )) rather than filename

and you think I didn't know that ...

It is still inefficient, because every time you use FileReadLine it's going to loop through the entire file, finding that line for you. Especially in large files that is problem.

Link to comment
Share on other sites

Manadar's method is still more efficient from a performance standpoint.

From a performance standpoint it is a bad idea to read line by line specifying "line" parameter whose value is incrementing by one. This forces AutoIt to reread the file from the beginning until it reach the specified line.

Link to comment
Share on other sites

There is no doubt that FileReadLine() is the slowest possible method of reading a file. In this case each line contains only one character so the fastest method of creating the array is probably using StringRegExp()

Just modifying Manadars code here to use the RegExp.

$sFilePath = @ScriptDir & "\mybsandps.txt"
$aLines = StringRegExp(FileRead($sFilePath), "(?m:^|\n)(\w)", 3)
For $i = 0 to Ubound($aLines)
   ;; To get the previous line, use the variable $aLines[$i-1]
   ;; To get 5 lines back, use the variable $aLines[$i-5]
    Switch $aLines[$i]
        Case "B"
           ;; You are now reading a B
        Case "P"
           ;; You are now reading a P
    EndSwitch
Next

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

In the logic, I often need to refer to the previous 2-8 lines depending on what they are to make a prediction about what comes next.

I'll give u an example

If the reader sees:

3.B

4.B

5.?

Then it makes a prediction B for line 5. Unless:

1.P

2.P

3.B

4.B

5.?

In which case it makes the prediction P for line 5 (the logic here being that the prediction is based on the patterns repeating in groups of 2.

Thanks again for your help guys!

Edited by Aeterna
Link to comment
Share on other sites

There is no doubt that FileReadLine() is the slowest possible method of reading a file. In this case each line contains only one character so the fastest method of creating the array is probably using StringRegExp()

Woah, give me some proof that StringRegExp is faster to split lines than using StringSplit and I'm using StringRegExp.
Link to comment
Share on other sites

Woah, give me some proof that StringRegExp is faster to split lines than using StringSplit and I'm using StringRegExp.

Don't see where you used StringRegExp in your code, but none the less, I didn't say that my code was faster than your method nor did I say it was slower. What I did say was that it's a simple way to create the array and it is certainly faster than FileReadLine(). It also eliminates a function call to StringStripCR(). And no, I'm not about to run a speed test to compare them on 30,000 lines.

EDIT: Actually now that I have re-read my first post it does read in a manner which would indicate a faster array and that may or may not be the case.

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Don't see where you used StringRegExp in your code, but none the less, I didn't say that my code was faster than your method nor did I say it was slower. What I did say was that it's a simple way to create the array and it is certainly faster than FileReadLine(). It also eliminates a function call to StringStripCR(). And no, I'm not about to run a speed test to compare them on 30,000 lines.

You're right. I only read your code and not your post.

The main benefit I see in using RegExp is that you don't get that retarded $Line[0] element which indicates the size of the array, which you can just as easy get using Ubound.

Edited by Manadar
Link to comment
Share on other sites

You're right. I only read your code and not your post.

The main benefit I see in using RegExp is that you don't get that retarded $Line[0] element which indicates the size of the array, which you can just as easy get using Ubound.

I agree there for sure. Hopefully at some time in the future AutoIt will only use 0 based arrays. I actually wrote a RegExpReplace function to change any of my existing code from "To $array[0]" (or [0][0]) to "To Ubound($array) -1"

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...