Jump to content

Extracting data from a text file.


Q_Engineer
 Share

Recommended Posts

I need some help please. I need to extract information from a large text file so I only have the data I need. I searched and found a few examples but they only show me how to extract one word or line of data. (I did find one that I got the stringsplit "|"). I am attempting to use the code below but come up with no data file with the lines a need.

Any assistance would be appreciated.

#include <file.au3>
$FPath = @DesktopDir & '\blood.txt'

$StringToLocate = "BATCH|PERSON........"

$ModesA = StringSplit($StringToLocate, "|")

_GetRaidInfoTxt($FPath, $ModesA)

Func _GetRaidInfoTxt($fz_Path, $s_String)
    Local $NewPath = StringTrimRight($fz_Path, 4) & 'Info.txt'; Create a new text file
    Local $nArray = ''
    If _FileReadToArray($fz_Path, $nArray) Then
        For $x = 1 To UBound($nArray) - 1
            $SnS = StringInStr($nArray[$x], $s_String)
            If $SnS Then
                $AddCRLF = StringTrimLeft($nArray[$x], ($SnS - 1)) & @CRLF & @CRLF
                FileWriteLine($NewPath, $AddCRLF)
            EndIf
        Next
    Else
        SetError(0);'Error reading file'
        Return 0
    EndIf

EndFunc

Here is an example of the text file I need the info from.

-------------------------------
BATCH #1037 SMITH, AJ              20/123 45 6789  \18

I[Q2OADKLKAW
WF;ALCKV;FKAF
VFA;LFK;FKAVF
LVAKVKL
RXV\TESTING BLOOD FOR PERSON........            AB POS
-------------------------------
SMITH, AJ              20/123 45 6789  \18
I[Q2ERWOADKLKAW
WFEWR;ALCKWREV;FKAF
VFAER;LFK;FKAVF
LVAREWKWERVKL
-------------------------------
BATCH #1032 JACKSON, BG            20/123 45 6789  \18

I[Q2OADKLKAW
WF;ALCKV;FKAF
VFA;LFK;FKAVF
LVAKVKL
RXV\TESTING BLOOD FOR PERSON........            B NEG
-------------------------------
JACKSON, BG            20/123 45 6789  \18
I[Q2OADKLKAW
WF;ALCKV;FKAF
VFA;LFK;FKAVF
LVAKVKL

The desired output would be something like this (or in columns):

1037 SMITH, AJ 123 45 6789

Blood type: AB POS

1032 JACKSON, BG 123 45 6789

Blood type: B NEG

Thanks in advance.

Link to comment
Share on other sites

  • Moderators

Something like this maybe?

$NameArray = _NameBTypes('~DF5E6D.txt')
If IsArray($NameArray) Then
    For $xCC = 1 To UBound($NameArray) - 1
        MsgBox(64, 'Info', $NameArray[$xCC])
    Next
EndIf

Func _NameBTypes($hFile)
    Local $hRead = FileRead($hFile), $sHold, $aSplit
    Local $a_Names = StringRegExp($hRead, '(?i:BATCH #)(.*?)\\', 3)
    Local $iNExtended = @extended
    Local $a_BType = StringRegExp($hRead, '(?i:\.\.\.\.\.\.\.\.)(.*?)(?:\N)', 3)
    If @extended And $iNExtended Then
        For $iCC = 0 To UBound($a_Names) - 1
            $aSplit = StringSplit($a_Names[$iCC], '/')
            $sHold &= StringStripWS(StringTrimRight($aSplit[1], 2) & $aSplit[2], 7) & @CRLF & _
                'Blood Type: ' & StringStripWS($a_BType[$iCC], 7) & Chr(01)
        Next
    EndIf
    If $sHold Then Return StringSplit(StringTrimRight($sHold, 1), Chr(01))
    Return SetError(1, 0, 0)
EndFunc

Edit:

:)

Just noticed some of your variable names, that looks like stuff I've used :P ... Where did you find those snippets?

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Thank you very much! I'll try it out with a big file and see what happens.

One of the references I used was located at Your Webpage

....and another: Your Webpage

I have the other at work and will post it later.

Something like this maybe?

$NameArray = _NameBTypes('~DF5E6D.txt')
If IsArray($NameArray) Then
    For $xCC = 1 To UBound($NameArray) - 1
        MsgBox(64, 'Info', $NameArray[$xCC])
    Next
EndIf

Func _NameBTypes($hFile)
    Local $hRead = FileRead($hFile), $sHold, $aSplit
    Local $a_Names = StringRegExp($hRead, '(?i:BATCH #)(.*?)\\', 3)
    Local $iNExtended = @extended
    Local $a_BType = StringRegExp($hRead, '(?i:\.\.\.\.\.\.\.\.)(.*?)(?:\N)', 3)
    If @extended And $iNExtended Then
        For $iCC = 0 To UBound($a_Names) - 1
            $aSplit = StringSplit($a_Names[$iCC], '/')
            $sHold &= StringStripWS(StringTrimRight($aSplit[1], 2) & $aSplit[2], 7) & @CRLF & _
                'Blood Type: ' & StringStripWS($a_BType[$iCC], 7) & Chr(01)
        Next
    EndIf
    If $sHold Then Return StringSplit(StringTrimRight($sHold, 1), Chr(01))
    Return SetError(1, 0, 0)
EndFunc

Edit:

:)

Just noticed some of your variable names, that looks like stuff I've used :P ... Where did you find those snippets?

Link to comment
Share on other sites

Thank you very much Smoke_N. I run it and get an error...

==> Array variable has incorrect number of subscripts or subscript dimension range exceeded.: 
$sHold &= StringStripWS(StringTrimRight($aSplit[1], 2) & $aSplit[2], 7) & @CRLF & 'Blood Type: ' & StringStripWS($a_BType[$iCC], 8) & Chr(01) 
$sHold &= StringStripWS(StringTrimRight($aSplit[1], 2) & $aSplit[2], 7) & @CRLF & 'Blood Type: ' & StringStripWS(^ ERROR

#include <file.au3>
;$FPath = @DesktopDir & '\blood.txt'
$NameArray = _NameBTypes(@DesktopDir & '\blood.txt')
If IsArray($NameArray) Then
    For $xCC = 1 To UBound($NameArray) - 1
        MsgBox(64, 'Info', $NameArray[$xCC])
    Next
EndIf

Func _NameBTypes($hFile)
    Local $hRead = FileRead($hFile), $sHold, $aSplit
    Local $a_Names = StringRegExp($hRead, '(?i:BATCH #)(.*?)\\', 3)
    Local $iNExtended = @extended
    Local $a_BType = StringRegExp($hRead, '(?i:\.\.\.\.\.\.\.\.)(.*?)(?:\N)', 3)
    If @extended And $iNExtended Then
        For $iCC = 0 To UBound($a_Names) - 1
            $aSplit = StringSplit($a_Names[$iCC], '/')
            $sHold &= StringStripWS(StringTrimRight($aSplit[1], 2) & $aSplit[2], 7) & @CRLF & _
                'Blood Type: ' & StringStripWS($a_BType[$iCC], 7) & Chr(01)
        Next
    EndIf
    If $sHold Then Return StringSplit(StringTrimRight($sHold, 1), Chr(01))
    Return SetError(1, 0, 0)
EndFunc

Any idea on this?

Q_E

Link to comment
Share on other sites

  • Moderators

I didn't do any error checking, I figured since you had the full file you could do that.

The below should make sure you don't have an error, however your error basically statest that are more names there are blood types. You might want to check to make sure you are getting "All" the information.

#include <file.au3>
;$FPath = @DesktopDir & '\blood.txt'
$NameArray = _NameBTypes(@DesktopDir & '\blood.txt')
If IsArray($NameArray) Then
    For $xCC = 1 To UBound($NameArray) - 1
        MsgBox(64, 'Info', $NameArray[$xCC])
    Next
EndIf

Func _NameBTypes($hFile)
    Local $hRead = FileRead($hFile), $sHold, $aSplit, $nUbound
    Local $a_Names = StringRegExp($hRead, '(?i:BATCH #)(.*?)\\', 3)
    Local $iNExtended = @extended
    Local $a_BType = StringRegExp($hRead, '(?i:\.\.\.\.\.\.\.\.)(.*?)(?:\N)', 3)
    If @extended And $iNExtended Then
        If UBound($a_Names) > UBound($a_BType) Then
            $nUbound = $a_BType
        Else
            $nUbound = $a_Names
        EndIf
        For $iCC = 0 To $nUbound
            $aSplit = StringSplit($a_Names[$iCC], '/')
            If $aSplit[0] > 1 Then
                $sHold &= StringStripWS(StringTrimRight($aSplit[1], 2) & $aSplit[2], 7) & @CRLF & _
                    'Blood Type: ' & StringStripWS($a_BType[$iCC], 7) & Chr(01)
            EndIf
        Next
    EndIf
    If $sHold Then Return StringSplit(StringTrimRight($sHold, 1), Chr(01))
    Return SetError(1, 0, 0)
EndFunc

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Works like a charm except it only gives the first set of data and stops. Also, I was attempting to output it to a text file but was having minor issues...What I had in the script was being ignored and all I was being returned was the single data Info box. Do I need to reference the $NameArray variable or the individual data variables when I attempt to output to a file? Like this...

$NewPath = _NameBTypes(@DesktopDir & '\Newblood.txt'); Create a new text file

If IsArray($NameArray) Then

For $xCC = 1 To UBound($NameArray) - 1

MsgBox(64, 'Info', $NameArray[$xCC])

FileWriteLine($NewPath, $NameArray)

Next

EndIf

Func _NameBTypes($hFile)

Local $NewPath = @DesktopDir & '\Newblood.txt'; Create a new text file

Local $hRead = FileRead($hFile), $sHold, $aSplit, $nUbound

Edited by Q_Engineer
Link to comment
Share on other sites

  • Moderators

Look at what your doing, my MsgBox shows $NameArray[$xCC] and your FileWriteLine() only shows $NameArray.

Edit:

Actually, now that I look at that, I have no idea what you are doing.

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

OK..I got it to append to the file the information with this.....

$NewPath = FileOpen(@DesktopDir & '\Newblood.txt', 1); Create a new text file
If IsArray($NameArray) Then
    For $xCC = 1 To UBound($NameArray) - 1
        MsgBox(64, 'Info', $NameArray[$xCC])
            FileWriteLine($NewPath, $NameArray[$xCC])
    Next

Now to have it read until the end of the file.

Link to comment
Share on other sites

As a firm believer in using the right tool for the job, I would not use AutoIt for this task. Instead, I would use mawk, a command-line text processing tool.

http://www.klabaster.com/freeware.htm

http://www.klabaster.com/progs/mawk32.zip

This one line of code that you can run from the command-line will give you your desired results:

mawk "BEGIN { RS='BATCH #' } { print $1' '$2' '$3' 'substr($4,4)' '$5' '$6'\nBlood type: '$16' '$17'\n' }" newblood.txt

outputs this:

Blood type:

1037 SMITH, AJ 123 45 6789
Blood type: AB POS

1032 JACKSON, BG 123 45 6789
Blood type: B NEG

The first line is extraneous which could be edited out with a text editor or something like tail.exe.

-John

Link to comment
Share on other sites

This worked like a charm and in a fraction of the time I and Smoke_N spent on the AutoIT verson. Thank you for the advice. I hope it works well with large files. I'll know tomorrow.

I think I still would like to get the AutoIT app up and running for S-n-Giggles.

type blood.txt | mawk "BEGIN { RS= 'BATCH #' } { print $1' '$2' '$3' 'substr($4,4)' '$5' '$6'\nBlood type: '$16' '$17'\n' }" > newblood.txt

Thanks John and Smoke_N for your assistance and I appreciate the help as I am learning AutoIT.

:)

Q_E

Edited by Q_Engineer
Link to comment
Share on other sites

The RS function works but what if I have a record that has no middle initial? Can I use multiple RS functions and if so what is the seperator for the new RS?

My example that does not work....

type blood.txt | mawk "BEGIN { RS= 'BATCH #' } { print $1' '$2' '$3' 'substr($4,4)' '$5' '$6'\nBlood type: ' }{ RS= 'PERSON' }{ print $1' '$2'\n'} " > newblood.txt

Way off topic for AutoIT but I gotta ask.

Q_E

Link to comment
Share on other sites

  • Moderators

Hmm, wish I could help here, but that's a bit over my head... It looks like StringRegExp() believe it or not :) I'm sure if it can be done in Batch, someone can translate it to AutoIt for you, I'm just not your huckleberry here.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...