Jump to content

How would you suggest I parse this?


Zinefer
 Share

Recommended Posts

I am converting some PDF Files to text... However I have a problem, I am not sure how to parse the data I am recieving out. Maybe you guys could help me out, I would greatly appreciate it.

Here is what a typical text file would look like:

Dispositions:   1= Sale 2= Has It 3= DQ 4= Red   5= Vacant 6= Does Not Exist 7= Business 8= No Solicit 9= Hard No 10= Can Be Reworked 10-23


         NAME            HOUSE STREET                             CITY               ZIP      DISP TT NI DM PR H S. PROF COMMENTS
                             51144   MERRY LN                       SHELBY TOWNSHIP  48317
                             51307   MERRY LN                       SHELBY TOWNSHIP  48317   2
                             51347   MERRY LN                       SHELBY TOWNSHIP  48317
                             51463   MERRY LN                       SHELBY TOWNSHIP  48317
                             51463   MERRY LN Unit 231             SHELBY TOWNSHIP   48317
                             51543   MERRY LN Unit 231             SHELBY TOWNSHIP   48317   2
                             51577   MERRY LN                       SHELBY TOWNSHIP  48317
CHARLES KOVAK                3414   E POINT CT                    UTICA            48316
ANTHONY FERRO                3523   E POINT CT                    UTICA            48316
SAVITA BHAGWAN              4055   W POINT CT                     SHELBY TOWNSHIP    48316
JAN ELECHICON                4151   W POINT CT                    SHELBY TOWNSHIP    48316
                             51030   SANDSHORE DR                   SHELBY TOWNSHIP  48316
                             51078   SANDSHORE DR                   SHELBY TOWNSHIP  48316
                             51149   SANDSHORE DR                   SHELBY TOWNSHIP  48316
                             51299   SANDSHORE DR                   SHELBY TOWNSHIP  48316
                             51538   SANDSHORE DR                   SHELBY TOWNSHIP  48316
RITA RENO                   51054   SANDSHORES DR                  UTICA               48316
EDWARD E WENG               51077   SANDSHORES DR                  UTICA               48316
MICHAEL R POLAROLO         51101   SANDSHORES DR                   SHELBY TOWNSHIP   48316
VINCENT CONSOLINO           51125   SANDSHORES DR                  UTICA               48316   11
SANDRA HALSTEAD           51174   SANDSHORES DR                SHELBY TOWNSHIP   48316
ELIZABETH CRACCHIOLO         51197   SANDSHORES DR                 UTICA               48316
JOSEPH YOUSIF               51269   SANDSHORES DR                  SHELBY TOWNSHIP   48316
ANNA ALICANDRO             51283   SANDSHORES DR                   UTICA               48316
DJURO OGNJANOVSKI           51317   SANDSHORES DR                  UTICA               48316
MARIA CECE                  3287   SANDY PT                     UTICA              48316
BERNARD IVIN                  3354   SANDY PT                       UTICA              48316
                              3355   SANDY POINT                     SHELBY TOWNSHIP     48316
                              3415   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316
                              3541   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316
                              3583   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316
                              3625   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316
                              3667   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316
                              3709   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316   8
                              3751   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316   8
                              3835   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316
                              3919   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316   2
                              3963   TWENTY-THREE MILE RD           SHELBY TOWNSHIP  48316
DA

I need to get the address, the city and zip code from every line. I would REALLY appreciate ANY suggestions. Thank you.

Edited by Zinefer
Link to comment
Share on other sites

#include <file.au3>
Dim $aRows
Dim $sFilename = "New Text document.txt"
Dim $cSpecial = Chr(0)

_FileReadToArray($sFilename,$aRows)
If @ERROR Then
    MsgBox(48,"Error", "Unable to read from file:" & @CRLF & $sFilename)
    Exit
EndIf

;Begin at line 5 and end 1 line from last
For $X = 5 to $aRows[0]-1
    ;ConsoleWrite($X & ": " & $aRows[$X] & @CRLF)
    
    ;Replace all instances of 2 or more spaces with special character
    $sTemp = StringRegExpReplace($aRows[$X],"(\h){2,}", $cSpecial)
    ;ConsoleWrite($sTemp & @CRLF)
    ConsoleWrite("Row " & $X & @CRLF)
    
    ;Split on special character
    $aCols = StringSplit($sTemp, $cSpecial)
    For $Y = 1 to $aCols[0]
        ConsoleWrite(@TAB & '[' & $Y & "]: " & $aCols[$Y] & @CRLF)
    Next

    ConsoleWrite(@CRLF)
Next

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...