Jump to content

Extract text from txt file


Recommended Posts

hello,

in these days i'm working with text transformation.

Start form a txt file and must convert in other thing.

Working on that sh1t i see that it can be fully automated using delimeters.

So, i think :

- is possible to identify text from a file using 'static' text.

- extract in a 'transition' file (i think one value for row)

- reverse into a new file.

Simple idea but hard realization.

txt example:

<v6.50><e0>

@campionato:JUNIORES GIRONE A

@risultati:

Bollengo-Victor Favria 5-1

Fenusma-Aosta 4-2

Ha riposato: Atletico 1912, Real Sarre, Sanson.

@hclassifica: Pt G V N P F S

@classifica:Atletico 1912 41 15 13 2 0 48 7

Monte Cervino 33 16 10 3 3 39 21

San Grato 30 15 9 3 3 36 16

Prossimo turno: Aosta-La Romanese; Atletico 1912-Fenusma; Monte Cervino-Bollengo; Real Canavese-Pont Donnaz; Real Sarre-G. Combin; Victor Favria-Sanson. Riposa: San Grato.

In bold we have some delimeters, and some others are hidden (eg: TAB)

'extractor' can identify a lot of things, and put them in a 'transition' file:

JUNIORES GIRONE A
Bollengo-Victor Favria
5-1
Fenusma-Aosta
4-2
Atletico 1912, Real Sarre, Sanson.
Atletico 1912
41  
15  
13  
2   
0   
48  
7
etc.....

Next step is compile an output file with some 'targets',

idea is that every line can be identified with line numer, eg:

<line1>***<line6>

that output in this way :

JUNIORES GIRONE A***Atletico 1912, Real Sarre, Sanson.

In this way i can extract all info, and then reverse into a new (formatted) file.

LOOONG (and boring) exposition but need some comments,

thank you,

m.

Link to comment
Share on other sites

ReadFileToArray

loop through the array using StringInStr to search for your static text

If Not found, FileWriteLine the unmodified line to a new file

If found StringReplace to change the line then FileWriteLine the modified line to the new file

:mellow:

010101000110100001101001011100110010000001101001011100110010000

001101101011110010010000001110011011010010110011100100001

My Android cat and mouse game
https://play.google.com/store/apps/details?id=com.KaosVisions.WhiskersNSqueek

We're gonna need another Timmy!

Link to comment
Share on other sites

star coding,

is there anyway to script text replacement like this ?

every <RanD0m t3xt> became <*>

<IP0><HR0,0.3,0,0><BXBEIGE,0,16,1.5,-1,-1><HR0,0.3,19,0><EL4><HS4><CF47><CP12>JUNIORES GIRONE A
<IB><EL7><COROSSO><CF44><CP9>RISULTATI
<CO><CP6.5><CS6><CF34>
<HR0,0.3,1,0><EL2>Bollengo-Victor Favria<QM>5-1
<HR0,0.3,1,0><EL2>Fenusma-Aosta<QM>rinv.
<HR0,0.3,1,0><EL2>G. Combin-Real Canavese<QM>rinv.

became :

<*><*><*><*><*><*><*><*>JUNIORES GIRONE A
<*><*><*><*><*>RISULTATI
<*><*><*><*>
<*><*>Bollengo-Victor Favria<*>5-1
<*><*>Fenusma-Aosta<*>rinv.
<*><*>G. Combin-Real Canavese<*>rinv.

thank you,

m.

Link to comment
Share on other sites

oooh, now I think you are wanting to look into StringRegExp to find everything between < and > then do a StringReplace with *

010101000110100001101001011100110010000001101001011100110010000

001101101011110010010000001110011011010010110011100100001

My Android cat and mouse game
https://play.google.com/store/apps/details?id=com.KaosVisions.WhiskersNSqueek

We're gonna need another Timmy!

Link to comment
Share on other sites

Try this one:

$string = "<IP0><HR0,0.3,0,0><BXBEIGE,0,16,1.5,-1,-1><HR0,0.3,19,0><EL4><HS4><CF47><CP12>JUNIORES GIRONE A" & @LF & _
                "<IB><EL7><COROSSO><CF44><CP9>RISULTATI" & @LF & _
                "<CO><CP6.5><CS6><CF34>" & @LF & _
                "<HR0,0.3,1,0><EL2>Bollengo-Victor Favria<QM>5-1" & @LF & _
                "<HR0,0.3,1,0><EL2>Fenusma-Aosta<QM>rinv." & @LF & _
                "<HR0,0.3,1,0><EL2>G. Combin-Real Canavese<QM>rinv."

MsgBox(0, "Test", StringRegExpReplace($string, "(?U)<(.*)>", "<*>"))

Br,

UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

I have a working thing here, with needed files.

#Include <Array.au3>


$file = FileOpen("IVGFBA.txt", 0)

; Check if file opened for reading OK
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf



;~ ------------------------------------------------------------
;~  open file and stringreplace in accord with given .ini
;~ ------------------------------------------------------------
$content_box = FileRead($file)                          ;put content file into var

$file_filter = FileOpen("model_A.ini", 0)               ;map every NOT interesting value (will be delimeter)
; Check if file opened for reading OK
If $file_filter = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

; change every NOT interesting value into *
While 1
    $line = FileReadLine($file_filter)
    If @error = -1 Then ExitLoop
        
    if $line <> "" then $content_box = StringReplace($content_box, $line, "*", 0)
Wend

FileClose($file_filter)



;~ ------------------------------------------------------------
;~  support for 'MACRO' filter [tab] - [enter] - etc.
;~ ------------------------------------------------------------
$file_filter = FileOpen("model_A.ini", 0)
$file_filter_contenuto = FileRead($file_filter)
FileClose($file_filter)


$result = StringInStr($file_filter_contenuto, '[TAB]')
if $result <> 0 then $content_box = StringReplace($content_box, @TAB, "*")

;~ ------------------------------------------------------------
;~  support for cosmetic text hack
;~ ------------------------------------------------------------
$content_box = StringReplace($content_box, "* ", "*")
$content_box = StringReplace($content_box, " *", "*")






;~ ------------------------------------------------------------
;~  save result and StringSplit row by row
;~ ------------------------------------------------------------
$file = FileOpen("mapped.txt", 10)
FileWrite($file, $content_box)
FileClose($file)

;open saved file to analize and extract values
;naval battle format is used for array spec [so happy :) ]
$file = FileOpen("mapped.txt", 0)

$line_counter = 0
; Read in lines of text until the EOF is reached
While 1
    $line_counter = $line_counter + 1               ;keep this info for have row info
    $line = FileReadLine($file, $line_counter)
    If @error = -1 Then ExitLoop
    $valori = StringSplit($line, "*")
    
    $array_elements = UBound($valori) - 1

    if $array_elements > 0 Then ;is a valid array
        
        For $r = 0 to UBound($valori,1) - 1
            if $r <> 0 then                         ;this is colum info
                if $valori[$r] <> "" then           ;if is not empty, and contain something, msgbox
                    msgbox(0,"info : ", "Coord: " & @CRLF & "ROW:   " & $line_counter & @CRLF & "Colums:" & $r & @CRLF & "Value: " & $valori[$r])
                EndIf
            EndIf
        Next

    EndIf
    
Wend

FileClose($file)

feel serendipity in the air. This solve my problem, and a lot of other things.

Miss last step, output model, but gear it's ready.

Viva Autoit,

m.

Edited by myspacee
Link to comment
Share on other sites

  • 5 weeks later...

I'm working on this project and need again some help :graduated:

after first step, need to remoce some trash:

<IP0><HR0,0.3,0,0><BXBEIGE,0,12,1.5,-1,-1><HR0,0.3,15,0><EL4><HS4><CF46><CP10>GIOV. FASCIA B GIRONE A<CF31>
<IB><EL5><COROSSO><CF44><CP9>RISULTATI<EL2><CO><CP7><CL8><CF34>
<HR0,0.3,-1,0>Atletico 1912-Evancon<QM>4-1
<HR0,0.3,-1,0>Charvensod-Banchette<QM>4-4
<HR0,0.3,-1,0>Coll. Pedenea-Castellamonte<QM>4-1
<HR0,0.3,-1,0>Rivarolese F-Pont Donnaz<QM>n.d.
<HR0,0.3,-1,0>Rivarolese M-Real Canavese<QM>2-1
<HR0,0.3,-1,0>Ha riposato: Aygreville.<QM>[riga_9_2]
<HR0,0.3,-1,0>[riga_10_1]<QM>
<EL2.5><HR0,0.5,-1,0><EL-8>
<TSL50,c+10,c+10,c+10,c+10,c+10,c+10,c+10><COROSSO>SQUADRE<TB>P<TB>G<TB>V<TB>N<TB>P<TB>F<TB>S<CONERO><CP7><CL8><QC>
<HR120,0.3,-1,0><CF34>9<TB>4<TB><CF33>3<TB>0<TB>1<TB>36<TB>9<TB>
<HR120,0.3,-1,0><CF34>Real Canavese<TB>9<TB><CF33>4<TB>3<TB>0<TB>1<TB>21<TB>
<HR120,0.3,-1,0><CF34>Rivarolese M<TB>9<TB><CF33>3<TB>3<TB>0<TB>0<TB>8<TB>
<HR120,0.3,-1,0><CF34>Coll. Pedenea<TB>6<TB><CF33>5<TB>2<TB>0<TB>3<TB>26<TB>
<HR120,0.3,-1,0><CF34>Charvensod<TB>4<TB><CF33>4<TB>1<TB>1<TB>2<TB>9<TB>
<HR120,0.3,-1,0><CF34>Atletico 1912<TB>3<TB><CF33>4<TB>1<TB>0<TB>3<TB>7<TB>
<HR120,0.3,-1,0><CF34>Aygreville<TB>3<TB><CF33>3<TB>1<TB>0<TB>2<TB>3<TB>
<HR120,0.3,-1,0><CF34>Castellamonte<TB>3<TB><CF33>4<TB>1<TB>0<TB>3<TB>13<TB>
<HR120,0.3,-1,0><CF34>Pont Donnaz<TB>3<TB><CF33>3<TB>1<TB>0<TB>2<TB>7<TB>
<HR120,0.3,-1,0><CF34>Rivarolese F<TB>0<TB><CF33>3<TB>0<TB>0<TB>3<TB>1<TB>
<HR120,0.3,-1,0><CF34>[riga_22_1]<TB>Atletico 1912-Rivarolese F<TB><CF33>Banchette-Rivarolese M<TB>Castellamonte-Pont Donnaz<TB>Evancon-Charvensod<TB>Real Canavese-Aygreville.<TB>Coll. Pedenea. <TB>
<HR120,0.3,-1,0><CF34>[riga_23_1]<TB>[riga_23_2]<TB><CF33>[riga_23_3]<TB>[riga_23_4]<TB>[riga_23_5]<TB>[riga_23_6]<TB>[riga_23_7]<TB>
<HR120,0.3,-1,0><CF34>[riga_24_1]<TB>[riga_24_2]<TB><CF33>[riga_24_3]<TB>[riga_24_4]<TB>[riga_24_5]<TB>[riga_24_6]<TB>[riga_24_7]<TB>
<HR120,0.3,-1,0><EL2><COROSSO><CF44><CP9>PROSSIMO TURNO<IB><CO><CF34><CP6>
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<COROSSO><CF300><CP4> n <CP6><CF34>
<CONERO>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<COROSSO><CF300><CP4> n <CP6><CF34>
<CONERO>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<COROSSO><CF300><CP4> n <CP6><CF34>
<CONERO>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<COROSSO><CF300><CP4> n <CP6><CF34>
<CONERO>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<COROSSO><CF300><CP4> n <CP6><CF34>
<CONERO>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<COROSSO><CF300><CP4> n <CP6><CF34>
<CONERO>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^<COROSSO><CF300><CP4> n <CP6><CF34>
<CONERO>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Need to remove, or replace, all text between []

I have [RanD0m] text, so need some help with stringRegExpReplace function.

thank you for any help,

m.

Edited by myspacee
Link to comment
Share on other sites

Try this:

StringRegExpReplace($text, "\[(.*)\]", "")

whereas $text is the text from above.

Br,

UEZ

Please don't send me any personal message and ask for support! I will not reply!

Selection of finest graphical examples at Codepen.io

The own fart smells best!
Her 'sikim hıyar' diyene bir avuç tuz alıp koşma!
¯\_(ツ)_/¯  ٩(●̮̮̃•̃)۶ ٩(-̮̮̃-̃)۶ૐ

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...