Jump to content

read a text file and extract information


Recommended Posts

I am trying to read a file from my server and extract a single line of text.
The information I want to retrieve from the file is the only line that is written as such:
Version: 0.74.1-114-g62a22d6
No other line in the file begins with Version:
I need to get that version number from the file, 0.74.1-114

Can you help?

Func Read()
    Local $file = FileReadToArray('https://shark007.net/build.log')
    Local $version = _ArrayFindAll($file, 'Version:')
    _ArrayDisplay($version)
EndFunc

 

Link to post
Share on other sites
Posted (edited)

but the file is 10,000 lines and the information I want to retrieve varies at around line 3000

the 007 in my name is a reference to Sean Connery... always like his movies. Not so much the bond films.
Medicine Man and the The Rock.

 

I'm guessing some regex but I wouldn't even know where to start with that
Luckily, that colon following the Version: is the only occurrence of that format in the 10000 lines.

Edited by Shark007
Link to post
Share on other sites

well, it's https, otherwise if http, you could open a TCP and chit-chat until you get your string and close the socket.
But https is usually compressed on the fly.

#include <Inet.au3>
#include <Array.au3>

Exit Read()
Func Read()
    Local $hTimer = TimerInit(), $sLogFile = _INetGetSource('https://shark007.net/build.log')
    ConsoleWrite(StringLeft($sLogFile, 100) & @CRLF)
    Local $aLogFile = StringSplit($sLogFile, @CRLF)
    _ArrayDisplay($aLogFile, TimerDiff($hTimer) & " ms.")
;~     Local $version = _ArrayFindAll($aLogFile, 'Version:')
;~     _ArrayDisplay($version)
EndFunc

it takes less than a sec to get the file :) 

Link to post
Share on other sites
#include <Inet.au3>
#include <Array.au3>

Exit Read()
Func Read()
    Local $hTimer = TimerInit(), $sLogFile = _INetGetSource('https://shark007.net/build.log')
    ConsoleWrite(TimerDiff($hTimer) & " ms." & @CRLF)
    ConsoleWrite(StringLeft($sLogFile, 100) & @CRLF)
    Local $sVersion = "not found", $aLogFile = StringSplit($sLogFile, @CRLF)
;~  _ArrayDisplay($aLogFile, TimerDiff($hTimer) & " ms.")
    For $n = 1 To UBound($aLogFile) -1
        If StringInStr($aLogFile[$n], "Version: ") Then
            $sVersion = $aLogFile[$n]
            ExitLoop
        EndIf
    Next
    MsgBox(0,@ScriptName, $sVersion & @CR & "( found in " & Round(TimerDiff($hTimer)) & " ms. )", 120)
EndFunc

... _ArrayFindAll() did not return what is expected but the for loop does :) 

Edited by argumentum
=)
Link to post
Share on other sites

Another way (it would be interesting to test speed of both solutions) :

#include <Constants.au3>
#include <InetConstants.au3>

Example()

Func Example()
    Local $dData = InetRead("https://shark007.net/build.log", $INET_FORCERELOAD)
    Local $sData = BinaryToString($dData)
    $sVersion = StringRegExp($sData, "Version: ([^\v]*)", 1)[0]
    MsgBox($MB_SYSTEMMODAL, "",  $sVersion)
EndFunc   ;==>Example

I believe @argumentum solution should be faster, since it leaves the parse at first hit...

Edited by Nine
Link to post
Share on other sites
43 minutes ago, Nine said:

Another way

#include <Constants.au3>
#include <InetConstants.au3>
Exit MsgBox($MB_SYSTEMMODAL, "", Example("https://shark007.net/build.log"))
Func Example($sURL)
    Local $v = StringRegExp(BinaryToString(InetRead($sURL, $INET_FORCERELOAD)), "Version: ([^\v]*)", 1)[0]
    Return StringLeft($v, StringInStr($v, "-", 0, -1)-1)
EndFunc   ;==>Example

actually, the OP asks for "0.74.1-114" only. But I don't know RegEx from FedEx :D 

Link to post
Share on other sites
3 minutes ago, Shark007 said:

I changed the dash to a dot so it is a proper file version

#include <Constants.au3>
#include <InetConstants.au3>
Exit MsgBox($MB_SYSTEMMODAL, "", Example("https://shark007.net/build.log"))
Func Example($sURL)
    Local $v = StringRegExp(BinaryToString(InetRead($sURL, $INET_FORCERELOAD)), "Version: ([^\v]*)", 1)[0]
    Return StringReplace(StringLeft($v, StringInStr($v, "-", 0, -1)-1), "-", ".")
EndFunc   ;==>Example

ok then :) 

Link to post
Share on other sites

@Shark007 : if you want your original script to work, there is one line to change in it :

Func Read()
    Local $file = FileReadToArray('https://shark007.net/build.log')

    ; Local $version = _ArrayFindAll($file, 'Version:')
      Local $version = _ArrayFindAll($file, 'Version:', 0, 0, 0, 1) ; 1 = partial search
    
    _ArrayDisplay($version)
EndFunc

Result :

639970092_1linefound.png.762d3f3430fee1ade0a5531725f69ecc.png

Link to post
Share on other sites

@argumentum : You're welcome, as always.

The string manipulation can be interesting too (StringinStr and StringMid are really fast) . It avoids the creation time of an array, then looping through it to find what you want. Also if you're a bit reluctant with RegEx (that's me lol) then String manipulation can help sometimes.

For example, in this log file filled with line separators being @LF or @CRLF, then this works fine & fast

$sLogFile = _INetGetSource('https://shark007.net/build.log')
$iPosVersion = StringInStr($sLogFile, "Version: ")
If $iPosVersion Then
    $iPosLF = StringInStr($sLogFile, @LF, Default, 1, $iPosVersion) ; start search at "Version: "
    MsgBox(0, "", StringMid($sLogFile, $iPosVersion, $iPosLF - $iPosVersion))
EndIf

version.png.e3a30d9dc7a65045ae0db17d5700e4ff.png

But if the line separators were @CR only, then it would have failed, while Nine's RegEx (based on \v which takes care of @CR or @LF or @CRLF) would have worked.

Link to post
Share on other sites
5 hours ago, argumentum said:
#include <Constants.au3>
#include <InetConstants.au3>
Exit MsgBox($MB_SYSTEMMODAL, "", Example("https://shark007.net/build.log"))
Func Example($sURL)
    Local $v = StringRegExp(BinaryToString(InetRead($sURL, $INET_FORCERELOAD)), "Version: ([^\v]*)", 1)[0]
    Return StringReplace(StringLeft($v, StringInStr($v, "-", 0, -1)-1), "-", ".")
EndFunc   ;==>Example

ok then :) 

This is some damn nice code too. I just had a chance to test this after grabbing some much needed sleep.

Link to post
Share on other sites

Or this one liner :geek:

MsgBox($MB_SYSTEMMODAL, "", Example1("https://shark007.net/build.log"))
Func Example1($sURL)
    Return StringReplace(StringRegExp(BinaryToString(InetRead($sURL, $INET_FORCERELOAD)), "Version: (\d+.\d+.\d+.\d+)", 1)[0], "-", ".")
EndFunc   ;==>Example

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...