Jump to content

read a text file and extract information


Recommended Posts

I am trying to read a file from my server and extract a single line of text.
The information I want to retrieve from the file is the only line that is written as such:
Version: 0.74.1-114-g62a22d6
No other line in the file begins with Version:
I need to get that version number from the file, 0.74.1-114

Can you help?

Func Read()
    Local $file = FileReadToArray('https://shark007.net/build.log')
    Local $version = _ArrayFindAll($file, 'Version:')
    _ArrayDisplay($version)
EndFunc

 

Link to comment
Share on other sites

but the file is 10,000 lines and the information I want to retrieve varies at around line 3000

the 007 in my name is a reference to Sean Connery... always like his movies. Not so much the bond films.
Medicine Man and the The Rock.

 

I'm guessing some regex but I wouldn't even know where to start with that
Luckily, that colon following the Version: is the only occurrence of that format in the 10000 lines.

Edited by Shark007
Link to comment
Share on other sites

well, it's https, otherwise if http, you could open a TCP and chit-chat until you get your string and close the socket.
But https is usually compressed on the fly.

#include <Inet.au3>
#include <Array.au3>

Exit Read()
Func Read()
    Local $hTimer = TimerInit(), $sLogFile = _INetGetSource('https://shark007.net/build.log')
    ConsoleWrite(StringLeft($sLogFile, 100) & @CRLF)
    Local $aLogFile = StringSplit($sLogFile, @CRLF)
    _ArrayDisplay($aLogFile, TimerDiff($hTimer) & " ms.")
;~     Local $version = _ArrayFindAll($aLogFile, 'Version:')
;~     _ArrayDisplay($version)
EndFunc

it takes less than a sec to get the file :) 

Follow the link to my code contribution ( and other things too ).
FAQ - Please Read Before Posting.
autoit_scripter_blue_userbar.png

Link to comment
Share on other sites

#include <Inet.au3>
#include <Array.au3>

Exit Read()
Func Read()
    Local $hTimer = TimerInit(), $sLogFile = _INetGetSource('https://shark007.net/build.log')
    ConsoleWrite(TimerDiff($hTimer) & " ms." & @CRLF)
    ConsoleWrite(StringLeft($sLogFile, 100) & @CRLF)
    Local $sVersion = "not found", $aLogFile = StringSplit($sLogFile, @CRLF)
;~  _ArrayDisplay($aLogFile, TimerDiff($hTimer) & " ms.")
    For $n = 1 To UBound($aLogFile) -1
        If StringInStr($aLogFile[$n], "Version: ") Then
            $sVersion = $aLogFile[$n]
            ExitLoop
        EndIf
    Next
    MsgBox(0,@ScriptName, $sVersion & @CR & "( found in " & Round(TimerDiff($hTimer)) & " ms. )", 120)
EndFunc

... _ArrayFindAll() did not return what is expected but the for loop does :) 

Edited by argumentum
=)

Follow the link to my code contribution ( and other things too ).
FAQ - Please Read Before Posting.
autoit_scripter_blue_userbar.png

Link to comment
Share on other sites

Another way (it would be interesting to test speed of both solutions) :

#include <Constants.au3>
#include <InetConstants.au3>

Example()

Func Example()
    Local $dData = InetRead("https://shark007.net/build.log", $INET_FORCERELOAD)
    Local $sData = BinaryToString($dData)
    $sVersion = StringRegExp($sData, "Version: ([^\v]*)", 1)[0]
    MsgBox($MB_SYSTEMMODAL, "",  $sVersion)
EndFunc   ;==>Example

I believe @argumentum solution should be faster, since it leaves the parse at first hit...

Edited by Nine
Link to comment
Share on other sites

43 minutes ago, Nine said:

Another way

#include <Constants.au3>
#include <InetConstants.au3>
Exit MsgBox($MB_SYSTEMMODAL, "", Example("https://shark007.net/build.log"))
Func Example($sURL)
    Local $v = StringRegExp(BinaryToString(InetRead($sURL, $INET_FORCERELOAD)), "Version: ([^\v]*)", 1)[0]
    Return StringLeft($v, StringInStr($v, "-", 0, -1)-1)
EndFunc   ;==>Example

actually, the OP asks for "0.74.1-114" only. But I don't know RegEx from FedEx :D 

Follow the link to my code contribution ( and other things too ).
FAQ - Please Read Before Posting.
autoit_scripter_blue_userbar.png

Link to comment
Share on other sites

3 minutes ago, Shark007 said:

I changed the dash to a dot so it is a proper file version

#include <Constants.au3>
#include <InetConstants.au3>
Exit MsgBox($MB_SYSTEMMODAL, "", Example("https://shark007.net/build.log"))
Func Example($sURL)
    Local $v = StringRegExp(BinaryToString(InetRead($sURL, $INET_FORCERELOAD)), "Version: ([^\v]*)", 1)[0]
    Return StringReplace(StringLeft($v, StringInStr($v, "-", 0, -1)-1), "-", ".")
EndFunc   ;==>Example

ok then :) 

Follow the link to my code contribution ( and other things too ).
FAQ - Please Read Before Posting.
autoit_scripter_blue_userbar.png

Link to comment
Share on other sites

@Shark007 : if you want your original script to work, there is one line to change in it :

Func Read()
    Local $file = FileReadToArray('https://shark007.net/build.log')

    ; Local $version = _ArrayFindAll($file, 'Version:')
      Local $version = _ArrayFindAll($file, 'Version:', 0, 0, 0, 1) ; 1 = partial search
    
    _ArrayDisplay($version)
EndFunc

Result :

639970092_1linefound.png.762d3f3430fee1ade0a5531725f69ecc.png

Link to comment
Share on other sites

@pixelsearch, if you look at _ArrayFindAll(), the for loop is faster than the _ArrayFindAll() function. But thanks for the clarification. I did not catch that :) 

@Nine, the RegExp should be faster than the for loop.

Edited by argumentum

Follow the link to my code contribution ( and other things too ).
FAQ - Please Read Before Posting.
autoit_scripter_blue_userbar.png

Link to comment
Share on other sites

@argumentum : You're welcome, as always.

The string manipulation can be interesting too (StringinStr and StringMid are really fast) . It avoids the creation time of an array, then looping through it to find what you want. Also if you're a bit reluctant with RegEx (that's me lol) then String manipulation can help sometimes.

For example, in this log file filled with line separators being @LF or @CRLF, then this works fine & fast

$sLogFile = _INetGetSource('https://shark007.net/build.log')
$iPosVersion = StringInStr($sLogFile, "Version: ")
If $iPosVersion Then
    $iPosLF = StringInStr($sLogFile, @LF, Default, 1, $iPosVersion) ; start search at "Version: "
    MsgBox(0, "", StringMid($sLogFile, $iPosVersion, $iPosLF - $iPosVersion))
EndIf

version.png.e3a30d9dc7a65045ae0db17d5700e4ff.png

But if the line separators were @CR only, then it would have failed, while Nine's RegEx (based on \v which takes care of @CR or @LF or @CRLF) would have worked.

Link to comment
Share on other sites

5 hours ago, argumentum said:
#include <Constants.au3>
#include <InetConstants.au3>
Exit MsgBox($MB_SYSTEMMODAL, "", Example("https://shark007.net/build.log"))
Func Example($sURL)
    Local $v = StringRegExp(BinaryToString(InetRead($sURL, $INET_FORCERELOAD)), "Version: ([^\v]*)", 1)[0]
    Return StringReplace(StringLeft($v, StringInStr($v, "-", 0, -1)-1), "-", ".")
EndFunc   ;==>Example

ok then :) 

This is some damn nice code too. I just had a chance to test this after grabbing some much needed sleep.

Link to comment
Share on other sites

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...