Jump to content

String In String


Recommended Posts

I've read manual about strings few times. But is there a way to get strings that are between <FONT SIZE=2> </FONT> and output it one by one. Here's example of one line to analyze

</FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 00 00 00</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 000 00 05</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>6125</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 101 100 000</FONT></TD><TD NOWRAP><FONT SIZE=2>SSS</FONT></TD></TR>

I can of course do it the hard way like i did to receive username which was between <a> and </a> but that tags occured once in in line.. with Font being there multiple times it's harder. Hope there's some nifty little fast feature i didn';t knew about.

Func FindName($adress)
        $file = FileOpen($tempname, 10)
        If $file = -1 Then
            MsgBox(0, "Error", "Unable to open file " & $tempname )
        Exit
        EndIf
        FileWrite( $file, _INetGetSource($adress))
        Dim $aFile
        If Not _FileReadToArray($tempname,$aFile) Then
            MsgBox(4096,"Error", " Error reading log to Array    error:" & @error)
        Exit
        EndIf
        For $x = 1 to $aFile[0]
            If StringInStr($aFile[$x], $name) Then
                If StringInStr($aFile[$x], "OpenDocument") Then
                   ; GETS NAMES
                    $FirstSplit = StringSplit($aFile[$x], 'OpenDocument">',1)
                    $SecondSplit = StringSplit($FirstSplit[2], '</a>',1)
                   ; GETS PHONE NUMBER
                    $ThirdSplit = StringSplit($SecondSplit[2], '<FONT SIZE=2>',1)
                    MsgBox(1,"Text", $SecondSplit[2])
                    ConsoleWrite( $SecondSplit[2])
                   ;$username = $SecondSplit[1]
                    MsgBox(1,"Text", $ThirdSplit[2])
                EndIf
            EndIf
        Next

My little company: Evotec (PL version: Evotec)

Link to comment
Share on other sites

Hi,

I didn´t get what really try to do, but maybe this helps out.

Func _StringBetween($s, $from, $to)
    $x = StringInStr($s, $from) + StringLen($from)
    $y = StringInStr(StringTrimLeft($s, $x), $to)
    Return StringMid($s, $x, $y)
EndFunc  ;==>_StringBetween

So long,

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Hi,

I didn´t get what really try to do, but maybe this helps out.

From this:

</FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 00 00 00</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 000 00 05</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>6125</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 101 100 000</FONT></TD><TD NOWRAP><FONT SIZE=2>SSS</FONT></TD></TR>

I have to get $phone = +48 12 00 00 00, $phone2 = +48 12 000 00 05, $cell = 6125, $cellphone = +48 101 100 000

My little company: Evotec (PL version: Evotec)

Link to comment
Share on other sites

Great function th.meger. It should be included in next beta ;p I have replaced all my lines with just 2 of yours :) But still the code only displys first match between FONT=2 and /FONT .. and if in my case like i showed you there are 3-4-5 occorances i need a way to get them to another varialbles or so.

Func FindName($adress)
        $file = FileOpen($tempname, 10)
        If $file = -1 Then
            MsgBox(0, "Error", "Unable to open file " & $tempname )
        Exit
        EndIf
        FileWrite( $file, _INetGetSource($adress))
        Dim $aFile
        If Not _FileReadToArray($tempname,$aFile) Then
            MsgBox(4096,"Error", " Error reading log to Array    error:" & @error)
        Exit
        EndIf
        For $x = 1 to $aFile[0]
            If StringInStr($aFile[$x], $name) Then
                If StringInStr($aFile[$x], "OpenDocument") Then
                  ; GETS NAMES
                    $username = _StringBetween($aFile[$x],'OpenDocument">',"</A>")
                    $firstsplit = StringSplit($aFile[$x],'</A>',1)
                    $phone_office = _StringBetween($FirstSplit[2], "<FONT SIZE=2>","</FONT>")
                    MsgBox(1,"Text", $username & " " & $phone_office)
                EndIf
            EndIf
        Next
EndFunc

Func _StringBetween($s, $from, $to)
    $x = StringInStr($s, $from) + StringLen($from)
    $y = StringInStr(StringTrimLeft($s, $x), $to)
    Return StringMid($s, $x, $y)
EndFunc;==>_StringBetween

EDIT:

Removed $firstsplit = StringSplit($aFile[$x],'</A>',1) by accident

Edited by MadBoy

My little company: Evotec (PL version: Evotec)

Link to comment
Share on other sites

SOLVED! --> _StringBetween OWNS!!

FindName($adress)

Func FindName($adress)
        $file = FileOpen($tempname, 10)
        If $file = -1 Then
            MsgBox(0, "Error", "Unable to open file " & $tempname )
        Exit
        EndIf
        FileWrite( $file, _INetGetSource($adress))
        Dim $aFile
        If Not _FileReadToArray($tempname,$aFile) Then
            MsgBox(4096,"Error", " Error reading log to Array    error:" & @error)
        Exit
        EndIf
        For $x = 1 to $aFile[0]
            If StringInStr($aFile[$x], $name) Then
                If StringInStr($aFile[$x], "OpenDocument") Then
                   ; GETS NAMES
                    $FirstSplit = StringSplit($aFile[$x],'</A>',1)
                    $SecondSplit = StringSplit($FirstSplit[2],'/FONT>',1)
                    $username = _StringBetween($aFile[$x],'OpenDocument">',"</A>")
                    $phone_office = _StringBetween($FirstSplit[2], "<FONT SIZE=2>","</FONT>")
                    $phone_all = _StringBetween($SecondSplit[4], "<FONT SIZE=2>","<")
                    $phone_internal = _StringBetween($SecondSplit[6], "<FONT SIZE=2>","<")
                    $phone_cell = _StringBetween($SecondSplit[8], "<FONT SIZE=2>","<")
                    $username_sko = _StringBetween($SecondSplit[9], "<FONT SIZE=2>","<")
                   ;$phone_cell = 
                   ;MsgBox(1,"TEXT", $SecondSplit[9])
                    MsgBox(1,"Text " & $x, "Nazwisko i Imie: " & $username & @CRLF & "Telefon na biurko: " & $phone_office & @CRLF & "Telefon wewnetrzny: " & $phone_internal & @CRLF & "Telefon komorkowy: " & $phone_cell)
                EndIf
            EndIf
        Next
EndFunc
    
Func _StringBetween($s, $from, $to)
    $x = StringInStr($s, $from) + StringLen($from)
    $y = StringInStr(StringTrimLeft($s, $x), $to)
    Return StringMid($s, $x, $y)
EndFunc ;==>_StringBetween

My little company: Evotec (PL version: Evotec)

Link to comment
Share on other sites

This piece of code will return an array of all the matches (NOTE: including the blank matches).

#include <Array.au3>
$rawText = FileRead("input.txt")
$asResults = StringRegExp($rawText, "(?:<FONT SIZE=2>)(.*?)(?:</FONT>)", 3)
_ArrayDisplay($asResults,"")

[u]My UDFs[/u]Coroutine Multithreading UDF LibraryStringRegExp GuideRandom EncryptorArrayToDisplayString"The Brain, expecting disaster, fails to find the obvious solution." -- neogia

Link to comment
Share on other sites

SOLVED! --> _StringBetween OWNS!!

CODE

FindName($adress)

Func FindName($adress)

$file = FileOpen($tempname, 10)

If $file = -1 Then

MsgBox(0, "Error", "Unable to open file " & $tempname )

Exit

EndIf

FileWrite( $file, _INetGetSource($adress))

Dim $aFile

If Not _FileReadToArray($tempname,$aFile) Then

MsgBox(4096,"Error", " Error reading log to Array error:" & @error)

Exit

EndIf

For $x = 1 to $aFile[0]

If StringInStr($aFile[$x], $name) Then

If StringInStr($aFile[$x], "OpenDocument") Then

; GETS NAMES

$FirstSplit = StringSplit($aFile[$x],'</A>',1)

$SecondSplit = StringSplit($FirstSplit[2],'/FONT>',1)

$username = _StringBetween($aFile[$x],'OpenDocument">',"</A>")

$phone_office = _StringBetween($FirstSplit[2], "<FONT SIZE=2>","</FONT>")

$phone_all = _StringBetween($SecondSplit[4], "<FONT SIZE=2>","<")

$phone_internal = _StringBetween($SecondSplit[6], "<FONT SIZE=2>","<")

$phone_cell = _StringBetween($SecondSplit[8], "<FONT SIZE=2>","<")

$username_sko = _StringBetween($SecondSplit[9], "<FONT SIZE=2>","<")

;$phone_cell =

;MsgBox(1,"TEXT", $SecondSplit[9])

MsgBox(1,"Text " & $x, "Nazwisko i Imie: " & $username & @CRLF & "Telefon na biurko: " & $phone_office & @CRLF & "Telefon wewnetrzny: " & $phone_internal & @CRLF & "Telefon komorkowy: " & $phone_cell)

EndIf

EndIf

Next

EndFunc

Func _StringBetween($s, $from, $to)

$x = StringInStr($s, $from) + StringLen($from)

$y = StringInStr(StringTrimLeft($s, $x), $to)

Return StringMid($s, $x, $y)

EndFunc ;==>_StringBetween

I have an alternative version that returns an array with all matches and $Array[0] = count:

; String Between function
#include <array.au3>

; String to test with
$MyString = '</FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 00 00 00</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 000 00 05</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>6125</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 101 100 000</FONT></TD><TD NOWRAP><FONT SIZE=2>SSS</FONT></TD></TR>'
$StartStr = '<FONT SIZE=2>'
$EndStr = '</FONT>'

$Results = _StringBetween($MyString, $StartStr, $EndStr)
_ArrayDisplay($Results, 'Results between "' & $StartStr & '" and "' & $EndStr & '"')


; Get string between start and end
Func _StringBetween($s, $from, $to)
    Local $Betweens[1]
    
    While (1)
        $x = StringInStr($s, $from)
        If $x = 0 Then ExitLoop
        $x = $x + StringLen($from) - 1
        $s = StringTrimLeft($s, $x)
        
        $y = StringInStr($s, $to)
        If $y = 0 Then
            $y = StringLen($s)
        Else
            $y = $y - 1
        EndIf
        
        _ArrayAdd($Betweens, StringLeft($s, $y))
        $s = StringTrimLeft($s, $y)
    WEnd
    
    $Betweens[0] = UBound($Betweens) - 1
    Return $Betweens
    
EndFunc  ;==>_StringBetween

Hope that helps! :)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

Bah, and when i actually think what a great coder i am and that everyone abandomed this topic i get 2 replays :mellow:(((((((((( Just Kidding, YOU GUYS are GREAT, It's best forum ever! Will try it out!

Even BETTER with the regex from neogia:

; Test Find Strings Between Function
#include <array.au3>

; Strings to test with
$MyString = '</FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 00 00 00</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 12 000 00 05</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>6125</FONT></TD><TD NOWRAP><FONT SIZE=2> </FONT></TD><TD NOWRAP><FONT SIZE=2>+48 101 100 000</FONT></TD><TD NOWRAP><FONT SIZE=2>SSS</FONT></TD></TR>'
$StartStr = '<FONT SIZE=2>'
$EndStr = '</FONT>'

$Results = _StringBetween($MyString, $StartStr, $EndStr)
_ArrayDisplay($Results, 'Results between "' & $StartStr & '" and "' & $EndStr & '"')


; Get string between start and end
Func _StringBetween($s, $from, $to)
    $Betweens = StringRegExp($s, '(?:' & $from & ')(.*?)(?:' & $to & ')', 3)
    _ArrayInsert($Betweens, 0, "")
    $Betweens[0] = UBound($Betweens) - 1
    Return $Betweens
EndFunc  ;==>_StringBetween

I love that regex! :)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

This piece of code will return an array of all the matches (NOTE: including the blank matches).

#include <Array.au3>
$rawText = FileRead("input.txt")
$asResults = StringRegExp($rawText, "(?:<FONT SIZE=2>)(.*?)(?:</FONT>)", 3)
_ArrayDisplay($asResults,"")
neogia, thanks for this elegant piece of code. I can use this in many places.

Can you tell me what the regexp should look like in the case where the start string may be either:

<FONT SIZE=2> OR <FONT SIZE=1>

I have run into instances where the thing I'm searching for may be preceded or followed by a slightly different startstring or endstring.

...by the way, it's pronounced: "JIF"... Bob Berry --- inventor of the GIF format
Link to comment
Share on other sites

neogia, thanks for this elegant piece of code. I can use this in many places.

Can you tell me what the regexp should look like in the case where the start string may be either:

<FONT SIZE=2> OR <FONT SIZE=1>

I have run into instances where the thing I'm searching for may be preceded or followed by a slightly different startstring or endstring.

This should do it:

$asResults = StringRegExp($rawText, "(?:<FONT SIZE=2>|<FONT SIZE=1>)(.*?)(?:</FONT>)", 3)
Link to comment
Share on other sites

This should do it:

$asResults = StringRegExp($rawText, "(?:<FONT SIZE=2>|<FONT SIZE=1>)(.*?)(?:</FONT>)", 3)
Bah, billmez beat me to it. That's exactly correct, "|" stands for "or" in regexp. Nice work billmez.

[u]My UDFs[/u]Coroutine Multithreading UDF LibraryStringRegExp GuideRandom EncryptorArrayToDisplayString"The Brain, expecting disaster, fails to find the obvious solution." -- neogia

Link to comment
Share on other sites

Bah, billmez beat me to it. That's exactly correct, "|" stands for "or" in regexp. Nice work billmez.

I've done a considerable amount of work in PERL regex and some in VBS, but haven't gotten into it much in AutoIT because of the syntax inconsistancies.

It does my heart good to see beautiful examples such as yours :) Thanks for sharing.

Link to comment
Share on other sites

  • Moderators

This is nice ya'll!!.... I wish someone would take the time to sit down and write out step by step with examples setups for StringRegExp().... I feel like a noob when it comes too it (well actually I am!!).

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

This is nice ya'll!!.... I wish someone would take the time to sit down and write out step by step with examples setups for StringRegExp().... I feel like a noob when it comes too it (well actually I am!!).

Good idea

[u]My UDFs[/u]Coroutine Multithreading UDF LibraryStringRegExp GuideRandom EncryptorArrayToDisplayString"The Brain, expecting disaster, fails to find the obvious solution." -- neogia

Link to comment
Share on other sites

This should do it:

$asResults = StringRegExp($rawText, "(?:<FONT SIZE=2>|<FONT SIZE=1>)(.*?)(?:</FONT>)", 3)
Thanks to both neogia and billmez! I appreciate the help.

Is there a way to use something like an asterisk, so that it could cover any case:

<FONT SIZE=*>

Edited by jefhal
...by the way, it's pronounced: "JIF"... Bob Berry --- inventor of the GIF format
Link to comment
Share on other sites

Thanks to both neogia and billmez! I appreciate the help.

Is there a way to use something like an asterisk, so that it could cover any case:

Without trying it, this should work since the . means 1 or more characters:

$asResults = StringRegExp($rawText, "(?:<FONT SIZE=.>(.*?)(?:</FONT>)", 3)
Link to comment
Share on other sites

$asResults = StringRegExp($rawText, "(?:<FONT SIZE=.>)(.*?)(?:</FONT>)", 3)
Your code will work, but keep in mind that "." means "match any character", but only 1 character. If you want to match 1 or more of any character you would use ".+?", and the question mark just means take the smallest match instead of the largest match. Note, you were missing the closing parentheses on "(?:<FONT SIZE=.>)". Edited by neogia

[u]My UDFs[/u]Coroutine Multithreading UDF LibraryStringRegExp GuideRandom EncryptorArrayToDisplayString"The Brain, expecting disaster, fails to find the obvious solution." -- neogia

Link to comment
Share on other sites

Your code will work, but keep in mind that "." means "match any character", but only 1 character. If you want to match 1 or more of any character you would use ".+?", and the question mark just means take the smallest match instead of the largest match. Note, you were missing the closing parentheses on "(?:<FONT SIZE=.>)".

Good catch on the parens, and entirely right on the .+?. I didn't think anyone would use font size=10 or better.

As an aside, I usually also use case insensitive matches for something like this, just in case the font tag is in lower case.

$asResults = StringRegExp($rawText, "(?i)(?:<FONT SIZE=.+?>)(.*?)(?:</FONT>)", 3)
Link to comment
Share on other sites

Good catch on the parens, and entirely right on the .+?. I didn't think anyone would use font size=10 or better.

As an aside, I usually also use case insensitive matches for something like this, just in case the font tag is in lower case.

$asResults = StringRegExp($rawText, "(?i)(?:<FONT SIZE=.+?>)(.*?)(?:</FONT>)", 3)
Thank you again. I will try these on Monday. I have to sit down in a quiet place with the regexp documentation and the sample code. However, even if I figure it out, it is very difficult to create one from scratch myself, so I appreciate both of your assistance...
...by the way, it's pronounced: "JIF"... Bob Berry --- inventor of the GIF format
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...