Jump to content

Simple Regular Expression


Recommended Posts

This should only take a few seconds for one of you RegExp geniuses...

I'm trying to create a simple _IEEmbedded GUI to display the users in chat, and size the GUI accordingly...

RAW String I'm trying to match against...

<br /> 2 users are in chat::<br /><a href="http://xxxxx.com/index.php?action=profile;u=3333" style="color: #38d7c1">BinaryBrother</a>, <a href="http://xxxxx.com/index.php?action=profile;u=2222" style="color: #38d7c1">Dude1</a>

Put simply... When I call this Function, I would like a somewhat small GUI window to appear and display the users that are currently in chat in an IE Interface because that will also allow me to basically pass the HTML back to the interface and the individual users will be 'colored' based on level of authority... Admin's are RED, etc...

But the only real thing I need help with right now is...

Taking 3 pages worth of HTML and pulling only the above (in the code tags) out... Doing so will give me everything I need... :)

Of course the number of users "x users are in chat::" as well as the following HTML (URLs) will be dynamic...

The Below can be ignored really... It was just to prove (even if sloppy) I did try... x.x

Func GetChatUsers()

    Opt("GUIResizeMode", 1)
    ;$RAWData = _INetGetSource("http://www.Win7Vista.com")  ; Quit using this because the txt alignment was seemingly random...
    ;If @error Then Return MsgBox(0, "ERROR", "Error?")

    $RAWPath = @ScriptDir & "\RAW.txt"
    If FileExists($RAWPath) Then
        FileDelete($RAWPath)
    EndIf
    $oAnother = _IECreate("http://www.Win7Vista.com",0,0)
    $oSource = _IEDocReadHTML($oAnother)
    ;InetGet("http://www.Win7Vista.com", $RAWPath, 1) ; no-go  InetGet causes txt to lose structure (So the line functions don't work)
    FileWrite($RAWPath,$oSource)
    _IEQuit($oAnother)
    $RAWLineCount = _FileCountLines($RAWPath)
    For $N = 0 To $RAWLineCount
        If StringInStr($LineN, "users are in") Or StringInStr($LineN, "user is in") Then
            $LineN = FileReadLine($RAWPath, $N)
            ExitLoop
        EndIf
    Next

    $EndOfString = StringLen($LineN)
    $LineN = _StringInsert($LineN, "<EndOfString>", $EndOfString)
    $UsersInChat = _StringBetween($LineN, "[1234567890]+ users are in chat\:\:", "<EndOfString>", -1)
    If Not IsArray($UsersInChat) Then
        $UsersInChat = _StringBetween($LineN, "user is in chat\:\:", "<EndOfString>", -1)
    EndIf

    ;$UsersInChat = StringRegExp($LineN,"([1234567890]+ users are in chat\:\:[.]+<EndOfString>)")
    ;$UsersInChat = _StringBetween($LineN,"[1234567890]+ users are in chat\:\:","",-1,1)
    If IsArray($UsersInChat) Then
        $UsersGUI = GUICreate("User(s) in Chat", 100, 40, -1, -1, $WS_OVERLAPPEDWINDOW + $WS_VISIBLE + $WS_CLIPSIBLINGS + $WS_CLIPCHILDREN)
        $oUsers = _IECreateEmbedded()
        $Embed = GUICtrlCreateObj($oUsers, 0, 0, 300, 40)
        _IENavigate($oUsers, "about:blank", 1)
        $UsersHTML = StringTrimLeft($UsersInChat[0], 6)
        $UsersHTML1 = StringReplace($UsersHTML, ",", "<br>")
        If $UsersHTML1 <> "" Then
            $UsersHTML = $UsersHTML1
        EndIf

        $SizePlus = @Extended
        $CurrentSize = WinGetPos("Users in Chat")
        WinMove("Users in Chat", "", $CurrentSize[0], $CurrentSize[1], $CurrentSize[2], $CurrentSize[3] + ($SizePlus * 25))
        GUISetState($UsersGUI, @SW_SHOW)
        $oUsers.document.body.scroll = "no"
        _IEDocWriteHTML($oUsers, "<body bgcolor='black'>" & $UsersHTML)
        GUISetOnEvent($GUI_EVENT_CLOSE, "UsersInChatClose", $UsersGUI)
    Else
        MsgBox(0, "Users", "No Users in chat.")
    EndIf
EndFunc   ;==>GetChatUsers
Edited by BinaryBrother

SIGNATURE_0X800007D NOT FOUND

Link to comment
Share on other sites

;get your html stuff
;$HTML = htmlStuffInAString($htmlStuff);not sure how you'd do this.
$stringIndex = StringInStr($HTML, "users are in chat")
$numberOfUsers = StringMid($HTML, $stringIndex-3, 3)

Get html to a string, find "users are in chat", get the 3 characters prior, and you have your number of users. You'll probably want to increase that to 4 and do a little post-processing for cases when there are more than 100 users, and so on.

Link to comment
Share on other sites

Hey! Now that's a handy little trick. Thanks for that. :P

I hate to sound unappreciative, because I'm not.

I just sort of needed the userlist as well, the #number of users is just a bonus for the Window Title. :)

I truly appreciate your help JRowe. I've been messing with this for a week, and I'm starting to get fluttered with it... :)

SIGNATURE_0X800007D NOT FOUND

Link to comment
Share on other sites

NP. Since anchors are common tags, you don't want to get all of them, so you do like I said above and find the index of the <br />2 users are in chat:: <br /> string.

You need a RegEx pattern , from "chat::<br />" to "</a>", inclusive, so something like [chat::<br/>]?s+[</a>]

$REParse = StringRegEx($myNewHTML, "[chat::<br/>]?s+[</a>]", 1, $stringIndex)

This, of course, will return an array of matches, which will be incorrect if there is another anchor tag after the user list.

I think what we need to do is parse anchor tags after chat:: and then parse again to remove anything not in the

<a href="site;usernumber" style="color: #38d7c1">name</a> format.

Let me know if you beat me to it. I'll see what i can whip up.

Link to comment
Share on other sites

That RegEx was horrible.

"chat::<br />(.+</a>)" does what I wanted (returns a match of everything from chat::<br /> to the last </a> tag.)

Link to comment
Share on other sites

Wanna give me an entire example page? I can probably use that, I'm seeing things like commas in places that can be used to match the whole pattern (I haven't used RE in a few months, kinda fun to do.)

Link to comment
Share on other sites

Ok, easy enough. Get the text between "chat::" and the </div> before "Most Online Today", then parse out usernames and colors.

A side effect means you wont have to explicitly grab the # of users, you can just grab it from the number of users parsed out.

"chat::<br />(.+</a>)</div>"

;first grab the relevant stuff from html
$HTMLParse = StringRegEx($HTML, "chat::<br />(.+</a>)</div>", 1)
$UserListUnparsed = $HTMLParse[0]
;Now parse out the chunks you need (the anchor element containing username and style information)
$UserListParse = StringRegEx($UserListUnparsed, "<a href=(.+)</a>", 1)
;$UserListParse is an array of anchor tags, I'm sure you can finish it (just need to grab the style:color value and the username.)

Let me know if you need more or if I horribly broke something somewhere.

Link to comment
Share on other sites

I just saved your file to the desktop as a txt file and ran this against it. You can change your method of getting the original file contents to whatever suits you and if you want the results in an array change the @CRLF to @LF and StringSplit it or chamge that whole RegExpReplace line to $aRegEx = StringRegExp($aSRE[0], $sSRE, 3)

$sStr = FileRead(@DeskTopDir & "\regexptoy.txt")
$sSRE = "(?i)(?s)\d*.+chat::(.*?)\s*</div>"
$aSRE = StringRegExp($sStr, $sSRE, 1)
If NOT @Error Then
    $sSRE = "(?i).*?<a.+?>(.+?)</a>"
    $sStr = StringRegExpReplace($aSRE[0], $sSRE, "$1" & @CRLF)
    MsgBox(0, "Result", StringStripWS($sStr, 3))
EndIf

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

This is the better method for returning the names as an array

$Result = _GetNames(FileRead(@DesktopDir & "\regexptoy.txt"))
If IsArray($Result) Then
    For $i = 0 To UBound($Result) -1
        MsgBox(0, "Result " & $i +1, $Result[$i])
    Next
EndIf

Func _GetNames($sStr)
    $sSRE = "(?i)(?s)\d*.+chat::(.*?)\s*</div>"
    $aSRE = StringRegExp($sStr, $sSRE, 1)
    If Not @Error Then
        $sSRE = "(?i).*?<a.+?>(.+?)</a>"
        $aSRE = StringRegExp($aSRE[0], $sSRE, 3)
        If Not @Error Then Return $aSRE
    EndIf
    Return ""
EndFunc   ;;<==>_GetNames()

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...