Jump to content

RegEx Not Working.


Recommended Posts

Nice forum update.

This is peice of code I want to find and pull data from. So. I would want to pull the username from the whole page.

<td><b>Name</b></td>
<td>Popcorned</td>

I attempted to code a regular expression to try and get the name from the page, but failed bad. :lmao:

; This function collects all the stats from my profile and places them into the program.
Func koc_stats()
Local $read_html = _IEBodyReadHTML ($oIE)

$asResult = StringRegExp($read_html, '<td><b>Name</b></td><td>Popcorned</td>', 1)
If @error == 0 Then
    MsgBox(0, "SRE Example 6 Result", $asResult[0])
EndIf

EndFunc

So the end result would be Popcorned, but with different names in place, etc ...

Any help is really appreciated. (Yes, I am desperate and pulling my hair out ;) )

Regards,

-Matt

Edited by Googler24022
Link to comment
Share on other sites

  • Moderators

The code below will give you the desired result. However there may be an easier way of retrieving this text. Can you provide a link to the website that contains this source?

#include <Array.au3>

$sString = "<td><b>Name</b></td><td>Popcorned</td>"
$aArray = StringRegExp($sString, "\<td\>\<b\>Name\</b\>\</td\>\<td\>(.+)\</td\>", 1)
If Not @error Then _ArrayDisplay($aArray, "Test")
Link to comment
Share on other sites

If you're looking to get the "Name", _StringBetween would be a very good option I would think.

IE Dev ToolbarMSDN: InternetExplorer ObjectMSDN: HTML/DHTML Reference Guide[quote]It is surprising what a man can do when he has to, and how little most men will do when they don't have to. - Walter Linn[/quote]--------------------[font="Franklin Gothic Medium"]Post a reproducer with less than 100 lines of code.[/font]
Link to comment
Share on other sites

The (.+) pattern is not the best one as it will take anything until the last \</td\>. So if you have several name constructs on the same line it will create undesired results. If your able to define what a legal name is it would be better to limit the pattern a bit.

EX: replace (.+) with

([\w \._-]+)

Cant remember if you should escape the space in there..;)

This will limit a valid name to contain a letter, space, punctures and/or underscore. Take away what you don't need.

Link to comment
Share on other sites

  • Moderators

See if this gives you any ideas.

#include <IE.au3>
#include <GuiListView.au3>
#include <GuiConstants.au3>

Opt("GUIOnEventMode", True)

$Form1 = GUICreate("User Info", 620, 440, 200, 125)
GUISetOnEvent($GUI_EVENT_CLOSE, "_Exit")
$List1 = GUICtrlCreateListView("Property|Value", 8, 50, 600, 350)
_GrabInfo()
GUISetState()

While 1
    Sleep(100)
WEnd

Func _GrabInfo()
    Local $oUserInfo = 0
    
    ; Register the IE.au3 internal Error Handler
    _IEErrorHandlerRegister()
    
    ; Specify the URL to the Webpage
    $sURL = @ScriptDir & "\source.html"
    ; Create a hidden Browser Window
    $oIE = _IECreate($sURL, 0, 0)
    
    ; Get a collection of all Table Header Cells
    $oHeaders = _IETagNameGetCollection($oIE, "TH")
    If @error Then Return 0
    
    ; Loop through each object in the collection
    For $oHeader In $oHeaders
        ; Compare the innerText of the current Header Cell
        If String($oHeader.innerText) = "User Info" Then
            ; Get a reference to the third parentElement of the Header Cell (This is the <Table> element)
            $oUserInfo = $oHeader.parentElement.parentElement.parentElement
            ExitLoop
        EndIf
    Next
    
    ; Check that $oUserInfo is an object, otherwise no match was found
    If Not IsObj($oUserInfo) Then Return 0
    
    ; Read the contents of the Table into an array
    $aUserInfo = _IETableWriteToArray($oUserInfo)
    If @error Then Return 0
    
    ; Loop through each element of the array
    For $i = 0 To UBound($aUserInfo, 2) - 1
        ; Skip any blank elements
        If $aUserInfo[0][$i] = "" Then ContinueLoop
        ; Add the element to the list view
        $hListItem = GUICtrlCreateListViewItem($aUserInfo[0][$i] & "|" & $aUserInfo[1][$i], $List1)
        ; Check if $i is even or odd
        If Mod($i, 2) = 0 Then
            ; Set the items background color if even
            GUICtrlSetBkColor($hListItem, 0x5CACEE)
        Else
            ; Set the items background color if odd
            GUICtrlSetBkColor($hListItem, 0x9FB6CD)
        EndIf
    Next
    
    ; Auto size the column to the contents
    _GUICtrlListViewSetColumnWidth($List1, 0, $LVSCW_AUTOSIZE)
    ; Extend the last column to the remainder of the listview
    _GUICtrlListViewSetColumnWidth($List1, 1, $LVSCW_AUTOSIZE_USEHEADER)
    
    ; Close the Browser Window
    _IEQuit($oIE)
EndFunc   ;==>_GrabInfo

Func _Exit()
    Exit
EndFunc   ;==>_Exit

Edit: Updated code with extensive commenting

Edited by big_daddy
Link to comment
Share on other sites

Perhaps use something like this? This would be used if you want to retrieve the Popcorned part of your code, I am guessing that Popcorned is a username for an account and it varies. Describe a little more on what you want it to return if this is not what you want.

$source = _InetGetSource('website.com')
$text     = "<td><b>Name</b></td><td>(.*?)</td>"
$array   = StrRegExp($source, $text)
Msgbox(0,'','This is what it retrieved: ' & $array[0])

Hopefully this helped,

Kurt

Awaiting Diablo III..

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...