Jump to content
Spask

Finding a text inside HTML

Recommended Posts

Hi, I'm trying to find a text value inside of a html.

This is what the line looks like normally:

<p id="line1" class>
    <span class="bot">TEXT HERE</span>
</p>

The text then changes to a non breaking space:

<p id="line1" class>
    <span class="bot">&nbsp;</span>
</p>

And then it changes back to normal text but it's different every time.

Can I code this so that it grabs the text every time it changes and has a variable that represents it?

I currently have this inside of my loop:

$span = .document.getElementsByTagName("span")
    For $text In $span
        If $text.value = "&nbsp;" Then
            Sleep(50)
            MsgBox(0,0,0) ;messagebox to test if it can be found, but I don't know how to grab the text
        EndIf
    Next

The problem is that there are many other lines in the html that have the same span but are called "line3", "line5", etc and the one I need is from "line1".

I will appreciate if anyone can help with this!

Edited by Spask

Share this post


Link to post
Share on other sites

Here  is how I'd do it given the html file is temp.html in the  @ScriptDir directory and looping through the regular expression periodically to keep checking:

#include <MsgBoxConstants.au3>

; Open the file for reading and store the handle to a variable.
$hFileOpen = FileOpen(@ScriptDir & "\temp.html", 0)
If $hFileOpen = -1 Then
    MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.")
EndIf

; Read the contents of the file using the handle returned by FileOpen.
$sFileRead = FileRead($hFileOpen)

; Check if a string fits a given regular expression pattern.
$aArray = StringRegExp($sFileRead, '(?i)(?-s)<p id="line1" class>\r?\n.*?bot">(.*?)[<;]+.*?\r?\n</P>', 3)


For $i = 0 To UBound($aArray) - 1
    MsgBox($MB_SYSTEMMODAL, "RegExp Test with Option 2 - " & $i, $aArray[$i])
Next

 

Edited by Jury

Share this post


Link to post
Share on other sites

Does this work if I'm trying to do it in an IE window? I've created a variable called $ie = ObjCreate("InternetExplorer.Application")

Share this post


Link to post
Share on other sites

Sure, here it is:

$ie = ObjCreate("InternetExplorer.Application")

#include <ButtonConstants.au3>
#include <EditConstants.au3>
#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>

Global $bot = GUICreate("Bot", 226, 139, -1, -1)
Global $startie = GUICtrlCreateButton("Start IE", 32, 24, 155, 25)
Global $startloop = GUICtrlCreateButton("Start Loop", 32, 56, 75, 25)
Global $pauseloop = GUICtrlCreateButton("Pause Loop", 112, 56, 75, 25)

GUICtrlSetOnEvent($startie, 'StartIE')
GUICtrlSetOnEvent($startloop, 'startloop')
GUICtrlSetOnEvent($pauseloop, 'Pauseloop')
GUISetOnEvent($GUI_EVENT_CLOSE, 'ExitApp')
Opt('GUIOnEventMode', 1)

GUISetState(@SW_SHOW)

Global $var = 0

;function to start internet explorer and get website ready
Func StartIE()
   With $ie
      .visible = true
      .navigate("http://cleverbot.com")
      While($ie.busy)
         Sleep(500)
      WEnd
   EndWith
EndFunc

While 1
   With $ie
      ;while loop to interact with the website
      While $var = 1
         While($ie.busy)
            Sleep(500)
         WEnd
         Sleep(5000)
         $sayitbutton = .document.getElementsByTagName("input")
         For $b in $sayitbutton
            if $b.value = "think for me" Then
               $b.click()
            EndIf
         Next
         ;finds the text in html
         $span = .document.getElementsByTagName("span")
         For $text In $span
           If $text.value = " " Then
              Sleep(50)
              MsgBox(0,0,0) ;messagebox to test if it can be found, but I don't know how to grab the text
           EndIf
         Next
      WEnd
   EndWith
WEnd

;starts the loop
Func startloop()
   $var = 1
EndFunc

;pauses the loop
Func Pauseloop()
   $var = 2
EndFunc

;exits the app
Func ExitApp()
   Opt('GUIOnEventMode', 0)
   GUIDelete($bot)
   Exit
EndFunc

 

Share this post


Link to post
Share on other sites

Something like this - the more information you supply and an example of what you are trying will result in better responses to you questions.

#include <IE.au3>


$oIE = _IECreate("Your url here")
$sHTML = _IEBodyReadHTML($oIE)


$aArray = StringRegExp($sHTML, '(?i)(?-s)<p id="line1" class>\r?\n.*?bot">(.*?)[<;]+.*?\r?\n</P>', 3)

For $i = 0 To UBound($aArray) - 1
    ConsoleWrite($aArray[$i] & @CRLF)
Next

 

Edited by Jury

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • By moimon
      Hi all. 
      Sorry if my words are confusing because I am not good at English.
      I am writing code to embed website with Recaptcha in GUI. And then, the code will automatically press the "I'm not a robot" button. The problem here is:
         - The code still works fine when entering iFrame to interact with Recaptcha with the URL is "https://www.google.com/recaptcha/api2/demo"
      <snip>
       
         - But for other URLs (such as "https://recaptcha-demo.appspot.com/recaptcha-v2-checkbox.php"), the code cannot be entered into the iframe.
       
      Code:
      <snip>  
      I did a lot of research but didn't know why.
       I sincerely thanks for the help. 
    • By SkysLastChance
      I am pretty sure the site that I am using was updated and now I am not seeing the same tags/elements that I used to. I tried using a UIAspy to see if I could grab them that way with no luck. 
      What do I need to do to be able to automate this again. Are the elements hidden somewhere?
      Instead of highlighting established account is highlights all of the web page almost. How can I get to the detail again? 
      I used to grab establish account by the name. Now I have nothing to even try to grab. 
      There are not elements in the event viewer either. 
       


       
      This is what happens when I try to inspect the element and click established account. 
      I get the same type of results in chrome
    • By wysocki
      I have a smartphone and I use it to access my email. However, when composing an email on it I have a problem. My list of phone contacts on the phone is very different from my list of email contacts in my Thunderbird desktop app.  I use my Gmail address book to store primarily phone contacts, and I use Thunderbird for my list of email contacts. I wanted a way to get my Thunderbird contact list onto my smartphone to be able to compose emails to addresses in that list. Here's my solution.
      I wrote a script to export my Thunderbird Personal Address Book to a csv file. It then reads that file and re-writes it with html wrappers around the data to make it into a nicely formatted web page. It then uploads the htm file to my website. On my smartphone, I created a shortcut to the file's URL and whenever I click it, I get the list displayed. Each contact shows name and email address along with a COPY button that will put the address into the clipboard. Then in my email client, I can easily paste that address into it. Alternatively, clicking on the actual email link will open a new message dialog in your email client with that address already entered.
      To use the app, all you need to do is use Thunderbird and have a webserver available. You'll need to download the FTPEX.AU3 file from this website and make a few changes to some constants around line 17 for FTP login info, etc.
       
      pab2ftp.au3
    • By SkysLastChance
      What would be the best way to grab the last digits of this <span>? One of the problems I know I am going to have is sometimes it will be 1 digit other times it might be 3. 

      I am trying to get the list of spans and I get this error.

       
      $oInputs = _IETagNameGetCollection($oIE, "span") $sTxt = "" For $oInput In $oInputs     $sTxt &= $oInput.Innertext & @CRLF Next MsgBox($MB_SYSTEMMODAL, "Form Input Type", "Form: " & $oInput.form.name & @CRLF & @CRLF & "         Types :" & @CRLF & $sTxt)  
    • By yffulf
      Is there probably a way to click ie popup button and hide or invisible popup button in the same time?
      I try to use WinSetState @SW_hide ,
      the result is ie crashing or no response because button hide and the next click command won't work...
      $oForm=_IEFormGetObjByName($oIE,"form1") $obutton=_IEFormElementGetObjByName($oForm,"btn_OK") $hWnd = _IEPropertyGet($oIE, "hwnd") _IEAction($obutton,"focus") ControlSend($hWnd, "", "[CLASS:Internet Explorer_Server; INSTANCE:1]", "{Enter}") WinWait("", "確定存檔嗎?") WinSetState("","確定存檔嗎?",@SW_Hide) ControlClick("", "確定存檔嗎?", "[CLASS:Button; TEXT:確定; Instance:1;]")  
×
×
  • Create New...