Jump to content
Spask

Finding a text inside HTML

Recommended Posts

Spask

Hi, I'm trying to find a text value inside of a html.

This is what the line looks like normally:

<p id="line1" class>
    <span class="bot">TEXT HERE</span>
</p>

The text then changes to a non breaking space:

<p id="line1" class>
    <span class="bot">&nbsp;</span>
</p>

And then it changes back to normal text but it's different every time.

Can I code this so that it grabs the text every time it changes and has a variable that represents it?

I currently have this inside of my loop:

$span = .document.getElementsByTagName("span")
    For $text In $span
        If $text.value = "&nbsp;" Then
            Sleep(50)
            MsgBox(0,0,0) ;messagebox to test if it can be found, but I don't know how to grab the text
        EndIf
    Next

The problem is that there are many other lines in the html that have the same span but are called "line3", "line5", etc and the one I need is from "line1".

I will appreciate if anyone can help with this!

Edited by Spask

Share this post


Link to post
Share on other sites
Jury

Here  is how I'd do it given the html file is temp.html in the  @ScriptDir directory and looping through the regular expression periodically to keep checking:

#include <MsgBoxConstants.au3>

; Open the file for reading and store the handle to a variable.
$hFileOpen = FileOpen(@ScriptDir & "\temp.html", 0)
If $hFileOpen = -1 Then
    MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.")
EndIf

; Read the contents of the file using the handle returned by FileOpen.
$sFileRead = FileRead($hFileOpen)

; Check if a string fits a given regular expression pattern.
$aArray = StringRegExp($sFileRead, '(?i)(?-s)<p id="line1" class>\r?\n.*?bot">(.*?)[<;]+.*?\r?\n</P>', 3)


For $i = 0 To UBound($aArray) - 1
    MsgBox($MB_SYSTEMMODAL, "RegExp Test with Option 2 - " & $i, $aArray[$i])
Next

 

Edited by Jury

Share this post


Link to post
Share on other sites
Spask

Does this work if I'm trying to do it in an IE window? I've created a variable called $ie = ObjCreate("InternetExplorer.Application")

Share this post


Link to post
Share on other sites
kylomas

Spask,

Can you post the runnable code you are trying?

kylomas

Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
Spask

Sure, here it is:

$ie = ObjCreate("InternetExplorer.Application")

#include <ButtonConstants.au3>
#include <EditConstants.au3>
#include <GUIConstantsEx.au3>
#include <WindowsConstants.au3>

Global $bot = GUICreate("Bot", 226, 139, -1, -1)
Global $startie = GUICtrlCreateButton("Start IE", 32, 24, 155, 25)
Global $startloop = GUICtrlCreateButton("Start Loop", 32, 56, 75, 25)
Global $pauseloop = GUICtrlCreateButton("Pause Loop", 112, 56, 75, 25)

GUICtrlSetOnEvent($startie, 'StartIE')
GUICtrlSetOnEvent($startloop, 'startloop')
GUICtrlSetOnEvent($pauseloop, 'Pauseloop')
GUISetOnEvent($GUI_EVENT_CLOSE, 'ExitApp')
Opt('GUIOnEventMode', 1)

GUISetState(@SW_SHOW)

Global $var = 0

;function to start internet explorer and get website ready
Func StartIE()
   With $ie
      .visible = true
      .navigate("http://cleverbot.com")
      While($ie.busy)
         Sleep(500)
      WEnd
   EndWith
EndFunc

While 1
   With $ie
      ;while loop to interact with the website
      While $var = 1
         While($ie.busy)
            Sleep(500)
         WEnd
         Sleep(5000)
         $sayitbutton = .document.getElementsByTagName("input")
         For $b in $sayitbutton
            if $b.value = "think for me" Then
               $b.click()
            EndIf
         Next
         ;finds the text in html
         $span = .document.getElementsByTagName("span")
         For $text In $span
           If $text.value = " " Then
              Sleep(50)
              MsgBox(0,0,0) ;messagebox to test if it can be found, but I don't know how to grab the text
           EndIf
         Next
      WEnd
   EndWith
WEnd

;starts the loop
Func startloop()
   $var = 1
EndFunc

;pauses the loop
Func Pauseloop()
   $var = 2
EndFunc

;exits the app
Func ExitApp()
   Opt('GUIOnEventMode', 0)
   GUIDelete($bot)
   Exit
EndFunc

 

Share this post


Link to post
Share on other sites
Spask

Woops my bad, didn't realize I made another post.

Edited by Spask

Share this post


Link to post
Share on other sites
Jury

Something like this - the more information you supply and an example of what you are trying will result in better responses to you questions.

#include <IE.au3>


$oIE = _IECreate("Your url here")
$sHTML = _IEBodyReadHTML($oIE)


$aArray = StringRegExp($sHTML, '(?i)(?-s)<p id="line1" class>\r?\n.*?bot">(.*?)[<;]+.*?\r?\n</P>', 3)

For $i = 0 To UBound($aArray) - 1
    ConsoleWrite($aArray[$i] & @CRLF)
Next

 

Edited by Jury

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • Burgs
      By Burgs
      Hello,
        I have a website with a Google Map I setup using the Google Map API.  It works and displays just fine.  However to make it useful to me I need to be able to dynamically change the map to display different areas by sending new Latitude and Longitude coordinates.  I am having difficulty making this happen.  Here is my code thus far:
      #include <IE.au3> $oIE3 = _IECreate("http://my_sample_website.html") ;just an example, not an actual site... _IELoadWait($oIE3) $s_word = "lat:" $oInputs = _IETagNameAllGetCollection($oIE3) if @error <> 0 Then MsgBox($MB_SYSTEMMODAL, "ERROR", "Error is: " & @error) EndIf ;@error For $oInput In $oInputs if Number($iPos) == -1 Then $iPos = StringInStr($oInput.innerHTML, String($s_word)) if (Number($iPos) > 0) AND (@error == 0) Then ConsoleWrite("I FOUND IT...! " & String($s_word) & @CRLF) $sHTML = _IEBodyReadHTML($oIE3) $_lat_look = 0 $_lng_look = 0 $_end_look = 0 ;default $_lat_look = StringInStr(String($sHTML), "lat:") if Number($_lat_look) <> 0 Then $_lng_look = StringInStr(String($sHTML), "lng:") if Number($_lng_look) <> 0 Then $_end_look = StringInStr(String($sHTML), "}") if Number($_end_look) <> 0 Then ConsoleWrite("HTML BODY: " & $sHTML & @CRLF) $_old_lat = String(StringMid(String($sHTML), $_lat_look, ($_lng_look - $_lat_look))) $_old_lng = String(StringMid(String($sHTML), $_lng_look, ($_end_look - $_lng_look))) ConsoleWrite("$_old_lat: " & $_old_lat & @CRLF) ConsoleWrite("$_old_lng: " & $_old_lng & @CRLF) $_new_lat = "lat: " & String("-34.397") & ", " $_new_lng = "lng: " & String("150.644") & "}; " ConsoleWrite("...new lat is: " & String($_new_lat) & " new lng is: " & String($_new_lng) & @CRLF) $_LOOK = StringReplace($_old_lat, 1, String($_new_lat)) $_LOOK2 = StringReplace($_old_lng, 1, String($_new_lng)) ConsoleWrite("$_LOOK: " & $_LOOK & "$_LOOK2: " & $_LOOK2 & @CRLF) EndIf ;'$_end_look' NOT "0"... $iPos = -1 EndIf ;'String($s_word)' was found in the collection '$oInputs' EndIf ;'$iPos' is "-1" Next  
        I am having trouble trying to replace the line in the HTML ($sHTML variable in my example) that contains the "lat:" and "lng:" information.  I figure if I can replace that line everything else remains the same, and in theory, the map should cycle to display a map with the new latitude and longitude coordinates...I hope. 
        I have attempted to write the $sHTML to a text document and then use '_IEBodyWriteHTML' to read it back into the webpage HTML however that is not working.  There must be an easier method to accomplish this...what am I missing here...?  Any thoughts greatly appreciated.  Regards.       
    • aiter
      By aiter
      I created a webpage using the IE udf. Great, but then I noticed I cannot do a ctrl-F to find something on the page. 
      If I save the page then bring that page up I can find things of course.
      Its trivial really, but is there way to get the ctrl-F to work when the page is generated without resorting to having to save?
    • XinYoung
      By XinYoung
      For fun, I'm building an app that opens a webpage and refreshes it every 30 seconds.
      But once the script performs _IEAction($oIE, "refresh"), the GUI closes.
      Any help is appreciated.
       

      #include <ButtonConstants.au3>
      #include <EditConstants.au3>
      #include <GUIConstantsEx.au3>
      #include <StaticConstants.au3>
      #include <WindowsConstants.au3>
      #include <WinAPIFiles.au3>
      #include <Array.au3>
      #include <File.au3>
      #include <Excel.au3>
      #include <DateTimeConstants.au3>
      #include <MsgBoxConstants.au3>
      #include <WinAPIShellEx.au3>
      #include <MsgBoxConstants.au3>
      #include <Date.au3>
      #include <ComboConstants.au3>
      #include <guimenu.au3>
      #include <IE.au3>
      HotKeySet("{F4}", "_Exit")
      ;Open the file(s) in the selected folder
      $extension = ".txt"
      $app2openWith = @SystemDir & "\notepad.exe"
      Func Begin()
          Global $loopTrick = 0
          #Region ### START Koda GUI section ### Form=c:\users\mchu\downloads\autoit\my code\form1.kxf
          Global $UI = GUICreate("Hit Em Up!", 256, 113, -1, -1)
          GUISetBkColor(0x000000)
          $menu = _GUICtrlMenu_GetSystemMenu($UI)
          _GUICtrlMenu_EnableMenuItem($menu, $SC_CLOSE, 1, False)
          Global $url = GUICtrlCreateInput("https://www.youtube.com/watch?v=dQw4w9WgXcQ", 81, 8, 160, 21)
          $Label1 = GUICtrlCreateLabel("Target:", 16, 8, 55, 17)
          GUICtrlSetFont(-1, 12, 800, 0, "MS Sans Serif")
          GUICtrlSetColor(-1, 0x00FF00)
          Global $StartBut = GUICtrlCreateButton("Start", 16, 40, 67, 25)
          GUICtrlSetFont(-1, 13, 800, 0, "MS Sans Serif")
          GUICtrlSetBkColor(-1, 0x008000)
          $Label2 = GUICtrlCreateLabel("(Press F4 to Exit)", 96, 40, 8000, 17)
          GUICtrlSetFont(-1, 12, 800, 0, "MS Sans Serif")
          GUICtrlSetColor(-1, 0x00FF00)
          GUISetState(@SW_SHOW)
          #EndRegion ### END Koda GUI section ###
          While 1
              $UIfunc = GUIGetMsg()
              Select
                  Case $UIfunc = $GUI_EVENT_CLOSE
                      _Exit()
                  Case $UIfunc = $StartBut
                      If GUICtrlRead($url) = "" Then
                          MsgBox(48, "Um...", "Give me a target you idiot.")
                      Else
                          GUICtrlSetState($url, $GUI_DISABLE)
                          GUICtrlSetState($StartBut, $GUI_DISABLE)
                          Start()
                      EndIf
              EndSelect
          WEnd
      EndFunc   ;==>Begin
      Func Start()
          If $loopTrick = 0 Then
              Global $oIE = _IECreate(GUICtrlRead($url))
              _IELoadWait($oIE)
              Again()
          ElseIf $loopTrick = 1 Then
              Sleep(3000)
              _IEAction($oIE, "refresh")
              Sleep(3000)
              Start()
          EndIf
      EndFunc   ;==>Start
      Func Again()
          $loopTrick = 1
          Start()
      EndFunc   ;==>Again
      Func _Exit()
          Exit
      EndFunc   ;==>_Exit
    • mLipok
      By mLipok
      I have some problems with windows explorers.
      Here is my testing snippet:
      ;~ #RequireAdmin _Example() MsgBox(1, '', '@error = ' & @error & @CRLF & '@extended = ' & @extended) Func _Example() Local $oShell = ObjCreate("shell.application") If @error then Return SetError(1, @extended, 0) Local $oShellWindows = $oShell.windows If @error then Return SetError(2, @extended, 0) If $oShellWindows = Null Then Return SetError(3, 0, 0) Local $iCount = $oShellWindows.Count If @error then Return SetError(4, $iCount, 0) Return SetExtended($iCount,1) EndFunc ;==>_Example Normally this should return in extended number of opened InternetExplorer + WindowsExplorer.
      But in some cases I encounter a problem with this, as there happens situation when @extended returns 0 even if I had already opened IE.
      Here is one of them:
      I have some script which is working fine with IE on about 100 computers.
      In this specyfic case this script starts his work, and after few minutes he stoped works.
      All the time IE is still responsive, I can click on elements and go to specyfic places.
      Restoring the script did not help because _IEAttach () does not work.
      All you need to do is restart InternetExplorer.
       
      As a result of all my investigation I had done you can see this script snippet above.
      In my case My primary script was doing their job, but when he stops, IE was still responsive but this above snippet starts to return @error=0 and @extended = 0 when @extended should be at least =1 (this opened Internet Explorer instance)
      QUESTION:
      Has anyone already encountered such a problem?
      Does anyone know any solution or has an idea for further diagnostics?
       
      btw.
      I have one solution which would fix it - I mean Windows reinstall, but this is not good solution, as I always like to know the esense of the problem.
       
      EDIT:
      I know you can say show the script........ this is not possible you have no access to this site, and about 100 computers works well, so this i not related to my script.
      The more so that the problem is not related to my main scripts, and the problem is simply how IE behaves - which is illustrated by the above piece of code.
       
    • Nareshm
      By Nareshm
      i want to repeat this function 
      _IELoadWait($oIE, "url 2") $checkb = _IEGetObjById($oIE,"checkb") _IEAction($checkb,"click") $img = _IEGetObjById($oIE,"img") _IEAction($img,"click") _IELoadWait($oIE, "url 1") $btnfy = _IEGetObjById($oIE,"btnvfy") _IEAction($btnfy,"click") until my targeted webpage not found.
×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.