Jump to content
jmp

How to get text from webpage ?

Recommended Posts

I want to get only Raj Rajesh Dave without Selected Candidate from webpage without id or title

HTML

<h5 style='color: rgb(58, 135, 173); font-family: "Trebuchet MS", "Lucida Sans Unicode", "Lucida Grande", "Lucida Sans", Arial, sans-serif; font-weight: bold;'>Selected Candidate : Raj Rajesh Dave</h5>

 

Share this post


Link to post
Share on other sites

You should be able to accomplish this with _StringBetween. Something like this --

$sHTML = '<h5 style="color: rgb(58, 135, 173); font-family: "Trebuchet MS", "Lucida Sans Unicode", "Lucida Grande", "Lucida Sans", Arial, sans-serif; font-weight: bold;">Selected Candidate : Raj Rajesh Dave</h5>'
$aResult = _StringBetween($sHTML, "Selected Candidate : ", "</h5>")
_ArrayDisplay($aResult)

 

Share this post


Link to post
Share on other sites
1 hour ago, jmp said:

I want to get only Raj Rajesh Dave without Selected Candidate from webpage without id or title

Great, so what have you tried that doesn't work?

StringReg* functions could be a good place to start.

Jos

Share this post


Link to post
Share on other sites
2 hours ago, Danp2 said:

You should be able to accomplish this with _StringBetween. Something like this --

$sHTML = '<h5 style="color: rgb(58, 135, 173); font-family: "Trebuchet MS", "Lucida Sans Unicode", "Lucida Grande", "Lucida Sans", Arial, sans-serif; font-weight: bold;">Selected Candidate : Raj Rajesh Dave</h5>'
$aResult = _StringBetween($sHTML, "Selected Candidate : ", "</h5>")
_ArrayDisplay($aResult)

 

@Danp2 How can i find this directly from opened webpage insted of using $sHTML ? Becuase it was different every time

Share this post


Link to post
Share on other sites
4 hours ago, jmp said:

How can i find this directly from opened webpage instead of using $sHTML ? Because it was different every time

#include <Inet.au3>
#include <Array.au3>
#include <String.au3>

Local $sHTML   = _INetGetSource("https://www.autoitscript.com/forum/topic/200610-how-to-get-text-from-webpage/")
Local $aResult = _StringBetween($sHTML, "Selected Candidate : ", "<span class=""sc2"">")
If Not @error Then
    _ArrayDisplay($aResult)
Else
    ConsoleWrite('! No strings found.' & @CRLF)
EndIf

The example uses this thread, so you will have to adjust the end parameter of _StringBetween.

Edited by Musashi

Musashi-C64.png

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."

Share this post


Link to post
Share on other sites

Using StringRegExp, adjusting an end parameter is not needed. This

Local $aResult = StringRegExp($sHTML, "Selected Candidate : ([^<]+)", 1)

means : after "Selected Candidate : " , get one or more characters which are not a "<"

Share this post


Link to post
Share on other sites
1 hour ago, mikell said:

Using StringRegExp, adjusting an end parameter is not needed

You're right, I was thinking of that variation, too.

I have used _StringBetween because it occurred in the initial question of @jmp . In addition, some users have a certain aversion to StringRegExp ;).

Here is an extended script with both variations :

#include <Inet.au3>
#include <Array.au3>
#include <String.au3>

Global $sURL = "https://www.autoitscript.com/forum/topic/200610-how-to-get-text-from-webpage/"

; Example 1 with _StringBetween :
_Example1()

; Example 2 with StringRegExp (thanks to @mikell) :
_Example2()

; -----------------------------------------------------------------------------
Func _Example1()
    Local $sHTML   = _INetGetSource($sURL)
    Local $sEndPrm = "<span class=""sc2"">" ; End of the string to find
    Local $aResult = _StringBetween($sHTML, "Selected Candidate : ", $sEndPrm)
    If Not @error Then
        _ArrayDisplay($aResult, "_StringBetween")
    Else
        ConsoleWrite('! _StringBetween : No matches found' & @CRLF)
    EndIf
EndFunc   ;==>Example1

; -----------------------------------------------------------------------------
Func _Example2()
    Local $sHTML   = _INetGetSource($sURL)
    Local $aResult = StringRegExp($sHTML, "Selected Candidate : ([^<]+)", 1)
    If Not @error Then
        _ArrayDisplay($aResult, "StringRegExp")
    Else
        ConsoleWrite('! StringRegExp   : No matches found' & @CRLF)
    EndIf
EndFunc   ;==>Example2

 


Musashi-C64.png

"In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."

Share this post


Link to post
Share on other sites

@Musashi @mikell, Thanks, But why you are giving me this type long code ?

i am tried with myself and found candidate name using this code :

#include <IE.au3>
#include <String.au3>
$oIE = _IEAttach ("Webpage")
Local $oTds = _IETagNameGetCollection($oIE, "h5")
For $oTd In $oTds
  $iCname = $oTd.innertext
  $sString = StringTrimLeft($iCname, 20)
  MsgBox(0, "", $sString)
Next

It was easy and simple for me.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...