Jump to content

Recommended Posts

Posted

hello

I found the the below code in the forum and it works only to extract the title 

#Include <String.au3>
#Include <INET.au3>

$html = _StringBetween(_INetGetSource('http://somdcomputerguy.com'), '<title>', '</title>')
MsgBox(0, "title", $html[0])

 

when I put $html = _StringBetween(_INetGetSource('http://www.google.com/search?q=autoit', '<h3 ….>', '</h3>') MsgBox(0, "title", $html[0])

it doesn't work, is maybe because it finds many </h3>? can you please point me to the right direction?

  • Developers
Posted

First of all I like to state it is somewhat impolite to crosspost questions (posting them multiple times).
As to your question: 

1: The posted line has an error in the syntax so will not run:

$html = _StringBetween(_INetGetSource('http://www.google.com/search?q=autoit', '<h3 ….>', '</h3>') 
; --- should be
$html = _StringBetween(_INetGetSource('http://www.google.com/search?q=autoit'), '<h3 ….>', '</h3>') 
MsgBox(0, "title", $html[0])

2: When you run this code it tells you what is wrong with the _StringBetween():

#Include <String.au3>
#Include <INET.au3>
$html = _StringBetween(_INetGetSource('http://www.google.com/search?q=autoit'), '<h3 ….>', '</h3>')
ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') >Error code: ' & @error & @CRLF) ;### Debug Console

As that returns "Error code: 1" and the helpfile tell you that in that case: "@error: 1 - No strings found. "

So what exactly were you expecting this start parameter to find: '<h3 ….>' ?

Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Posted (edited)

@Nina
Always check for errors in your code, and you were missing a round parenthesis in INetGetSource() function, and the three dots in the function were definitely not helping the research.
The code below shows how to see if there are errors returned by the functions you are using; by the way, the string <h3> is not in the HTML source code, so it will always returns error 1:

#include <Inet.au3>
#include <String.au3>

Global $strHTLM = _INetGetSource('http://www.google.com/search?q=autoit')
If @error Then
    ConsoleWrite("_INetGetSource ERR: " & @error & @CRLF)
    Exit
Else
    ConsoleWrite($strHTLM & @CRLF)

    $strHTLM = _StringBetween($strHTLM, '<h3>', '</h3>')
    If @error Then
        ConsoleWrite("_StringBetween ERR: " & @error & @CRLF)
    Else
        MsgBox(0, "", $strHTLM[0])
    EndIf
EndIf

 

Edited by FrancescoDiMuro

Click here to see my signature:

  Reveal hidden contents

 

Posted

Jos, I'm sorry, didn't mean to be impolite, after posting the question in the topic which was already posted, I noticed that it's very old and though that maybe it was wrong to post it there. so I opened a new topic.

As per my question, I must have deleted the parenthese by mistake while typing my question, anyway, it's not working(even with the parenthese)..

Posted

I think my post is not very clear, for the '<h3 ….>'  it's just an exemple, The text I would like to extract is between <h3 class="LC20lb DKV0Md"> and </h3>

 

  • Developers
Posted (edited)
  On 5/21/2020 at 8:42 AM, Nina said:

I think my post is not very clear, for the '<h3 ….>'  it's just an exemple, The text I would like to extract is between <h3 class="LC20lb DKV0Md"> and </h3>

 

Expand  

Don't provide "just an example" which doesn't make any sense, but rather provide an actual case that isn't working so we can help you.  ;) 

Edited by Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Posted (edited)

what I'm trying to do is to search a word in google, then return the title of the first 3 links that were found. I checked the HTTML code of the google research page and the title is between  <h3 class="LC20lb DKV0Md"> and </h3> .

I have updated my code, and for now I managed to have it partially working(it returns the first title) but, there is an issue somewhere, because it only returns the first title and then it shows the following error

  Quote

--> IE.au3 T3.0-2 Error from function _IETagNameGetCollection, $_IESTATUS_InvalidDataType
"C:\Users\IEUser\Desktop\AutoIT\testIE.au3" (76) : ==> Variable must be of type "Object".:

Expand  

 

_IETagNameGetCollection($findH3, "h3")

 

 

Edited by Nina
Posted

https://www.autoitscript.com/autoit3/docs/libfunctions/_IETagNameGetCollection.htm

if you saw the help file's example for the _IETagNameGetCollection function, then you must have seen that they used $oIE as the first parameter.  Being new to AutoIt, why did you choose to use some none existent object instead of trying to follow the example?

In your _IETagNameGetCollection function, change $findH3 to $oIE (like the help file's example) and see if that works.

Posted (edited)

Works fine:

 

#include <IE.au3>
#include <Array.au3>

Local $oIE = _IECreate("https://www.google.com/search?q=autoit") 

Local $oItems = _IETagNameGetCollection($oIE, "h3")
Local $aTitle_Table[3]
Local $iUbound = UBound($aTitle_Table)
Local $iCount = 0

For $oItem In $oItems
    If StringLeft($oItem.ClassName, 2) <> 'LC' Then
        ContinueLoop
    EndIf
    
    $aTitle_Table[$iCount] = $oItem.InnerText
    $iCount += 1
    
    If $iCount >= $iUbound Then
        ExitLoop
    EndIf
Next

_IEQuit($oIE)
_ArrayDisplay($aTitle_Table)

 

Edited by MrCreatoR

 

  Reveal hidden contents

 

 

AutoIt is simple, subtle, elegant. © AutoIt Team

Posted
  On 5/21/2020 at 12:33 PM, TheXman said:

https://www.autoitscript.com/autoit3/docs/libfunctions/_IETagNameGetCollection.htm

if you saw the help file's example for the _IETagNameGetCollection function, then you must have seen that they used $oIE as the first parameter.  Being new to AutoIt, why did you choose to use some none existent object instead of trying to follow the example?

In your _IETagNameGetCollection function, change $findH3 to $oIE (like the help file's example) and see if that works.

Expand  

Thank you very much! 

Posted

Curiously, the tag <h3 class="LC20lb DKV0Md"> does exist, but then i use something like this:

#Include <INET.au3>
$source = _INetGetSource('http://www.google.com/search?q=autoit')
ConsoleWrite($source &@CRLF)

And then search the text for those words, they're not in it.

Shouldn't they be retrieved?

  Reveal hidden contents

IUIAutomation - Topic with framework and examples

Au3Record.exe

Posted
  On 5/21/2020 at 12:46 PM, careca said:

then search the text for those words, they're not in it.

Expand  

Because browser handles other stuff while loading the page, using InetRead you load raw page data.

This is how we can get titles in this case:

#Include <Array.au3>
#Include <INET.au3>

$sSource = _INetGetSource('http://www.google.com/search?q=autoit')
$aTitles = StringRegExp($sSource, '<a href="/url\?q=.+?><div class=".*?"><span dir=".*?">(.*?)</span>', 3)
_ArrayDisplay($aTitles)

 

 

  Reveal hidden contents

 

 

AutoIt is simple, subtle, elegant. © AutoIt Team

Posted

I get nothing at all, i mean not even the arraydisplay window. But since this is not my thread, and not my issue, let's leave it at that. Thanks.

  Reveal hidden contents

IUIAutomation - Topic with framework and examples

Au3Record.exe

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...