Jump to content

Extract data from the web page and store it in notepad


Recommended Posts

Hello arunachandu,

take a look at the helpfile, see documentation for InetRead or InetGet. You can save the recieved data as a text file with FileWrite, FileWriteLine, ...

:unsure:

Regards,Hannes[spoiler]If you can't convince them, confuse them![/spoiler]
Link to comment
Share on other sites

I passed a string in the google search and search results will be displayed. I want to store all the links(urls) of the search results in a notepad or excel sheet.

Can you show your script that i adapt to it...

Edit i remember you ! Posted Image

Try this :

#include <IE.au3>

$oIE = _IECreate ("http://www.google.com")
$oForm = _IEFormGetObjByName ($oIE, "f")
$oQuery = _IEFormElementGetObjByName ($oForm, "q")
_IEFormElementSetValue ($oQuery, "AutoIt IE.au3")
_IEFormSubmit ($oForm )
$oLinks = _IELinkGetCollection ($oIE)
$iNumLinks = @extended
MsgBox(0, "Link Info", $iNumLinks & " links found")
For $oLink In $oLinks
    MsgBox(0, "Link Info", $oLink.href)
Next
Edited by wakillon

AutoIt 3.3.14.2 X86 - SciTE 3.6.0WIN 8.1 X64 - Other Example Scripts

Link to comment
Share on other sites

A way to filter links ! Posted Image

#include <IE.au3>
#include <Array.au3>

$oIE = _IECreate ("http://www.google.com")
$oForm = _IEFormGetObjByName ($oIE, "f")
$oQuery = _IEFormElementGetObjByName ($oForm, "q")
_IEFormElementSetValue ($oQuery, "AutoIt IE.au3")
_IEFormSubmit ($oForm )
$oLinks = _IELinkGetCollection ($oIE)
$iNumLinks = @extended
$_Display=0
Dim $_LinkArray[1]

For $oLink In $oLinks
    If $_Display Then _ArrayAdd ( $_LinkArray, $oLink.href )
    If StringInStr ( $oLink.href, 'advanced_search?' ) <> 0 Then $_Display=1
    If StringInStr ( $oLink.href, 'google.com/support' ) <> 0 Then ExitLoop
Next

$_LinkArray = _DeleteArrayElementWithStringInstr ( $_LinkArray, 'webcache.' )
$_LinkArray = _DeleteArrayElementWithStringInstr ( $_LinkArray, 'search?' )
_ArrayDisplay ( $_LinkArray )
_IEQuit ( $oIE )

Exit

Func _DeleteArrayElementWithStringInstr ( $_Array, $_String )
    Local $_Item
    For $_Element In $_Array
        If StringInStr ( $_Element, $_String ) <> 0 Then
            _ArrayDelete ( $_Array, $_Item )
        Else
            $_Item+=1
        EndIf
    Next
    Return $_Array
EndFunc ;==> _DeleteArrayElementWithStringInstr ( )

AutoIt 3.3.14.2 X86 - SciTE 3.6.0WIN 8.1 X64 - Other Example Scripts

Link to comment
Share on other sites

Thanks a lot for the reply.i just started leaning script.so getting all silly doubts.

I tried extracting the headings in the web page.

The code is as follows:

$oIE = _IECreate ("$url")

$heading=_IEGetObjById ($oIE, "summary")-----to extract only h2 level headings and the id for that is summary

$result=_IEPropertyGet($heading, "innertext")

MsgBox(0,"heading",$result)

the result it is showing in the message box is "0"

how do i get the content in the h2 tag?

Link to comment
Share on other sites

This is the code i wrote:

#include <IE.au3>

$oIE = _IECreate("www.google.com")

$html=_IEBodyReadText($oIE)

$htmlfile=FileOpen("..\Htmlfile.txt",1)

if $htmlfile = -1 Then

MsgBox(0, "Error", "Unable to open file.")

Exit

EndIf

FileWrite($htmlfile, $html)

$testurl=StringRegExp($htmlfile, '(?i)(?s)A faster way to browse the web',1)

MsgBox(0,"data",$testurl)

I was able to open the page write into the file but not able pick the text and display in the message.

The message it was showing is 0.

Can you please help me in this?

Thanks

Aruna

Link to comment
Share on other sites

Unless you need the page text later, there's no need to write to a file, but that's pretty much where the problem stems from.

StringRegExp() will not open a file to do a search, in the usage you have above the function is just matching against the name of the file. ( ..\Htmlfile.txt )

Looking in the help you can see what your return of "0" is referring to, though it depends on the mode you're matching in.

I'm not sure what you're trying to match exactly, but to give you an example using google, try this:

#include <IE.au3>
#include <array.au3>

$oIE = _IECreate("www.google.com",0,0)
$html=_IEBodyReadText($oIE)


MsgBox(0,"",$html)

$testurl=StringRegExp($html, '(?i).*search.*',3)

_IEQuit($oIE)
_ArrayDisplay($testurl)
Edited by bwochinski
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...