Jump to content

How to read text within the paragraph tag in html


Recommended Posts

Hi, I am very new to AutoIT

I am trying to grab text from website and copy to a text file

Originally I used $sText = _IEBodyReadText($oIE), however this will grab everything and it's hard to read it.

If I only want to grab text within the paragraph tag, how do I achieve this ?

example :

<p>Grab this text</p>

Thx for all the help.

Link to comment
Share on other sites

Hi,

just read the file and then use _StringBetween.

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Hi,

just read the file and then use _StringBetween.

Mega

Thank you for the quick reply.

Is there anyway that I can shorten the step. Currently I am grabbing all text from the website and copy to the text file->using _StringBetween and extract text from <p></p>.

Is there anyway that I can just directly grabbing text within <p></p> from website ?

Thx for any help.

Link to comment
Share on other sites

HI,

try using Function _INetGetSource

Mega

Edited by Xenobiologist

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Example:

#include <INet.au3> ; needed for get source (hmtl)
#include <String.au3> ; needed for stringbetween
$source = _INetGetSource("http://de.yahoo.com/") ;get html
FileWrite(@ScriptDir & "\yahoo.html",$source); save file
$read = FileRead(@ScriptDir & "\yahoo.html") ;read file
$readtitle = _StringBetween($read, "<title>","</title>") ;read title from file
MsgBox(0,"",$readtitle[0]) ;display first found string
Link to comment
Share on other sites

Hi,

you do not need the file.

#include <INet.au3> ; needed for get source (hmtl)
#include <String.au3> ; needed for stringbetween
$readtitle = _StringBetween(_INetGetSource("http://de.yahoo.com/"), "<title>", "</title>") 
MsgBox(0, "", $readtitle[0]) ;display first found string

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Another approach:

#include <IE.au3>
$oIE = _IECreate(your-url)
$oPs = _IETagnameGetCollection($oIE, "p")
For $oP in $oPs
    ColsoleWrite(_IEPropertyGet($oP, "innertext" & @CRLF & "----------------" & @CRLF)
Next

Dale

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

Hi,

you do not need the file.

#include <INet.au3> ; needed for get source (hmtl)
#include <String.au3> ; needed for stringbetween
$readtitle = _StringBetween(_INetGetSource("http://de.yahoo.com/"), "<title>", "</title>") 
MsgBox(0, "", $readtitle[0]) ;display first found string

Mega

Hi, thank you for the reply.

<p style="border-bottom: 1px solid #ccc; padding-bottom: 10px; width: 450px; {bgcolor}">

<strong>Title</strong><br />

Content

</p>

How do I grab the tile and content and save that into file ? I had the following code, however it does not quite doing as I want.

$file = FileOpen(@ScriptDir & "\test.txt", 2)

$readtitle = _StringBetween(_INetGetSource("http://www.abc.com"), "<strong>", "</strong>")

FileWriteLine($file,$readtitle[0])

$readcontent = _StringBetween(_INetGetSource("http://www.abc.com"), "</strong><br />", "</p>")

FileWriteLine($file,$readcontent[0])

FileClose($file)

Please advise. Thanks for all the help.

Link to comment
Share on other sites

Hi,

what site is it? It is much easier to give you the correct code, if I knew the site.

Mega

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...