Jump to content
Sign in to follow this  
HowieJ

How to read text within the paragraph tag in html

Recommended Posts

HowieJ

Hi, I am very new to AutoIT

I am trying to grab text from website and copy to a text file

Originally I used $sText = _IEBodyReadText($oIE), however this will grab everything and it's hard to read it.

If I only want to grab text within the paragraph tag, how do I achieve this ?

example :

<p>Grab this text</p>

Thx for all the help.

Share this post


Link to post
Share on other sites
Xenobiologist

Hi,

just read the file and then use _StringBetween.

Mega


Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites
HowieJ

Hi,

just read the file and then use _StringBetween.

Mega

Thank you for the quick reply.

Is there anyway that I can shorten the step. Currently I am grabbing all text from the website and copy to the text file->using _StringBetween and extract text from <p></p>.

Is there anyway that I can just directly grabbing text within <p></p> from website ?

Thx for any help.

Share this post


Link to post
Share on other sites
Xenobiologist

HI,

try using Function _INetGetSource

Mega

Edited by Xenobiologist

Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites
BurakSZ

Example:

#include <INet.au3> ; needed for get source (hmtl)
#include <String.au3> ; needed for stringbetween
$source = _INetGetSource("http://de.yahoo.com/") ;get html
FileWrite(@ScriptDir & "\yahoo.html",$source); save file
$read = FileRead(@ScriptDir & "\yahoo.html") ;read file
$readtitle = _StringBetween($read, "<title>","</title>") ;read title from file
MsgBox(0,"",$readtitle[0]) ;display first found string

Share this post


Link to post
Share on other sites
Xenobiologist

Hi,

you do not need the file.

#include <INet.au3> ; needed for get source (hmtl)
#include <String.au3> ; needed for stringbetween
$readtitle = _StringBetween(_INetGetSource("http://de.yahoo.com/"), "<title>", "</title>") 
MsgBox(0, "", $readtitle[0]) ;display first found string

Mega


Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites
DaleHohm

Another approach:

#include <IE.au3>
$oIE = _IECreate(your-url)
$oPs = _IETagnameGetCollection($oIE, "p")
For $oP in $oPs
    ColsoleWrite(_IEPropertyGet($oP, "innertext" & @CRLF & "----------------" & @CRLF)
Next

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites
HowieJ

Hi,

you do not need the file.

#include <INet.au3> ; needed for get source (hmtl)
#include <String.au3> ; needed for stringbetween
$readtitle = _StringBetween(_INetGetSource("http://de.yahoo.com/"), "<title>", "</title>") 
MsgBox(0, "", $readtitle[0]) ;display first found string

Mega

Hi, thank you for the reply.

<p style="border-bottom: 1px solid #ccc; padding-bottom: 10px; width: 450px; {bgcolor}">

<strong>Title</strong><br />

Content

</p>

How do I grab the tile and content and save that into file ? I had the following code, however it does not quite doing as I want.

$file = FileOpen(@ScriptDir & "\test.txt", 2)

$readtitle = _StringBetween(_INetGetSource("http://www.abc.com"), "<strong>", "</strong>")

FileWriteLine($file,$readtitle[0])

$readcontent = _StringBetween(_INetGetSource("http://www.abc.com"), "</strong><br />", "</p>")

FileWriteLine($file,$readcontent[0])

FileClose($file)

Please advise. Thanks for all the help.

Share this post


Link to post
Share on other sites
Xenobiologist

Hi,

what site is it? It is much easier to give you the correct code, if I knew the site.

Mega


Scripts & functions Organize Includes Let Scite organize the include files

Yahtzee The game "Yahtzee" (Kniffel, DiceLion)

LoginWrapper Secure scripts by adding a query (authentication)

_RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...)

Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc.

MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.