Sign in to follow this  
Followers 0
marcotv

get all link from html code

5 posts in this topic

#1 ·  Posted (edited)

Hi guys,

i'm a fan of autoit from many years!

This is my problem.

I have the source html code oh an html page

For example this:

<html>

example text

<a href="www.url1.com">description1</a>

<a href="www.url2.com">description2</a>

<a href="www.url3.com">description3</a>

</html>

I would like to get a bidimensional array (possibly without using ie browser) that contains all the links and their description:

array={(www.url1.com,description1),(www.url2.com,description2),(www.url3.com,description3)}

Which is the best solution?

Thank you

Edited by marcotv

Share this post


Link to post
Share on other sites



marcotv,

Welcome to the the AutoIt forum. :mellow:

Not sure about the "best", but it certainly works:

#include <Array.au3>
#include <String.au3>

$sSource = '<html>' & @CRLF & _
'example text' & @CRLF & _
'<a href="www.url1.com">description1</a>' & @CRLF & _
'<a href="www.url2.com">description2</a>' & @CRLF & _
'<a href="www.url3.com">description3</a>' & @CRLF & _
'</html>'

$aArray1 = _StringBetween($sSource, '<a href="', '">description')
$aArray2 = _StringBetween($sSource, '.com">', '</a>')

Global $a2DArray[UBound($aArray1)][2]

For $i = 0 To UBound($aArray1) - 1
    $a2DArray[$i][0] = $aArray1[$i]
    $a2DArray[$i][1] = $aArray2[$i]
Next

_ArrayDisplay($a2DArray)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

marcotv,

Welcome to the the AutoIt forum. :mellow:

Not sure about the "best", but it certainly works:

#include <Array.au3>
#include <String.au3>

$sSource = '<html>' & @CRLF & _
'example text' & @CRLF & _
'<a href="www.url1.com">description1</a>' & @CRLF & _
'<a href="www.url2.com">description2</a>' & @CRLF & _
'<a href="www.url3.com">description3</a>' & @CRLF & _
'</html>'

$aArray1 = _StringBetween($sSource, '<a href="', '">description')
$aArray2 = _StringBetween($sSource, '.com">', '</a>')

Global $a2DArray[UBound($aArray1)][2]

For $i = 0 To UBound($aArray1) - 1
    $a2DArray[$i][0] = $aArray1[$i]
    $a2DArray[$i][1] = $aArray2[$i]
Next

_ArrayDisplay($a2DArray)

M23

thank you for the reply

anyway the source code I indicated was only an example

for example i can have this source code:

<html>

example text

<a href="www.house.net">house</a>

<a href="www.car.com">car</a>

<a href="www.pc.biz">computer/a>

<a href="animal.com">cat/a>

</html>

in this case, your solution doesn't function :(

Share this post


Link to post
Share on other sites

marcotv,

You will have to adjust the _StringBetween parameters to match the actual file structure. For example, this works on the example you just posted: :mellow:

#include <Array.au3>
#include <String.au3>

$sSource = '<html>' & @CRLF & _
'example text' & @CRLF & _
'<a href="www.house.net">house</a>' & @CRLF & _
'<a href="www.car.com">car</a>' & @CRLF & _
'<a href="www.pc.biz">computer/a>' & @CRLF & _
'<a href="animal.com">cat/a>' & @CRLF & _
'</html>'

$aArray1 = _StringBetween($sSource, '<a href="', '">')

_ArrayDisplay($aArray1)

$aArray2 = _StringBetween($sSource, '">', '/')

_ArrayDisplay($aArray2)

Global $a2DArray[UBound($aArray1)][2]

For $i = 0 To UBound($aArray1) - 1
    $a2DArray[$i][0] = $aArray1[$i]
    If StringRight($aArray2[$i], 1) = "<" Then $aArray2[$i] = StringTrimRight($aArray2[$i], 1)
    $a2DArray[$i][1] = $aArray2[$i]
Next

_ArrayDisplay($a2DArray)

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

marcotv,

You will have to adjust the _StringBetween parameters to match the actual file structure. For example, this works on the example you just posted: :mellow:

#include <Array.au3>
#include <String.au3>

$sSource = '<html>' & @CRLF & _
'example text' & @CRLF & _
'<a href="www.house.net">house</a>' & @CRLF & _
'<a href="www.car.com">car</a>' & @CRLF & _
'<a href="www.pc.biz">computer/a>' & @CRLF & _
'<a href="animal.com">cat/a>' & @CRLF & _
'</html>'

$aArray1 = _StringBetween($sSource, '<a href="', '">')

_ArrayDisplay($aArray1)

$aArray2 = _StringBetween($sSource, '">', '/')

_ArrayDisplay($aArray2)

Global $a2DArray[UBound($aArray1)][2]

For $i = 0 To UBound($aArray1) - 1
    $a2DArray[$i][0] = $aArray1[$i]
    If StringRight($aArray2[$i], 1) = "<" Then $aArray2[$i] = StringTrimRight($aArray2[$i], 1)
    $a2DArray[$i][1] = $aArray2[$i]
Next

_ArrayDisplay($a2DArray)

M23

thank you very much! great. I use it, it's very simple. Yesterday i was at work and I was in a hurry.

autoit: great lenguage and great support

Edited by marcotv

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0