Jump to content

TD extract algoritm


 Share

Recommended Posts

hello!

I have a small problem with extracting data from a table.

I want to extract these names ...

Aer conditionat mobil

Aer conditionat

Agentii imobiliare

etc...

how can I do that?

<TD> and </ TD>

this is my script:

#include <Inet.au3>
#NoTrayIcon
#include <String.au3>
#include <ButtonConstants.au3>
#include <GUIConstantsEx.au3>
#include <ListViewConstants.au3>
#include <WindowsConstants.au3>
#include <Ie.au3>
#Region ### START Koda GUI section ### Form=
$Form1 = GUICreate("Form1", 404, 292, -1, -1)
$Button1 = GUICtrlCreateButton("Arata", 8, 8, 75, 25)
$ListView1 = GUICtrlCreateListView("#Nr|#Name", 8, 40, 386, 238)
GUICtrlSendMsg(-1, $LVM_SETCOLUMNWIDTH, 0, 50)
GUICtrlSendMsg(-1, $LVM_SETCOLUMNWIDTH, 1, 300)
GUISetState(@SW_SHOW)
#EndRegion ### END Koda GUI section ###
While 1
$nMsg = GUIGetMsg()
Switch $nMsg
Case $GUI_EVENT_CLOSE
Exit
Case $Button1
     $link =_IECreate("file:///H:/Tabel%20De%20Facut/index.html",Default,1,1,0)
     _IELoadWait($link)
     Global $SourceText = _IEBodyReadHTML($link)
     _IEQuit($link)
     $string1 = _StringBetween($SourceText,'<TD>','</TD>')
     For $1 = 0 To 106
     GUICtrlCreateListViewItem($1+1&"|"&$string1[$1], $ListView1)
     Next
     MsgBox(0,"","Finish !")
EndSwitch
WEnd

index.rar

Edited by incepator
Link to comment
Share on other sites

how about post the table, with removed private data. Those functions will return what you need (when you select the proper parent table), and then you can loop through the array as necessary.

so, there are <th>/<tr>/<td>

th is header, tr is a row, td is the data within the row.

So when you read in the proper table, and print it to the array, you can loop through the array as you need.

http://www.w3schools.com/html/html_tables.asp

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

I prefer to not download things from forums. Can you post the code, and surround it in the proper [] tags...html in this case. Else, someone will be around, probably tomorrow.

IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

ok, this is a part of the code:

<tr>
                        <td><a href="http://www.apartamentedevanzare.org" target="_blank">www.apartamentedevanzare.org</a></td>
                        <td>Apartamente de vanzare</td>
                        <td>
                            <span class="btn small blue oai_seo" data-ttgoogle="16" data-ttyahoo="-" data-ttbing="-" data-ttwords="apartamente de vanzare, vanzari apartamente">Vezi pozitie</span>
                        </td>
                        <td>
                                                        <a class="btn small blue" href=" http://wstats.net/ro/website/apartamentedevanzare.org " target="_blank">Google analytics</a>
                                                        </td>
                        <td>
               <a href="http://secret.ideideafaceri.org/site/editare/22" class="btn small green">Editare</a>             </td>
                        <td><a href="http://secret.ideideafaceri.org/site/sterge/22" class="delete_confirm btn small orange">Sterge</a></td>
                    </tr>
                                    <tr>
                        <td><a href="http://www.autosecondhand.org" target="_blank">www.autosecondhand.org</a></td>
                        <td>Auto second hand</td>
                        <td>
                            <span class="btn small blue oai_seo" data-ttgoogle="n/a" data-ttyahoo="-" data-ttbing="-" data-ttwords="auto second hand, auto secondhand">Vezi pozitie</span>
                        </td>
                        <td>
                                                        <a class="btn small blue" href="http://wstats.net/ro/website/autosecondhand.org " target="_blank">Google analytics</a>
                                                        </td>
                        <td>
               <a href="http://secret.ideideafaceri.org/site/editare/98" class="btn small green">Editare</a>             </td>
                        <td><a href="http://secret.ideideafaceri.org/site/sterge/98" class="delete_confirm btn small orange">Sterge</a></td>
                    </tr>
                                    <tr>
                        <td><a href="http://www.bijuterii-aur-argint.org" target="_blank">www.bijuterii-aur-argint.org</a></td>
                        <td>Bijuterii argint</td>
                        <td>
                            <span class="btn small blue oai_seo" data-ttgoogle="n/a" data-ttyahoo="-" data-ttbing="-" data-ttwords="bijuterii argint, bijuterii aur">Vezi pozitie</span>
                        </td>
                        <td>
                                                        <a class="btn small blue" href=" http://wstats.net/ro/website/bijuterii-aur-argint.org " target="_blank">Google analytics</a>
                                                        </td>
                        <td>
               <a href="http://secret.ideideafaceri.org/site/editare/23" class="btn small green">Editare</a>             </td>
                        <td><a href="http://secret.ideideafaceri.org/site/sterge/23" class="delete_confirm btn small orange">Sterge</a></td>
                    </tr>
                                    <tr>
                        <td><a href="http://www.biletavion.org" target="_blank">www.biletavion.org</a></td>
                        <td>Bilete de avion</td>
                        <td>
                            <span class="btn small blue oai_seo" data-ttgoogle="27" data-ttyahoo="-" data-ttbing="-" data-ttwords="bilete de avion, bilet avion, bilete de avion low cost, bilete de avion ieftine">Vezi pozitie</span>
                        </td>
                        <td>
                                                        <a class="btn small blue" href=" http://wstats.net/ro/website/biletavion.org " target="_blank">Google analytics</a>
                                                        </td>
                        <td>
               <a href="http://secret.ideideafaceri.org/site/editare/13" class="btn small green">Editare</a>             </td>
                        <td><a href="http://secret.ideideafaceri.org/site/sterge/13" class="delete_confirm btn small orange">Sterge</a></td>
                    </tr>
                                    <tr>
                        <td><a href="http://www.cabinet--stomatologic.org" target="_blank">www.cabinet--stomatologic.org</a></td>
                        <td>Cabinet stomatologic</td>
                        <td>
                            <span class="btn small blue oai_seo" data-ttgoogle="n/a" data-ttyahoo="-" data-ttbing="-" data-ttwords="cabinet stomatologic">Vezi pozitie</span>
                        </td>
                        <td>
                                                        <a class="btn small blue" href="http://wstats.net/ro/website/cabinet--stomatologic.org " target="_blank">Google analytics</a>
                                                        </td>
                        <td>
               <a href="http://secret.ideideafaceri.org/site/editare/92" class="btn small green">Editare</a>             </td>
                        <td><a href="http://secret.ideideafaceri.org/site/sterge/92" class="delete_confirm btn small orange">Sterge</a></td>
                    </tr>
                                    <tr>
                        <td><a href="http://www.cadouricadou.org" target="_blank">www.cadouricadou.org</a></td>
                        <td>Cadouri</td>
                        <td>
                            <span class="btn small blue oai_seo" data-ttgoogle="n/a" data-ttyahoo="-" data-ttbing="-" data-ttwords="cadouri, cadou">Vezi pozitie</span>
                        </td>
                        <td>
                                                        <a class="btn small blue" href=" http://wstats.net/ro/website/cadouricadou.org " target="_blank">Google analytics</a>
                                                        </td>
                        <td>
               <a href="http://secret.ideideafaceri.org/site/editare/24" class="btn small green">Editare</a>             </td>
                        <td><a href="http://secret.ideideafaceri.org/site/sterge/24" class="delete_confirm btn small orange">Sterge</a></td>
                    </tr>
                                    <tr>
                        <td><a href="http://www.camere--supraveghere.org" target="_blank">www.camere--supraveghere.org</a></td>
                        <td>Camere supraveghere</td>
                        <td>
                            <span class="btn small blue oai_seo" data-ttgoogle="n/a" data-ttyahoo="-" data-ttbing="-" data-ttwords="camere supraveghere, supraveghere video">Vezi pozitie</span>
                        </td>
                        <td>
                                                        <a class="btn small blue" href=" http://wstats.net/ro/website/camere--supraveghere.org " target="_blank">Google analytics</a>
                                                        </td>
                        <td>

ex:

this:

<td><a href="http://www.cabinet--stomatologic.org" target="_blank">www.cabinet--stomatologic.org</a></td>

<td>Cabinet stomatologic</td>

<td>

are "td" everywhere, how can I get delimitation thus can only

Cabinet stomatologic

Link to comment
Share on other sites

Looks like a standard table to me. jdelaney gave you the correct answer, and here's the code to prove it:

#include <IE.au3>
#include <array.au3>

_IEErrorHandlerRegister()

$oIE = _IEAttach("adsManager")

Local $oTable = _IETableGetCollection($oIE, 0)
Local $aTableData = _IETableWriteToArray($oTable)

_ArrayDisplay($aTableData)
Link to comment
Share on other sites

seems standard to me. You are looking for the second <td> in each row...if you use the _IETableToArray, it should be easy.

damn autoformatter!

anyways look above for the solution :)

Edited by jdelaney
IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window.
Link to comment
Share on other sites

#include <Inet.au3>
#NoTrayIcon
#include <String.au3>
#include <ButtonConstants.au3>
#include <GUIConstantsEx.au3>
#include <ListViewConstants.au3>
#include <WindowsConstants.au3>
#include <Ie.au3>
#Region ### START Koda GUI section ### Form=
$Form1 = GUICreate("Form1", 404, 292, -1, -1)
$Button1 = GUICtrlCreateButton("Arata", 8, 8, 75, 25)
$ListView1 = GUICtrlCreateListView("#Nr|#Name", 8, 40, 386, 238)
GUICtrlSendMsg(-1, $LVM_SETCOLUMNWIDTH, 0, 50)
GUICtrlSendMsg(-1, $LVM_SETCOLUMNWIDTH, 1, 300)
GUISetState(@SW_SHOW)
#EndRegion ### END Koda GUI section ###
While 1
$nMsg = GUIGetMsg()
Switch $nMsg
  Case $GUI_EVENT_CLOSE
   Exit
  Case $Button1
   $link =_IECreate("file:///H:/Tabel%20De%20Facut/index.html",Default,0,1,0)
       _IELoadWait($link)
       Local $oTable = _IETableGetCollection($link, 0)
       Local $aTableData = _IETableWriteToArray($oTable)
     _IEQuit($link)
      For $1 = 0 To 106
       GUICtrlCreateListViewItem($1+1&"|"&$aTableData[1][$1], $ListView1)
       Sleep(10)
      Next
       MsgBox(0,"Info","Finish !")
EndSwitch
WEnd

RESOLVED !

thank you very much

jdelaney and DanP2

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...