Jump to content

reading html code - save matching lines?


megges
 Share

Recommended Posts

hi, at first i wanna say autoit is a great tool :lmao: i have progged a few things with it, but now, i dont now how to fix my problem ;)

i would do the folloing:

- surf to a website with internet explorer

- look at the html code of the site

- save the lines which are matching a definied search-string to a file

ok the surfing, and code saving works nice, but how to save only the special lines to a file? without using a temp file, because i have to spider thousend of sites ...

my code for now is:

#include <IE.au3>
#include <file.au3>

$url = somthing...

$oIE = _IECreate()
$file = FileOpen ("html.txt",2)
_IENavigate($oIE, "http://" & $url)
$html = _IEBodyReadHTML($oIE)

FileWriteLine($file,$html & @LF)

_IELoadWait($oIE)

FileClose ($file)

^^ with this code i save the full code to the file, but now i want only to save the code-lines which begin with

"<TD><A onmouseover="return overlib('<table class=mapTableOthers><tr><th><strong>"

thx for your help

megges

Link to comment
Share on other sites

hi, at first i wanna say autoit is a great tool :lmao: i have progged a few things with it, but now, i dont now how to fix my problem ;)

i would do the folloing:

- surf to a website with internet explorer

- look at the html code of the site

- save the lines which are matching a definied search-string to a file

ok the surfing, and code saving works nice, but how to save only the special lines to a file? without using a temp file, because i have to spider thousend of sites ...

my code for now is:

#include <IE.au3>
#include <file.au3>

$url = somthing...

$oIE = _IECreate()
$file = FileOpen ("html.txt",2)
_IENavigate($oIE, "http://" & $url)
$html = _IEBodyReadHTML($oIE)

FileWriteLine($file,$html & @LF)

_IELoadWait($oIE)

FileClose ($file)

^^ with this code i save the full code to the file, but now i want only to save the code-lines which begin with

"<TD><A onmouseover="return overlib('<table class=mapTableOthers><tr><th><strong>"

thx for your help

megges

well with html, i don't think it's a good idea to save by line, just because statements dont' need to be on seperate lines. The best way to go may be to just create a string from the whole source, and then parse out what you need. One code i wrote a while back for a similar project was:

#include<array.au3>
Func _StringSubLoc($ack, $blah)
    $test = StringInStr($ack,$blah)
    If $test Then
        $occ = $occ + 1
        _StringSubLoc(StringRight($ack,$test-StringLen($blah),$blah)
    EndIf
    Return($occ)
EndFunc  ;==>_StringSubLoc

what it does is takes a string and substring as arguments, then locates every occurance of the substring in the string, and returns an array of occurances and positions. So you could call it with a string of the whole document, and the string you want to search by, save that to an array $linestosave, then search for a substring that would come immediately after the line you want to save... maybe a </table> ? then call that function again with the full source and that substring, and save the result to array $endtags or something. then for each item of the $linestosave you could find the $endtag that is closest AFTER, and use stringmid to extract the data between the two points. It may sound complicated, but it really wouldn't be.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...