Jump to content
Sign in to follow this  

Trouble parsing webpages

Recommended Posts

Hello and thanks for reading. I am having trouble parsing webpages using Inetget and saving the file as a txt file. It seems to lose all formatting when I do this making it difficult for me to anticipate how to extract URL's and other info. I'll explain what I'm trying to do in more depth. I want to input an initial webpage, search for a list of urls that have the same basic format. Download each url that is found and parse each for some specific info and then compile a list of what was found, probably output a txt file, maybe in a gui after I get the basic parts done. I've done something like this previously but I lost a HDD and all my scripts I had written for reference.

Another problem I am having, even if I DL the initial page by hand, when stringsplitting into arrays, the array is not populating like I expect it to and is returning some numbers and I've no idea why. If someone could give me an example of this or direct me I would be very thankful for the help.

Share this post

Link to post
Share on other sites

Um, yes but my code is ugly. Well, here is what I was fiddling with. The debug things are there because I was trying to figure out what was going on. I am trying to get a clans page on WoT such as this page > http://uc.worldoftanks.com/uc/clans/1000000006/ then evaluate each players' tiers of tanks in that clan to gauge their relative strength at a glance. Going through this by hand takes an extreme amount of time and I would really like to simplify it by scripting this. Ugly code below

$n = 0
$numplayers = 0
MsgBox(0, "debug", @ScriptDir & "\Clans.txt")
If FileExists(@ScriptDir & "\Clans.txt") Then MsgBox(0, "debug", "File Exists.")
$file = FileOpen(@ScriptDir & "\Clans.txt", 0)

; Check if file opened for reading OK
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")

; Read in lines of text until the EOF is reached
While 1
    $line = FileReadLine($file, $n)
    If @error = -1 Then
        MsgBox(0, "Error", "Unable to open file.")

    $playerexists = StringInStr($line, 'href="http://uc.worldoftanks.com/uc/accounts/')
    If $playerexists <> 0 Then
        ;$split = StringSplit($playerexists, "{ASC 34}")
        $split = StringSplit($playerexists, ">")
        If $split[0] >= 2 Then
        MsgBox(0, "debug", $playerexists & " " & $split[0] & " " & $split[2] & " " & $numplayers)
        $numplayers = $numplayers + 1
    $n = $n + 1

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

  • Create New...