Jump to content

Help with _IE and table formatting


Recommended Posts

Good morning to everyone,

I start off by saying that I have already done a quite good research on how to solve my self this problem, as I always did with other issues I encountered, but I think that this time I need some hints from you guys :)

I am coding something that handles with these 2 web pages:

Link : http://s8.travian.it/allianz.php?s=3

Example image of how this page does look like:

Image of Ally Report

Link: http://s8.travian.it/berichte.php?id= "xxxxxxxx"

Example image of how this page does look like:

Image of Single Report

In the following the entire piece of code I wrote to handle these 2 Web Pages:

#include <IE.au3>
#include <array.au3>
HotKeySet ( "{F1}" , "Scanner" )
HotKeySet ( "{ESC}" , "iExit" )

Func Scanner()
    Local $i=0
    Local $StringLength=0
    Local $LinkToReport[100]
    Local $aTabellaReport0[100][100]
    Local $aTabellaReport1[100][100]
    Local $aTabellaReport2[100][100]
    $oIE = _IECreate ("http://s8.travian.it/allianz.php?s=3",0,0)
    $oIE2 = _IECreate ("about:blank",0,0)
    $oLinks =_IELinkGetCollection($oIE)

    For $oLink in $oLinks
        if StringInStr($oLink.href, "berichte.php?id=") Then
            $StringLength = StringLen($oLink.href)
            if $StringLength < 48 Then
                $LinkToReport[$i]=$oLink.href
                ConsoleWrite("Trovato link: " & $LinkToReport[$i] & @CRLF)
                _IENavigate ($oIE2, $LinkToReport[$i])
                
                $oTable0 = _IETableGetCollection ($oIE2,0)
                $aTabellaReport0 = _IETableWriteToArray ($oTable0, True)
                
                $oTable1 = _IETableGetCollection ($oIE2,1)
                $aTabellaReport1 = _IETableWriteToArray ($oTable1, True)
                                
                $oTable2 = _IETableGetCollection ($oIE2,2)
                $aTabellaReport2 = _IETableWriteToArray ($oTable2, True)
                
                ConsoleWrite("-- Report --" & @crlf)
                ConsoleWrite(" Orario: "  & $aTabellaReport0[1][1]& @crlf)
                ConsoleWrite(" Attaccante: "  & $aTabellaReport1[0][1]& @crlf)
                ConsoleWrite(" Difensore: "  & $aTabellaReport2[0][1]& @crlf)
                ConsoleWrite(" Bottino: "  & $aTabellaReport1[4][1]& @crlf)
                ConsoleWrite("-- Fine Report --" & @crlf)
                EndIf
        EndIf
        $i=$i+1
    Next


EndFunc
Func iExit()
    Exit
EndFunc
While 1
    Sleep(500)
WEnd

What I get in Console is something like this:

(Repeated for every "berichte?" page I find)

Trovato link: http://s8.travian.it/berichte.php?id=15400276

-- Report --

Orario: il oggi alle 12:16:21

Attaccante: kuknassbrothers dal villaggio 01 Piallachetipassa

Difensore: red34 dal villaggio Villaggio di red34

Bottino: 28 | 28 | 35 | 23114/330

The console report should be a "line representation" of what I see in the webpage (Check the image: http://img600.imageshack.us/i/reportattaccosingolo.jpg/)

But in the line starting with "Bottino:" as you can see the last numbers gets messed up... the others lines are like I wanted.

My objective is to "extract from this table the single valours like 28, 28 , 35, 23, 114/330 (or 114,330) (Taken from the above example)

Is there any other way othern than MouseDrag (I want this program to run in background) to exctrapolate the data on the webpage?

I attach onto this poste the source of the webpage berichte.

berichte.php

*Note: If you find also any other improvements to my source code I could make, feel free to hint me :)*

Thanks in advance and best regards,

Luciano from Italy.

Edited by r3dbull
Link to comment
Share on other sites

It won't let me Edit this post, don't know why...

By the way the interested part of source code in the web page for the "Bottino:" numbers in the table does look like this:

<th>Bottino</th>
<td colspan="10">   
<div class="res"><img class="r1" src="img/x.gif" alt="Legno" title="Legno">58 | <img class="r2" src="img/x.gif" alt="Argilla" title="Argilla">59 | <img class="r3" src="img/x.gif" alt="Ferro" title="Ferro">45 | <img class="r4" src="img/x.gif" alt="Grano" title="Grano">40</div><div class="carry"><img class="car" src="img/x.gif" alt="porta" title="porta">202/480</div>
Edited by r3dbull
Link to comment
Share on other sites

Why do you say that line is screwed up? It has exactly what I would expect in it.

All of the values you are working with are stored in a single table cell:

<th>Bottino</th>
    <td colspan="10">
    <div class="res">
    <img class="r1" src="img/x.gif" alt="Legno" title="Legno" />58 | 
    <img class="r2" src="img/x.gif" alt="Argilla" title="Argilla" />59 | 
    <img class="r3" src="img/x.gif" alt="Ferro" title="Ferro" />45 | 
    <img class="r4" src="img/x.gif" alt="Grano" title="Grano" />40</div>
    <div class="carry">
    <img class="car" src="img/x.gif" alt="porta" title="porta" />202/480</div>
    </td>

_IETableWriteToArray pulls the text information from a cell and you can see that the initial values have " | " appended and there is no such marker between the last two.

If you want to get distinct values, you'll need to get a reference to the specific table cell that includes this information and parse it using some other DOM methods or get the .innerHTML and parse through it.

Dale

Edited by DaleHohm

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

Probably I expressed myself wrong, I said messed up, because I intended to have those numbers to be separate.

I would be satisfied just to extract separately from everything else the final reference "202/480"

I will try to get around this by parsing with:_IEBodyReadHTML and _StringBetween

Hope I will come out with something working good :)

And thanks Dale for the reply, really appreciated it and cleared one point out for me =)

Edit: I will get my hands again on this piece of code from Monday, I take a rest during the weekend :)

Edited by r3dbull
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...