abdulrahmanok

Get specific texts from Html file (Solved)

8 posts in this topic

#1 ·  Posted (edited)

Welcome all,

This is my html file  :

<span id="line148"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>1</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line149"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Student Services Department / Student General Service Section ( )</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line150"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>SEP-2016</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line151"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>3 - Met expectations</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line152"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line153"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line154"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line155"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line156"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>24</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line157"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>840</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line158"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>SEP</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line159"></span></span><span>&lt;/<span class="end-tag">TR</span>&gt;</span><span>
<span id="line160"></span></span><span>&lt;<span class="start-tag">TR</span>&gt;</span><span>
<span id="line161"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>2</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line162"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Student Services Department / Student General Service Section ( )</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line163"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>AUG-2016</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line164"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>3 - Met expectations</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line165"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line166"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line167"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line168"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>Y</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line169"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>18.5</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line170"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>648</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>
<span id="line171"></span></span><span>&lt;<span class="start-tag">TD</span> <span class="attribute-name">CLASS</span>="<a class="attribute-value">dddefault</a>"&gt;</span><span>SEP</span><span>&lt;/<span class="end-tag">TD</span>&gt;</span><span>

what I need is to get Values like "840" ,"648","18.5","24" and ignore all other texts .
my try:

 

#include <IE.au3>
#Include <String.au3>
#Include <Array.au3>
$sText=FileRead('try.txt')    ;replace this with _IEBodyReadHTML
$aDate=_StringBetween($sText,'>','<')
If IsArray($aDate) Then $aDate=_StringBetween($aDate[0],'</span>','</span><')
If IsArray($aDate) Then
    _ArrayDisplay($aDate)
Else
    ConsoleWrite('Nothing found')
EndIf

Solved By:
@mikell

 

Edited by abdulrahmanok

Share this post


Link to post
Share on other sites



Hi, I tried this one with regex:

Local $sText = FileRead("try.txt")    ;replace this with _IEBodyReadHTML
Local $aDate = StringRegExp($sText, "<span>([\d.]+)<\/span>", 3)
_ArrayDisplay($aDate)

It shows me all integer (digits including a dot ([\d.])) between <span> and </span>. Click on StringRegExp in the code window above to follow the link to help file for further informations.

Conrad


SciTE = 3.6.2.0/full   AutoIt = 3.3.14.2   AutoItX64 = 0   OS = Win7Pro SP1   OSArch = X64   Language = 0407/german
H:\...\AutoIt3\SciTE   H:\...\AutoIt3   H:\...\AutoIt3\Include   H: = Network Drive

   88x31.png  Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind.

Share this post


Link to post
Share on other sites

Also You should take a look at this UDF:

 

it could help in further work with stripping data from html...

Share this post


Link to post
Share on other sites

@Simpel thanks for your code , but I'm stuck trying to get only (24 and 18.5) values I tried use "Step" with Array to remove unwanted rows.

#include <IE.au3>
#Include <String.au3>
#Include <Array.au3>
$sText=FileRead('try.txt')    ;replace this with _IEBodyReadHTML
Local $aDate = StringRegExp($sText, "<span>([\d.]+)<\/span>", 3)
_ArrayDisplay($aDate)

For $i = UBound($aDate,1)  to 1 Step -2

    ;   MsgBox(0,"$i",$i)
        _ArrayDelete($aDate, $i)
          
     
Next
;_ArrayDelete($aDate, 0)
_ArrayDisplay($aDate)

1,24,840,2,18.5,648

24,18.5

 

Share this post


Link to post
Share on other sites

@Fr33b0w I will take a look about this later because It needs some time to figure it.

Share this post


Link to post
Share on other sites

Another try:
 

#include <IE.au3>
#Include <String.au3>
#Include <Array.au3>
$sText=FileRead('try.txt')    ;replace this with _IEBodyReadHTML
Local $aDate = StringRegExp($sText, "<span>([\d.]+)<\/span>", 3)
_ArrayDisplay($aDate,"Before")


For $i = 0 To UBound($aDate) - 1 Step 2
   ; If $aDate[$i] = "2" Then
            ;MsgBox(0,"$i",$i)
            ConsoleWrite("$i=" &$i&@CRLF)
        _ArrayDelete($aDate, $i)

  ;  EndIf
Next
     _ArrayDisplay($aDate,"After")

;_ArrayDelete($aDate, 0)


;1,24,840,2,18.5,648

; Wanted Values = 24,18.5

 

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Thank you very much I almost went crazy , because I don't have any knowledge about mathematics and actually I hate the math operators :(
I'm so lucky because there is someone can deal with it :)
I also Added code to Sum array values :
 

#include <IE.au3>
#Include <String.au3>
#Include <Array.au3>
#Include <Array.au3>

$sText=FileRead('try.txt')  
Local $aDate = StringRegExp($sText, "<span>([\d.]+)<\/span>", 3)
;_ArrayDisplay($aDate,"Before")

Local $res[Ceiling(UBound($aDate)/3)]
For $i = 0 To UBound($aDate) - 1
   If Mod($i, 3) = 1 Then $res[($i-1)/3] = $aDate[$i]
    
Next
_ArrayDisplay($res,"After")
 $Sum = 0
    For $i = 0 To UBound($res) - 1
    $Sum += $res[$i]
    Next
    MsgBox(0, 0,$Sum)

Reference:

 

Edited by abdulrahmanok

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now