Jump to content

Recommended Posts

Posted

Hi Team,

I wanted to get text running across multiple lines between html tags.

I have used the following code :

********************************************************

$fileForRead = FileOpen("test.html", 0)

; Check if file opened for reading OK

If $fileForRead = -1 Then

MsgBox(0, "Error", "Unable to open file test.html.")

Exit

EndIf

; Read in full character at a time until the EOF is reached

$sText = FileRead($fileForRead )

FileClose($fileForRead)

$subString = StringRegExp($sText,'<div id="likely_problems">(.*?)</div>', 1)

ConsoleWrite($subString[0])

*******************************************************

The text file I use as input is

***********************************************************

abcdefghijk

<div id="likely_problems"><span class='congrats_msg'><img src='http://test.com/images/feedback.gif' alt='Feedback' /> Congratulations! No likely problems.</span> ukjkjghjka

ksadfhjhagk;

jadghkl;aosdjl

aklgoruiouporip

</div>

abcdefghijk

**************************************************************

It is not working and throwing the following error : "D:\AccesibilityTesting\regExpsimple.au3 (15) : ==> Subscript used with non-Array variable.:"

The pattern matching is working if input file has <div id="likely_problems"> and </div> in "same line".

How can I get a pattern which is over multiple lines (Ignoring white space & newline)?

Thanks,

Thomas

Posted

$nOffset = 1
While 1
    $array = StringRegExp('<div id="likely_problems"><span class="congrats_msg"><img src="http://test.com/images/feedback.gif" alt="Feedback"/> Congratulations! No likely problems.</div>', '<(?i)div id="likely_problems">(.*?)</(?i)div>', 1, $nOffset)
    
    If @error = 0 Then
        $nOffset = @extended
    Else
        ExitLoop
    EndIf
    for $i = 0 to UBound($array) - 1
        msgbox(0, "RegExp Test with Option 1 - " & $i, $array[$i])
    Next
WEnd

Keep in mind the " and ' characters from the html file.Put a line like:

msgbox(0,"",$sText)
to see your string before you feed it to the StringRegExp.
Posted (edited)

You didn't really say what portion of that code you wanted returned. I'll assume it's everything within the div tags.

$sText = FileRead("test.html") ;; No need to open a file in read mode to use FileRead()
If $sText Then
    $subString = StringRegExp($sText, "(?i)(?s)<div id=.?likely_.+?>(.+?)\s*</div>", 1)
    If NOT @Error Then
        ConsoleWrite($subString[0]) & @CRLF
        ;;  To clean it up even further use this line
        $sCleaned = StringRegExpReplace($subString[0], "<.+?>", "")
        If @Extended Then ConsoleWrite($sCleaned) & @CRLF
    EndIf
EndIf

Look in my signature for a toolkit to test AutoIt PCRE expressions.

EDIT: $sSubString[0] should contain

<span class='congrats_msg'><img src='http://test.com/images/feedback.gif' alt='Feedback' /> Congratulations! No likely problems.</span> ukjkjghjka

ksadfhjhagk;

jadghkl;aosdjl

aklgoruiouporip

After cleaning that would become

Congratulations! No likely problems. ukjkjghjka

ksadfhjhagk;

jadghkl;aosdjl

aklgoruiouporip

Edited by GEOSoft

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Posted

You didn't really say what portion of that code you wanted returned. I'll assume it's everything within the div tags.

$sText = FileRead("test.html") ;; No need to open a file in read mode to use FileRead()
If $sText Then
    $subString = StringRegExp($sText, "(?i)(?s)<div id=.?likely_.+?>(.+?)\s*</div>", 1)
    If NOT @Error Then
        ConsoleWrite($subString[0]) & @CRLF
        ;;  To clean it up even further use this line
        $sCleaned = StringRegExpReplace($subString[0], "<.+?>", "")
        If @Extended Then ConsoleWrite($sCleaned) & @CRLF
    EndIf
EndIf

Look in my signature for a toolkit to test AutoIt PCRE expressions.

EDIT: $sSubString[0] should contain

After cleaning that would become

Thank you. It is working directly.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...