Mithrandir Posted November 2, 2011 Share Posted November 2, 2011 (edited) Hi, I made a regular expression long ago to extract div tags from a string: (?i)(?s)<div whatever >.*?(?(?=<div).*?</div>)</div> What it does is to extract everything between the first <div whatever > until its closing </div> and if it finds other <div tags it searches for everything until their closing </div> for each <div tag . This functioned well in a couple of scripts I had but in this one isn't functioning. Here is the problem isolated (I can bring the full html code if necessary): <div class="entry clearfix" >asd<div c> <div cl</div> <div clas</div> <div cot</div> <div clas</div><div cls</div> </div> <!-- end .entry-content --><div clix"> </div> </div> <!-- end .entry --> <div class="entry clearfix" >asd<div c> <div cl</div> <div clas</div> <div cot</div> <div clas</div><div cls</div> </div> <!-- end .entry-content --><div clix"> </div> </div> <!-- end .entry --> I used PCRE toolkit to test the regular expression using flag 3. It should return and array with two cells with: <div class="entry clearfix" >asd<div c> <div cl</div><div clas</div><div cot</div><div clas</div><div cls</div> </div> <!-- end .entry-content --><div clix"> </div></div> in each of them but it returns an array with one cell with: <div class="entry clearfix" >asd<div c> <div cl</div> Like it ignores the condition (?(?=<div).*?</div>) and just searches until it finds the first closing </div> :S I tested also making it greedy but then it captures the two blocks in one cell of the array. I'm stuck in this and I appreciate any help. Thanks! Edited November 2, 2011 by Mithrandir Help with SOAP message!! Link to comment Share on other sites More sharing options...
czardas Posted November 2, 2011 Share Posted November 2, 2011 (edited) This seems pretty complicated to me, but you could try a different approach. The following code will find opening and closing div tag positions in a string. From there you could perhaps figure something out. I imagine GOESoft will laugh, but I'm not sure how to fix the RegExp. expandcollapse popup#include <Array.au3> $sTest = "<div>data_A</div><div><div>data_B</div></div>" Dim $iNumDiv = 1 While StringInStr($sTest, "<div", 0, $iNumDiv) $iNumDiv += 1 WEnd $iNumDiv -=1 ; Since we started with a value of 1 before searching. Dim $iNumEndDiv = 1 While StringInStr($sTest, "</div>", 0, $iNumEndDiv) $iNumEndDiv += 1 WEnd $iNumEndDiv -=1 Dim $iBound = $iNumDiv + $iNumEndDiv ; This should be an even number If $iBound = 0 Then MsgBox(0, "Error", "The string contains no div tags") Exit ; To avoid errors with he next part of the script EndIf Dim $aArray[$iBound][2], $iCount = 0 For $i = 1 To $iNumDiv $aArray[$iCount][0] = "<div whatever>" $aArray[$iCount][1] = StringInStr($sTest, "<div>", 0, $i) $iCount += 1 Next For $i = 1 To $iNumEndDiv $aArray[$iCount][0] = "</div>" $aArray[$iCount][1] = StringInStr($sTest, "</div>", 0, $i) $iCount += 1 Next _ArraySort($aArray, 0, 0, 0, 1) ; Sort the div tags in the order they appear _ArrayDisplay($aArray)EditThe RegExp you have will never work anyway, since the opening and closing tags need to be paired. I believe a more sophisticated parsing method is required. Edited November 2, 2011 by czardas operator64 ArrayWorkshop Link to comment Share on other sites More sharing options...
DeltaRocked Posted August 20, 2012 Share Posted August 20, 2012 Hi, Hope this helps. (?i)(?:<[s*]{0,1}div[^>].*?/div) Regards Deltarocked Link to comment Share on other sites More sharing options...
jdelaney Posted August 20, 2012 Share Posted August 20, 2012 Using the HTML Dom, you can search for 'DIV' in x.getElementsByTagName(name) - get all elements with a specified tag name.That will return a collection of all the DIV elements, which you can loop through to get the .text of each child (the inner text of the node). IEbyXPATH-Grab IE DOM objects by XPATH IEscriptRecord-Makings of an IE script recorder ExcelFromXML-Create Excel docs without excel installed GetAllWindowControls-Output all control data on a given window. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now