Jump to content
Sign in to follow this  
Ned

RegEx Help - Parsing HTML

Recommended Posts

Ned

Why will this not work? I have spent a few hours on this one little problem. I have tried everything I can think of to make it work, but never wants too. I have tested the pattern in a few other scripts and it works fine. the @ERROR says it is a bad string - but how can String($HTML) be bad? Is it to much text for it to string?

$IE = _IECreate("C:\Users\Ned\Dropbox\Public\Findings.html", "", 0)
$HTML = _IEBodyReadHTML($IE)
_IEQuit($IE)
$HTML = String($HTML)
$array = StringRegExp($HTML, '(?<=\QNew: <a href="/items/view/\E)(.*?)(?=\Q">\E)', 3) ;gets new item numbers
_ArrayDisplay($array)
Edited by Ned

Share this post


Link to post
Share on other sites
Beege

I dont see any error checking. How do you know its coming from String()? Have you tried printing $HTML to the console to verify that _IEBodyReadHTML() worked?

Share this post


Link to post
Share on other sites
Ned

I have made string $html print to console and everything was there like it should. The pattern works fine when I put the raw text in the script directly, just never when it gets it from an ie window.

Share this post


Link to post
Share on other sites
Beege

Wait...If you manually add the string $html to your script it works? That is weird. I have a few ideas that might work, but nothing to figure out whats actually going wrong. Maybe try putting on the clipboard and then reading back?

Clipput($html)
$html = clipget()

If that doesnt work maybe :graduated:

Clipput($html)
$html = clipget()
_filecreate(@scriptdir & '\Temphtml.txt')
filewrite(@scriptdir & '\Temphtml.txt', $html)
$html = fileread(@scriptdir & '\Temphtml.txt')
filedelete(@scriptdir & '\Temphtml.txt')

Share this post


Link to post
Share on other sites
Ned

I ended up going a different route using table to array function since it wasn't working. Luckily I am able to use that function. Never tried the other options you gave in last post, but thanks for the help.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • gruntydatsun
      By gruntydatsun
      I have an XML file and every time there are three lines in a row with only <null/> in them, i want to insert a fourth line with <null/>.   Each line starts with 3 white spaces, followed by <null/> and ends with a white space followed by CR LF.   The presence of the three lines as described is unique to the points where I want to insert a line in this document.
       I'm trying to figure out how to apply the repeating part of a regex  {1,4} but apply it to this whole segment. 
      So far I have the below which picks up an individual line ok:
      ^\s{3}<null/>\s\r\n I tried wrapping it all in braces () then adding {3} but I'm obviously getting something wrong. 
      Attached is a section from the xml file with a block of nulls that should be matched if anyone would like to have a look.
      Help_From_Forum.xml
    • nooneclose
      By nooneclose
      I need help turning this string "20180913221626" into a formatted time string.
      I need to go from this: 20180913221626
      to this: 09/13/2018 10:16 PM
      I do not always know what the date will be so I can not just use a variable I need to actually convert/format. 
      I did see an older post in the forms that was basically the same question only the other guy did not post the working code and I can not figure out how to use _AD_GetObjectProperties properly to get what I want. 
      As always any help would be appreciated. 
      Here is the code I use to find the date, but it always gives me the unformatted version. 
      ;retrieve the items object $oItem = $oOutlook.Session.GetItemFromID($aItems[1][0], Default) $oItem.GetInspector $eSentOn = $oItem.SentOn ; When was the email sent? MsgBox("", "Sent On of the email", $eSentOn) ;******************************************************************************* ; Formats the date and time from the email ;******************************************************************************* ;Local $fDatenTime = _DateTimeFormat($eSentOn, 1) ;MsgBox("", "Formatted email time", $fDatenTime) $aProperties = _AD_GetObjectProperties($eSentOn) _ArrayDisplay($aProperties, "Did the conversion work?")  
    • Miliardsto
      By Miliardsto
      I got that func
      Func makeHelpImgGUI($title,$width,$height,$img) $img = GUICtrlCreatePic("",20,40,$width,$height) _ResourceSetImageToCtrl($img, "HERE") EndFunc and I call this func like that
      makeHelpImgGUI("Image",1190, 800,$SETTINGS_JPG)  
      so what is the problem in the parameter where is - "HERE" I need value of img but passed as string
      so $img = $SETTINGS_JPG and how make it "SETTINGS_JPG"
       
      I tried something like that but not work
      Func makeHelpImgGUI($title,$width,$height,$img) $name_str = String($img) $name_str = StringTrimLeft ($name_str, 1 ) $img = GUICtrlCreatePic("",20,40,$width,$height) _ResourceSetImageToCtrl($img, $name_str) EndFunc  
    • liagason
      By liagason
      Hello everyone,
      How can I display in ascending  sequence some numbers stored in a string variable?
      $str = "18,03,48,23" MsgBox(0,"test",$str) I would like it to display "03,18,23,48"
    • Rskm
      By Rskm
      Hi, I have the following line in a text file 'input.txt'. I know the line number - say '6'. I wish to replace the text 'WWW' in the below line with a random number (I can generate that with random()).
      WERIS  WWWJP   3.83  8.330  1.000                1097.RAXX 
      The WWW is a 3 digit integer (could be any number between 0 to 999), I can use stringtrimleft and get the numerical value of WWW in this file
      so, basically, I know the string to replace (ie; WWW stored in a variable), I know the line number to work on and the file location/name and the replacement variable (through random()). My requirement is to fill that 3 spaces with my random number (which Is a integer between 1 and 999)
      please put ur suggestions
       
×