Sign in to follow this  
Followers 0
Ned

RegEx Help - Parsing HTML

5 posts in this topic

#1 ·  Posted (edited)

Why will this not work? I have spent a few hours on this one little problem. I have tried everything I can think of to make it work, but never wants too. I have tested the pattern in a few other scripts and it works fine. the @ERROR says it is a bad string - but how can String($HTML) be bad? Is it to much text for it to string?

$IE = _IECreate("C:\Users\Ned\Dropbox\Public\Findings.html", "", 0)
$HTML = _IEBodyReadHTML($IE)
_IEQuit($IE)
$HTML = String($HTML)
$array = StringRegExp($HTML, '(?<=\QNew: <a href="/items/view/\E)(.*?)(?=\Q">\E)', 3) ;gets new item numbers
_ArrayDisplay($array)
Edited by Ned

Share this post


Link to post
Share on other sites



I dont see any error checking. How do you know its coming from String()? Have you tried printing $HTML to the console to verify that _IEBodyReadHTML() worked?

Share this post


Link to post
Share on other sites

I have made string $html print to console and everything was there like it should. The pattern works fine when I put the raw text in the script directly, just never when it gets it from an ie window.

Share this post


Link to post
Share on other sites

Wait...If you manually add the string $html to your script it works? That is weird. I have a few ideas that might work, but nothing to figure out whats actually going wrong. Maybe try putting on the clipboard and then reading back?

Clipput($html)
$html = clipget()

If that doesnt work maybe :graduated:

Clipput($html)
$html = clipget()
_filecreate(@scriptdir & '\Temphtml.txt')
filewrite(@scriptdir & '\Temphtml.txt', $html)
$html = fileread(@scriptdir & '\Temphtml.txt')
filedelete(@scriptdir & '\Temphtml.txt')

Share this post


Link to post
Share on other sites

I ended up going a different route using table to array function since it wasn't working. Luckily I am able to use that function. Never tried the other options you gave in last post, but thanks for the help.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • rkr
      By rkr
      Hi, i want to read a particular string from a text file using autoit. i wish to read it without explicitly opening the text file. the copied string should be then transferred to an excelbook (again, no need to explicitly open the excel book)... 

      with reference to my screenshot attached, my input to the  'script'' is going to be 0017-0008, and the script should copy the highlighted two lines from the input file to excel
      thanks

    • rcmaehl
      By rcmaehl
      Hi all,
      I still suck at regex as always and I need some help. According to the regex tester I normally use this should be working fine but it doesn't....
      StringRegExp($sString, "\A[1-9]+[0-9]*(\-[1-9]+[0-9]*)?,*\Z") I basically want to match:
      all numbers EXCEPT 0, but including 10, 20, etc with each number separated by a comma and allowing a "-" separated range as a value For example:
      1-5,7,10-12 I've spent a couple hours modifying it but I'm not sure where I've gone wrong. Any help would be appreciated!
    • ISI360
      By ISI360
      Hi!

      I need a little bit help from some RegEx experts please:
      I would make my ISN AutoIt Studio faster when generating the scripttree. And what would be better to do this via regex?
      Problem is i am not really good at this regex stuff. So maybe someone could help me here.
       
      The challange is to get all Global Variables from a script via RegEx in a Array.
      Here is a example script with some tests:
      Global $Var1 = 1234 Local $Local_Var = 1234 $Ignore_me_too = 1234 Global $Var2 = 1234, $var3 = 1242 Global $ahIcons[30], $ahLabels[30] Global Const $Var4 = iniread($inivar1,"jj","jj","") , $var5= iniread($inivar2,"jj","jj","") Global $Var_String = "was" Global $Array_Test[16] = [1,15,16,0,31,15,25,15,25,30,8,30,8,15,1,15] Global Enum $MARGIN_SCRIPT_NUMBER = 0, $MARGIN_SCRIPT_ICON, $MARGIN_SCRIPT_FOLD Global Const $Delim = '\', $Delim1 = '|' Global $hard1 = "a", _ $hard2 = "b", _ $hard3 = "c"  
      The returning array should look like this:
      $Var1 $Var2 $var3 $Var4 $var5 $Var_String $Array_Test $MARGIN_SCRIPT_NUMBER $MARGIN_SCRIPT_ICON $MARGIN_SCRIPT_FOLD $Delim $Delim1 $hard1 $hard2 $hard3  
      I already made some success with a expression i found in the SciTE Jump Tool:  (\$\w+)(?:[\h\[.=+*/^,)\-])?
      This nearly returns the perfect results. But it does not check if it´s a global variable (with the const and enum options) and also returns variables in commands (for example $inivar1)
      I also found this regex: (?im:^(?=Global|Const|Enum|Static)(?:Global)?\h*(?:Const|Enum|Static)?(?:(?<=Enum)\h+Step\h+[+*-]\d+)?\h*)([^\r\n .\=]+)
      This returns also usefull results...but trying to understand this explodes my head

      Maybe someone can help me here?
      Thanks in advance!
    • cheeroke
      By cheeroke
      Hi all,
      I got this code and would like to be able to change Baud Rate and instead of sending character by character i would like to be able (if possible) to send whole string. But i don't know how to change it.
      I am taking input from file and processing whole line (this is done in FilesHandling.au3).
      To execute this i am just calling SendData("FileName", int) in "main" script.
      Any help very appreciated.
      #include <WinAPI.au3> #include <Array.au3> #include "FilesHandling.au3" ;init DLL function, we need handle to call the function $h = DllCall("Kernel32.dll", "hwnd", "CreateFile", "str", "\\.\COM19", "int", BitOR($GENERIC_READ,$GENERIC_WRITE), "int", 0, "ptr", 0, "int", $OPEN_EXISTING, "int", $FILE_ATTRIBUTE_NORMAL, "int", 0) $handle=$h[0] Func SendData($FileName, $LineNumber) ;string to be send $c = readFile($FileName, $LineNumber) $cLenght = StringLen($c) $aArray = StringSplit($c, "") ;_ArrayDisplay($aArray, "", Default, 64) For $i = 1 To $cLenght writeChar($handle, $aArray[$i], $cLenght) Next ;move to next line writeChar($handle, @CR,1) EndFunc ;write a single char func writeChar($handle,$c,) $stString = DLLStructCreate("char str") $lpNumberOfBytesWritten = 0 DllStructSetData($stString, 1, $c) $res = _WinAPI_WriteFile($handle, DllStructGetPtr($stString, "str"), 1,$lpNumberOfBytesWritten) if ($res<>true) then ConsoleWrite ( _WinAPI_GetLastErrorMessage() & @LF) EndIf EndFunc  
    • FroVN
      By FroVN
      Hi, i have a problem :" can't set the name of file with a special character like: \;/;";|;...  have anyway to short the StringInSrt and Stringreplace? i am using this code but too long
      $title=InputBox(0,'','','')
         if StringInStr($title,'\') or StringInStr($title,'/') or StringInStr($title,':') or StringInStr($title,'*') or StringInStr($title,'?') or StringInStr($title,'"') or StringInStr($title,'<') or StringInStr($title,'>') or StringInStr($title,'|') Then
             $title=StringReplace($title,'\','-')
              $title=StringReplace($title,'/','-')
               $title=StringReplace($title,':','-')
                $title=StringReplace($title,'*','-')
                 $title=StringReplace($title,'?','-')
                  $title=StringReplace($title,'"','-')
                   $title=StringReplace($title,'<','-')
                    $title=StringReplace($title,'>','-')
                     $title=StringReplace($title,'|','-')
         EndIf