Jump to content
Sign in to follow this  
wimhek

Regexp question

Recommended Posts

wimhek

Pleas help me , I am converting HTML to csv using the command stringreg exp. In the example belot, the field Help is not detected.

How to change my regexp ?

#include <Array.au3>

$sString = "<td NOWRAP>cel1</td><td NOWRAP>cel2</td><td NOWRAP>cel3</td><td>Help</td><td NOWRAP>cel4</td>"

$aReturn = StringRegExp($sString, '(?s)(?i)<td NOWRAP>(.*?)</td>', 3)

_ArrayDisplay($aReturn)

thnx.

Share this post


Link to post
Share on other sites
water

I assume you use Internet Explorer as browser. Then you could use the builtin IE UDF, fucntion _IETableWriteToArray, to read the content of a table into an array (for further processing).


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2018-10-19 - Version 1.4.10.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (2018-09-01 - Version 1.3.4.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
PowerPoint (2017-06-06 - Version 0.0.5.0) - Download - General Help & Support
Excel - Example Scripts - Wiki
Word - Wiki
 
Tutorials:

ADO - Wiki

 

Share this post


Link to post
Share on other sites
wimhek

Thnx I will try that

Share this post


Link to post
Share on other sites
PhoenixXL

#include <Array.au3>

$sString = "<td NOWRAP>cel1</td><td NOWRAP>cel2</td><td NOWRAP>cel3</td><td>Help</td><td NOWRAP>cel4</td>"

$aReturn = StringRegExp($sString, '(?s)(?i)<td(?: NOWRAP)?>(.*?)</td>', 3)
_ArrayDisplay($aReturn)

;orelse to get only the Help

$aReturn = StringRegExp( $sString , '<td>(.*?)</td>', 3 )
_ArrayDisplay($aReturn)
Ask if you don't get the code


My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites
wimhek

Super, it works. I do not get the code, but that is my restriction :-)

Share this post


Link to post
Share on other sites
kylomas

winhek,

Here's a couple alternatives

#include <Array.au3>
$sString = "<td NOWRAP>cel1</td><td NOWRAP>something before Help1</td><td NOWRAP>help,once again</td><br><td>Help</td><td NOWRAP>cel4</td>"
; to get everything that is not HTML
$aReturn = StringRegExp($sString, '(?si)>([^<].*?)<', 3)
 _ArrayDisplay($aReturn, 'All NON-HTML')
 ; get any non-HTML that begins with the string "help"
 $aReturn = StringRegExp($sString, '(?s)(?i)(help.*?)<', 3)
 _ArrayDisplay($aReturn,'Help Only')
 ;==================================================================================
 ;
 ; REGEXP Experts - How would I get get any non-HTML that contains the string "help"
 ;
 ; I've tried multiple variations of the following without success
 ;
 ;===================================================================================
 $aReturn = StringRegExp($sString, '(?si)>([^>].*?help.*?)<', 3)
 _ArrayDisplay($aReturn,'Help Only')

@SRE Experts - I can't figure out how to get the third example to work. I am trying to get any non-HTML containing a string.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
PhoenixXL

An Example

#include <Array.au3>
$sString = "<td NOWRAP>cel1</td><td NOWRAP>something before Help1</td><td NOWRAP>help,once again</td><td>Help</td><td NOWRAP>cel4</td>"
Local $a, $aReturn = StringRegExp($sString, '>([^<>]+)<', 4), $aRet[1]
For $i = 0 To UBound($aReturn) - 1
$a = $aReturn[$i]
If StringInStr($a[1], "help") Then _ArrayAdd($aRet, $a[1])
Next
_ArrayDelete($aRet, 0 )
_ArrayDisplay($aRet)

Direct Approach

#include <Array.au3>
$sString = "<td NOWRAP>cel1</td><td NOWRAP>something before Help1</td><td NOWRAP>help,once again</td><td>Help</td><td NOWRAP>cel4</td>"
$aReturn = StringRegExp($sString, '(?i)>([^<>]*?help[^<>]*?)<', 3)
_ArrayDisplay($aReturn)

Regards :)

Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Share this post


Link to post
Share on other sites
kylomas

@PhoenixXL,

I see it now. I was negating the "<" and ">", but then matching on any char "." (which is probably contradictory).

Thanks,

kylomas

edit: additional question

This pattern also works

'(?si)>([^<>]*?help.*?)<'

Because the "<" is the first char encountered following "help"???? Edited by kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • MrCheese
      By MrCheese
      argh, pulling my hair out.
      considering this post: 
       
      say for a string = "03a", how can I strip out the leading 0 and the a.
      I have tried:
      $new = StringRegExpReplace($string, '[^1-9][^0-9]', '')
       
      and various combinations:
      ^0+[^0-9]
      [^[:digit:]]
      "[^0].*"
      "^0*(d+)"
       
      I'm going loopy!
       
       
    • MrCheese
      By MrCheese
      hi all,
      reviewing the forum, this thread is applicable: 
       
       
      I wanted to know if there is now a better way to do this?
      In essence, I load a tab delimited txt file into an array (works well). I used tab, as some fields in the original csv contains commas.
      However, I needed autoit to manipulate this array, and output it as a csv.
      IF my array contains items with a comma, without double quotes around the field, then how best do I get a csv out of this?
      My current workaround is to filewritefromarray tab delimited, then open it in excel and save as a csv. I will need to check this to see how the address fields behave that contain a comma.
       
      Any thoughts would be appreciated.
       
    • SOF-TECH
      By SOF-TECH
      Dear all,
      Can someone show  me how to en hance the below function to write in CSV  into column  and rows the input values ? 
      I am getting this result: 

      I would like the result to be as this 

      From A1:C1 is for headers
      From A2:C2 is for input Data
      Global Const $GUI_EVENT_CLOSE = -3 $sDataFilePath = @ScriptDir & "\Records.csv" #region ### START Koda GUI section ### Form= $Form1 = GUICreate("Demo1: New Record", 580, 115) $Input1 = GUICtrlCreateInput("", 10, 30, 270, 21) $Input2 = GUICtrlCreateInput("", 300, 30, 270, 21) $Input3 = GUICtrlCreateInput("", 10, 80, 270, 21) $Label1 = GUICtrlCreateLabel("Name:", 10, 10, 35, 17) $Label2 = GUICtrlCreateLabel("ID:", 300, 10, 18, 17) $Label3 = GUICtrlCreateLabel("Phone No:", 10, 60, 55, 17) $Button1 = GUICtrlCreateButton("Save to CSV", 450, 70, 120, 30) GUISetState(@SW_SHOW) #endregion ### END Koda GUI section ### While 1 $nMsg = GUIGetMsg() Switch $nMsg Case $GUI_EVENT_CLOSE Exit Case $Button1 _ExportData() MsgBox(64, @ScriptName, "Record Saved.") EndSwitch WEnd Func _ExportData() If Not FileExists($sDataFilePath) Then FileWriteLine($sDataFilePath, "Name;ID;Phone No.;") EndIf For $i = $Input1 To $Input3 FileWrite($sDataFilePath, GUICtrlRead($i) & ";") Next FileWriteLine($sDataFilePath, "") EndFunc ;==>_ExportData May be Excel UDF has be to be added but I can manage that my self  
      Thank you in advance
    • PClough
      By PClough
      Hi everyone!
      After updating autoit, I tried to run an old program using complex regexp's.  It did not work.  Eventually I broke the problem down to this example:
       
      #include <Array.au3> $buf = "First title" & @CRLF & "Tom" & Chr(0x92) & "s sleepwalking" & @CRLF & "Last | line" & @CRLF $items = StringRegExp($buf, '([\x20-\xff]+)\x0d\x0a', 3) _ArrayDisplay($items,'') And this is the result I get when running it:
      Row 0
       
×