Regexp question

wimhek · March 27, 2013

Pleas help me , I am converting HTML to csv using the command stringreg exp. In the example belot, the field Help is not detected.

How to change my regexp ?

#include <Array.au3>

$sString = "<td NOWRAP>cel1</td><td NOWRAP>cel2</td><td NOWRAP>cel3</td><td>Help</td><td NOWRAP>cel4</td>"

$aReturn = StringRegExp($sString, '(?s)(?i)<td NOWRAP>(.*?)</td>', 3)

_ArrayDisplay($aReturn)

thnx.

water · March 27, 2013

I assume you use Internet Explorer as browser. Then you could use the builtin IE UDF, fucntion _IETableWriteToArray, to read the content of a table into an array (for further processing).

wimhek · March 27, 2013

Thnx I will try that

PhoenixXL · March 28, 2013

#include <Array.au3>

$sString = "<td NOWRAP>cel1</td><td NOWRAP>cel2</td><td NOWRAP>cel3</td><td>Help</td><td NOWRAP>cel4</td>"

$aReturn = StringRegExp($sString, '(?s)(?i)<td(?: NOWRAP)?>(.*?)</td>', 3)
_ArrayDisplay($aReturn)

;orelse to get only the Help

$aReturn = StringRegExp( $sString , '<td>(.*?)</td>', 3 )
_ArrayDisplay($aReturn)

Ask if you don't get the code

wimhek · April 1, 2013

Super, it works. I do not get the code, but that is my restriction :-)

kylomas · April 1, 2013

winhek,

Here's a couple alternatives

#include <Array.au3>
$sString = "<td NOWRAP>cel1</td><td NOWRAP>something before Help1</td><td NOWRAP>help,once again</td><br><td>Help</td><td NOWRAP>cel4</td>"
; to get everything that is not HTML
$aReturn = StringRegExp($sString, '(?si)>([^<].*?)<', 3)
 _ArrayDisplay($aReturn, 'All NON-HTML')
 ; get any non-HTML that begins with the string "help"
 $aReturn = StringRegExp($sString, '(?s)(?i)(help.*?)<', 3)
 _ArrayDisplay($aReturn,'Help Only')
 ;==================================================================================
 ;
 ; REGEXP Experts - How would I get get any non-HTML that contains the string "help"
 ;
 ; I've tried multiple variations of the following without success
 ;
 ;===================================================================================
 $aReturn = StringRegExp($sString, '(?si)>([^>].*?help.*?)<', 3)
 _ArrayDisplay($aReturn,'Help Only')

@SRE Experts - I can't figure out how to get the third example to work. I am trying to get any non-HTML containing a string.

kylomas

PhoenixXL · April 1, 2013

An Example

#include <Array.au3>
$sString = "<td NOWRAP>cel1</td><td NOWRAP>something before Help1</td><td NOWRAP>help,once again</td><td>Help</td><td NOWRAP>cel4</td>"
Local $a, $aReturn = StringRegExp($sString, '>([^<>]+)<', 4), $aRet[1]
For $i = 0 To UBound($aReturn) - 1
$a = $aReturn[$i]
If StringInStr($a[1], "help") Then _ArrayAdd($aRet, $a[1])
Next
_ArrayDelete($aRet, 0 )
_ArrayDisplay($aRet)

Direct Approach

#include <Array.au3>
$sString = "<td NOWRAP>cel1</td><td NOWRAP>something before Help1</td><td NOWRAP>help,once again</td><td>Help</td><td NOWRAP>cel4</td>"
$aReturn = StringRegExp($sString, '(?i)>([^<>]*?help[^<>]*?)<', 3)
_ArrayDisplay($aReturn)

Regards

Edited April 1, 2013 by PhoenixXL

kylomas · April 1, 2013

@PhoenixXL,

I see it now. I was negating the "<" and ">", but then matching on any char "." (which is probably contradictory).

Thanks,

kylomas

edit: additional question

This pattern also works

'(?si)>([^<>]*?help.*?)<'

Because the "<" is the first char encountered following "help"???? Edited April 1, 2013 by kylomas

Sign In

Regexp question

Recommended Posts

wimhek

water

wimhek

PhoenixXL

wimhek

kylomas

PhoenixXL

kylomas

Create an account or sign in to comment

Create an account

Sign in

Similar Content

Loading Large Table for lookup

Extract hex number from string

EXCEL (XLS not XLSX) to array or sequential File - (Moved)

Make a metasymbol condition to make operations over the text file line

RegExp Multiline Comments

Browse

AutoIt Resources

Release

Beta