Jump to content

StringRegExp problem


Recommended Posts

let's say I have a big file ( its an html source), something like this:

bla bla bla

line2 sdfkjkdf

bla bla bla

href="http://www.example.com/bla-example/29-here/">' bla bla bla bla

another text

ok, I have to replace all ''-'' in this link " href="http://www.example.com/29-bla-example/bla-here/">' " with %20

luckily "href="http://www.example.com/bla-example/" is static, after there is always a number (2 or 3 digits)

the output should be like this: href="http://www.example.com/29%20bla%20example/bla%20here/">

and should be done with regex

thx in advance

Link to comment
Share on other sites

let's say I have a big file ( its an html source), something like this:

ok, I have to replace all ''-'' in this link " href="http://www.example.com/29-bla-example/bla-here/">' " with %20

luckily "href="http://www.example.com/bla-example/" is static, after there is always a number (2 or 3 digits)

the output should be like this: href="http://www.example.com/29%20bla%20example/bla%20here/">

and should be done with regex

thx in advance

Where are you using the href exactly?? in a webpage?? In case afirmative, put the webpage to examine it.

Link to comment
Share on other sites

  • Moderators

let's say I have a big file ( its an html source), something like this:

ok, I have to replace all ''-'' in this link " href="http://www.example.com/29-bla-example/bla-here/">' " with %20

luckily "href="http://www.example.com/bla-example/" is static, after there is always a number (2 or 3 digits)

the output should be like this: href="http://www.example.com/29%20bla%20example/bla%20here/">

and should be done with regex

thx in advance

Age old question... what have you tried? Not much for reading code requests personally.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

for example I tried this:

$text = 'dflgfdlògflkdlgdlfg bla bla bladsfkdf' & @CRLF & _
        'sksksksk line ldfldlfdl bla bla dsfdfdf bla' & @CRLF & _
        'href="http://www.example.com/29-bla-example/bla-here/">' & @CRLF & _
        'bla bla bla line line bla bla'


$NewText = StringRegExpReplace($text, '(href="http://www.example.com/)(\d{1,2})((-)([^-]+))+(/">)', '\1\2%20\5\6')
MsgBox(0, '', $NewText)

it fails becouse it strips ''example/bla-''

but I tried a lot of combinations as well

Link to comment
Share on other sites

let's say I have a big file ( its an html source), something like this:

ok, I have to replace all ''-'' in this link " href="http://www.example.com/29-bla-example/bla-here/">' " with %20

luckily "href="http://www.example.com/bla-example/" is static, after there is always a number (2 or 3 digits)

the output should be like this: href="http://www.example.com/29%20bla%20example/bla%20here/">

and should be done with regex

thx in advance

Using StringReplace you can replace the character what you want.

;your text
$text = "http://www.example.com/29-bla-example/bla-here/"

;replace "-" with "%20" and is returned into $s
$s = StringReplace($text,"-","%20")

;show http://www.example.com/29%20bla%20example/bla%20here
MsgBox(0,"",$s)

Now you need extract the url from the text and ready :D

Link to comment
Share on other sites

  • Moderators

There are too many variables to use RegExReplace (or I'm to tired to think of a way to do it) ... Why I say that is, you don't know how many hyphens will be in that part of the string, so you could end up cutting off data with back-referencing.

If you want a regex solution, then use the regex to grab the data you want.

Global $s_str = "Stuff before" & @CRLF & 'href="http://www.example.com/bla-example/29-here-you-are/">' & @CRLF & "Stuff after"
Global $s_out = $s_str
Global $a_sre = StringRegExp($s_str, "(?i)((href=\x22http://www.example.com/bla-example/\w+)(.+?))/\x22", 3)

For $i = 0 To UBound($a_sre) - 1 Step 3
    $s_out = StringReplace($s_out, $a_sre[$i], $a_sre[$i + 1] & StringReplace($a_sre[$i + 2], "-", "%20"))
Next
ConsoleWrite($s_out & @CRLF)
Then replace the data with a couple of replace strings. Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

I mistakenly believed StringReplace() was faster than StringRegExpReplace().

;
Local $text = 'dflgfdlògflkdlgdlfg bla bla bladsfkdf' & @CRLF & _
        'sksksksk line ldfldlfdl bla bla dsfdfdf bla' & @CRLF & _
        'href="http://www.example.com/29-bla-example/bla-here/">' & @CRLF & _
        'bla bla bla line line bla bla'
Local $NewText, $NewText1, $Diff, $Diff1, $Timer, $Timer1

;StringRegExpReplace
Local $Timer = TimerInit()
For $x = 1 To 10000
    $NewText = StringRegExpReplace($text, '(-)', '%20')
Next
$Diff = "StringRegExpReplace" & @CRLF & TimerDiff($Timer) & @CRLF & @CRLF & $NewText

;StringReplace
Local $Timer1 = TimerInit()
For $x = 1 To 10000
    $NewText1 = StringReplace($text, '-', '%20')
Next
$Diff1 = "StringReplace" & @CRLF & TimerDiff($Timer1) & @CRLF & @CRLF & $NewText1


MsgBox(0, 'Results', $Diff & @CRLF & @CRLF & @CRLF & @CRLF & $Diff1)
;
Link to comment
Share on other sites

  • Moderators

@Malkey, StringReplace() in most every circumstance should be faster than StringRegExpReplace(). Remember, you're looking for literals with StringReplace() always. Typically, RegExReplace you're looking for some kind of pattern that will slow it down.

Edit:

Try that test with a case sensitive search (Like you're doing with RegExReplace) and see if that is faster.

Edit2:

Here I did it:

StringRegExpReplace

173.363095030234

dflgfdlògflkdlgdlfg bla bla bladsfkdf

sksksksk line ldfldlfdl bla bla dsfdfdf bla

href="http://www.example.com/29%20bla%20example/bla%20here/">

bla bla bla line line bla bla

StringReplace

67.6549673212657

dflgfdlògflkdlgdlfg bla bla bladsfkdf

sksksksk line ldfldlfdl bla bla dsfdfdf bla

href="http://www.example.com/29%20bla%20example/bla%20here/">

bla bla bla line line bla bla

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

There are too many variables to use RegExReplace (or I'm to tired to think of a way to do it) ... Why I say that is, you don't know how many hyphens will be in that part of the string, so you could end up cutting off data with back-referencing.

If you want a regex solution, then use the regex to grab the data you want.

Global $s_str = "Stuff before" & @CRLF & 'href="http://www.example.com/bla-example/29-here-you-are/">' & @CRLF & "Stuff after"
Global $s_out = $s_str
Global $a_sre = StringRegExp($s_str, "(?i)((href=\x22http://www.example.com/bla-example/\w+)(.+?))/\x22", 3)

For $i = 0 To UBound($a_sre) - 1 Step 3
    $s_out = StringReplace($s_out, $a_sre[$i], $a_sre[$i + 1] & StringReplace($a_sre[$i + 2], "-", "%20"))
Next
ConsoleWrite($s_out & @CRLF)
Then replace the data with a couple of replace strings.
yes I think this solution should work, thx
Link to comment
Share on other sites

  • 5 months later...

Whoa, these are my results:

StringRegExpReplace

511.232293489815

StringReplace - non case-sensitive using user locale:

1320.58934864627

StringReplace - case-sensitive:

196.782551972388

StringReplace - non case-sensitive:

221.04988203808

Zoicks! What a huge difference! What does the default "using user's local" option actually do? Am I safe to stick with the faster non-case sens option? (2) What situations might call for the default search? (0)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...