Jump to content
Sign in to follow this  
qwert

Need help with StringRegExpReplace

Recommended Posts

I've been working to fashion a RegEx command to remove all "<script>" declarations in a page of html.

The RegEx Toolkit has been a great help ... but I'm at a point that my knowledge of RegEx just runs out.

Can someone explain why the expression in this example doesn't find (and replace) both of the script statements?

Thanks in advance for any help.

RegEx.PNG.9c7b267004d0500b81388342ab570b96.PNG

Share this post


Link to post
Share on other sites

@qwert
Something like this:

#include <MsgBoxConstants.au3>
#include <StringConstants.au3>

Global $strString = 'This is HTML!' & @CRLF & _
                    '<script>' & @CRLF & _
                    'document.getElementById("demo").innerHTML = "Hello JavaScript!";' & @CRLF & _
                    '</script>' & @CRLF & _
                    'This is more HTML!'

MsgBox($MB_ICONINFORMATION, "Before:", $strString)

$strString = StringRegExpReplace($strString, '<script>[^<]*<\/script>', '[Replaced]')

MsgBox($MB_ICONINFORMATION, "After:", $strString)

:)


Click here to see my signature:

Spoiler

ALWAYS GOOD TO READ:

 

Share this post


Link to post
Share on other sites

It always helps when you post a scriptlet with the input data and current source to play and modify.

Jis


SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

@Francesco: Well, almost!  When I try it in RegEx Toolkit, it replaces the first occurrence, but not the second.

Quote

This is a test.
  [replaced]
  <script src='https://www.google.com/recaptcha/api.js'></script>
</head>
<body class="with-hero ">
Now is the time.

Any ideas?

 

Share this post


Link to post
Share on other sites

I think I found it.  The first element was defined to have a hard ">" as it's last character.

I changed to this and it works for both cases:  <script[^<]*<\/script>

I appreciate your help!

 

@Jos: yes, I see the value in doing that. But since I was already using the Toolkit, I thought there might be a benefit in showing others that it exists.

Share this post


Link to post
Share on other sites

@mikell: Yes, indeed. When I tried the first expression in my full script, it only worked for scripts on a single line.

The expression you provided caught all occurrences.

Thanks for posting. I appreciate your help.

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...