Jump to content

StringRegExpReplace to correct HTML tag


Go to solution Solved by PhoenixXL,

Recommended Posts

  • Moderators

luckyluke,
 
Why use an SRER?  You can do it very simply like this: :)

$sString = "< / li  >  < /  li  >  < /li > < / li> < / li  > </ li  >"

; Replace all spaces and then add one between the ><
$sNewString = StringReplace(StringReplace($sString, " ", ""), "><", "> <")

ConsoleWrite($sNewString & @CRLF)

Good enough?  Or is there something you have not told us? :huh:

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to post
Share on other sites
  • Solution

The requirement sounds incomplete. I hope there isn't any more possibilities you have left.

Here is what you ask

;working - match anything inbetween "<~li~>" , and replace everything with a "</li>"
$string = StringRegExpReplace("< / li  >  < /  li  >  < /li > < / li> < / li  > </ li  >", "<[^>/]*/[^>]*?li[^>]*>", "</li>")
MsgBox( 64, "", $string)

Regards :)

Edited by PhoenixXL

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Link to post
Share on other sites

luckyluke,

 

Why use an SRER?  You can do it very simply like this: :)

$sString = "< / li  >  < /  li  >  < /li > < / li> < / li  > </ li  >"

; Replace all spaces and then add one between the ><
$sNewString = StringReplace(StringReplace($sString, " ", ""), "><", "> <")

ConsoleWrite($sNewString & @CRLF)

Good enough?  Or is there something you have not told us? :huh:

M23

In the first, it will not work any more will i use stringreplace in a HTML code, eg:

<BR>new ControlPanel 
<LI>perfect . < / Li >
<LI>perfect Taskbar Button Position . < / Li >Improved 
<LI>StartMenu . < / Li >
<LI>Improved maximum , minimum , close button . < / Li >
<LI>New Improved Taskbar Improved 
<LI><A href="" target=_blank>. Control Panel </A>View 
<LI>Improved address bar 
<LI>Adressbar Improved text size . . . < / Li >
<LI>Adressbar Glow Improved Text 
<UL></UL><BR>

The requirement sounds incomplete. I hope there isn't any more possibilities you have left.

Here is what you ask

;working - match anything inbetween "<~li~>" , and replace everything with a "</li>"
$string = StringRegExpReplace("< / li  >  < /  li  >  < /li > < / li> < / li  > </ li  >", "<[^>/]*/[^>]*?li[^>]*>", "</li>")
MsgBox( 64, "", $string)

Regards :)

I think this is i wanted. Stringregexp is so difficult to learn. I will test this with other HTML tag

Thank you very much!

Link to post
Share on other sites
  • Moderators

LuckyLuke,

So as both PhoenixXL and I suspected there was more to your request that you initially stated. In future, please make your questions clear from the beginning - then we do not waste time producing code which does not fill the actual requirement. ;)

M23

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to post
Share on other sites

The requirement sounds incomplete. I hope there isn't any more possibilities you have left.

Here is what you ask

;working - match anything inbetween "<~li~>" , and replace everything with a "</li>"
$string = StringRegExpReplace("< / li  >  < /  li  >  < /li > < / li> < / li  > </ li  >", "<[^>/]*/[^>]*?li[^>]*>", "</li>")
MsgBox( 64, "", $string)

Regards :)

 

I just tested this simpler version which works fine too.
I'm not RegExp guru so I don't know if this my simpler solution has some disadvantages or bugs
 
$string = "a <1 /2 li 3 > 4 < /  li 5 > 6 < /li > < / li> < / li  > </ li  > < / li  >  < /  li  >  < /li > < / li> < / li  > </ li  >"
$string = StringRegExpReplace($string, "(<.*?/.*?li.*?>)", '</li>')
ConsoleWrite($string& @CRLF)
Link to post
Share on other sites

@Zedna,

Your method doesn't have any bugs, it will work as expected. Still using lazy operators will cause the engine to have a lot of back-tracking( at least in this scenario ). Therefore I prefer greedy operator.

My code:

PredictText: Predict Text of an Edit Control Like Scite. Remote Gmail: Execute your Scripts through Gmail. StringRegExp:Share and learn RegExp.

Run As System: A command line wrapper around PSEXEC.exe to execute your apps scripts as System (LSA). Database: An easier approach for _SQ_LITE beginners.

MathsEx: A UDF for Fractions and LCM, GCF/HCF. FloatingText: An UDF for make your text floating. Clipboard Extendor: A clipboard monitoring tool. 

Custom ScrollBar: Scroll Bar made with GDI+, user can use bitmaps instead. RestrictEdit_SRE: Restrict text in an Edit Control through a Regular Expression.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By goku200
      I'm having an issue with my html paginated table. The script work as expected. It reads the html table and clicks on the Download button. However when it clicks on the next page its not iterating the items. instead it goes to the next URL from the spreadsheet and then iterates through the html table clicking the Download button and so on. Not sure why its doing that. I want it to click the next page and then continue iterating then after it has reached the end of the pagination go to the next url in the spreadsheet and repeat the process. Below is my script. Any help is appreciated 🙂
       
       
    • By Hermes
      Hi, I have a site that has the following elements below: 
      <div>More element here</div> <div>More element here</div> <div>More element here</div> When I do this in Auto It:
      Local $oSelectDiv = _WD_FindElement($sSession, $_WD_LOCATOR_ByCSSSelector, "div") _WD_HighlightElement($sSession, $oSelectDiv, 1) I also tried to add [3], but it doesnt seems to work:
      Local $oSelectDiv = _WD_FindElement($sSession, $_WD_LOCATOR_ByCSSSelector, "div[3]") _WD_HighlightElement($sSession, $oSelectDiv, 1) It always highlight the first one, but I am trying to highlight the 3rd in the list. Is there anyway to select the 3rd div without having to add any class/id in the divs, and without using XPATH? The structure of the elements in that site were built that way.
    • By Pured
      I am looking to create a script which refreshes/reads a webpage every few seconds. My goal is to see if the page has changed, then I will send myself a notification that the webpage has been updated.
       
      However, rather than downloading the entire webpage every single time, is there a way to check when the webpage last updated?
       
      If not, is there away to partially download/read html source until a specific tag is hit?
       
      Goal: I would like to increase my poll rate and not excessively waste data.
    • By Mr_Microphone
      Alright, I may be an idiot.
      Three years ago, I wrote a program that pushed component information to a secure site via their API. I went back to add some attributes and (here's the idiot part) ended up losing the  source code and my modified code does not quite work. I have the compiled version that works minus the new attributes, so I know that their system has not changed. I stripped the larger program down from 3,000 lines to the part that is broken, but I am stumped. This was one of my first scripts, so it heavily leverages examples and isn't as pretty as I'd like it to be.
      Be gentle. 
      The program / script creates a new records as expected, but for some reason, I cannot access information in the response, which I need for a later step.
      I use Charles, a web debugging proxy tool so I can see the request and the response and both are as expected. Also, when I write to log file, the JSON reply is exactly what I expect and need, but when I try to do anything with the http body, it seems to be blank. 
      Here is the script minus  the URL and token:
      #include <Array.au3> #include <Curl.au3> #include <MsgBoxConstants.au3> #include <json.au3>  ; this was added as an alternate way to read the data Global $WM_serial_number = "WM20745001" Global $wm_component_status_id = "10" Global $wm_manufacturer ="Multi-Tech" Global $wm_model = "MTR-LAT1-B07" Global $cellular_carrier_id = "3" Global $iccid_esn = "89010303300012345678" Global $ip_address = "192.168.2.11" Global $NewIDNumber     Local $Curl = Curl_Easy_Init()     Local $Html = $Curl ; any number as identify     Local $Header = $Curl + 1 ; any number as identify     Local $HtmlFile = "cURL_Request.html"     Local $File = FileOpen($HtmlFile, 2 + 16)     Local $Slist = Curl_Slist_Append(0, "content-type: multipart/form-data; boundary=---011000010111000001101001")     $Slist = Curl_Slist_Append($Slist, "authorization: Token token=" & $Token)     Curl_Easy_Setopt($Curl, $CURLOPT_PROXY, "127.0.0.1") ; needed to use Charles web debugging proxy     Curl_Easy_Setopt($Curl, $CURLOPT_PROXYPORT, 8888) ; needed to use Charles     Curl_Easy_Setopt($Curl, $CURLOPT_HTTPHEADER, $Slist) ;     Curl_Easy_Setopt($Curl, $CURLOPT_URL, $Server & "wireless_module" & "s")     Curl_Easy_Setopt($Curl, $CURLOPT_SSL_VERIFYPEER, 0)     Curl_Easy_Setopt($Curl, $CURLOPT_TIMEOUT, 30)     Curl_Easy_Setopt($Curl, $CURLOPT_WRITEDATA, $Html)     Curl_Easy_Setopt($Curl, $CURLOPT_WRITEFUNCTION, Curl_FileWriteCallback())     Curl_Easy_Setopt($Curl, $CURLOPT_WRITEDATA, $File)     Local $HttpPost = ""     Local $LastItem = ""         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[serial_number]", $CURLFORM_COPYCONTENTS, $WM_serial_number, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[component_status_id]", $CURLFORM_COPYCONTENTS, $wm_component_status_id, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[manufacturer]", $CURLFORM_COPYCONTENTS, $wm_manufacturer, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[model]", $CURLFORM_COPYCONTENTS, $wm_model, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[cellular_carrier_id]", $CURLFORM_COPYCONTENTS, $cellular_carrier_id, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[iccid_esn]", $CURLFORM_COPYCONTENTS, $iccid_esn, $CURLFORM_END)         Curl_FormAdd($HttpPost, $LastItem, $CURLFORM_COPYNAME, "wireless_module" & "[ip_address]", $CURLFORM_COPYCONTENTS, $ip_address, $CURLFORM_END)         ; submit         Curl_Easy_Setopt($Curl, $CURLOPT_HTTPPOST, $HttpPost)         Local $Code = Curl_Easy_Perform($Curl)         If $Code = $CURLE_OK Then         ConsoleWrite("Content Type: " & Curl_Easy_GetInfo($Curl, $CURLINFO_CONTENT_TYPE) & @LF)         ConsoleWrite("Download Size: " & Curl_Easy_GetInfo($Curl, $CURLINFO_SIZE_DOWNLOAD) & @LF)         MsgBox(0, 'Html', BinaryToString(Curl_Data_Get($Html))) ; this is something I threw in for debugging, expecting to see SOMETHING. Returns nothing         MsgBox(0, 'Header', BinaryToString(Curl_Data_Get($Header))) ; this is something I threw in for debugging, expecting to see SOMETHING. Returns nothing         Local $response = Curl_Easy_GetInfo($Curl, $CURLINFO_RESPONSE_CODE)             If $response = "409" Then $response = "Failed due to a conflict."             If $response = "200" Then $response = "Was NOT created."             If $response = "201" Then $response = "Was created."             ; read the ID that was assigned and store it         $NewIDNumber = StringRight(StringLeft(BinaryToString(Curl_Data_Get($Html)), 10), 4) ; this DID work, but now it doesn't. An old compiled version still works ;~         Global $JsonObject = json_decode($Html); another debugging attempt. Did not use json functions previously and the program worked without it. ;~         Global $NewIDNumber = json_get($JsonObject, '.id')         ConsoleWrite(@CRLF &'! id:' & $NewIDNumber & @CRLF & @CRLF)    ; debugging feedback         MsgBox(0, $response, $wm_serial_number & " new ID = " & $NewIDNumber); debugging feedback         If $Code <> $CURLE_OK Then ConsoleWrite(Curl_Easy_StrError($Code) & @LF)             Local $Data = BinaryToString(Curl_Data_Get($Curl))             Curl_Easy_Cleanup($Curl)             Curl_Data_Cleanup($Curl)             Curl_Data_Cleanup($Header)             Curl_Data_Cleanup($Html)             Curl_FormFree($HttpPost)             Curl_slist_free_all($Slist)             curl_easy_reset($Curl)             FileClose($File)             ConsoleWrite(@LF)         EndIf  This is the captured request (minus the host and token)
      POST /api/v2/wireless_modules HTTP/1.1 Host: api. Accept: */* authorization: Token token= Content-Length: 942 Expect: 100-continue content-type: multipart/form-data; boundary=---011000010111000001101001; boundary=------------------------9adb0d87c7ea5061 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[serial_number]" WM20745001 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[component_status_id]" 10 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[manufacturer]" Multi-Tech --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[model]" MTR-LAT1-B07 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[cellular_carrier_id]" 3 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[iccid_esn]" 89010303300012345678 --------------------------9adb0d87c7ea5061 Content-Disposition: form-data; name="wireless_module[ip_address]" 192.168.2.11 --------------------------9adb0d87c7ea5061-- and the captured response
      HTTP/1.1 201 Created Date: Sun, 04 Apr 2021 00:12:18 GMT Server: Apache Cache-Control: max-age=0, private, must-revalidate Access-Control-Allow-Origin: not-allowed Vary: Accept-Encoding Access-Control-Max-Age: 1728000 X-XSS-Protection: 1; mode=block X-Request-Id: 71cfcf36-6020-48a6-a822-d2b393a27b69 Access-Control-Allow-Credentials: true Access-Control-Allow-Methods: PUT, OPTIONS, GET, POST ETag: W/"25d97fe8a9387cb4b9029a9e62b0bfa2" X-Frame-Options: SAMEORIGIN, SAMEORIGIN X-Runtime: 0.344005 X-Content-Type-Options: nosniff Access-Control-Request-Method: * X-Powered-By: Phusion Passenger 5.2.1 Strict-Transport-Security: max-age=63072000; includeSubDomains; preload Location: /wireless_modules/3195 Status: 201 Created Connection: close Transfer-Encoding: chunked Content-Type: application/json; charset=utf-8 X-Charles-Received-Continue: HTTP/1.1 100 Continue {"id":3195,"model":"MTR-LAT1-B07","serial_number":"WM20745001","manufacturer":"Multi-Tech","mfg_date":null,"iccid_esn":"89010303300012345678","ip_address":"192.168.2.11","purchase_order":null,"supplier":null,"cellular_carrier_id":3,"component_status_id":10,"component_status":{"id":10,"name":"Hold","description":"Available- Held for specific use"},"custom_attributes":[{"name":"Deactivated","type":"Boolean","value":false},{"name":"Port 3001","type":"Boolean","value":false}],"comments":[]}  
      Also attached is the log file. I need to read the id value. Clearly, it is arriving back to cURL, since it is being written out to the log, but I cannot seem to get to it within the code. 
      It is established that I may be an idiot, but this idiot has wasted days in non-billable hours trying to figure out what should be a simple glitch.
      Help???
       
      cURL_Request.html
    • By Hermes
      I have an html table that displays data along with an excel spreadsheet that has the same data as the html table. I am wanting to only match the Title column in my html table with the Title column in my Excel spreadsheet. If the titles match, click on the Edit hyperlink and continue to loop to next row. The issue I'm experience is its not matching correctly. So far  i've written the codes below:
      <table border="1" class="test"> <tr> <th> UniqueID</th> <th> Title</th> <th> UserID</th> <th> Address</th> <th> Gender </th> </tr> <tr> <td> 1 </td> <td> Title1 </td> <td> 12345 </td> <td> Manila </td> <td> <span> Male </span> </td> </tr> <tr> <td align="center" colspan="5"> <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 2 </td> <td> Title2 </td> <td> 67891 </td> <td> Valenzuela </td> <td> <span> Female </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 3 </td> <td> Title3 </td> <td> 88888 </td> <td> Ohio </td> <td> <span> Male </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 4 </td> <td> Title4 </td> <td> 77777 </td> <td> California </td> <td> <span> Female </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> <tr> <td> 5 </td> <td> Title5 </td> <td> 33333 </td> <td> Arizona </td> <td> <span> Male </span> </td> </tr> <tr> <td align="center" colspan="5" > <a href="#" class="testlink">Edit</a> </td> </tr> </table> #Include "Chrome.au3" #Include "wd_core.au3" #Include "wd_helper.au3" #Include "Excel.au3" #Include "_HtmlTable2Array.au3" #Include "Array.au3" Local $sDesiredCapabilities, $sSession SetupChrome() _WD_Startup() $sSession = _WD_CreateSession($sDesiredCapabilities) _WD_LoadWait($sSession) _WD_Navigate($sSession, "index.html") Sleep(6000) Local $oExcel = _Excel_Open() Local $oWorkbook = _Excel_BookOpen($oExcel, "test.xlsx") ; Get the table element $sElement = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//table[@class='test']") ; Retrieve HTML $sHTML = _WD_ElementAction($sSession, $sElement, "Property", "outerHTML") ;Local $aTable = _HtmlTableGetWriteToArray($sHTML) Local $aArray1 = _Excel_RangeRead($oWorkbook,1,$oWorkbook.ActiveSheet.Usedrange.Columns("B:B")) Local $aArray2 = _HtmlTableGetWriteToArray($sHTML) ;_ArrayDisplay($aArray1) ;_ArrayDisplay($aArray2) For $i = UBound($aArray1) - 1 To 0 step - 1 For $j = UBound($aArray2) - 1 to 0 step - 1 If $aArray1[$i][1] == $aArray2[$j][1] Then _WD_WaitElement($sSession, $_WD_LOCATOR_ByXPath, "//a[contains(@class,'testlink') or contains(text(),'Edit')]") $test1 = _WD_FindElement($sSession, $_WD_LOCATOR_ByXPath, "//a[contains(@class,'testlink') or contains(text(),'Edit')]") _WD_ElementAction($sSession, $test1, 'click') ;_ArrayDisplay($aArray1) ;_ArrayDelete($aArray1 , $i) ;exitloop EndIf Next Next _WD_Shutdown() Func SetupChrome() _WD_Option('Driver', 'chromedriver.exe') _WD_Option('Port', 9515) _WD_Option('DriverParams', '--log-path="' & @ScriptDir & '\chrome.log"') $sDesiredCapabilities = '{"capabilities": {"alwaysMatch": {"goog:chromeOptions": {"w3c": true, "args":["start-maximized","disable-infobars"]}}}}' EndFunc ;==>SetupChrome Would appreciate if anyone can provide tips, or point me in the right direction in doing it.
       
      test.xlsx
×
×
  • Create New...