Iczer Posted October 14, 2020 Posted October 14, 2020 I need to parse string with unknown beforehand count of repeats into another string using StringRegExpReplace() from : <div class='tags'><h3>Tags</h3><ul><li>aaa</li><li>bbb</li><li>ccc</li><li>ddd</li><li>eee</li></ul></div> to : aaa, bbb, ccc, ddd, eee But i still cannot grasp repeatable patterns in StringRegExpReplace()... with array it simple, but with string... $sTags = StringRegExpReplace($aEntry[$i],"(?si).+?<h3>Tags</h3><ul>((<li>\w++</li>)\K).++","$2"
seadoggie01 Posted October 14, 2020 Posted October 14, 2020 (edited) This mostly satisfies your request, but leaves a trailing ", ": $sTags = StringRegExpReplace($aEntry[$i], "(?:.*?)<li>(.*?)<\/li>(?:(?!<\/?li>).)*", "$1, ") Spoiler Explanation: (?:.*?)<li>(.*?)<\/li>(?:(?!<\/?li>).)* (?:.*?) - Capture everything before the first <li> tag, only expanding as needed (lazy) <li>(.*?)<\/li> - Capture everything between <li> tags, only expanding as needed (lazy) (?:(?!<\/li>).)* - The beast... uses a negative lookahead to ensure that we don't capture a trailing <li> tag, basically captures everything after the final </li> Personally though, I would match everything between <li> and </li> tags with StringRegExp option 3 and concatenate the results back together... Func ConcatRegExp($sText) Local $aResults = StringRegExp($sText, "<li>(.*?)<\/li>", 3) If @error Then Return SetError(1, 0, False) Local $sReturn = "" For $i=0 To Ubound($aResults) - 1 $sReturn &= $aResults[$i] & ", " Next Return StringTrimRight($sReturn, 2) EndFunc Edited October 14, 2020 by seadoggie01 Added explanation Musashi 1 All my code provided is Public Domain... but it may not work. Use it, change it, break it, whatever you want. Spoiler My Humble Contributions:Personal Function Documentation - A personal HelpFile for your functionsAcro.au3 UDF - Automating Acrobat ProToDo Finder - Find #ToDo: lines in your scriptsUI-SimpleWrappers UDF - Use UI Automation more Simply-erKeePass UDF - Automate KeePass, a password managerInputBoxes - Simple Input boxes for various variable types
Iczer Posted October 15, 2020 Author Posted October 15, 2020 Thanks! This leaves trailing "</li>" in resulting string, but it still good result as it faster than have to deal with array after StringRegExp()
seadoggie01 Posted October 15, 2020 Posted October 15, 2020 Do you have different data? When I try it I don't get a trailing tag: ConsoleWrite(StringTrimRight(StringRegExpReplace("<div class='tags'><h3>Tags</h3><ul><li>aaa</li><li>bbb</li><li>ccc</li><li>ddd</li><li>eee</li></ul></div>", "(?:.*?)<li>(.*?)<\/li>(?:(?!<\/?li>).)*", "$1, "), 2) & @CRLF) All my code provided is Public Domain... but it may not work. Use it, change it, break it, whatever you want. Spoiler My Humble Contributions:Personal Function Documentation - A personal HelpFile for your functionsAcro.au3 UDF - Automating Acrobat ProToDo Finder - Find #ToDo: lines in your scriptsUI-SimpleWrappers UDF - Use UI Automation more Simply-erKeePass UDF - Automate KeePass, a password managerInputBoxes - Simple Input boxes for various variable types
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now