IAMK Posted February 17, 2019 Posted February 17, 2019 (edited) How I currently write HTML_INNERTEXT to a text file without editing it: After this is a link!!! <a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a><br>Before this was a link!!! //Note: This is 1 line. Goal is to have this: After this is a link!!! https://imgur.com/a/u9E0Srf Before this was a link!!! //Note: This is 2 lines. I'm trying to do something along the lines of the following, but am stuck on the second last line of the function: Func cleanup($temp) ;Turns special characters back to their real symbols. Local $cleanupResult = $temp $cleanupResult = StringReplace($cleanupResult, "<br>", @CRLF) $cleanupResult = StringReplace($cleanupResult, ">", ">") $cleanupResult = StringReplace($cleanupResult, "<", "<") $cleanupResult = StringReplace($cleanupResult, "&", "&") Local $tempURL = $cleanupResult $tempURL = StringSplit($tempURL, '<a href="', 1) $tempURL = StringSplit($tempURL[1], '"', 1) $cleanupResult = StringReplace($cleanupResult, '<a href' UNTIL '</a>', $tempURL[0], 1) ;How do I do something like this? Return $cleanupResult EndFunc Or perhaps there's a MUCH simpler way of doing this? FYI: Where you see ">", ">" etc, it's actually changing & gt ; to >, but the formatting on this site hides it. Edited February 17, 2019 by IAMK Added FYI
FrancescoDiMuro Posted February 17, 2019 Posted February 17, 2019 (edited) @IAMK Something like this should do the trick, until the lines in your file have this pattern #include <Array.au3> #include <StringConstants.au3> Global $strString = 'After this is a link!!! <a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a><br>Before this was a link!!! //Note: This is 1 line.' & @CRLF & _ 'After this there is another link!!! <a href="https://imgur.com/a/something" target="_blank">https://imgur.com/a/something</a><br>Before this was another link!!! //Note: This is 2 line.', _ $arrSplit $arrSplit = StringSplit($strString, @CRLF, $STR_ENTIRESPLIT) _ArrayDisplay($arrSplit) For $i = 1 To $arrSplit[0] Step 1 ConsoleWrite(StringRegExpReplace($arrSplit[$i], '([^<]+)(?=<a href[^>]+>)<a[^>]+>([^<]+)<\/a>(?<=<\/a>)([^\/]{2})(?=\/{2})(.*)', "$1$2" & @CRLF & "$3$4") & @CRLF) Next Edited February 17, 2019 by FrancescoDiMuro Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette
mikell Posted February 17, 2019 Posted February 17, 2019 If the provided example is representative, this should work (remove all tags, keep newlines) $s = 'After this is a link!!! <a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a><br>Before this was a link!!! ' $r = StringRegExpReplace(StringReplace($s, '<br>', @crlf), '(<.*?>)+', " ") Msgbox(0,"", $r)
IAMK Posted February 17, 2019 Author Posted February 17, 2019 @FrancescoDiMuro That regex is too complex for me to understand :s Also, why is $arrSplit appended to the string? I'm not sure what that does since $arrSplit has not been defined yet. @mikell Jesus, you and your regex... I laughed as soon as I saw your name, expecting regex. I'm not sure if something like X < Y will ever appear, but I'll keep your code in mind in case I'm sure it won't appear.
FrancescoDiMuro Posted February 17, 2019 Posted February 17, 2019 @IAMK $arrSplit is used as a container of the splitted string (by @CRLF). Then, for each line, the text is formatted as the expected output Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette
Dwalfware Posted February 17, 2019 Posted February 17, 2019 its an array = $arrSplit you read it by its number like an excel sheet $arrSplit[0] do a consolewrite
Dwalfware Posted February 17, 2019 Posted February 17, 2019 _ArrayDisplay($arrSplit) This shows you what is inside the array. its more a debug tool for examining the array.
iamtheky Posted February 17, 2019 Posted February 17, 2019 I suppose this is similar to @mikell's solution, but it also increments the value of the number of lines $str = 'After this is a link!!! <a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a><br>Before this was a link!!! //Note: This is 1 line.' msgbox(0, '' , StringRegExpReplace(stringreplace(stringreplace($str , '<br>' , @LF) , "1 line" , @extended + 1 & " line(s)") , '<.*?>' , '')) mikell 1 ,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-. |(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/ (_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_) | | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) ( | | | | |)| | \ / | | | | | |)| | `--. | |) \ | | `-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_| '-' '-' (__) (__) (_) (__)
mikell Posted February 17, 2019 Posted February 17, 2019 4 hours ago, IAMK said: I'm not sure if something like X < Y will ever appear We can deal with that : $s = 'After this is a link!!! x<y <a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a>p>q<br>Before this was a link!!! ' $r = StringRegExpReplace(StringReplace($s, '<br>', @crlf), '(<(?![^>]*<).*?>)+', " ") Msgbox(0,"", $r) But obviously "generic" patterns have limits. Let's say you got this : $s = 'After this is a link!!! x<y and a>b <a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a><br>Before this was a link!!! ' How would you explain to the regex that "<y and a>" is not a tag ? In this case you must use a specific regex like the one from FrancescoDiMuro... or a non-regex way
Dwalfware Posted February 17, 2019 Posted February 17, 2019 Its JSON, have you thought of using JSMN.au3?
IAMK Posted February 17, 2019 Author Posted February 17, 2019 @FrancescoDiMuro @Dwalfware I understand how the ArraySplit() and $arrSplit[0] work. I meant this part: Global $strString = 'After this is a link!!! <a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a><br>Before this was a link!!! //Note: This is 1 line.' & @CRLF & _ 'After this there is another link!!! <a href="https://imgur.com/a/something" target="_blank">https://imgur.com/a/something</a><br>Before this was another link!!! //Note: This is 2 line.', _ $arrSplit @iamtheky Thanks, but I don't require this. I only wrote it as this website word wraps there. @mikell Yes, that's what I was getting at. Thanks.
FrancescoDiMuro Posted February 17, 2019 Posted February 17, 2019 @IAMK There is the declaration of the "source" string and of an array, what to say more? Click here to see my signature: Spoiler ALWAYS GOOD TO READ: Forum Rules Forum Etiquette
Dwalfware Posted February 18, 2019 Posted February 18, 2019 So something like this then #include <Array.au3> #include <String.au3> $String = '<a href="https://imgur.com/a/u9E0Srf" target="_blank">https://imgur.com/a/u9E0Srf</a><br>' Local $aArray = _StringBetween($String, '<a href="', '"') _ArrayDisplay($aArray) For $xi = 0 to UBound($aArray) - 1 ConsoleWrite("After this is a link!!! " & $aArray[$xi] & " Before this was a link!!! //Note: This is " & $xi & " lines." & @CRLF) Next
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now