Deye Posted September 11, 2020 Share Posted September 11, 2020 (edited) @Trying to pool just the story from a page so i can paste it into a text reader for Any ideas on how to get it closer to getting it cleaned Thanks $url = "https://www.newframe.com/the-long-term-effects-of-covid-19/" Global $InetRead = InetRead($url, 1) Global $text = BinaryToString($InetRead) $p1 = '(?<=>).*(?=</p>)' $p2 = '(?<=<.*>).*(?=<.*/>)' $text = StringRegExpReplace(_ArrayToString(StringRegExp($text, $p1, 3), @CRLF), $p2, "") $text =_ArrayToString(StringRegExp($text, '[\w\d\,\.\!\#]++', 3), " ") ConsoleWrite($text & @CRLF) Edit : Got something made that seems to do just fine - Thanks Deye Edited September 11, 2020 by Deye Link to comment Share on other sites More sharing options...
faustf Posted September 13, 2020 Share Posted September 13, 2020 try with udf _IEbodytext, look in help F1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now