John117 Posted April 28, 2009 Share Posted April 28, 2009 (edited) Hey I need to dump the <>'s and all info in them. Please show me how if you can. I have tried string replace but needs a wildcard #include <array.au3> #include <string.au3> #include <IE.au3> $Subject = "Male" $oIE = _IECreate("http://thesaurus.reference.com/browse/" & $Subject, 0, 1, 1,1) $sHTML = _IEDocReadHTML($oIE) _Run1() Func _Run1() $aNameArray = _StringBetween($sHTML, '<b>Synonyms:</b></td>', '</span></td>') _ArrayDisplay($aNameArray) EndFunc _IEQuit ($oIE) ;need to strip everything within and including <> and strip * Edited May 2, 2009 by John117 Link to comment Share on other sites More sharing options...
John117 Posted April 28, 2009 Author Share Posted April 28, 2009 Solved: #include <array.au3> #include <string.au3> #include <IE.au3> Dim $aNameArray $Subject = "Male" $oIE = _IECreate("http://thesaurus.reference.com/browse/" & $Subject, 0, 1, 1,1) $sHTML = _IEDocReadHTML($oIE) _Run1() Func _Run1() $aNameArray = _StringBetween($sHTML, '<b>Synonyms:</b></td>', '</span></td>') For $i = 0 to UBound($aNameArray) -1 $aNameArray[$i] = StringRegExpReplace($aNameArray[$i], '<(.*?)>', "", 0) Next EndFunc _ArrayDisplay($aNameArray) _IEQuit ($oIE) ;need to strip everything within and including <> and strip * Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted April 28, 2009 Moderators Share Posted April 28, 2009 Solved: #include <array.au3> #include <string.au3> #include <IE.au3> Dim $aNameArray $Subject = "Male" $oIE = _IECreate("http://thesaurus.reference.com/browse/" & $Subject, 0, 1, 1,1) $sHTML = _IEDocReadHTML($oIE) _Run1() Func _Run1() $aNameArray = _StringBetween($sHTML, '<b>Synonyms:</b></td>', '</span></td>') For $i = 0 to UBound($aNameArray) -1 $aNameArray[$i] = StringRegExpReplace($aNameArray[$i], '<(.*?)>', "", 0) Next EndFunc _ArrayDisplay($aNameArray) _IEQuit ($oIE) ;need to strip everything within and including <> and strip *You could also do something like:_get_synonyms($sHTML) Func _get_synonyms($s_html) $s_html = StringRegExpReplace($s_html, "(?s)(?i)<b>Antonyms:</b></td>.*?</span></td>", "") Local $a_name_array = StringRegExp($s_html, "(?s)(?i)\x22http://thesaurus.reference.com/browse/(\w+)\x22", 3) _ArrayDisplay($a_name_array) EndFunc Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Valuater Posted April 28, 2009 Share Posted April 28, 2009 Well... here's my approach... #include <array.au3> #include <string.au3> #include <IE.au3> $Subject = "Male" $oIE = _IECreate("http://thesaurus.reference.com/browse/" & $Subject, 0, 1, 1, 1) $sHTML = _IEDocReadHTML($oIE) _Run1() Func _Run1() $aNameArray = _StringBetween($sHTML, '<b>Synonyms:</b></td>', '</span></td>') For $x = 0 To UBound($aNameArray) - 1 $aNameArray[$x] = StringReplace($aNameArray[$x], "<TD><SPAN>", "") $aNameTemp = "" $split = StringSplit($aNameArray[$x], @CRLF) For $i = 1 To UBound($split) - 1 $aNameTemp &= __Stringbetween($split[$i], ">", "</A>") Next $aNameArray[$x] = $aNameTemp Next _ArrayDisplay($aNameArray) EndFunc ;==>_Run1 _IEQuit($oIE) ;need to strip everything within and including <> and strip * Func __StringBetween($s_String, $s_Start, $s_End = 0) $s_Start = StringInStr($s_String, $s_Start) + StringLen($s_Start) Return StringMid($s_String, $s_Start, StringInStr($s_String, $s_End) - $s_Start) EndFunc ;==>__StringBetween ... a little late, but it works 8) Link to comment Share on other sites More sharing options...
DaleHohm Posted April 28, 2009 Share Posted April 28, 2009 I don't understand: ;need to strip everything within and including <> and strip * shat do you mean by "strip *"? Dale Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model Automate input type=file (Related) Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded Better Better? IE.au3 issues with Vista - Workarounds SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead? Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble Link to comment Share on other sites More sharing options...
John117 Posted May 2, 2009 Author Share Posted May 2, 2009 Sorry guys, after marking it as solved, I didn't check back on it. As soon as I get to my home pc (has autoit) I will test out your methods to see what I can learn from them. @Dale by 'strip' I just meant remove. remove anything like <this> including "this", "<" and ">" also, "*" Ex: <this>some stuff*<that><and the other> Result: some stuff I am just learning 'StringRegExpReplace' so it took me a while to come up with something. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now