zackrspv Posted March 9, 2008 Share Posted March 9, 2008 (edited) I have a source, puled from a website, and I need to find any occurance of <br /> and </p> so that I can pull the text. It would be ideal if i could find ="def_p"> and then find everything after it to the next </p> but that doesn't seem to work; however just finding <br /> and </p> works great because only the text I want is wrapped in them. So, did i type this right? StringRegExp($source, '(.*?)<[[br /]|[/p]]>', 1, $nOffset) or, is it: StringRegExp($source, '(.*?)<[br|/p]>', 1, $nOffset) or, is it: StringRegExp($source, '(.*?)[<br /> | </p>]', 1, $nOffset) Thanks! Edited April 10, 2008 by zackrspv -_-------__--_-_-____---_-_--_-__-__-_ ^^€ñ†®øÞÿ ë×阮§ wï†høµ† ƒë@®, wï†høµ† †ïmë, @ñd wï†høµ† @ †ïmïdï†ÿ ƒø® !ïƒë. €×阮 ñø†, bµ† ïñ§†ë@d wï†hïñ, ñ@ÿ, †h®øµghøµ† †hë 맧ëñ§ë øƒ !ïƒë. Link to comment Share on other sites More sharing options...
Paulie Posted March 9, 2008 Share Posted March 9, 2008 Can we see the website source you want to pull from? maybe not the whole thing... but just the relevant area. Link to comment Share on other sites More sharing options...
Xand3r Posted March 9, 2008 Share Posted March 9, 2008 (edited) $pat="test1 <br />test"&@CRLF&"string and another "&@LF&" new "&@CR&" new line</p> test2" $reg=StringRegExp($pat,"(?s)<br \/>(.)+<\/p>" , 2) MsgBox(0 , "" , StringTrimLeft(StringTrimRight($reg[0],4),6)) Explanation: (?s)=. matches even new lines \/=matches a / (.)+=matches any char except new lines(but with the (?s) it matches new lines too ) Cheers:) Edited March 9, 2008 by alexmadman Only two things are infinite, the universe and human stupidity, and i'm not sure about the former -Alber EinsteinPractice makes perfect! but nobody's perfect so why practice at all?http://forum.ambrozie.ro Link to comment Share on other sites More sharing options...
zackrspv Posted March 10, 2008 Author Share Posted March 10, 2008 Can we see the website source you want to pull from? maybe not the whole thing... but just the relevant area. Per your request: <div class="def_p"> <p>Research Science Institute. Insanely prestigious summer research program for rising high school seniors at MIT and CalTech. Practically impossible to gain admission.</p> <p style="font-style: italic">RSI has the best of the best.</p> <div class="tags">by <a href="/author.php?author=nerdish">nerdish</a> Jan 2, 2005 <a href="java script:void(0)" class="shareoff" onclick="share_toggle(this, 973915, 'http://www.urbanup.com/973915', 'http://del.icio.us/post?url=http%3A%2F%2Fwww.urbanup.com%2F973915&title=¬es=')">email it</a></div> </div> I'll try the other solution as well, to see if i can get it to work. But, if you can help me determine this, it would be great. Each item I want to extract is wrapped in the div class def_p -_-------__--_-_-____---_-_--_-__-__-_ ^^€ñ†®øÞÿ ë×阮§ wï†høµ† ƒë@®, wï†høµ† †ïmë, @ñd wï†høµ† @ †ïmïdï†ÿ ƒø® !ïƒë. €×阮 ñø†, bµ† ïñ§†ë@d wï†hïñ, ñ@ÿ, †h®øµghøµ† †hë 맧ëñ§ë øƒ !ïƒë. Link to comment Share on other sites More sharing options...
Xenobiologist Posted March 10, 2008 Share Posted March 10, 2008 Hi, this way? (?<="def_p">)(.*)(?=<\/p>) Mega Scripts & functions Organize Includes Let Scite organize the include files Yahtzee The game "Yahtzee" (Kniffel, DiceLion) LoginWrapper Secure scripts by adding a query (authentication) _RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...) Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc. MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times Link to comment Share on other sites More sharing options...
zackrspv Posted March 10, 2008 Author Share Posted March 10, 2008 Hi,this way? (?<="def_p">)(.*)(?=<\/p>)MegaHiya,Thanks for the response, but that didn't pull up any results. -_-------__--_-_-____---_-_--_-__-__-_ ^^€ñ†®øÞÿ ë×阮§ wï†høµ† ƒë@®, wï†høµ† †ïmë, @ñd wï†høµ† @ †ïmïdï†ÿ ƒø® !ïƒë. €×阮 ñø†, bµ† ïñ§†ë@d wï†hïñ, ñ@ÿ, †h®øµghøµ† †hë 맧ëñ§ë øƒ !ïƒë. Link to comment Share on other sites More sharing options...
Xenobiologist Posted March 10, 2008 Share Posted March 10, 2008 (edited) Hi, ??? This ... <div class="def_p"> <p>Research Science Institute. Insanely prestigious summer research program for rising high school seniors at MIT and CalTech. Practically impossible to gain admission.</p> <p style="font-style: italic">RSI has the best of the best.</p> <div class="tags">by <a href="/author.php?author=nerdish">nerdish</a> Jan 2, 2005 <a href="java script:void(0)" class="shareoff" onclick="share_toggle(this, 973915, 'http://www.urbanup.com/973915', 'http://del.icio.us/post?url=http%3A%2F%2Fwww.urbanup.com%2F973915&title=¬es=')">email it</a></div> </div> and you need this? <p>Research Science Institute. Insanely prestigious summer research program for rising high school seniors at MIT and CalTech. Practically impossible to gain admission.</p> <p style="font-style: italic">RSI has the best of the best.</p> right? Then this works for me : (?-i)(?s)(?<="def_p">)(.*)(?=<\/p>) Mega P.S.: I tested the pattern only in a tool not in Autoit. Edited March 10, 2008 by Xenobiologist Scripts & functions Organize Includes Let Scite organize the include files Yahtzee The game "Yahtzee" (Kniffel, DiceLion) LoginWrapper Secure scripts by adding a query (authentication) _RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...) Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc. MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times Link to comment Share on other sites More sharing options...
zackrspv Posted March 10, 2008 Author Share Posted March 10, 2008 $pat="test1 <br />test"&@CRLF&"string and another "&@LF&" new "&@CR&" new line</p> test2" $reg=StringRegExp($pat,"(?s)<br \/>(.)+<\/p>" , 2) MsgBox(0 , "" , StringTrimLeft(StringTrimRight($reg[0],4),6)) Explanation: (?s)=. matches even new lines \/=matches a / (.)+=matches any char except new lines(but with the (?s) it matches new lines too ) Cheers:) Hum, Implimenting your changes causes the script to auto close once it hits the regular expression. Here's the modified code: expandcollapse popupFunc sSearch() $source = "" $str = "" $str2 = "" GUICtrlSetState($Edit1, $GUI_SHOW) GUICtrlSetState($Combo1, $GUI_HIDE) GUICtrlSetState($Terms, $GUI_HIDE) GUICtrlSetState($Defs, $GUI_HIDE) GUICtrlSetState($comboLable1, $GUI_HIDE) GUICtrlSetState($Progress1, $GUI_SHOW) $item = StringReplace($item2, " ", "+") $source = (_INetGetSource("http://www.urbandictionary.com/define.php?term=" & $item)) If StringInStr($source, "isn't defined", 0) > "0" Then GUICtrlSetData($Edit1, "Term not found, sorry, try doing a Google Search instead!") GUICtrlSetState($Progress1, $GUI_HIDE) Else GUICtrlSetData($Edit1, "Term found, loading definition!") For $i = 0 To 100 GUICtrlSetData($Progress1, $i) Sleep(5) GUICtrlSetData($Progress1, 0) Next GUICtrlSetState($Progress1, $GUI_HIDE) $nOffset = 1 $str = "" While 1 $array = StringRegExp($source, "(?s)<br \/>(.)+<\/p>", 1, $nOffset) If @error = 0 Then $nOffset = @extended Else ExitLoop EndIf For $i = 0 To UBound($array) - 1 $str = $array[$i] $str = StringRegExpReplace($str, "&#(.*?);", "") $str = StringRegExpReplace($str, "<(.*?)>", "") $str = StringRegExpReplace($str, "</(.*?)>", "") $str = StringRegExpReplace($str, "&(.*?);", "") $str = StringRegExpReplace($str, "\t", "") Next WEnd GUICtrlSetData($Edit1, $str) EndIf EndFunc ;==>sSearch Why is it closing like that? -_-------__--_-_-____---_-_--_-__-__-_ ^^€ñ†®øÞÿ ë×阮§ wï†høµ† ƒë@®, wï†høµ† †ïmë, @ñd wï†høµ† @ †ïmïdï†ÿ ƒø® !ïƒë. €×阮 ñø†, bµ† ïñ§†ë@d wï†hïñ, ñ@ÿ, †h®øµghøµ† †hë 맧ëñ§ë øƒ !ïƒë. Link to comment Share on other sites More sharing options...
zackrspv Posted March 10, 2008 Author Share Posted March 10, 2008 WARNING: LONG LONG POST, includes most of the code i'm working with and the resources i'm searching as well as the site it comes from. Then this works for me : (?-i)(?s)(?<="def_p">)(.*)(?=<\/p>) P.S.: I tested the pattern only in a tool not in Autoit. Hum, well, at least I get results, yay! However, it pulls out way too much. For example; that search i gave you was for 'RSI' and this is what your regex did pull: expandcollapse popupRepetitive Strain Injury. The price you pay for over-indulging in a single form of entertainment. For goodness sake put down that mouse, Johnny! You'll give yourself RSI... by CougarSW2 Nov 19, 2004 email it permalink: del.icio.us Send to a friend your email: their email: send me the word of the day (it's free) 2. RSI 21 up, 6 down Research Science Institute. Insanely prestigious summer research program for rising high school seniors at MIT and CalTech. Practically impossible to gain admission. RSI has the best of the best. by nerdish Jan 2, 2005 email it 3. RSI 1 thumb up An acronym for Rapid Sequence Induction, an emergency medical procedure sometimes performed by paramedics prior to intubation on conscience patients. It is essentially a rapid anesthetization. EMT: Resps are 6 chief! Medic: Alright, we need to intubate. Is he still conscious? EMT: Partially Medic: Go ahead and set up for an RSI tags ems emt paramedic intubation emergency by pcbene C-ville, VA Dec 24, 2007 email it <!-- google_ad_client = "pub-4733233155277872"; google_alternate_ad_url = "http://www.urbandictionary.com/asbackup_medrect.html"; google_ad_width = 300; google_ad_height = 250; google_ad_format = "300x250_as"; google_ad_type = "image"; //2007-03-31: define rectangle google_ad_channel = "1113511432"; google_color_border = "FBFFEA"; google_color_bg = "FFF3DA"; google_color_link = "83AE84"; google_color_text = "000000"; google_color_url = "000000"; //--> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> 4. RSi 5 up, 5 down RSi - a small band from the Belfast area in Northern Ireland. Currently un-signed, but have a manager. Could be described as Space Rock or progressive Kid one : Hey dOOd. Heard of RSi? Kid two : No but my older brother gets it all the time. Note, the sheer amount of line spaces above, I'm not sure why that happens. But note, also, all the extra information it pulls. Here's the source: expandcollapse popup<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <title>Urban Dictionary: rsi</title> <meta name="description" content="Urban Dictionary is a slang dictionary with your definitions. Define your world"> <style type="text/css"> <!-- @import "http://static.urbandictionary.com/css/urban.css?1205139850"; @import "http://static.urbandictionary.com/css/define.css?1205139850"; --> </style> <script type="text/javascript" language="javascript" src="http://static.urbandictionary.com/js/urban.js?1205139850"></script><script type="text/javascript" language="javascript" src="http://static.urbandictionary.com/js/thumbs3.js?1205139850"></script><script type="text/javascript" language="javascript" src="http://static.urbandictionary.com/js/share.js?1205139850"></script><link rel="search" type="application/opensearchdescription+xml" title="Urban Dictionary Search" href="http://www.urbandictionary.com/osd.xml" /> </head> <body> <div id="banner1"> <div id="banner2"> <div id="banner3"> <div id="logo"><a href="/"><img src="http://static.urbandictionary.com/img/logo.gif" alt="urbandictionary.com" width="259" height="81"></a></div> <div id="form"><form method="get" action="http://www.urbandictionary.com/define.php" name="define"> <table><tr><td><input type="text" name="term" size="30" tabindex="1" value="rsi"></td><td><input type="submit" value="search"></td></tr></table> <div id="tagline">Urban Dictionary is a slang dictionary with your definitions. <b>Define your world.</b></div> </form></div> <div id="topnav"><a href="/" class="active">browse</a> <a href="/daily.php">word of the day</a> <a href="/insert.php?word=rsi">add</a> <a href="/editor.php">edit</a> <a href="/book.php">new book</a> <a href="/news.php">press</a> <a href="/tools.php">tools</a> <a href="/chat.php">chat</a> <a href="/yesterday.php">newest</a></div> <div id="banner-color"> </div> <div><img src="http://static.urbandictionary.com/img/banner.jpg" width="765" height="94"></div> </div> </div> </div> <div id="whole"> <div id="content"> <div id="subnav1"> <div id="subnav2"><a href="/random.php">random</a> <a href="/browse.php?character=A">A</a> <a href="/browse.php?character=B">B</a> <a href="/browse.php?character=C">C</a> <a href="/browse.php?character=D">D</a> <a href="/browse.php?character=E">E</a> <a href="/browse.php?character=F">F</a> <a href="/browse.php?character=G">G</a> <a href="/browse.php?character=H">H</a> <a href="/browse.php?character=I">I</a> <a href="/browse.php?character=J">J</a> <a href="/browse.php?character=K">K</a> <a href="/browse.php?character=L">L</a> <a href="/browse.php?character=M">M</a> <a href="/browse.php?character=N">N</a> <a href="/browse.php?character=O">O</a> <a href="/browse.php?character=P">P</a> <a href="/browse.php?character=Q">Q</a> <a href="/browse.php?word=rsi" class="active">R</a> <a href="/browse.php?character=S">S</a> <a href="/browse.php?character=T">T</a> <a href="/browse.php?character=U">U</a> <a href="/browse.php?character=V">V</a> <a href="/browse.php?character=W">W</a> <a href="/browse.php?character=X">X</a> <a href="/browse.php?character=Y">Y</a> <a href="/browse.php?character=Z">Z</a> <a href="/browse.php?character=*">#</a></div> </div> <table border="0" cellpadding="0" cellspacing="0" style="width: 805px"> <tr style="vertical-align: top"> <td style="width: 150px; background-color: #FFF3DA"> <div class="leftist-tabs"> <a href="http://www.amazon.com/gp/product/0740768751?ie=UTF8&tag=urbandictio08-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0740768751"><img src="http://static.urbandictionary.com/img/book-on-star.gif" width="137" height="105" border="0"/></a> <a href="http://www.amazon.com/gp/product/0740768751?ie=UTF8&tag=urbandictio08-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0740768751">order <i>mo' urban</i></a><br/> <a href="http://www.amazon.com/gp/product/0740768751?ie=UTF8&tag=urbandictio08-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=0740768751">on amazon</a> and <a href="http://search.barnesandnoble.com/Mo-Urban-Dictionary/Aaron-Peckham/e/9780740768750/?itm=1&afsrc=1&lkid=J23965981&pubid=K134569&byo=1">b&n</a> <b>now shipping</b><br/> </div> <div style="padding: 10px;"> <ul class="leftist" id="leftist"> <li><a href="/define.php?term=RS+Ship">RS Ship</a><li><a href="/define.php?term=rs%2A">rs*</a><li><a href="/define.php?term=rs-resources">rs-resources</a><li><a href="/define.php?term=rs2">rs2</a><li><a href="/define.php?term=Rs2+Product">Rs2 Product</a><li><a href="/define.php?term=rs6">rs6</a><li><a href="/define.php?term=RSA">RSA</a><li><a href="/define.php?term=rsag">rsag</a><li><a href="/define.php?term=rsb">rsb</a><li><a href="/define.php?term=rsbo">rsbo</a><li><a href="/define.php?term=RSC">RSC</a><li><a href="/define.php?term=RsCheatNet">RsCheatNet</a><li><a href="/define.php?term=rsd">rsd</a><li><a href="/define.php?term=rsd+-+Real+Street+Drags">rsd - Real Street Drags</a><li><a href="/define.php?term=RSe">RSe</a><li><a href="/define.php?term=rsf">rsf</a><li><a href="/define.php?term=rsfx">rsfx</a><li><a href="/define.php?term=RSGB">RSGB</a><li><a href="/define.php?term=RSGC">RSGC</a><li><a href="/define.php?term=rsh">rsh</a><li><div class="active">rsi</div><li><a href="/define.php?term=rsk">rsk</a><li><a href="/define.php?term=Rskillz">Rskillz</a><li><a href="/define.php?term=RSL">RSL</a><li><a href="/define.php?term=RSM">RSM</a><li><a href="/define.php?term=RSM+International">RSM International</a><li><a href="/define.php?term=rsmami">rsmami</a><li><a href="/define.php?term=rsmv">rsmv</a><li><a href="/define.php?term=RSN">RSN</a><li><a href="/define.php?term=RSO">RSO</a><li><a href="/define.php?term=rsod">rsod</a><li><a href="/define.php?term=RSP">RSP</a><li><a href="/define.php?term=RSP%27d+off">RSP'd off</a><li><a href="/define.php?term=RSPCA">RSPCA</a><li><a href="/define.php?term=rspct">rspct</a><li><a href="/define.php?term=Rspecial">Rspecial</a><li><a href="/define.php?term=RSPW">RSPW</a><li><a href="/define.php?term=RSR">RSR</a><li><a href="/define.php?term=RSS">RSS</a><li><a href="/define.php?term=rssXz">rssXz</a><li><a href="/define.php?term=RST">RST</a></ul> </div> </td> <td style="width: 465px; padding: 15px"> <table cellpadding="0" cellspacing="0" border="0" width="100%"> <tr valign="top"> <td class="def_number" width="20">1.</td> <td class="def_word">RSI</td> <td class="def_thumbs"> <table cellpadding="0" cellspacing="3" border="0" style="margin-left: auto"><tr> <td><a href="java script:void(0)" onclick="thumbs.click(906999, 1)"><img src="http://static.urbandictionary.com/thumbsup.gif" width="19" height="19" id="thumbs_906999_1_gif"></a></td> <td nowrap><span id="thumbs_906999"><strong>34</strong> up, <strong>3</strong> down</span></td> <td><a href="java script:void(0)" onclick="thumbs.click(906999, 0)"><img src="http://static.urbandictionary.com/thumbsdown.gif" width="19" height="19" id="thumbs_906999_0_gif"></a></td> </tr></table> </td> </tr> <tr> <td></td> <td colspan="2"> <div class="def_p"> <p>Repetitive Strain Injury. <br /> The price you pay for over-indulging in a single form of entertainment.</p> <p style="font-style: italic">"For goodness sake put down that mouse, Johnny! You'll give yourself RSI..."</p> <div class="tags">by <a href="/author.php?author=CougarSW2">CougarSW2</a> Nov 19, 2004 <a href="java script:void(0)" class="shareoff" onclick="share_toggle(this, 906999, 'http://www.urbanup.com/906999', 'http://del.icio.us/post?url=http%3A%2F%2Fwww.urbanup.com%2F906999&title=¬es=')">email it</a></div> <div id="fold" style="display: none"> <form onsubmit="share_send(this)" action="java script:void(0)" name="share"><input type="hidden" name="defid"> <table width="100%" border="0" cellspacing="0" cellpadding="0"> <tr> <td class="fold-left"><span onclick="document.getElementById('permalink').click()">permalink:</span></td> <td><input type="text" value="" onclick="this.focus(); this.select()" size="30" name="permalink" id="permalink"> <a href="java script:void(0)" id="delicious_fold">del.icio.us</a></td> </tr> <tr> <td></td> <td style="padding-top: 15px">Send to a friend</td> </tr> <tr> <td class="fold-left">your email:</td> <td><input type="text" size="30" name="yours" id="session_email"></td> </tr> <tr> <td class="fold-left">their email:</td> <td><input type="text" size="30" name="theirs"></td> </tr> <tr> <td></td> <td><input type="checkbox" name="subscribe"> send me the word of the day (it's free)</td> </tr> <tr> <td></td> <td> <div class="height: 5px"> </div> <table> <tr> <td><input type="submit" value="Send message"></td> <td class="never" id="share_status" width="180"></td> </tr> </table> </td> </tr> </table> </form> </div> </div> </td> </tr> <tr valign="top"> <td class="def_number" width="20">2.</td> <td class="def_word">RSI</td> <td class="def_thumbs"> <table cellpadding="0" cellspacing="3" border="0" style="margin-left: auto"><tr> <td><a href="java script:void(0)" onclick="thumbs.click(973915, 1)"><img src="http://static.urbandictionary.com/thumbsup.gif" width="19" height="19" id="thumbs_973915_1_gif"></a></td> <td nowrap><span id="thumbs_973915"><strong>21</strong> up, <strong>6</strong> down</span></td> <td><a href="java script:void(0)" onclick="thumbs.click(973915, 0)"><img src="http://static.urbandictionary.com/thumbsdown.gif" width="19" height="19" id="thumbs_973915_0_gif"></a></td> </tr></table> </td> </tr> <tr> <td></td> <td colspan="2"> <div class="def_p"> <p>Research Science Institute. Insanely prestigious summer research program for rising high school seniors at MIT and CalTech. Practically impossible to gain admission.</p> <p style="font-style: italic">RSI has the best of the best.</p> <div class="tags">by <a href="/author.php?author=nerdish">nerdish</a> Jan 2, 2005 <a href="java script:void(0)" class="shareoff" onclick="share_toggle(this, 973915, 'http://www.urbanup.com/973915', 'http://del.icio.us/post?url=http%3A%2F%2Fwww.urbanup.com%2F973915&title=¬es=')">email it</a></div> </div> </td> </tr> <tr valign="top"> <td class="def_number" width="20">3.</td> <td class="def_word">RSI</td> <td class="def_thumbs"> <table cellpadding="0" cellspacing="3" border="0" style="margin-left: auto"><tr> <td><a href="java script:void(0)" onclick="thumbs.click(2758354, 1)"><img src="http://static.urbandictionary.com/thumbsup.gif" width="19" height="19" id="thumbs_2758354_1_gif"></a></td> <td nowrap><span id="thumbs_2758354"><strong>1</strong> thumb up</span></td> <td><a href="java script:void(0)" onclick="thumbs.click(2758354, 0)"><img src="http://static.urbandictionary.com/thumbsdown.gif" width="19" height="19" id="thumbs_2758354_0_gif"></a></td> </tr></table> </td> </tr> <tr> <td></td> <td colspan="2"> <div class="def_p"> <p>An acronym for Rapid Sequence Induction, an emergency medical procedure sometimes performed by paramedics prior to intubation on conscience patients. It is essentially a rapid anesthetization. </p> <p style="font-style: italic">EMT: Resps are 6 chief!<br /> Medic: Alright, we need to intubate. Is he still conscious?<br /> EMT: Partially<br /> Medic: Go ahead and set up for an RSI</p> <div class="tags">tags <a href="/define.php?term=ems">ems</a> <a href="/define.php?term=emt">emt</a> <a href="/define.php?term=paramedic">paramedic</a> <a href="/define.php?term=intubation">intubation</a> <a href="/define.php?term=emergency">emergency</a></div> <div class="tags">by <a href="/author.php?author=pcbene">pcbene</a> C-ville, VA Dec 24, 2007 <a href="java script:void(0)" class="shareoff" onclick="share_toggle(this, 2758354, 'http://www.urbanup.com/2758354', 'http://del.icio.us/post?url=http%3A%2F%2Fwww.urbanup.com%2F2758354&title=¬es=ems+emt+paramedic+intubation+emergency+urbandictionary')">email it</a></div> </div> </td> </tr> <tr> <td></td> <td style="padding: 10px" colspan="2"> <center> <script type="text/javascript"><!-- google_ad_client = "pub-4733233155277872"; google_alternate_ad_url = "http://www.urbandictionary.com/asbackup_medrect.html"; google_ad_width = 300; google_ad_height = 250; google_ad_format = "300x250_as"; google_ad_type = "image"; //2007-03-31: define rectangle google_ad_channel = "1113511432"; google_color_border = "FBFFEA"; google_color_bg = "FFF3DA"; google_color_link = "83AE84"; google_color_text = "000000"; google_color_url = "000000"; //--> </script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> </center> </td> </tr> <tr valign="top"> <td class="def_number" width="20">4.</td> <td class="def_word">RSi</td> <td class="def_thumbs"> <table cellpadding="0" cellspacing="3" border="0" style="margin-left: auto"><tr> <td><a href="java script:void(0)" onclick="thumbs.click(2213018, 1)"><img src="http://static.urbandictionary.com/thumbsup.gif" width="19" height="19" id="thumbs_2213018_1_gif"></a></td> <td nowrap><span id="thumbs_2213018"><strong>5</strong> up, <strong>5</strong> down</span></td> <td><a href="java script:void(0)" onclick="thumbs.click(2213018, 0)"><img src="http://static.urbandictionary.com/thumbsdown.gif" width="19" height="19" id="thumbs_2213018_0_gif"></a></td> </tr></table> </td> </tr> <tr> <td></td> <td colspan="2"> <div class="def_p"> <p>RSi - a small band from the Belfast area in Northern Ireland.<br /> <br /> Currently un-signed, but have a manager.<br /> <br /> Could be described as <a href="/define.php?term=Space+Rock">Space Rock</a> or <a href="/define.php?term=progressive">progressive</a><br /> <br /> </p> <p style="font-style: italic">Kid one : Hey dOOd. Heard of RSi?<br /> <br /> Kid two : No but my older brother gets it all the time.</p> <div class="tags">tags <a href="/define.php?term=rsi">rsi</a> <a href="/define.php?term=spacerock">spacerock</a> <a href="/define.php?term=progressive">progressive</a> <a href="/define.php?term=music">music</a> <a href="/define.php?term=belfast">belfast</a></div> <div class="tags">by <a href="/author.php?author=mel2k7">mel2k7</a> Northern Ireland Jan 23, 2007 <a href="java script:void(0)" class="shareoff" onclick="share_toggle(this, 2213018, 'http://www.urbanup.com/2213018', 'http://del.icio.us/post?url=http%3A%2F%2Fwww.urbanup.com%2F2213018&title=¬es=rsi+spacerock+progressive+music+belfast+urbandictionary')">email it</a></div> </div> </td> </tr> </table><div id="strip" style="margin-top: 50px"> <table> <tr id="pics"><td width="20%"><a href="/define.php?term=mi-24&i=1"><img src="http://media.urbandictionary.com/image/icon/mi-24-36293.jpg" width="64" height="37" border="0"></a></td><td width="20%"><a href="/define.php?term=poverty&i=1"><img src="http://media.urbandictionary.com/image/icon/poverty-35059.jpg" width="64" height="29" border="0"></a></td><td width="20%"><a href="/define.php?term=puppie&i=1"><img src="http://media.urbandictionary.com/image/icon/puppie-9801.jpg" width="64" height="48" border="0"></a></td><td width="20%"><a href="/define.php?term=keep+it+real&i=1"><img src="http://media.urbandictionary.com/image/icon/keepitreal-2047.jpg" width="64" height="46" border="0"></a></td><td width="20%"><a href="/define.php?term=super+chill&i=1"><img src="http://media.urbandictionary.com/image/icon/superchill-55866.jpg" width="64" height="48" border="0"></a></td></tr><tr><td><a href="/define.php?term=mi-24&i=1">mi-24</a></td><td><a href="/define.php?term=poverty&i=1">poverty</a></td><td><a href="/define.php?term=puppie&i=1">puppie</a></td><td><a href="/define.php?term=keep+it+real&i=1">keep it real</a></td><td><a href="/define.php?term=super+chill&i=1">super chill</a></td></tr> </table> </div> <td id="right2" style="width: 130px; background-color: #FFF3DA"> <div style="margin: 10px 5px"> <div style="margin: 0 auto; text-align: center"> <script type="text/javascript"><!-- google_ad_client = "pub-4733233155277872"; google_alternate_ad_url = "http://www.urbandictionary.com/asbackup_skyscraper.html"; google_ad_width = 160; google_ad_height = 600; google_ad_format = "160x600_as"; google_ad_type = "text_image"; //2006-12-04: wide skyscraper google_ad_channel = "9172383876"; google_color_border = "FFF3DA"; google_color_bg = "FFF3DA"; google_color_link = "DE5F25"; google_color_text = "000000"; google_color_url = "000000"; //--></script> <script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"> </script> </div> </div> </td> </tr> </table> <div id="bottomnav"><a href="/">browse</a> <a href="/daily.php">word of the day</a> <a href="/insert.php?word=rsi">add</a> <a href="/editor.php">edit</a> <a href="/book.php">new book</a> <a href="/news.php">press</a> <a href="/tools.php">tools</a> <a href="/yesterday.php">newest</a></div> </div> <div id="footer"><a href="http://www.urbandictionary.com/">Urban Dictionary</a> is not appropriate for all audiences. ©1999-2008. <a href="http://www.urbandictionary.com/tos.php">terms of service</a> | <a href="http://www.urbandictionary.com/feedback.php">feedback</a> | <a href="http://www.urbandictionary.com/tools.php">tools</a> | <a href="http://www.urbandictionary.com/tools_rss.php">rss</a> | <a href="https://adwords.google.com/select/OnsiteSignupLandingPage?client=ca-pub-4733233155277872&referringUrl=http://www.urbandictionary.com">advertise</a></div> <!-- 6 --> </div> </body> <script type="text/javascript" src="http://www.google-analytics.com/urchin.js"></script> <script> _uacct="UA-31805-1"; _udn="urbandictionary.com"; urchinTracker(); </script> <!-- Start Quantcast tag --> <script type="text/javascript" src="http://edge.quantserve.com/quant.js"></script> <script type="text/javascript"> _qacct="p-77H27_lnOeCCI";quantserve();</script> <noscript> <img src="http://pixel.quantserve.com/pixel/p-77H27_lnOeCCI.gif" style="display: none" height="1" width="1" alt="Quantcast"/></noscript> <!-- End Quantcast tag --> </html> <script src="/share_load.php"></script><script src="http://www.urbandictionary.com/thumbs_load.php?defid=906999,973915,2758354,2213018"></script> Here's the function that I have to help pull that information out: expandcollapse popupFunc sSearch() $source = "" $str = "" $str2 = "" GUICtrlSetState($Edit1, $GUI_SHOW) GUICtrlSetState($Combo1, $GUI_HIDE) GUICtrlSetState($Terms, $GUI_HIDE) GUICtrlSetState($Defs, $GUI_HIDE) GUICtrlSetState($comboLable1, $GUI_HIDE) GUICtrlSetState($Progress1, $GUI_SHOW) $item = StringReplace($item2, " ", "+") $source = (_INetGetSource("http://www.urbandictionary.com/define.php?term=" & $item)) If StringInStr($source, "isn't defined", 0) > "0" Then GUICtrlSetData($Edit1, "Term not found, sorry, try doing a Google Search instead!") GUICtrlSetState($Progress1, $GUI_HIDE) Else GUICtrlSetData($Edit1, "Term found, loading definition!") For $i = 0 To 100 GUICtrlSetData($Progress1, $i) Sleep(5) GUICtrlSetData($Progress1, 0) Next GUICtrlSetState($Progress1, $GUI_HIDE) $nOffset = 1 $str = "" While 1 $array = StringRegExp($source, '(?-i)(?s)(?<="def_p">)(.*)(?=<\/p>)', 1, $nOffset) If @error = 0 Then $nOffset = @extended Else ExitLoop EndIf For $i = 0 To UBound($array) - 1 $str = $array[$i] $str = StringRegExpReplace($str, "&#(.*?);", "") $str = StringRegExpReplace($str, "<(.*?)>", "") $str = StringRegExpReplace($str, "</(.*?)>", "") $str = StringRegExpReplace($str, "&(.*?);", "") $str = StringRegExpReplace($str, "\t", "") Next WEnd GUICtrlSetData($Edit1, $str) EndIf EndFunc ;==>sSearch So, then, why, oh why, is it not pulling exactly what i need? -_-------__--_-_-____---_-_--_-__-__-_ ^^€ñ†®øÞÿ ë×阮§ wï†høµ† ƒë@®, wï†høµ† †ïmë, @ñd wï†høµ† @ †ïmïdï†ÿ ƒø® !ïƒë. €×阮 ñø†, bµ† ïñ§†ë@d wï†hïñ, ñ@ÿ, †h®øµghøµ† †hë 맧ëñ§ë øƒ !ïƒë. Link to comment Share on other sites More sharing options...
Xenobiologist Posted March 10, 2008 Share Posted March 10, 2008 Hi, just post : Source and expected result. Then we can do the rest. Mega Scripts & functions Organize Includes Let Scite organize the include files Yahtzee The game "Yahtzee" (Kniffel, DiceLion) LoginWrapper Secure scripts by adding a query (authentication) _RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...) Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc. MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times Link to comment Share on other sites More sharing options...
zackrspv Posted March 10, 2008 Author Share Posted March 10, 2008 Hi, just post : Source and expected result. Then we can do the rest. Mega Hiya, I posted the source above. As to what I need out of it, i need to pull the definition of the term; as defined in the source. For example: Search Term: 'RSI'; returns: <div class="def_p"> <p>Repetitive Strain Injury. <br /> The price you pay for over-indulging in a single form of entertainment.</p> ... <div class="def_p"> <p>Research Science Institute. Insanely prestigious summer research program for rising high school seniors at MIT and CalTech. Practically impossible to gain admission.</p> <p style="font-style: italic">RSI has the best of the best.</p> ... <div class="def_p"> <p>An acronym for Rapid Sequence Induction, an emergency medical procedure sometimes performed by paramedics prior to intubation on conscience patients. It is essentially a rapid anesthetization. </p> <p style="font-style: italic">EMT: Resps are 6 chief!<br /> Medic: Alright, we need to intubate. Is he still conscious?<br /> ... <div class="def_p"> <p>RSi - a small band from the Belfast area in Northern Ireland.<br /> <br /> Currently un-signed, but have a manager.<br /> I only want to see: Repetitive Strain Injury. ---The price you pay for over-indulging in a single form of entertainment. Research Science Institute. Insanely prestigious summer research program for rising high school seniors at MIT and CalTech. Practically impossible to gain admission. ---RSI has the best of the best. An acronym for Rapid Sequence Induction, an emergency medical procedure sometimes performed by paramedics prior to intubation on conscience patients. It is essentially a rapid anesthetization. ---EMT: Resps are 6 chief! ---Medic: Alright, we need to intubate. Is he still conscious? RSi - a small band from the Belfast area in Northern Ireland. ---Currently un-signed, but have a manager. Note, i don't necessarly need to see the tidbits after the ---'s; but if i can get them, then fine. I really just need the main description. The whole project revolves around pulling just the definitions, but the quotes and examples are nice too. Note, again, I do not wish to use an IE com object to display the page; i just want to see the data. Thanks -_-------__--_-_-____---_-_--_-__-__-_ ^^€ñ†®øÞÿ ë×阮§ wï†høµ† ƒë@®, wï†høµ† †ïmë, @ñd wï†høµ† @ †ïmïdï†ÿ ƒø® !ïƒë. €×阮 ñø†, bµ† ïñ§†ë@d wï†hïñ, ñ@ÿ, †h®øµghøµ† †hë 맧ëñ§ë øƒ !ïƒë. Link to comment Share on other sites More sharing options...
Xenobiologist Posted March 10, 2008 Share Posted March 10, 2008 Hi, is this already enough? I saved your source in source.txt. #include<Array.au3> Global $source = FileRead(FileOpen(@ScriptDir & '\source.txt', 0)) MsgBox(0, 0, $source) Global $re_A = _getIt($source) _ArrayDisplay($re_A) Func _getIt($text) Local $re = StringRegExp($text, '(?-i)(?s)"def_p">(.*?)<\/p>', 3) If @error Then Return -1 Return $re EndFu Mega Scripts & functions Organize Includes Let Scite organize the include files Yahtzee The game "Yahtzee" (Kniffel, DiceLion) LoginWrapper Secure scripts by adding a query (authentication) _RunOnlyOnThis UDF Make sure that a script can only be executed on ... (Windows / HD / ...) Internet-Café Server/Client Application Open CD, Start Browser, Lock remote client, etc. MultipleFuncsWithOneHotkey Start different funcs by hitting one hotkey different times Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now