bvr Posted February 10, 2012 Share Posted February 10, 2012 Basically I gather data from google trends rss all in one string. I tell it to gather everything from the "<ol" so it fetches the ordered list of the 10 trending keywords all on a different line. The problem is that is all one string. I want to be able to search the string for multiple occurring words or strings of words. Of course there won't be multiple trends in the first hour, but I plan on gathering keywords every hour and comparing new keywords to old ones and finding the matches and separating them. It would be easy to "search" for keywords, the problem is I never know what the keyword is going to be, so I can't really specify what to search for. Could I search each line separate from the string, or make each line a sub string of a string, or use an array? Kind of confused on how to handle the data. Link to comment Share on other sites More sharing options...
Moderators JLogan3o13 Posted February 10, 2012 Moderators Share Posted February 10, 2012 Hi, bvr. You're not the only one confused here Hard to give suggestions on how to perform your search when you state you don't know what you'll be searching for. "Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball How to get your question answered on this forum! Link to comment Share on other sites More sharing options...
bvr Posted February 10, 2012 Author Share Posted February 10, 2012 (edited) Local $Url = 'http://www.google.com/trends/hottrends/atom/hourly' Local $Html = BinaryToString(InetRead($Url)) ;Get the <ol> code from google $Html = StringMid($Html,StringInStr($Html,'<ol')) ;Remove the javascripts $Html = StringRegExpReplace($Html,'(?s)(?i)<script.+?script>','') ;Retrieve the visible text Local $Text = StringRegExpReplace($Html,'(?s)<.+?>','') ;gets rid of the "]]>" thats left when we get visible text $Text1 = StringReplace($Text, "]]>", " ") ;Display the visible text FileWrite("keywords.txt",@CRLF&$Text1) So after I pull the string of keywords, it saves is in a long string in a text file. I want to break that string up into 1 keyword per line, then be able to search through those keywords everytime I get the new keyword list, and place the multipules together to see which keywords are valuable. Then after I figure that out, I'm going to check competition for keywords and strings. But this is meant for SEO, I thought it would be fun to try after I found out how to pull data from the web. Edited February 10, 2012 by bvr Link to comment Share on other sites More sharing options...
Blue_Drache Posted February 10, 2012 Share Posted February 10, 2012 (edited) Are the keywords delimited in any way? Like a comma ... or a space? orisitallonebigfatlumpofrobotext? Edited February 10, 2012 by Blue_Drache Lofting the cyberwinds on teknoleather wings, I am...The Blue Drache Link to comment Share on other sites More sharing options...
Spiff59 Posted February 10, 2012 Share Posted February 10, 2012 (edited) StringSplit() gets you a nice array to play with. Am not sure I'm clear enough on your goal to suggest whether another array column would be useful for keeping counts of how many times the strings were encountered, or if you'd be wanting to merge newer scans into this array, etc. #include <Array.au3> Local $Url = 'http://www.google.com/trends/hottrends/atom/hourly' Local $Html = BinaryToString(InetRead($Url)) ;Get the <ol> code from google $Html = StringMid($Html,StringInStr($Html,'<ol')) ;Remove the javascripts $Html = StringRegExpReplace($Html,'(?s)(?i)<script.+?script>','') ;Retrieve the visible text Local $Text = StringRegExpReplace($Html,'(?s)<.+?>','') ;gets rid of the "]]>" thats left when we get visible text $Text1 = StringStripWS(StringReplace($Text, "]]>", ""), 3) $aText = StringSplit($text1, @CRLF) ;Display the visible text _ArrayDisplay($aText) ;FileWrite("keywords.txt",@CRLF&$Text1) Edited February 10, 2012 by Spiff59 Link to comment Share on other sites More sharing options...
bvr Posted February 10, 2012 Author Share Posted February 10, 2012 here is googles code: view-source:http://www.google.com/trends/hottrends/atom/hourlyThe only way to tell the difference in keywords would be the classes. Some are medium, no change, low and so on. So maybe I could just grab the medium or low keywords and then search for occurring with new scans? Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now