Jump to content

[UPDATE 2][STUCK] RSS Search Help


kzboy
 Share

Recommended Posts

Hi, I have used AutoIt many times but I am by no means a Pro-programmer. So you might need to elaborate on some of the basics. Also try not to be too hard on me I just don't know my way around the code and/or functions as well as I wish I did, I'm learning. :-)I am trying to make a program that canopen an RSS feed > check to see if there are any updates since the last scan > If yes then > pull the titles > return the tiles (& descriptions) that are new.^^^^Maybe I should make this its own function?After the program has a list of new titles it would preform a scan to see if any of them match any of the preset criteria. If so those matching titles and their descriptions would be returned.I have been messing with using this script from

#include-once#region _RSS; RSS Reader; Created By: Frostfel#include #include ; ============================================================================; Function: _RSSGetInfo($RSS, $RSS_InfoS, $RSS_InfoE[, $RSS_Info_ = 1]); Description: Gets RSS Info; Parameter(s): $RSS = RSS Feed Example: "http://feed.com/index.xml"; $RSS_InfoS = String to find for info start Example:; $RSS_Info_Start = [optional] / To start at; Some RSS feeds will have page titles; you dont want Defualt = 0; Requirement(s): None; Return Value(s): On Success - Returns RSS Info in Array Starting at 1; On Failure - Returns 0; @Error = 1 - Failed to get RSS Feed; Author(s): Frostfel; ============================================================================Func _RSSGetInfo($RSS, $RSS_InfoS, $RSS_InfoE, $RSS_Info_Start = 0)$RSSFile = _INetGetSource($RSS)If @Error ThenSetError(1)Return -1EndIfDim $InfoSearchS = 1Dim $Info[1000]Dim $InfoNumA$InfoNum = $RSS_Info_StartWhile $InfoSearchS <> 6 $InfoNum += 1 $InfoNumA += 1 $InfoSearchS = StringInStr($RSSFile, $RSS_InfoS, 0, $InfoNum) $InfoSearchE = StringInStr($RSSFile, $RSS_InfoE, 0, $InfoNum) $InfoSearchS += 6 $InfoSS = StringTrimLeft($RSSFile, $InfoSearchS) $InfoSearchE -= 1 $InfoSE_Len = StringLen(StringTrimLeft($RSSFile, $InfoSearchE)) $InfoSE = StringTrimRight($InfoSS, $InfoSE_Len) _ArrayInsert($Info, $InfoNumA, $InfoSE)WEndReturn $InfoEndFunc#endregion
So far I have managed to get my program to use the script above to return the list of titles from the RSS (thou its kinda slow), I'm thinking about having it scan again to retrieve the descriptions, my only fear is that if the feed is updated between scans one and two it could result in miss-matched titles & descriptions.I am also thinking about having the program write the titles returned to a file and then check if the new titles match any of the ones from the file each time its run, but I'm not sure how to make it still return the ones that ARE new, maybe I could just record the newest one each time and compare to that? But I DO want it to retrieve all/any of the ones newer then that last one.Anyway I'm still working on my program but I thought I would post this so I could check it for your ideas/input.ThanksKzEDIT: I wish I knew more about parsing XML :-/

[UPDATE]

Ok so I have been able to retrieve the RSS feeds and pull the content I want (thanks guinness & mihaibr, that info was a big help!). Now I am using _ArraySearch and its working well so far but I had a question regarding it. The 4th operator can be set to: 0(default), 1, or 2. I am unclear what the difference is between 0 & 2 (1 is partial string search this I understand)

From AutoIt Help:

0 AutoIt variables compare (default), "string" = 0, "" = 0 or "0" = 0 match

1 executes a partial search (StringInStr)

2 comparison match if variables have same type and same value

[uPDATE]

Still not sure what 0 & 2 do. Though it seems like 0 will also return true if it finds an empty location. I'm using 2.

Anyway onto the newest problem. I need a way to search the array and find all indices that match a list of preset criteria. I have been trying _ArrayFindAll, but I can only search one word at a time, I could do this with a long list of _ArrayFindAll's but that seems like a poor idea.

Also I would like to only search new results. I have a var that has the location of the first old result, but _ArrayFindAll won't accept a var for $iEnd.

Edited by kzboy
Link to comment
Share on other sites

Search _GetXML in the Forum.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Try to refrain from bumping your posts less than 24 hours. Could you provide a small reproducer with an example of an RSS feed you're using.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Wasn't trying to bump, just been working on this a lot, so my last post was rendered irrelevant. I thought about starting a new thread but that seemed unnecessary.

#cs ----------------------------------------------------------------------------
AutoIt Version: 3.3.8.1
Author:      KzBoy
Script Function:
Find spam post on GBX forums
ToDo:
Report post
#ce ----------------------------------------------------------------------------
#include <Array.au3>
#include <INet.au3>
#include <RSS_Reader2.au3>
;temp stuff
IniWrite ( "data", "prevScan", "link", "http://forums.gearboxsoftware.com/" )
;/temp
;Pull new topics from GBX forum RSS feed
$feed = _INetGetSource ("http://forums.gearboxsoftware.com/external.php?type=RSS1")
;Pull: title, discription, & link info from RSS feed
$rssTitle = _GetXML($feed, "title")
$rssDisc = _GetXML($feed, "description")
$rssLink = _GetXML($feed, "link")
_ArrayDisplay($rssDisc, "RSS Reader")
;Pull first link from last scan to compare to feed to check for updates
$prevScanL = IniRead ( "data", "prevScan", "link", "default" )

;Remove "<![CDATA[" from titles
Local $noPhrseT = 0
While $noPhrseT <> -1
$noPhrseT = _ArraySearch($rssTitle, "<![CDATA[", 0, 0, 0, 1)
If $noPhrseT = -1 Then ExitLoop
_ArrayTrim($rssTitle, 9, 0, $noPhrseT, $noPhrseT)
_ArrayTrim($rssTitle, 3, 1, $noPhrseT, $noPhrseT)
WEnd
;Remove "<![CDATA[" from descriptions
Local $noPhrseD = 0
While $noPhrseD <> -1
$noPhrseD = _ArraySearch($rssDisc, "<![CDATA[", 0, 0, 0, 1)
If $noPhrseD = -1 Then ExitLoop
_ArrayTrim($rssDisc, 9, 0, $noPhrseD, $noPhrseD)
_ArrayTrim($rssDisc, 3, 1, $noPhrseD, $noPhrseD)
WEnd
;Check for new items, set $endPos at the end of new items
Local $endPos = _ArraySearch($rssLink, $prevScanL, 0, 0, 0, 2)
;If $endPos >= 0 Then
; MsgBox(0, "Found", '"' & $prevScanL & '" was found in the array at position ' & $endPos & ".")
; EndIf

;Search for spam
Local $spamWords = "live" Or "online" Or "watch" Or "free"
Local $spamSymbols = "@"; Or "~" Or "[" Or "]" Or "(" Or ")" Or "^" Or "$" Or "#" Or "<" Or ">" Or "*"
;Find titles that meets criteria
$spamList = _ArrayFindAll($rssTitle, $spamSymbols, 0, 0, 0, 1)
;Diag
MsgBox(0, "Found", '"' & $spamList & '" was found in the array at position ' & $endPos & ".")
_ArrayDisplay($spamList, "RSS Reader")
;Exit after diag cloeses
WinWaitClose ( "RSS Reader" )
Exit

Sorry for sloppy code, and there are some things commented out on propose for testing various stuff, but that's my code so far...

Edited by kzboy
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...