Khryus

Advice for speeding up this procedure?

2 posts in this topic

#1 ·  Posted (edited)

 

Hey guys!
 

I'm working on a project which relies on this script I've written to fetch data from a website (using their API). The script works like this:

  1. Authenticate to API using user-inputted credentials
  2. Get a list of folders inside a given directory using _FileListToArray
  3. For each folder, check if we have a cached query. This is to reduce the number of requests we make to the API.
  4. If we don't, then start a synchronous query to the API using InetGet (this is probably what's slowing down everything and I'm aware I should be using async, but I don't know how to handle this. For instance, how do I keep track of all the queries and know when which have completed and which have failed?). This process usually takes about 14 seconds for me (I have roughly 240 folders in the root directory I'm processing).
  5. Cache previous query (if any) as an xml file.

Here's the function that handles the part above. _log and _report append a line to different files. You can find queryMAL_Base below the snippet here.

Func CacheCollection($refresh = False)
        Local $FolderList = _FileListToArray($collection_root, "*", 2, False)
        Local $downloadedBytes = 0
        Local $newQueries = 0, $skippedQueries = 0, $successfullCachings = 0, $failedCachings = 0, $timer = TimerInit()
        For $Folder In $FolderList
            If $Folder == $FolderList[0] Then
                ConsoleWrite("silentskip"&@CRLF)
            ElseIf StartsWith($Folder, "_") Then
                _log("Skipped Folder '"&$Folder&"' because of special character at start or vector count.")
                $skippedQueries = $skippedQueries + 1
            Else
                ;_log("Checking if '"&$query_root&CachedName($Folder)&"' exists... ")
                If Not Cached($Folder) Or $refresh  Then
                    _log("Caching Folder '"&$Folder&"' as '"&CachedName($Folder)&"'.")
                    Local $CacheAttempt = queryMAL_Base($cfg_username, $cfg_password, $Folder)
                    $newQueries = $newQueries + 1
                    If $CacheAttempt == 13 Then
                        _log("Failed Cache Attempt for '"&$Folder&"'.")
                        $failedCachings = $failedCachings + 1
                    Else
                        _log("Successfully cached '"&$Folder&"'.")
                        $successfullCachings = $successfullCachings + 1
                        $downloadedBytes = $downloadedBytes + $CacheAttempt
                    EndIf
                Else
                    _log("Skipped Folder '"&$Folder&"' because already cached as '"&GetCachedFileByFolderName($Folder)&"'")
                    $skippedQueries = $skippedQueries + 1
                EndIf
            EndIf
        Next
        _report("JOB_"&$job_id&"_COMPLETED")
        _report($failedCachings)
        _log("Done caching collection in approx. " & Floor(TimerDiff($timer)/1000) & " seconds. " & "Downloaded kilobytes: " & ($downloadedBytes > 0 ? "approx. " : "") & Floor($downloadedBytes / 1024)  & " Kb.")
        _log("Queries: " & $newQueries & ", of which: " & $successfullCachings & " were successful and " & $failedCachings & " failed. Skipped: " & $skippedQueries&". " & ($newQueries - $failedCachings == 0 ? " (successful queries are automatically cached)" : ""))
    EndFunc

 

queryMAL_Base

Func queryMAL_Base($user, $password, $query)
        ;returns 13 on failure, number of bytes downloaded on success
        Local $url = StringReplace(StringReplace(StringReplace($mal_root, "[u]", $user), "[p]", $password), "[query]", $query)
        Local $request = InetGet($url, $query_root&CachedName($query), 0, 0)
        Local $error = @error
        If $error Then
            Return $error
        Else
            Return $request
        EndIf
    EndFunc

I think I should be using asynchronous requests instead, but how should I handle that? Could you give me some advice? Is there anything else I can improve speed-wise, especially in the part where I check which folder should be processed?

Edit: here's the API documentation in case you want to check it out http://myanimelist.net/modules.php?go=api

Edited by Khryus

"The story of a blade is linked in Blood." 

―Yasuo

 

Share this post


Link to post
Share on other sites



Hi. You can do something like this. (Of course adapt it as you need it.)

 

#include <InetConstants.au3>



Local $sDir = @ScriptDir & "\Downloads\"

If Not FileExists($sDir) Then DirCreate($sDir)


Global Enum $eHANDLE, $eFILEPATH
Local $sURL = 'https://www.autoitscript.com/forum/uploads/profile/photo-thumb-69185.jpg'
Local $iNDownLoads = 15
Local $aDownLoads[$iNDownLoads][2]


For $i = 0 To $iNDownLoads - 1
    $aDownLoads[$i][$eFILEPATH] = $sDir & "Imagen-" & String($i + 1) & ".jpg"
    $aDownLoads[$i][$eHANDLE] = InetGet($sURL, $aDownLoads[$i][$eFILEPATH], $INET_FORCERELOAD, $INET_DOWNLOADBACKGROUND)
    ConsoleWrite(">HANDLE " & String($i + 1) & @TAB & $aDownLoads[$i][$eHANDLE] & @CRLF)
Next


Local $iSize = 0
Local $iError = 0
Local $iCountFinish = 0
ConsoleWrite("!Waiting for Downloads... " & @CRLF)
While True
    For $i = 0 To $iNDownLoads - 1

        $iSize = InetGetInfo($aDownLoads[$i][$eHANDLE], $INET_DOWNLOADSIZE)

        If $iSize Then ;If you want to check for size
;~          ConsoleWrite("-Size Present Instance: " & String($i + 1) & " Size: " & $iSize & @CRLF)
        Else
;~          ConsoleWrite(">Size nopresent Instance: " & String($i + 1) & " Size: " & $iSize & @CRLF)
        EndIf


        $iError = InetGetInfo($aDownLoads[$i][$eHANDLE], $INET_DOWNLOADERROR)

        If $iError Then ;If error  try to download again
            ConsoleWrite("!Download Error Instance: " & String($i + 1) & " Error: " & $iError & @TAB & "Try DownLoad Again" & @CRLF)
            InetClose($aDownLoads[$i][$eHANDLE])
            $aDownLoads[$i][$eHANDLE] = 0
;~          $iCountFinish+=1
            $aDownLoads[$i][$eHANDLE] = InetGet($sURL, $aDownLoads[$i][$eFILEPATH], $INET_FORCERELOAD, $INET_DOWNLOADBACKGROUND)
        EndIf

        If InetGetInfo($aDownLoads[$i][$eHANDLE], $INET_DOWNLOADCOMPLETE) Then ;check complete
            ConsoleWrite("+Download Completed Instance: " & String($i + 1) & @CRLF)
            InetClose($aDownLoads[$i][$eHANDLE])
            $aDownLoads[$i][$eHANDLE] = 0
            $iCountFinish += 1
        EndIf


    Next
    If $iCountFinish = $iNDownLoads Then ExitLoop
    Sleep(100)
WEnd

MsgBox(64, "info", "Downloads finished")

Saludos

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now