Jump to content

Statistic Functions


Recommended Posts

Hi there. First post, but can i thank all the active posters here - I've found many answers to my queries in these forums!

I'm looking for some statistics functions inside of AutoIt. If they don't exist, I'm not expecting anyone to write them for me, I just hoped there might be a way to include a script or something to allow me to use the following statistics (per Excel names):

STEYX (standard regression error)

RSQ (R squared)

SLOPE (gradient of linear regression)

INTERCEPT (Y intercept of linear regression)

If any of these already exist, forgive me but i couldnt find them in autoit help. I would prefer for my script to be independent of the Excel program, so attaching to Excel would be a less-ideal solution for me.

Many thanks for any pointers you can give me!

Mike

Link to comment
Share on other sites

From the (real/official) AutoIt help file. (differs a bit from Wiki)

Math functions Reference

Below is a complete list of the (math) functions available in AutoIt. Click on a function name for a detailed description.

Function - Description

Abs - Calculates the absolute value of a number.

ACos - Calculates the arcCosine of a number.

ASin - Calculates the arcsine of a number.

ATan - Calculates the arctangent of a number.

BitAND - Performs a bitwise AND operation.

BitNOT - Performs a bitwise NOT operation.

BitOR - Performs a bitwise OR operation.

BitRotate - Performs a bit shifting operation, with rotation.

BitShift - Performs a bit shifting operation.

BitXOR - Performs a bitwise exclusive OR (XOR) operation.

Cos - Calculates the cosine of a number.

Ceiling - Returns a number rounded up to the next integer.

Exp - Calculates e to the power of a number.

Floor - Returns a number rounded down to the closest integer.

Log - Calculates the natural logarithm of a number.

Mod - Performs the modulus operation.

Random - Generates a pseudo-random float-type number.

Round - Returns a number rounded to a specified number of decimal places.

Sin - Calculates the sine of a number.

Sqrt - Calculates the square-root of a number.

SRandom - Set Seed for random number generation.

Tan - Calculates the tangent of a number.

Nope. Need to be coded manually.

Try forum search again. Not that much statistical function code around though, but there are some.

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

For anyone finding thsi topic at a later date, heres the solution I came up with. Its all in one function because it doesnt need to keep calculating the same stuff over and over if you need all of the stats. Of course, if you have need only for certain bits, hacking away at it can make it less bloated.

Inputs:

$YValues - array containing Y values of dataset

$XValues - array containing X values of dataset

Returns:

RegStat[0] = Mean of Y

RegStat[1] = Mean of X

RegStat[2] = Slope of Best Fit (Linear Regression)

RegStat[3] = Intercept of Best Fit (Linear Regression)

RegStat[4] = Average Difference from Best Fit Line - NOTE: this is not the STEYX statistic I mentioned in my earlier post, this is a bit different.

; ---- RegStats - Regression Statistics Function ---- 
; returns several statistics based on two datasets, stored in arrays.
;
; NOTE: use 0-based arrays.
;
; Inputs:
; $YValues - array containing Y values of dataset
; $XValues - array containing X values of dataset 
;
; Returns:
; RegStat[0] = Mean of Y
; RegStat[1] = Mean of X
; RegStat[2] = Slope of Best Fit (Linear Regression)
; RegStat[3] = Intercept of Best Fit (Linear Regression)
; RegStat[4] = Average Difference from Best Fit Line
;
; Errors:
; 1 - array had fewer than 2 entries
; 2 - YValues and XValues were of different array lengths

func RegStats($YValues, $XValues)
    
    ; COUNT (N)
    Local $Count = UBound($YValues)
    if $Count < 2 Then Return SetError(1,0,0)
        
    Local $CountX = UBound($XValues)
    If $Count <> $CountX Then Return SetError(2,0,0)    
    
    ; SUMS
    Local $SumY
    Local $SumYSq
    Local $SumX
    Local $SumXSq
    Local $SumXY
    
    For $i = 0 to $Count - 1
        $SumX = $SumX + $XValues[$i]
        $SumY = $SumY + $YValues[$i]
        $SumYSq = $SumYSq + $YValues[$i] ^ 2
        $SumXSq = $SumXSq + $XValues[$i] ^ 2
        $SumXY = $SumXY + $XValues[$i] * $YValues[$i]
    Next
    
    ; MEAN  
    Local $MeanY = $SumY / $Count
    Local $MeanX = $SumX / $Count
    
    ; SLOPE (Y on X)
    Local $Slope = ($SumXY - $SumX * $SumY / $Count) / ($SumXSq - $SumX ^ 2 / $Count)
    
    ; INTERCEPT (Y on X)
    Local $Intercept = $MeanY - $MeanX * $Slope
    
    ; SUM DIFFERENCE FROM LINE  
    Local $SumDifference
    For $i = 0 to $Count - 1
        $SumDifference = $SumDifference + (($YValues[$i] - ($Slope * $XValues[$i] + $Intercept)) ^ 2) ^ 0.5
    Next
    
    ; AVERAGE DIFFERENCE FROM LINE
    Local $AverageDifference = $SumDifference / $Count
    
    ; RETURN VALUES
    Dim $RegAnalyseArray[5]
    
    $RegAnalyseArray[0] = $MeanY
    $RegAnalyseArray[1] = $MeanX
    $RegAnalyseArray[2] = $Slope
    $RegAnalyseArray[3] = $Intercept
    $RegAnalyseArray[4] = $AverageDifference
    
    return $RegAnalyseArray
    
EndFunc

Or, if you don't need the X values (I don't I just thought I'd script them while I was at it... the next one assumes X values are 1, 2, 3, 4...

; ---- RegStatsY - Regression Statistics Function ---- 
; returns several statistics based on onedatasets, stored in an array.
; the other dataset is assumed to be 1, 2, 3, 4, 5...
;
; NOTE: use 0-based arrays.
;
; Inputs:
; $YValues - array containing Y values of dataset
;
; Returns:
; RegStatsY[0] = Mean of Y
; RegStatsY[1] = Mean of X
; RegStatsY[2] = Slope of Best Fit (Linear Regression)
; RegStatsY[3] = Intercept of Best Fit (Linear Regression)
; RegStatsY[4] = Average Difference from Best Fit Line
;
; Errors:
; 1 - array had fewer than 2 entries


func RegStatsYBAD($YValues)
    
    ; COUNT (N)
    Local $Count = UBound($YValues)
    if $Count < 2 Then Return SetError(1,0,0)
        
    dim $XValues[$Count]
    for $i = 0 to $Count - 1
        $XValues[$i] = $i
    next
    
    ; SUMS
    Local $SumY
    Local $SumYSq
    Local $SumX
    Local $SumXSq
    Local $SumXY
    
    For $i = 0 to $Count - 1
        $SumX = $SumX + $XValues[$i]
        $SumY = $SumY + $YValues[$i]
        $SumYSq = $SumYSq + $YValues[$i] ^ 2
        $SumXSq = $SumXSq + $XValues[$i] ^ 2
        $SumXY = $SumXY + $XValues[$i] * $YValues[$i]
    Next
    
    ; MEAN  
    Local $MeanY = $SumY / $Count
    Local $MeanX = $SumX / $Count
    
    ; SLOPE (Y on X)
    Local $Slope = ($SumXY - $SumX * $SumY / $Count) / ($SumXSq - $SumX ^ 2 / $Count)
    
    ; INTERCEPT (Y on X)
    Local $Intercept = $MeanY - $MeanX * $Slope
    
    ; SUM DIFFERENCE FROM LINE  
    Local $SumDifference
    For $i = 0 to $Count - 1
        $SumDifference = $SumDifference + (($YValues[$i] - ($Slope * $XValues[$i] + $Intercept)) ^ 2) ^ 0.5
    Next
    
    ; AVERAGE DIFFERENCE FROM LINE
    Local $AverageDifference = $SumDifference / $Count
    
    ; RETURN VALUES
    Dim $RegAnalyseArray[5]
    
    $RegAnalyseArray[0] = $MeanY
    $RegAnalyseArray[1] = $MeanX
    $RegAnalyseArray[2] = $Slope
    $RegAnalyseArray[3] = $Intercept
    $RegAnalyseArray[4] = $AverageDifference
    
    return $RegAnalyseArray
    
EndFunc

and finally, some example code to test them:

Dim $TestDataX[12]
$TestDataX[0] = 94
$TestDataX[1] = 65
$TestDataX[2] = 88
$TestDataX[3] = 83
$TestDataX[4] = 92
$TestDataX[5] = 50
$TestDataX[6] = 67
$TestDataX[7] = 100
$TestDataX[8] = 100
$TestDataX[9] = 73
$TestDataX[10] = 90
$TestDataX[11] = 83

Dim $TestDataY[12]
$TestDataY[0] = 89
$TestDataY[1] = 52
$TestDataY[2] = 57
$TestDataY[3] = 78
$TestDataY[4] = 76
$TestDataY[5] = 30
$TestDataY[6] = 67
$TestDataY[7] = 96
$TestDataY[8] = 74
$TestDataY[9] = 65
$TestDataY[10] = 87
$TestDataY[11] = 78

$Stats = RegStats($TestDataY, $TestDataX)

Local $str = ""
$str = $str & "Mean of X is " & $Stats[0] & @CRLF
$str = $str & "Mean of Y is " & $Stats[1] & @CRLF
$str = $str & "Slope of Linear Best Fit is " & $Stats[2] & @CRLF
$str = $str & "Intercept of Best Fit is " & $Stats[3] & @CRLF
$str = $str & "Average Difference from Best Fit is " & $Stats[4]

msgbox(0,"",$str)

Should I also post this to the snippets section? I'm happy for anybody to use it.

Link to comment
Share on other sites

Should I also post this to the snippets section?

You might like to wait a bit until you have a little higher post count. (editing/updating_code and such). Other than that, Its your call. Edited by iEvKI3gv9Wrkd41u

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...
 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...