# Statistic Functions

## Recommended Posts

Hi there. First post, but can i thank all the active posters here - I've found many answers to my queries in these forums!

I'm looking for some statistics functions inside of AutoIt. If they don't exist, I'm not expecting anyone to write them for me, I just hoped there might be a way to include a script or something to allow me to use the following statistics (per Excel names):

RSQ (R squared)

INTERCEPT (Y intercept of linear regression)

If any of these already exist, forgive me but i couldnt find them in autoit help. I would prefer for my script to be independent of the Excel program, so attaching to Excel would be a less-ideal solution for me.

Many thanks for any pointers you can give me!

Mike

##### Share on other sites

From the (real/official) AutoIt help file. (differs a bit from Wiki)

Math functions Reference

Below is a complete list of the (math) functions available in AutoIt. Click on a function name for a detailed description.

Function - Description

Abs - Calculates the absolute value of a number.

ACos - Calculates the arcCosine of a number.

ASin - Calculates the arcsine of a number.

ATan - Calculates the arctangent of a number.

BitAND - Performs a bitwise AND operation.

BitNOT - Performs a bitwise NOT operation.

BitOR - Performs a bitwise OR operation.

BitRotate - Performs a bit shifting operation, with rotation.

BitShift - Performs a bit shifting operation.

BitXOR - Performs a bitwise exclusive OR (XOR) operation.

Cos - Calculates the cosine of a number.

Ceiling - Returns a number rounded up to the next integer.

Exp - Calculates e to the power of a number.

Floor - Returns a number rounded down to the closest integer.

Log - Calculates the natural logarithm of a number.

Mod - Performs the modulus operation.

Random - Generates a pseudo-random float-type number.

Round - Returns a number rounded to a specified number of decimal places.

Sin - Calculates the sine of a number.

Sqrt - Calculates the square-root of a number.

SRandom - Set Seed for random number generation.

Tan - Calculates the tangent of a number.

Nope. Need to be coded manually.

Try forum search again. Not that much statistical function code around though, but there are some.

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...

##### Share on other sites

did another poke around forums and i really cant find any statistics snippets posted. okay, thanks for making sure i wasnt wasting my time scripting these!

##### Share on other sites

For anyone finding thsi topic at a later date, heres the solution I came up with. Its all in one function because it doesnt need to keep calculating the same stuff over and over if you need all of the stats. Of course, if you have need only for certain bits, hacking away at it can make it less bloated.

Inputs:

\$YValues - array containing Y values of dataset

\$XValues - array containing X values of dataset

Returns:

RegStat[0] = Mean of Y

RegStat[1] = Mean of X

RegStat[2] = Slope of Best Fit (Linear Regression)

RegStat[3] = Intercept of Best Fit (Linear Regression)

RegStat[4] = Average Difference from Best Fit Line - NOTE: this is not the STEYX statistic I mentioned in my earlier post, this is a bit different.

```; ---- RegStats - Regression Statistics Function ----
; returns several statistics based on two datasets, stored in arrays.
;
; NOTE: use 0-based arrays.
;
; Inputs:
; \$YValues - array containing Y values of dataset
; \$XValues - array containing X values of dataset
;
; Returns:
; RegStat[0] = Mean of Y
; RegStat[1] = Mean of X
; RegStat[2] = Slope of Best Fit (Linear Regression)
; RegStat[3] = Intercept of Best Fit (Linear Regression)
; RegStat[4] = Average Difference from Best Fit Line
;
; Errors:
; 1 - array had fewer than 2 entries
; 2 - YValues and XValues were of different array lengths

func RegStats(\$YValues, \$XValues)

; COUNT (N)
Local \$Count = UBound(\$YValues)
if \$Count < 2 Then Return SetError(1,0,0)

Local \$CountX = UBound(\$XValues)
If \$Count <> \$CountX Then Return SetError(2,0,0)

; SUMS
Local \$SumY
Local \$SumYSq
Local \$SumX
Local \$SumXSq
Local \$SumXY

For \$i = 0 to \$Count - 1
\$SumX = \$SumX + \$XValues[\$i]
\$SumY = \$SumY + \$YValues[\$i]
\$SumYSq = \$SumYSq + \$YValues[\$i] ^ 2
\$SumXSq = \$SumXSq + \$XValues[\$i] ^ 2
\$SumXY = \$SumXY + \$XValues[\$i] * \$YValues[\$i]
Next

; MEAN
Local \$MeanY = \$SumY / \$Count
Local \$MeanX = \$SumX / \$Count

; SLOPE (Y on X)
Local \$Slope = (\$SumXY - \$SumX * \$SumY / \$Count) / (\$SumXSq - \$SumX ^ 2 / \$Count)

; INTERCEPT (Y on X)
Local \$Intercept = \$MeanY - \$MeanX * \$Slope

; SUM DIFFERENCE FROM LINE
Local \$SumDifference
For \$i = 0 to \$Count - 1
\$SumDifference = \$SumDifference + ((\$YValues[\$i] - (\$Slope * \$XValues[\$i] + \$Intercept)) ^ 2) ^ 0.5
Next

; AVERAGE DIFFERENCE FROM LINE
Local \$AverageDifference = \$SumDifference / \$Count

; RETURN VALUES
Dim \$RegAnalyseArray[5]

\$RegAnalyseArray[0] = \$MeanY
\$RegAnalyseArray[1] = \$MeanX
\$RegAnalyseArray[2] = \$Slope
\$RegAnalyseArray[3] = \$Intercept
\$RegAnalyseArray[4] = \$AverageDifference

return \$RegAnalyseArray

EndFunc```

Or, if you don't need the X values (I don't I just thought I'd script them while I was at it... the next one assumes X values are 1, 2, 3, 4...

```; ---- RegStatsY - Regression Statistics Function ----
; returns several statistics based on onedatasets, stored in an array.
; the other dataset is assumed to be 1, 2, 3, 4, 5...
;
; NOTE: use 0-based arrays.
;
; Inputs:
; \$YValues - array containing Y values of dataset
;
; Returns:
; RegStatsY[0] = Mean of Y
; RegStatsY[1] = Mean of X
; RegStatsY[2] = Slope of Best Fit (Linear Regression)
; RegStatsY[3] = Intercept of Best Fit (Linear Regression)
; RegStatsY[4] = Average Difference from Best Fit Line
;
; Errors:
; 1 - array had fewer than 2 entries

; COUNT (N)
Local \$Count = UBound(\$YValues)
if \$Count < 2 Then Return SetError(1,0,0)

dim \$XValues[\$Count]
for \$i = 0 to \$Count - 1
\$XValues[\$i] = \$i
next

; SUMS
Local \$SumY
Local \$SumYSq
Local \$SumX
Local \$SumXSq
Local \$SumXY

For \$i = 0 to \$Count - 1
\$SumX = \$SumX + \$XValues[\$i]
\$SumY = \$SumY + \$YValues[\$i]
\$SumYSq = \$SumYSq + \$YValues[\$i] ^ 2
\$SumXSq = \$SumXSq + \$XValues[\$i] ^ 2
\$SumXY = \$SumXY + \$XValues[\$i] * \$YValues[\$i]
Next

; MEAN
Local \$MeanY = \$SumY / \$Count
Local \$MeanX = \$SumX / \$Count

; SLOPE (Y on X)
Local \$Slope = (\$SumXY - \$SumX * \$SumY / \$Count) / (\$SumXSq - \$SumX ^ 2 / \$Count)

; INTERCEPT (Y on X)
Local \$Intercept = \$MeanY - \$MeanX * \$Slope

; SUM DIFFERENCE FROM LINE
Local \$SumDifference
For \$i = 0 to \$Count - 1
\$SumDifference = \$SumDifference + ((\$YValues[\$i] - (\$Slope * \$XValues[\$i] + \$Intercept)) ^ 2) ^ 0.5
Next

; AVERAGE DIFFERENCE FROM LINE
Local \$AverageDifference = \$SumDifference / \$Count

; RETURN VALUES
Dim \$RegAnalyseArray[5]

\$RegAnalyseArray[0] = \$MeanY
\$RegAnalyseArray[1] = \$MeanX
\$RegAnalyseArray[2] = \$Slope
\$RegAnalyseArray[3] = \$Intercept
\$RegAnalyseArray[4] = \$AverageDifference

return \$RegAnalyseArray

EndFunc```

and finally, some example code to test them:

```Dim \$TestDataX[12]
\$TestDataX[0] = 94
\$TestDataX[1] = 65
\$TestDataX[2] = 88
\$TestDataX[3] = 83
\$TestDataX[4] = 92
\$TestDataX[5] = 50
\$TestDataX[6] = 67
\$TestDataX[7] = 100
\$TestDataX[8] = 100
\$TestDataX[9] = 73
\$TestDataX[10] = 90
\$TestDataX[11] = 83

Dim \$TestDataY[12]
\$TestDataY[0] = 89
\$TestDataY[1] = 52
\$TestDataY[2] = 57
\$TestDataY[3] = 78
\$TestDataY[4] = 76
\$TestDataY[5] = 30
\$TestDataY[6] = 67
\$TestDataY[7] = 96
\$TestDataY[8] = 74
\$TestDataY[9] = 65
\$TestDataY[10] = 87
\$TestDataY[11] = 78

\$Stats = RegStats(\$TestDataY, \$TestDataX)

Local \$str = ""
\$str = \$str & "Mean of X is " & \$Stats[0] & @CRLF
\$str = \$str & "Mean of Y is " & \$Stats[1] & @CRLF
\$str = \$str & "Slope of Linear Best Fit is " & \$Stats[2] & @CRLF
\$str = \$str & "Intercept of Best Fit is " & \$Stats[3] & @CRLF
\$str = \$str & "Average Difference from Best Fit is " & \$Stats[4]

msgbox(0,"",\$str)```

Should I also post this to the snippets section? I'm happy for anybody to use it.

##### Share on other sites

Should I also post this to the snippets section?

You might like to wait a bit until you have a little higher post count. (editing/updating_code and such). Other than that, Its your call. Edited by iEvKI3gv9Wrkd41u

"Straight_and_Crooked_Thinking" : A "classic guide to ferreting out untruths, half-truths, and other distortions of facts in political and social discussions."
"The Secrets of Quantum Physics" : New and excellent 2 part documentary on Quantum Physics by Jim Al-Khalili. (Dec 2014)

"Believing what you know ain't so" ...

Knock Knock ...

## Create an account

Register a new account

×

• Wiki

• Back

• Git