Numbers and Variants

Jon · December 9, 2003

After the posts on how numbers are handled I had a deeper dig into VBScript to see how it handled them.

Run this code with cscript.exe test.vbs

Dim vVar

vVar = 1
wscript.echo vVar & " - " & VarType(vVar)

vVar = vVar + 32767
wscript.echo vVar & " - " & VarType(vVar)

vVar = vVar + 0.1
wscript.echo vVar & " - " & VarType(vVar)

vVar = vVar - 0.1
wscript.echo vVar & " - " & VarType(vVar)

vVar = vVar - 1
wscript.echo vVar & " - " & VarType(vVar)

vVar = 2147483647
wscript.echo vVar & " - " & VarType(vVar)

vVar = 2147483647 + 1
wscript.echo vVar & " - " & VarType(vVar)

vVar = vVar - 1
wscript.echo vVar & " - " & VarType(vVar)

The VarTypes given are 2 (16 bit int), 3 (32bit int) and 5(double float). From the test you can see that once an integer expands over the 32bit limit it is converted to a double. Also, if a floating point type operation takes place (including division even if the anwser would have been integer) the result is converted to a double.

Also worth noting is that once a variant has "become " a double it can never go back to being a straight integer.

I can just about imagine making this work in AutoIt3 to make it a little more intuitive than the strict types we currently have.

Comments?

Valik · December 9, 2003

I'm like Larry, I don't really use math with AutoIt, so it doesn't matter a whole lot to me how it works, either. My only issue is will there be a performance hit in any way by the changes?

Nutster · December 10, 2003

I just want to make sure that calculations have a consistant behaviour that could then be documented. :iamstupid:

To determine if a integer should be intepreted as a double instead requires checking the calculation origininal values and results on each addition, subtraction or multiplication.

e.g.

30000 + 10000

-20000 - 30000

16000 * 25000

This is actually a lot of work. I could do it by converting the arguments to double and then adding, etc. the integer arguments and the double arguments seperately and then converting the integer result to a double and comparing the converted and double results. If they are the same, then return the integer result, otherwise give the double result. This is a lot of work and will slow down calculations considerably. Any other thoughts on how to do this?

:whistle:

Edited December 10, 2003 by Nutster

Jon · December 10, 2003

Actually the way it currently works is whenever a variant changes values a new representation is worked out.

So, doing

$var = 10

Actually does

$var (int) = 10

$var (float) = 10.0

$var (string) = "10"

That's why internally you can just use .fvalue, .szvalue or .nvalue for a variant and not care how it was initialised.

So if we just get rid of integers all together and just use doubles everywhere then perversly performance will increase as less conversion happens... Then we just have two types: strings and numbers.

The only problem then is if you have a double 10.0223 how can you tell - for the IsInt/IsFloat functions - if it is a whole number or not....seems simple but I can't think of a way. :whistle:

There is soooooo much overhead in the way that variants are created and copied around the place that I think any changes to doubles will be unnoticable. I'd say the actual arithmetic operations only account for 1% of the work AutoIt does.

tylo · December 27, 2003

I do a lot of calculations in my scripts. But there is an issue that must be looked into: String <-> double convertions.

The double to string today only uses the sprintf convertion string "%.15f". This is unacceptable for at least two reasons:

Printing e.g. numbers like 100.57 and 100.65 will result in something like 100.569999999999999 and 100.650000000000001.
Big (e.g. 1e+20) and small numbers (e.g. 1e-20) are impossibe to input/output, even though double has no trouble representing them.

The only solution to this problem is to support "engineer" real numbers with exponents, like in 2).

The following code (in quote) parses the follwing regular expression for a floating point number, and throws an error if it is not well-formed: [0-9]+\.?[0-9]*([Ee][-+]?[0-9]+)?

(Sorry, next time I will submit code suggestions the proper way)

AUT_RESULT AutoIt_Script::Lexer_Number(const char *szLine, uint &iPos, Token &rtok, char *szTemp)
{
   uint   iPosTemp = 0;
   int nTemp;

   if ( (szLine[iPos]== '0') && (szLine[iPos+1] == 'x' || szLine[iPos+1] == 'X') )
   {
// Hex number
...
   }
   else
   {
// float or integer (stored as doubles in either case)
enum   { L_DIGIT = 1, L_COMMA = 2, L_EXP = 4, L_SIGN = 8, L_MORE = 16 };
uint   iState = (L_DIGIT | L_COMMA | L_EXP | L_MORE);
for (;
{
   char ch = szLine[iPos];
   if (ch >= '0' && ch <= '9' && (iState & L_DIGIT))
iState &= ~L_MORE; // no more chars required
   else if (ch == '.' && (iState & L_COMMA))
iState = (L_DIGIT | L_EXP);
   else if ((ch == 'e' || ch == 'E') && (iState & L_EXP))
iState = (L_DIGIT | L_SIGN | L_MORE);
   else if ((ch == '+' || ch == '-') && (iState & L_SIGN))
iState = (L_DIGIT | L_MORE);
   else if (iState & L_MORE)
      return AUT_ERR; // results in a GENPARSE error
else
break;

   szTemp[iPosTemp++] = szLine[iPos++];
}

szTemp[iPosTemp] = '\0';    // Terminate
rtok.m_nType   = TOK_VARIANT;
rtok.m_Variant   = atof(szTemp);
   }
   return AUT_OK;
}

Modified "default:" section in the Lexer function: (btw, what's the deal with the tokVariant variable - tok should be enough. Also, why push_back in every case - can be done after switch).

AUT_RESULT AutoIt_Script::Lexer(const char *szLine, VectorToken &vLineToks)
....
default:
   // If nothing else matched must be the start of a number OR keyword/func
   ch = szLine[iPos];
   if ( ((ch >= '0' && ch <= '9') || ch == '.') && Lexer_Number(szLine, iPos, tokVariant, szTemp) == AUT_OK )
   {
vLineToks.push_back(tokVariant);
   }
   else if ( (ch >= 'A' && ch <= 'Z') || (ch >= 'a' && ch <= 'z') || ch == '_')
   {
Lexer_KeywordOrFunc(szLine, iPos, tokVariant, szTemp);
vLineToks.push_back(tokVariant);
   }
   else
   {
FatalError(IDS_AUT_E_GENPARSE, iPos);
tok.m_nType = TOK_END;
vLineToks.push_back(tok);
return AUT_ERR;
   }
   break;

And finally, the much simplified GenStringValue() - note the "%.15g" format.

void Variant::GenStringValue(void)
{
   char   szTemp[1024]; // It is unclear just how many 0000 the sprintf function can add...
   int iLastDigit, i, iDecimal;

   // Do we need to generate or does a value already exist?
   if (m_bszValueAvail == true)
return;

   if (m_nVarType == VAR_DOUBLE)
   {
// Work out the string representation of the number, don't print trailing zeros
sprintf(szTemp, "%.15g", m_fValue); // Have at least 15 digits after the . for precision (default is 6)
m_szValue = new char[strlen(szTemp)+1];
strcpy(m_szValue, szTemp);
   }
   else
   {
// Oh dear, someone is requesting the szValue() of an array or reference, throw them a blank string
m_szValue = new char[1];
m_szValue[0] = '\0';
   }

   m_bszValueAvail = true;

} // GenStringValue()

IsFloat(), etc. must probably be modified as well. Edited December 28, 2003 by tylo

tylo · December 28, 2003

An optimized AutoIt_Script::Lexer_Number() float parser:

else
{
// float or integer (stored as doubles in either case)
enum { L_DIGIT = 1, L_COMMA = 2, L_EXP = 4, L_SIGN = 8, L_MORE = 16, L_OK = 32 };
uint iState = (L_DIGIT | L_COMMA | L_EXP | L_MORE);
for (;
{
switch (szLine[iPos])
{
case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9':
// test not needed - digit acceptable in all states.
iState = (iState & ~L_MORE) | L_OK; // remove more bit - no more chars required
break;
case '.':
if (iState & L_COMMA) iState = (L_DIGIT | L_EXP | L_OK);
break;
case 'e': case 'E': // more bit should be unset
if ( (iState & (L_EXP | L_MORE)) == L_EXP ) iState = (L_DIGIT | L_SIGN | L_MORE | L_OK);
break;
case '+': case '-':
if (iState & L_SIGN) iState = (L_DIGIT | L_MORE | L_OK);
break;
}
if (iState & L_OK)
szTemp[iPosTemp++] = szLine[iPos++];
else if (iState & L_MORE)
return AUT_ERR; // results in a GENPARSE error
else
break; // done
iState &= ~L_OK;
}
szTemp[iPosTemp] = '\0'; // Terminate
rtok.m_nType = TOK_VARIANT;
rtok.m_Variant = atof(szTemp);
}

Jon · December 28, 2003

Thanks, I'm looking at the changes now.

I've got no idea why .15f is printing incorrectly and .15g is OK. As far as the docs go f should be fine (until working with massive numbers). But whatever. :whistle:

The tokVariant variable is only used for storing variants as when you work with variants some memory allocations/conversions take place. I used a seperate tok just for storing TOK_ stuff as it stayed the same "type" (integer). In tests it doubled the speed of lexing although this difference may have vanished since I rewrote the variant class. This is also why is don't just do a push_back at the end of the switch as sometimes I push a tok and other times a tokVariant.

tylo · January 2, 2004

Thanks for doing this so fast, and a Happy New Year!

The reason for why %.15g works is that it specifies the total number of digits (before and after comma). E.g. 30001/3 = 10000.3333333333 (15 digits in total; 16 is the resolution for double).

Just to complete the float/math stuff, here is a few functions that it now makes sense to add. Go ahead and rename them, or add others (I just used the C lib names):

F_ABS, F_SIN, F_COS, F_TAN,
  F_ASIN, F_ACOS, F_ATAN, F_SQRT,
  F_LOG, F_EXP, F_POW, F_ROUND,
  F_MAX
};

"Abs", "Sin", "Cos", "Tan", // 42
  "ASin", "ACos", "ATan", "Sqrt",   // 43
  "Log", "Exp", "Pow", "Round"  // 44
};

{1,1}, {1,1}, {1,1}, {1,1}, // 42
  {1,1}, {1,1}, {1,1}, {1,1},   // 43
  {1,1}, {1,1}, {2,2}, {2,2}    // 44
};

case F_ABS: vResult = fabs(vParams[0].fValue()); return AUT_OK;
    case F_SIN: vResult = sin(vParams[0].fValue()); return AUT_OK;
    case F_COS: vResult = cos(vParams[0].fValue()); return AUT_OK;
    case F_TAN: vResult = tan(vParams[0].fValue()); return AUT_OK;
    case F_ASIN: vResult = asin(vParams[0].fValue()); return AUT_OK;
    case F_ACOS: vResult = acos(vParams[0].fValue()); return AUT_OK;
    case F_ATAN: vResult = atan(vParams[0].fValue()); return AUT_OK;
    case F_SQRT: vResult = sqrt(vParams[0].fValue()); return AUT_OK;
    case F_LOG: vResult = log(vParams[0].fValue()); return AUT_OK;
    case F_EXP: vResult = exp(vParams[0].fValue()); return AUT_OK;
    case F_POW:
      vResult = pow(vParams[0].fValue(), vParams[1].fValue());
      return AUT_OK;
    case F_ROUND: {
      double m = 1.0;
      int n = vParams[1].nValue();
      while (--n >= 0) m *= 10.0;
      vResult = floor(vParams[0].fValue() * m + 0.5) / m;
      return AUT_OK;
    }

Jon · January 2, 2004

Jeremy submitted pretty much most of those a while back, but when I added them it add 20KB to the code (must have forced the linking of extra libs) so I cut them out again.

tylo · January 2, 2004

It added only 2KB to the .exe when I compiled it with VC6:

3.0.84:

2.01.04  17:10          78 848  AutoIt3.exe
 2.01.04  17:10         184 320  AutoItSC.bin

3.0.84 with math functions:

2.01.04  13:57          80 896  AutoIt3.exe
 2.01.04  13:57         188 416  AutoItSC.bin

Edited January 2, 2004 by tylo

Jon · January 4, 2004

Odd. Seems the same here now too. :whistle:

CyberSlug · January 4, 2004

So will we get any math functions? [Or where should Tylo's code go: utility.h/utility.cpp? its own file(s)?]

Exponentation and rounding are the most important to me....

Edited January 4, 2004 by CyberSlug

Jon · January 4, 2004

So will we get any math functions? [Or where should Tylo's code go: utility.h/utility.cpp? its own file(s)?]

Exponentation and rounding are the most important to me....

All those above are in my 3.0.85 - some modified though to use versions that JP and Jeremy submitted a while ago (some extra error trapping on some).

Sign In

Numbers and Variants

Recommended Posts

Jon

Valik

Nutster

Jon

tylo

tylo

Jon

tylo

Jon

tylo

Jon

CyberSlug

Jon

Create an account or sign in to comment

Create an account

Sign in

Browse

AutoIt Resources

Release

Beta