Jump to content
Sign in to follow this  
czardas

Disambiguation in an Imaginary Language

Recommended Posts

czardas

In an imaginary language.

I declare a variable called CONFLICT.
I later use a function which expects the string parameter CONFLICT.
There is no way to differentiate between the variable and the string parameter.

What should the interpreter do?
1) Prevent the user from declaring the variable CONFLICT
2) Allow the declaration and only throw an error if something actually breaks.

3) Something else - please say what.

Share this post


Link to post
Share on other sites
orbs

There is no way to differentiate between the variable and the string parameter.

 

i read "there is no way to differentiate between variable name and variable value." do i understand correctly? if so, the imaginary language may have some very real problems. i think the imaginary developer should push his/her imagination a bit further and have this resolved, for this is the cause of the issue, and your suggested options of resolution are handling one possible symptom only.

Share this post


Link to post
Share on other sites
Mat

With naming conflicts, variables in the innermost scope take preference. So in this case using CONFLICT inside the function refers to the argument. 

  • Like 1

Share this post


Link to post
Share on other sites
trancexx

The way I read czardas, the problem here is conflict of literal value of the argument and its identifier.

PRINT(CONFLICT) // prints CONFLICT
CONFLICT = ABCD

PRINT(CONFLICT) // prints ABCD

♡♡♡

.

eMyvnE

Share this post


Link to post
Share on other sites
czardas

i read "there is no way to differentiate between variable name and variable value." do i understand correctly?

 

Assume this to be true.

With naming conflicts, variables in the innermost scope take preference. So in this case using CONFLICT inside the function refers to the argument. 

 

Unlike the following scenario.

The way I read czardas, the problem here is conflict of literal value of the argument and its identifier.

PRINT(CONFLICT) // prints CONFLICT
CONFLICT = ABCD

PRINT(CONFLICT) // prints ABCD

 

This scenario!.

Would the language not expect "CONFLICT" as a parameter and CONFLICT as a variable.

 

That would be cheating. :naughty:

Share this post


Link to post
Share on other sites
czardas

Here's an idea! Preventing the variable CONFLICT form being passed to the function PRINT, but allowing it in other situations, is possibly a solution.

;

CONFLICT = ABCD
ABCD = CONFLICT

PRINT(CONFLICT) // prints CONFLICT (always)
PRINT(ABCD) // prints ABCD (because this value was assigned)

:

It seems to be more or less what Mat was saying. It's still a bit tricky. Actually!


;

Local $sCode = "distance (Floor( au + distance(distance, au, angst )) , au , angst ) + angst" ; Imaginary syntax
$sCode = StringRegExpReplace($sCode, "(\bdistance)(\s*\()", "_$1(")

Local $sDistParam = '(au|angst)' ; Potential conflicts
Local $sRegExpDistParam = '(_distance\()(.+)(\(.+\))*(\s*,\s*)' & $sDistParam & '(\s*,\s*)' & $sDistParam & '\s*\)'
Local $iWorking = -1

While $iWorking
    $sCode = StringRegExpReplace($sCode, $sRegExpDistParam, '$1$2$3$4"$5","$7")')
    $iWorking = @extended
WEnd
MsgBox(0, "Interpreter", $sCode)

Piece of cake! I wish. :(

I thought I had it. I know I can fix this.

Edited by czardas

Share this post


Link to post
Share on other sites
Mat

Solutions are: start variables with a letter, use quotes for strings, or start string literals with a character.

Anything else would just be confusing.

Share this post


Link to post
Share on other sites
czardas

Sorry, strings are to be reserved for undefined numeric base variants. 'ACE'. :) It's a kind of numbers only language with minimal syntax. So 100 is not necessarily the same as '100' and string variables may only contain the letters A-F.

Edited by czardas

Share this post


Link to post
Share on other sites
czardas

I couldn't find a working regexp and gave up. If someone can figure it out, I'd love to see a more elegant solution. I eventually resorted to brute force.

;

Local $sCode = "Distance (Floor( au + Random(Distance(Distance, au, angst ), au, angst )) , au , angst ) + angst + Distance(1, au , angst )" ; Imaginary syntax

$sCode = StringRegExpReplace($sCode, "(\s*,\s*)", ",")
$sCode = StringRegExpReplace($sCode, "(Random|Distance|Floor)(\s*\()", "_$1(")

Local $aFunctions = StringRegExp($sCode, "(_\w+[^_]+)",3)

Local $iCharCount, $sParentheses, $iCurrentFunc, _
$sSearch = "_Distance(", $sDistParam = '(au|angst)' ; Potential conflicts

For $i = 0 To UBound($aFunctions) -1
    If Not StringInStr($aFunctions[$i], $sSearch) Then ContinueLoop

    $iCharCount = 1
    $sParentheses = 1
    $iCurrentFunc = $i

    For $j = $iCurrentFunc To UBound($aFunctions) -1
        For $k = 1 To StringLen($aFunctions[$j])
            If $iCharCount > Stringlen($sSearch) Then
                If StringMid($aFunctions[$j], $k, 1) = ")" Then
                    $sParentheses -= 1
                ElseIf StringMid($aFunctions[$j], $k, 1) = "(" Then
                    $sParentheses += 1
                EndIf
            EndIf

            If $sParentheses = 0 Then
                $aFunctions[$j] = StringRegExpReplace(StringLeft($aFunctions[$j], $k), "(.*)" & $sDistParam & '(,)' & $sDistParam & "(\s*\)\z)",'$1"$2"$3"$4"$5') _
                & StringTrimLeft($aFunctions[$j], $k)
                ExitLoop 2
            EndIf
            $iCharCount += 1
        Next
    Next
Next

$sCode = ""
For $i = 0 To UBound($aFunctions) -1
    $sCode &= $aFunctions[$i]
Next

MsgBox(0, "", $sCode)

;

It appears to be parsing correctly now, even though the unfinished code looks somewhat unsophisticated - it's just a proof of concept for the time being. Identical variable names can be used, but will never be recognised by any function in which they appear as parameters. The same goes for function names, which will never be mistaken for variables.

 Disambiguation

; Before
Distance (Floor( au + Random(Distance(Distance, au, angst ), au, angst )) , au , angst ) + angst + Distance(1, au , angst )

; After
_Distance(_Floor( au + _Random(_Distance(Distance,"au","angst" ),au,angst )),"au","angst" ) + angst + _Distance(1,"au","angst" )

;

I don't actually expect anyone to write such a messy expression as the one above. Nor do I expect anyone to convert a random number of astronomical units to angstroms. :whistle:

Edited by czardas

Share this post


Link to post
Share on other sites
TypeIt

I couldn't find a working regexp and gave up. If someone can figure it out, I'd love to see a more elegant solution.

It's impossible to accept every context-free language with a finite-state automaton. The language you're describing is a context-free language, but isn't a regular language. That means that there is no regular expression that can do that for you.

A simplified grammar of your language could be:

expression      -> expression + term | term
term            -> term * factor | factor
factor          -> ( expression ) | string | number | identifier
You can write a simple recursive descent parser which will accept your language. You simply write these rules in AutoIt:

#include <Array.au3>


#Region Parser

Func ParseExpression (Const ByRef $charArray, ByRef $index, ByRef $errors)
    Local $expression = ParseTerm ($charArray, $index, $errors)
    While $charArray [$index] == "+"
        $index += 1
        Local $newExpression = ["+", $expression, ParseTerm ($charArray, $index, $errors)]
        $expression = $newExpression
    WEnd
    Return $expression
EndFunc

Func ParseTerm (Const ByRef $charArray, ByRef $index, ByRef $errors)
    Local $term = ParseFactor ($charArray, $index, $errors)
    While $charArray [$index] == "*"
        $index += 1
        Local $newTerm = ["*", $term, ParseFactor ($charArray, $index, $errors)]
        $term = $newTerm
    WEnd
    Return $term
EndFunc

Func ParseFactor (Const ByRef $charArray, ByRef $index, ByRef $errors)
    If $charArray [$index] == "(" Then
        $index += 1
        Local $factor = ["()", ParseExpression ($charArray, $index, $errors)]
        If $charArray [$index] == ")" Then
            $index += 1
        Else
            _ArrayAdd ($errors, "syntax error: missing ')' at " & $index)
        EndIf
        Return $factor
    EndIf

    If $charArray [$index] == '"' Then Return ParseString ($charArray, $index, $errors)

    If NextIsNumber ($charArray, $index) Then Return ParseNumber ($charArray, $index, $errors)

    Return ParseIdentifier ($charArray, $index, $errors)
EndFunc

Func NextIsNumber (Const ByRef $charArray, ByRef $index)
    Local $charCode = Asc ($charArray [$index])
    Return $charCode >= Asc ("0") And $charCode <= Asc ("9")
EndFunc

Func ParseNumber (Const ByRef $charArray, ByRef $index, ByRef $errors)
    Local $number = 0
    Do
        $number *= 10
        $number += Asc ($charArray [$index]) - Asc ("0")
        $index += 1
    Until Not NextIsNumber ($charArray, $index)
    Local $result = ["number", $number]
    Return $result
EndFunc

Func ParseString (Const ByRef $charArray, ByRef $index, ByRef $errors)
    Local $string = ""
    $index += 1

    ; Keep it simple and do not allow escape sequences.
    While $charArray [$index] <> '"' And $index < UBound ($charArray) - 1
        $string &= $charArray [$index]
        $index += 1
    WEnd

    If $charArray [$index] == '"' Then
        $index += 1
    Else
        _ArrayAdd ($errors, "syntax error: missing '""' at " & $index)
    EndIf

    Local $result = ["string", $string]
    Return $result
EndFunc

Func NextIsIdentifierStart (Const ByRef $charArray, ByRef $index)
    Return $charArray [$index] == "$"
        Or $charArray [$index] == "@"
        Or $charArray [$index] == "_"
        Or StringRegExp ($charArray [$index], "\p{L}")
EndFunc

Func NextIsIdentifierPart (Const ByRef $charArray, ByRef $index)
    Return StringRegExp ($charArray [$index], "\w")
EndFunc

Func ParseIdentifier (Const ByRef $charArray, ByRef $index, ByRef $errors)
    Local $identifier = $charArray [$index]
    $index += 1
    While NextIsIdentifierPart ($charArray [$index])
        $identifier &= $charArray [$index]
    WEnd
    Local $result = ["identifier", $identifier]
    Return $result
EndFunc

Func Parse (Const ByRef $expression, ByRef $errors)
    Local $charArray = StringSplit (StringRegExpReplace ($expression & Chr (0), "\s", ""), "")
    Local $index = 1 ; StringRegExpReplace doesn't return a zero-based array.
    Local $result = ParseExpression ($charArray, $index, $errors)

    If $index < UBound ($charArray) - 1 Then _ArrayAdd ($errors, "syntax error: unexpected character at " & $index)

    If UBound ($errors) > 0 Then SetError (1, UBound ($errors))

    Return $result
EndFunc

#EndRegion


#Region Visitor

Func VisitExpression (ByRef $expression)
    Switch $expression [0]
        Case "+"
            Return VisitAddition ($expression)
        Case "*"
            Return VisitMultiplication ($expression)
        Case "()"
            Return VisitParentheses ($expression)
        Case "number"
            Return VisitNumber ($expression)
        Case "string"
            Return VisitString ($expression)
        Case "identifier"
            Return VisitIdentifier ($expression)
    EndSwitch
EndFunc

Func VisitAddition (ByRef $expression)
    Return VisitExpression ($expression [1]) & " + " & VisitExpression ($expression [2])
EndFunc

Func VisitMultiplication (ByRef $expression)
    Return VisitExpression ($expression [1]) & " * " & VisitExpression ($expression [2])
EndFunc

Func VisitParentheses (ByRef $expression)
    Return "(" & VisitExpression ($expression [1]) & ")"
EndFunc

Func VisitNumber (ByRef $expression)
    Return $expression [1]
EndFunc

Func VisitString (ByRef $expression)
    Return '"' & $expression [1] & '"'
EndFunc

Func VisitIdentifier (ByRef $expression)
    Return $expression [1]
EndFunc

#EndRegion


Local $defaultExpression = '0*1 + 2*(34 + "5") + 6'
Local $expression = InputBox ("", "Please enter an expression.", $defaultExpression)
If @error Then Exit

Local $output = "Before: " & $expression & @CRLF & @CRLF

Local $errors [0]
Local $abstractSyntaxTree = Parse ($expression, $errors)

If @error Then
    $output &= UBound ($errors) & " error(s):" & @CRLF
    For $i = 1 To UBound ($errors)
        $output &= $i & ". " & $errors [$i - 1] & @CRLF
    Next
    $output &= @CRLF
EndIf

Local $transformedExpression = VisitExpression ($abstractSyntaxTree)
$output &= "After: " & $transformedExpression & @CRLF & @CRLF

Func _Sin ($argument)
    Return Sin ($argument)
EndFunc
Local $result = Execute ($transformedExpression)

$output &= "Result: " & $result

MsgBox (0, "", $output)
A general visitor (VisitExpression($visitAddition,$visitMultiplication,...,ByRef $expression) with VisitAddition(ByRef $expression,$visitExpression)) isn't possible (without first-class functions).
  • Like 3

Share this post


Link to post
Share on other sites
czardas

Whoa - TypeIt, that's brilliant! I very much appreciate you spending the time to write such a detailed script for me. I can see it will be very helpful - not only for me. I will study it and try to make the best use of it. Thank you. :thumbsup:

Edited by czardas

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×