Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interfaces: More flexible abstract syntax binding instead of define? #267

Open
andreasabel opened this issue Nov 13, 2019 · 1 comment
Open
Labels
AST Concerning the generated abstract syntax enhancement

Comments

@andreasabel
Copy link
Member

andreasabel commented Nov 13, 2019

This is an issue to contemplate a more flexible handling of LBNF rules in the backend. Currently there is a define pragma that allows rule names to be first-order functions over other rule names. However, this could be obsolete if we produced only an interface for abstract syntax (with a standard implementation) that could be implemented in different ways.
The interface would defined methods to construct abstract syntax and a view on abstract syntax for the pretty printer.

For example, consider the (ambiguous) LBNF grammar:

EInt.   Exp ::= Integer ;
EMinus. Exp ::= Exp "-" Exp ;
EStm.   Exp ::= "{" Stm ";" Exp "}" ;

SExp.   Stm ::= Exp ";" ;
SWhile. Stm ::= "while" "(" Exp ")" Stm ;

In Haskell, this could be represented by the following abstract syntax:

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FunctionalDependencies #-}

-- * Model class

class Exp s e | e -> s where  -- FunctionalDependency needed
  eInt   :: Integer -> e
  eMinus :: e -> e -> e
  eStm   :: s -> e -> e

class Stm s e where -- | s -> e where  -- FunctionalDependency not needed
  sExp   :: e -> s
  sWhile :: e -> s -> s

-- ** View

class ViewExp s e where
  viewE  :: e -> ExpView s e

class ViewStm s e where
  viewS  :: s -> StmView s e

data ExpView s e
  = EInt Integer
  | EMinus e e
  | EStm s e

data StmView s e
  = SExp e
  | SWhile e s

-- * Initial syntax

data IExp
  = IEInt Integer
  | IEMinus IExp IExp
  | IEStm IStm IExp

data IStm
  = ISExp IExp
  | ISWhile IExp IStm

-- ** Constructors

instance Exp IStm IExp where
  eInt   = IEInt
  eMinus = IEMinus
  eStm   = IEStm

instance Stm IStm IExp where
  sExp   = ISExp
  sWhile = ISWhile

-- ** Views

instance ViewExp IStm IExp where
  viewE = \case
    IEInt i      -> EInt i
    IEMinus e e' -> EMinus e e'
    IEStm s e    -> EStm s e

instance ViewStm IStm IExp where
  viewS = \case
    ISExp e     -> SExp e
    ISWhile e s -> SWhile e s

One could also make an instance that directly evaluates programs:

instance Exp () Integer where
  eInt   = id
  eMinus = (-)
  eStm _ = id

instance Stm () Integer where
  sExp _ = ()
  sWhile _ _ = ()

Basically, this is generating a standard interface to what are semantic actions of conventional parser generators.

Advantages:

  • subsumes define (see also Some backends do not implement the define pragma #266)
  • subsumes the undocumented views feature
  • allows to plug a custom implementation of parts of the abstract syntax, like the booleans of the host language for a type of booleans (see Bool production #231), or a different implementation of lists, etc.
  • allows any semantic actions in the parser (would need a monadic interface for effectful actions in Haskell)

Issue:

  • semantic actions are physically separated from grammar (in other file)

This can be an issue, but also an advantage, since the grammar can be studied by itself without being interleaved with the semantic actions. This is in the spirit of BNFC.

Disadvantage:

  • If one uses a custom implementation of AST and changes the grammar, one has to replay the changes on ones custom AST implementation. This is not an issue with define.
@ScottFreeCode
Copy link

ScottFreeCode commented Oct 19, 2021

I love when one solution addresses many needs!

Would this be an alternative to the GADT and functor implementations? Or would those become alternative default implementations under this design?

I presume this design will allow for position information, is that correct?

In terms of ease of use (edit: and/or error handling, printing, etc.) how do you think it compares with, say, using the current design and splitting the interpreter/compiler up into two parts:

  1. A function that converts from AST to desired data structure/types.
  2. Another that converts from that structure to…
    • (compiler) generated code, or
    • (interpreter) function taking input and producing output or actions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AST Concerning the generated abstract syntax enhancement
Projects
None yet
Development

No branches or pull requests

2 participants