Heron Language Specification

Draft version June 7, 2008

Copyright, all rights reserved, Christopher Diggins, http://www.cdiggins.com

Heron Language Home Page, http://www.heron-language.com

About this Document

This is the working draft describing the Heron language specification. This document assumes basic familiarity of the Java language and UML.

About Heron

Heron is effectively a combination of programming language and modeling language. It is designed to allow seamless transitioning between textual and diagrammatic representiation of executable UML (xUML) models. Heron is the first programming language designed specifically to support the model-driven architecture (MDA) paradigm. 

 

Executable UML is a subset of UML 2.1 [3] designed to support translational approaches to model-driven architecture [4].

 

Heron is designed to be easily understood by programmers who have a basic level of familarity with Java and UML. Heron also strives to minimize the number of concepts by using higher-order functions and a more flexible syntax to minimize the language complexity. The syntax for higher-order language features are modeled on the Scala programming language [5].

Acknowledgements

Heron is being developed in collaboration with Abdelwahab Hamou-Lhadj, PhD ( http://users.encs.concordia.ca/~abdelw/ ) assistant professor in the department of Electrical and Computer Engineering (ECE) at Concordia University in Montreal, Canada.

Programming Elements

Notes Regarding Syntax

The syntax is presented using a parsing expression grammar (PEG) [10]. A PEG is similar to an extended BNF format except that it is unambiguous. This leads to a more direct implementation of recursive-descent or linear time complexity packrat parsers.

 

Note that unlike many languages, Heron is strict about the ordering of sections within many elements (e.g. domains, classes).

Source File

<sourcefile> ::== domain+

        

Heron source files contain one or more domains.

Domains 

<domain> ::== domain <identifier> {
  <attributes>?

  <operations>?

  <datatypes>?

  <classes>?

  <associations>?

}

A domain is a representation of a domain model, and corresponds to a singel class diagrams. Domains may contain attributes and operations that are not associated with any particular class. Domains are represented in a UML digram as a singleton classifiers labeled with the <<domain>> stereotype.

Datatypes

<datatypes> ::== datatypes { <datatype>* }

 

<datatype> ::== <tags>? datatype <identifier> <templateparameters>? {

  <attributes>?

  <operations>? 

}

A datatype is similar to a class but it does not have state or invariants, and does not participate in any associations. A datatype is represented on the class diagram as a classifier with the "<<datatype>>" stereotype.

Classes

<classes> ::== classes { <class>* }

 

<class> ::== <tags>? class <identifier> <templateparameters>? {

  <attributes>?

  <operations>? 

  <states>?

  <invariants>?

}

Classes correspond to classifiers in UML class diagrams. 

 

Some classes are not implemented in the domain model. In this case it is called a realized class and it identified in a class diagram with the "<<realized>>" stereotype. This means that all operations have no body. An example of usage would be to represent a CORBA or COM object in a domain model.

Attributes

<attributes> ::== attributes { <attribute>* }

<attribute> ::== <identifer> <typedecl>?; 

<typedecl> ::== : <type>

An attribute corresponds to a class field. The type of an attribute should not be a modeled classifier in the domain. This is not a syntactic or semantic error, simply a rule of thumb. If an attribute has the type of a modeled classifier, then it indicates an implicit association.

Operations

<operations> ::== operations { <operation>* }

<operation> ::== <identifier> <arglist> <typedecl>? <codeblock>?

<onearg> ::== <identifier> <typedecl>?

<manyargs> ::== <onearg> <nextarg>?

<nextarg> ::== , <manyargs>

<arglist> ::== ( <onearg>?  | <manyargs>? )

An operation is a synchronous member function associated with a class or domain.

States

A state represents a stage in the lifecycle of an object. Object lifecycles are modeled in xUML using Moore state machines. An object is always in exactly one state. As soon as the object is created it is in the "initial" state. Objects changes states when they receive an event signal. An event signal is a value of any type that is associated with a transition from the current state to another state.

 

The timing of event signal processing is implementation dependent, however event signals are guaranteed to be processed in the order that they are received.

 

Each state has two sub-sections: an entry procedure, and transition table. The entry procedure is also called an event handler. The entry procedure is executed immediately upon receipt of the event. The implementation does not make any promises about the length of time between when an event is triggered and when it will be received. When multiple events are triggered from a signal source, the order of these events is guaranteed to arrive at an object in order, but this is not true when they arrive from multiple source source.

 

Commentary: The precise definition of source still has to be decided.

 

An object with states is called an active object. An active object must have an initial state. The initial state has no entry procedure. State entry procedures will never be executed in parallel with each other, however they may be executed in parallel with operations in the class.  

 

All entry procedures, must have precisely one argument of any type (called the signal type).

<states> ::== states { <state>* }

<state> ::== state <identifier> ( <identifier> <type> )  { <entryproc>? <transitiontable>? }

<entryproc> ::== entry { <statement>* }

<transitiontable> ::= transitions { <transition>* } 

<transition> ::== <typedecl> -> <identifier> ;

Transition tables specify which states can be reached from a particular state, and what type of signal will trigger the transition. The same type can not be used to identify different transitions. An object is always in exactly one state

Associations

<associations> ::== associations { <association>* }

<association> ::== association <identifer> {

  ends {

    <identifier> <typedecl> <multiplicity>;

    <identifier> <typedecl> <multiplicity>;

  }

  <attributes>?

  <operations>?

}

<multiplicity> ::== [0..1] | [0..*] | [1..1] | [1..*] | [0..0]

Associations represent relationships between classes. The identifiers of an association are called the "rolename". Rolenames are accessible as if they were fields of collections of the classifier they are associated with.

 

class Author {

  attributes {

    name : String;

  } 

}

 

class Book {

  attributes {

    name : String;

  }

}

 

association AuthorToBookRelation {

  connections {

    authors : Author[1..*];

    books : Book[1..*];

  }

}

 

var me : Author = new Author();

var mybook : Book = new Book();

me.name = "Christopher Diggins";

mybook.author.Add(me);

 

// Notice: "me.book" was updated automatically.

print(me.book[0]);

 

There are four four kinds of legal multiplicities of associations [1..1] [1..*] [0..1] [0..*] in xUML. The first two multiplicities are "unconditional" or mandatory which means that an object instance always participate. The other two multiplicities are conditional, meaning that the instance might not participate.

 

In Heron a fifth kind of multiplicity is introduced: [0..0] which is used to model unidirectional associations.  

Unidirectional Associations (xUML Extension)

In programming it is frequent to only specify unidirectional associations however this is not part of executable UML. In Heron we can express this notion by identifying one connection of an association as having multiplicity [0..0].

 

Rationale: In xUML it is currently required that all associations can be queried from both directions. This forces an implementation to become very inefficient.  

Implicit Associations (xUML Extension)

Implicit associations are associations declared as a class attribute. The field representing the class containing the attribute has multiplicity 0..0 and the field representing the class containing the referred class has multiplicity 0..1. A multiplicity of "0..0" is not part of the xUML standard.

 

 

Given two classes MyClass1 and MyClass2, if MyClass1 has an attribute named "mine" of type MyClass2:

class MyClass1 {

  attributes {

    mine : MyClass2;

  }

}

this is precisely the same thing as declaring the following explicit association:

association unnamed {

  ends {

    owner : MyClass1[0..0];

    mine : MyClass2[0..1] 

  }

}

 

It should be expected that tools would likely convert from implicit to explicit associations when switching between diagrammatic and textual views of the source code.

Types and Values

Functions

Class operations, domain operations, anonymous functions, and state machine entry procedures, are all considered functions. Functions in Heron can have multiple results.

Primitives

Heron primitive are: Int, Real, Bool, Char, String, DateTime. All Heron primitives are datatypes.

Collections

Heron collections are instances of the abstract data type: Collection. There are three concrete types of collections predefined in Heron: Set, Bag, and List.

Exceptions

As in C++ any object can be thrown as an exception. A thrown exception causes execution to unwind the call stack until it reaches a try/catch statement which can handle the data object and jumps to the catch statement.

 

Commentary: Maybe it should trigger a state transition. This would however require state machines for exception handling, and it is not clear what the impact of this would be. Such a feature is under investigation.

Statements

<statement> ::==

  <ifstatement>

  | <whilestatement>

  | <breakstatement>

  | <returnstatement>

  | <deletestatement>

  | <foreachstatement>

  | <switchstatement>

  | <throwstatement>

  | <codeblock>

 

If Statement

<ifstatement> ::= if ( <expr> ) <codeblock> <elsestatement>?

<elsestatement> ::== else <codeblock> <elsestatement>?

While Statement

<whilestatement> ::== while ( <expr> ) <statement> 

Break Statement

<breakstatement> ::== break ;

The break statement can be used only from within a while statement or foreach statement. A break statement causes a loop to halt immediately.

Return Statement

<returnstatement> ::== return <expr> ;

Delete Statement

<deletestatement> ::== delete <expr> ;

 The delete statement requests that the system destroy an object instance. The resources associated with an object instance are recovered when the implementation chooses.

 

Commentary: It would be desirable that if an object instance is still in use by other parts of the system, then a soft runtime error occurs immediately, but it is not clear that is always enforceable.

Foreach Statement

 <foreachstatement> ::== foreach ( <identifier> <typedecl>? in <expr> ) <statement>

Iterates over each item in a collection in indexing order if the collection is ordered (or in an unspecified order if the collection is unordered. The identifier is assigned the value in the collection on each iteration, and the loop body statement is executed.

Code Block

<codeblock> ::== { <statement>* }

A code block is a series of statements delimited by "{" and "}".

Switch, Case, and Default Statements

<switchstatement> ::== switch ( <expr> ) { <casestatement>* <defaultstatement>? }

<casestatement> ::== case (<expr>) <codeblock>

<defaultstatement> ::== default <codeblock>

The type of <expr> may be any value as long as it supports comparison. The semantics of a switch statement are similar to a chain of "if else" statements where the condition is a comparison with the initial value. However, the comparison is not guaranteed to be executed in each case.

Throw Statement

<switchstatement> ::== switch ( <expr> ) { <casestatement>* <defaultstatement>? }

<casestatement> ::== case (<expr>) <codeblock>

<defaultstatement> ::== default <codeblock>

Expressions

Literals

<literal> ::==

  true

  | false

  | <decimalintliteral>

  | <hexintliteral>

  | <octintliteral>

  | <binintliteral>

  | <floatliteral>

  | <charliteral>

  | <stringliteral>

  | <tuple>

Operators

Heron operators (e.g. "+", "!", etc.) are legal identifiers. The interpretation of an identifier as an infix, prefix, or postfix operator is type dependent.

Tuples

A tuple is expressed as "(x0, x1, ..., xN)" where x0 through xN are expressions of arbitrary type. A tuple constructor is the comma operator.

Function Application

The most common way to express function application is Java-style notation: f(a0, a1, ...). This is parsed as two separate expressions: "f" and the tuple "(a0, a1, ...)".

 

Function application can also be expressed as two expressions side by side: "f x". This applies (what must be assumed to be a function) f to the expression x (i.e. f(x)). In the case of "f x y", the interpretation is "(f(x))(y)".

"is" Expression

<objectexpr> is <typeexpr> =>

  typeof(<objectexpr>).equals(typeexpr)

The is function is associated with clause 11.3.32 of the UML specification ReadIsClassifiedAction.   

Anonymous Functions

An anonymous function is a function defined within the scope of a code-block, and is constructed dyanmically. An anonymous function can refer to any variable that can be reached from the scope of its construction. There is no binding however and a copy of the variable's value is made. Because of this anonymous functions in Heron are not true closures.

Error Handling

There are two classes of errors: soft errors and hard errors. A soft error is an error that may or may not be catastropic, depending on how the model compiler is directed to handle it. The software is not considered to be in a unsafe state after a soft error. 

 

A hard error is a catastropic error. This indicates the system is in an unsafe mode and must be shut down. The system is given a chance to react during the "hard fail phase" before the software system is shut-down to protect other resources.  

Appendicies

Phase

A Heron program has several phases (stages) that are executed in sequence:

 

 

Runtime Phase Entry Point

The entry point of a Heron application is a compiler specified domain operation (often called "main" by convention) that is executed at run-time.

Verification Phase Entry Point

The verifier is a compiler specified domain operation (often called "verify" by convention) that is executed immediately after compilation, as opposed to run-time. The verifier is expected to use the reflection API to traverse the abstract syntax tree, examining tags, and verify that the static semantics associated with tags is respected.   

Template / Generics (xUML Extension)

Executable UML does not currently support templates, though UML does. Heron supports templates that have similar semantics to C++ templates. Unlike C++ templates, a Heron template is itself a valid type.

 

Commentary: We strongly advocate the inclusion of templates in executable UML, as a way to increase type safety and reduce unneccessary boiler-plate code. A template can take any type as a parameter.

Related Work

The OMG publishes a specification for the Human Usable Textual Notation (HUTN)  [7]. This syntax supports the full meta-object framework (MOF). This is much more general than the scope of Heron which is limited to xUML. The drawback of HUTN for representing xUML is that it does not have a syntax for actions.

References

[1] Executable UML: A Foundation for Model-Driven Architecture by Stephen J. Mellor and Marc J. Balcer.

[2] Model Driven Architecture with Executable UML, by Raistrick, Frances, Wright, Carter, Wilkie.

[3] UML 2.1 Specification by OMG 

[4] MDA Guide verion 1.0.1 http://www.omg.org/docs/omg/03-06-01.pdf

[5] Scala language specification, Martin Odersky et. al.

[6] XMI specification http://www.omg.org/spec/XMI/2.1.1/

[7] Human-Usable Textual Notation (HUTN) http://www.omg.org/technology/documents/formal/hutn.htm

[8] Action Semantics for UML 2.0, Response to RFP http://www.omg.org/cgi-bin/apps/doc?ptc/02-01-09.pdf

[9] SDL Mapping for the UML Action Semantics http://www.omg.org/docs/ad/00-08-01.pdf

[10] Parsing Expression Grammars: A Recognition-Based Syntactic Foundation by Bryan Ford http://pdos.csail.mit.edu/~baford/packrat/popl04/peg-popl04.pdf