Additional Language Features of SML/NJ

SML/NJ provides some additional language features beyond those specified in the SML \'97 Definition.

Vector expressions and patterns

Vectors are homogeneous, immutable arrays (see the Vector structure). Vectors are a standard feature of SML \'97, but SML/NJ also has special syntax for vector expressions and vector patterns. In SML \'97, vectors can be created only by calling functions from the Vector structure, and cannot be pattern-matched.

The vector expression

#[exp₀, ..., exp_n]

(where n >= 0) creates a vector of length n+1 whose elements are the values of the corresponding subexpressions. As with other aggregate expressions, the element expressions are evaluated from left to right. Vectors may be pattern-matched by vector patterns of the form

#[pat₀, ..., pat_n]

Such a pattern will only match a vector value of the same length.

Vector expressions and vector patterns have more compact and efficient runtime representations than lists, and are comparable in cost to records.

Or-patterns

SML/NJ has also extended the syntax of patterns to allow “or-patterns.” The basic syntax is:

(apat₁ | ... | apat_n)

where the apat_i are atomic patterns. The other restriction is that the variables bound in each apat_i must be the same, and have the same types. A simple example is:

fun f ("y" | "yes") = true
  | f _ = false

which has the same meaning as:

fun f "y" = true
  | f "yes" = true
  | f _ = false

Quote and Antiquote

The original use of ML was as a Meta Language for manipulating terms in an object language (typically a logic; originally LCF, or Scott's “Logic for Computable Functions”). The original LCF/ML had features to parse one particular object language (called the OL). Standard ML of New Jersey has support for arbitrary object languages, with user supplied object-language parsing.

Higher-order Modules

The module system of Standard ML has always supported first-order parametric modules in the form of functors (aka module functions). But there are occasions when one would like to parameterize over functors as well as structures, which requires a truly higher-order module system (see, for instance, this powerset functor example). As of Version 0.93 (Feb 1993) SML/NJ has provided a higher-order extension of the module system.

Parameterization over functors can be provided in a straightforward way by allowing functors to be components of structures. Syntactically this can be accomplished merely by allowing functor declarations in of structure bodies, and by providing syntax for functor specifications in signatures. Functor specifications were already part of the module syntax of the 1990 Definition of Standard ML (Figure 8, p. 14), so we have implemented that syntax and added it to the _spec syntax class (Figure 7, p. 13). In addition, it is convenient to have a way of declaring functor signatures and some syntactic sugar for curried functor definitions and partial application of curried functors, and so these are provided. This extension is an “upward-compatible” enrichment of the language that breaks no existing programs.

Functors as structure components.

In the extended language, a signature can contain a functor specification:

signature SIG =
sig
  type t
  val a : t
  functor F(X: sig type s
                   val b: s
               end) : sig val x : t * X.s end
end

To match such a signature, a structure is allowed to contain a functor declaration:

structure S : SIG =
struct
  type t = int
  val a = 3
  functor F(X: sig type s val b: s end) = struct val x = (a,X.b) end
end

This makes it possible to define higher-order functors by including a functor as a component of a parameter structure or of a result structure. The case of a functor parameter is illustrated by the following example.

signature MONOID =
sig
  type t
  val plus: t*t -> t
  val e: t
end;

(* functor signature declaration *)
funsig PROD (structure M: MONOID
             structure N: MONOID) = MONOID

functor Square(structure X: MONOID
                functor Prod: PROD): MONOID =
  Prod(structure M = X
       structure N = X);

Note that this example involves the definition of a functor signature PROD.

Currently functor signature declarations take one of the following forms:

funsig funid (strid_i: sigexp_i) = sigexp
funsig funid (specs) = sigexp

Warning

This syntax is viewed as provisional and subject to change (but it hasn't changed since 0.93).

A common use of functors returning functors in their result is to approximate a curried functor with multiple parameters. Here is how one might define a curried monoid product functor:

functor CurriedProd (M: MONOID) =
struct
  functor Prod1 (N: MONOID) : MONOID =
    struct
      type t = M.t * N.t
      val e = (M.e, N.e)
      fun plus((m1,n1),(m2,n2))=(M.plus(m1,m2),N.plus(n1,n2))
    end;
end

This works, but the partial application of this functor is rather awkward because it requires the explicit creation of an intermediate structure:

structure IntMonoid =
struct
  type t = int
  val e = 0
  val plus = (op +): int*int -> int
end;

structure Temp = CurriedProd(IntMonoid);

functor ProdInt = Temp.Prod1;

To simplify the use of this sort of functor, some derived forms provide syntactic sugar for curried functor definition and partial application. Thus the above example can be written:

functor CurriedProd (M: MONOID) (N: MONOID) : MONOID =
struct
  type t = M.t * N.t
  val e = (M.e, N.e)
  fun plus((m1,n1),(m2,n2))=(M.plus(m1,m2),N.plus(n1,n2))
end;

functor ProdInt = CurriedProd(IntMonoid);

The syntax for curried forms of functor signature and functor declarations and for the corresponding partial applications can be summarized as follows:

funsig funsigid (par₁) ... (par_n) = sigexp

functor funid (par₁) ... (par_n) = strexp

functor funid1 = funid2 (arg₁) ... (arg_n)

structure strid = funid (arg₁) ... (arg_n)

where

par ::= id : sigexp | specs
arg ::= strexp | dec

In the case of a partial application defining a functor, it is assumed that the funid2 on the right hand side takes more than n arguments, while in the case of the structure declaration funid should take exactly n arguments. As a degenerate case where n=0 we have identity functor declarations:

functor funid1 = funid2

There is also a "let" form of functor expression:

fctexp ::= let dec in fctexp end

which can only be used in functor definitions of the form:

functor funid = let dec in fctexp end

The curried functor declaration

functor fctid (par₁) ... (par_n) = strexp

is a derived form that is translated into the following declaration

functor F (par₁) =
struct
  functor %fct% (par₂) ... (par_n) = strexp
end

and the declarations

structure S = F (arg₁) ... (arg_n)
functor G = F (arg₁) ... (arg_n)

are derived forms expanding into (respectively):

local
  structure %hidden% = F (arg₁)
in
  structure S = %hidden%.%fct% (arg₂) ... (arg_n)
end

and

local
  structure %hidden% = F (arg₁) ... (arg_n)
in
  functor G = %hidden%.%fct%
end

Currently there is no checking that a complete set of arguments is supplied when a curried functor is applied to define a structure, as illustrated by the following example:

functor Foo (X: sig type s end) (Y: sig type t end) =
struct
  type u = X.s * Y.t
end

structure A = struct type s = int end

structure S = Foo (A)  (* Foo A yields a (useless) structure *)

functor G = Foo (A)    (* Foo A yields a functor *)