APartSML(2)——语法，类型与评估

Posted Nov 24, 2024 Updated Jan 5, 2025

By ZoeyWang

12 min read

1.Expressions

在 ML 语言中，程序执行的过程通常可以分为三步：语法解析、类型检查和评估。在语法解析阶段，ML 编译器会检查代码是否符合语言的语法规则，ML 是一种静态类型语言，这意味着所有变量和表达式的类型在编译时就必须确定。在类型检查阶段，编译器会根据语法树推断出每个表达式的类型，并检查类型是否符合预期。通过语法解析和类型检查后，ML 开始执行代码。这一阶段会对表达式进行实际求值，生成结果。

每类表示式的运行有三步：

（1）syntax：语法是什么

（2）type-checking：检查变量类型是否合法

（3）evaluation：获取变量值，抛出异常或进入循环

举例说明addition表达式

syntax：e1+e2，e1和e2要求都是某个表达式
type-checking：如果e1和e2是int类型，那么e1+e2也是int类型
evaluation：如果e1评估值为v1，e2评估值为v2，那么e1+e2的值评估为v1和v2的和

2.Pairs and Tuples

ML中的pairs构建方式

syntax：(e1, e2)
type-checking：如果e1类型是ta，e2类型是tb，那么(e1, e2)类型是ta * tb
evaluation：如果e1评估值为v1，e2评估值为v2，那么(e1, e2)值评估为(v1, v2)

ML中的pairs读取方式

syntax：#1 e，#2 e
type-checking：如果类型是ta * tb，那么#1 e类型是ta，#2 e类型是tb
evaluation：e的值是一对，返回第一个值或第二个值

  
fun swap(pr : int * bool) =
    (#2 pr, #1 pr);

(* (int * int) * (int * int) -> int *)
fun sum_two_pairs(pr1 : int * int, pr2 : int * int)=
    (#1 pr1) + (#2 pr1) + (#1 pr2) + (#2 pr2);

(* int * int -> int * int *)
fun div_mod(x : int, y : int) =
    (x div y, x mod y);

fun sort_pair(pr : int * int) =
    if (#1 pr) < (#2 pr)
    then pr
    else (#2 pr, #1 pr);

  
- use "tuple.sml";
[opening tuple.sml]
val swap = fn : int * bool -> bool * int
val sum_two_pairs = fn : (int * int) * (int * int) -> int
val div_mod = fn : int * int -> int * int
val sort_pair = fn : int * int -> int * int
val it = () : unit
- swap(7, true);
val it = (true,7) : bool * int
- sort_pair(4,3);
val it = (3,4) : int * int

tuple可以有多个部分， tuple和pair可以嵌套，比如((7, true),9)

3.Lists

ML中的list可以有任何数量的元素，但是所有的元素类型都要相同

ML中的list构建方式

空列表是一个值：[]
列表形式：[v1, v2, …, vn]
如果e1值为v，e2值为[v1,…, vn]，那么e1::e2值为[v, v1,…, vn]

ML中的list读取方式

null e：如果e为[]，那么null e值为true
hd e：如果e值为[v1,…, vn]，那么hd e为v1
t1 e：如果e值为[v1,…, vn]，那么tl e为[v1,…, vn]

ML中的list type-checking

t list：表示这个list的所有元素类型都是t，比如int list，(int * int) list
[]：类型为’a list，表示可以为任何类型

  
fun sum_list(xs : int list) =
    if null xs
    then 0
    else hd xs + sum_list(tl xs);

fun countdown (x : int) =
    if x=0
    then []
    else x::countdown(x-1);

fun append(xs : int list, ys : int list) =
    if null xs
    then ys
    else (hd xs) :: append((tl xs), ys);

fun sum_pair_list(xs :(int * int) list) =
    if null xs
    then 0
    else #1 (hd xs) + #2 (hd xs) + sum_pair_list(tl xs);

fun firsts(xs : (int * int) list) =
    if null xs
    then []
    else (#1 (hd xs)) :: firsts(tl xs);

fun seconds(xs : (int * int) list) =
    if null xs
    then []
    else (#2 (hd xs)) :: seconds(tl xs);

fun sum_pair_lists(xs : (int * int) list) =
    sum_list (firsts xs) + sum_list (seconds xs);

用递归的方式实现list的常见方法

4.Let Expression

ML中Let构建方式

syntax：let b1 b2… bn in e and
type-checking：整个let-expression类型是e
evaluation：整个let-expression值是e

可以在let expression中使用nested function，即嵌套函数

  
fun silly1 (z : int)=
    let val x = if z > 0 then z else 34
    val y = x + z + 9
    in
    if x > y then x * 2 else y * y
    end;

fun silly2() =
    let
    val x = 1
    in
    (let val x = 2 in x+1 end) + (let val y=x+2 in y+1 end)
    end;

fun countup_from(x : int) =
    let
    fun count (from : int) =
        if from = x
        then x::[]
        else from::count(from+1);
    in
    count(1)
    end;

- use "let.sml";
[opening let.sml]
val silly1 = fn : int -> int
val silly2 = fn : unit -> int
val countup_from = fn : int -> int list
val it = () : unit
- silly2();
val it = 7 : int
- countup_from(7);
val it = [1,2,3,4,5,6,7] : int list

5.Let Efficiency

举一个求列表最大值的例子说明，用let expression来解决回溯重复操作问题

  
fun countup(from : int, to : int) =
    if from=to
    then to::[]
    else from::countup(from+1,to);


fun countdown(from : int, to : int) =
    if from=to
    then to::[]
    else from::countdown(from-1,to);

一个不好的实现代码：bad_max(tl xs)会重复执行，如果遇到的列表是正序，那么重复的次数会更多，因此在bad_max(countup(1,40))就会发现需要一点时间才能计算出来

  
fun bad_max(xs : int list) =
    if null xs
    then 0
    else if null (tl xs)
    then hd xs
    else if hd xs > bad_max(tl xs)
    then hd xs
    else bad_max(tl xs);

- bad_max(countup(1,20));
val it = 20 : int
- bad_max(countup(1,40));
  C-c C-c
Interrupt
- bad_max(countdown(30,1));
val it = 30 : int

为了避免重复操作，使用let expression提前存储bad_max(tl xs)的值，速度快很多

  
fun good_max(xs : int list) =
    if null xs
    then 0
    else if null (tl xs)
    then hd xs
    else
    let val tl_ans = good_max(tl xs)
    in
        if hd xs > tl_ans
        then hd xs
        else tl_ans
    end;

- bad_max(countdown(30,1));
val it = 30 : int
- good_max(countup(1, 400));
val it = 400 : int
- good_max(countdown(400,1));
val it = 400 : int

关键：不要重复某些操作，提前存储保留某些值进行局部绑定，能有效减少耗时

6.Option

在5.Let Efficiency中，对于空列表返回的是0，这可以运行但是显然不合理，我们希望函数可以返回多种类型，可以代表空，也可以是int，ML提供了Options

NONE：代表’a option，比如[]的类型是’a list
SOME e：如果e类型为t，那么SOME e类型为t option

SOME e不是一个可以直接读取的对象

isSome：判断是否有类型’a option，如果为NONE返回false
valOf：读取SOME e中e的值

对上面的max函数再进行优化，函数返回的是SOME e，类型为int list -> int option，对于SOME类型不能直接运算，需要通过valOf取值

  
fun max1(xs : int list) =
    if null xs
    then NONE
    else
    let
        val tl_ans = max1(tl xs)
    in
        if isSome tl_ans andalso valOf tl_ans > hd xs
        then tl_ans
        else SOME(hd xs)
    end;

- use "option.sml";
[opening option.sml]
val max1 = fn : int list -> int option
val it = () : unit
- max1([3,2,5]);
val it = SOME 5 : int option
- max1([3,2,5]) + 1;
stdIn:7.1-7.18 Error: operator and operand do not agree [overload - bad instantiation]
- valOf(max1([3,2,5])) + 1;
val it = 6 : int

上面代码还可以再优化： isSome tl_ans是为了检查tl_ans不是NONE类型，但是实际上只有在xs只有一个元素时才会出现这种情况，而上面代码每次都在isSome判断，因此可以再把tl xs为null的情况单独写出来

  
fun max2(xs : int list) =
    if null xs
    then NONE
    else
    let
        fun max_nonempty(xs : int list) =
        if null (tl xs)
        then hd xs
        else
            let
                val tl_ans = max_nonempty(tl xs)
            in
                if hd xs > tl_ans
                then hd xs
                else tl_ans
            end
    in
        SOME(max_nonempty xs)
    end;

- max2([3,2,5]);
val it = SOME 5 : int option

7.Bool and Comparsion

ML中bool操作

syntax：e1 andalso e2，e1 orelse e2，not e1
type-checking：e1和e2的类型都是bool
evaluation：对于e1 andalso e2，如果e1的结果是false，结果是false，否则返回e2的值

ML中比较操作

=，<>，>，<，>=，<=
<，>=，<=可以用于real，但是不能一个int，一个real

8.No mutation

（1）可变性：ML中的变量是不可变的，某些语言中变量可变，比如python list是可变的，对于可变变量需要小心处理是否前后一致。

（2）赋值语句有两种结果：引用reference(别名alisa)vs复制(copy)，alisa在内存层面引用的是原来的地址，copy是占用内存新建了一个变量，在ML中是alias

  
fun append(xs : int list, ys : int list) =
    if null xs
    then ys
    else hd(xs)::append(tl(xs), ys);

val x = [2, 4];
val y = [5, 3, 0];
val z = append(x, y);

# z的存储方式为第一种，返回的是ys的alias，而不是copy

9.Pieces

对于任何一种语言，需要关注一下5点

Syntax：语法
Semantics：语义，程序是什么意思（Evaluation rule）
Idioms：惯用语，有哪些典型的样式使用语言特征来表达计算
Libraries：有哪些可以帮助语言的使用更加标准，比如文件读取标准库
Tools：语言实现的工具，帮助开发，比如REPL，debuggerval

Coursera, ProgrammingLanguage

programming

This post is licensed under CC BY 4.0 by the author.