[Ocaml MOOC] week3: MORE ADVANCED DATA STRUCTURES
Last week, we only defined flat data structures which are nice to aggregate values but quite limited when you try to structure values.
This week: algebraic datatypes .
1. TAGGED VALUES
⇒ change the return type to a type
query_result
, which can be either of these:
 an error
 a new database (in case of successful insertion/deletion)
 a contact and its index (in case of successful search)
in ocaml, can define such a type (called sum type ) by :
# type query_result =
 Error
 NewDatabase of database
 FoundContact of contact*int;;
More generally, to define disjoint union of types:
type some_type_identifier =
 SomeTag of some_type
 ...
 SomeTag of some_type
tag must start with uppercase letter
Taga are also called
conscturcors
, grammar is like java constructors:
SomeTag (some_expr, ..., some_expr)
(the parenthesis can be omitted if only 1 expr is required)
enumeration:
type color = Black  Gray  White;;
observing tagged values
must prvide an expression
for each possible case
of the value. A
case
is described by a pattern:
SomeTag (some_pattern, ..., some_pattern)
A branch is composed of a pattern an an expr separated by an arrow.
some_pattern > some_expr
pattern matching is a seq of branches:
match some_expr with
 some_pattern > some_expr
...
 some_pattern > some_expr
example:
let engine db query =
match query with
 Insert contact > insert db contact
 Delete contact > delete db contact
 Search name > search db name;;
synatactic shortcut: function keyword (for functions with only 1 argument)
pitfalls
 illtyped pattern
 nonexhaustive case analysis
These errors can be caught by the checker.
2. RECURSIVE TYPES
data structures with unbounded depth, ie, list/tree.
For example, an integer list can be defined as:
# type int_list =
 EmptyList
 SomeElement of int * int_list;;
type int_list = EmptyList  SomeElement of int * int_list
in the machine:
functions on such datastruct usually use pattern matching:
# let rec length = function
EmptyList > 0
SomeElement (x,l) > 1 + length l;;
val length : int_list > int = <fun>
The predefined type in ocaml:
t list

empty list:
[]
([]
is just a special tage corresponding to EmptyList) 
head and tail:
i::r
(::
is just a special tage corresponding to SomeElement) 
a list can be defined by enumeration:
[some_expr; ...; some_expr]

list concatenation:
@
# let rec length = function
 [] > 0
 x::xs > 1 + length xs;;
val length : 'a list > int = <fun>
# length [1;2;3;];;
 : int = 3
# let rec rev = function
 [] > []
 x::xs > (rev xs)@[x];;
val rev : 'a list > 'a list = <fun>
# rev [1;2;3;4];;
 : int list = [4; 3; 2; 1]
the
rev
function above has quadcomplexity → here is the tail rec version:
# let rec rev_aux accu = function
 [] > accu
 x::xs > rev_aux (x::accu) xs;;
val rev_aux : 'a list > 'a list > 'a list = <fun>
# let rev l = rev_aux [] l;;
val rev : 'a list > 'a list = <fun>
3. TREELIKE VALUES
the database type is formed in a (binary)treelike fashion:
# type database =
 NoContact
 DataNode of database * contact * database;;
type database = NoContact  DataNode of database * contact * database
impose the BST invariant:
Now the functions insert/search/delete is BST fashion:
4. CASE STUDY: A STORY TELLER
typedirected programming: writing the right type declaration is half success.
define a story type (and other types:
type story = {
context: context;
perturbation: event;
adventure: event list;
conclusion: context;
}
and context = {characters: character list}
and character = {
name: string;
state: state;
location: location;
}
and event =
 Change of character * state
 Action of character * action
and state = Happy  Hungry
and action = Eat  GoToRestaurant
and location = Appartment  Restaurant;;
5. POLYMORPHIC ALGEBRAIC DATATYPES
parametric programming: example — list is parametrized by the element type.
Hence in
List
module contains polymorphic functions.
Good for code reuse.
define your own polymorphic types, using '
a
to indicate unkonw types:
type ('a1,...,1aN) some_type_identifier = some_type
example:
type 'a option =
 None
 Some of 'a;;
type ('a, 'b) either =
 Left of 'a
 Right of 'b;;
type square = {dimension: int);;
type circle = {radius: int);;
type shape = (square, circle) either;;
another example: bst:
type 'a bst =
 Empty
 Node of 'a bast * a' * 'a bst ;;
let rec insert x = function
 Empty > Node (Empty, x, Empty)
 Node (l, y, r) >
if x=y then Node (l,y,r)
else if x<y then Node (insert x l, y, r)
else Node (l, y, insert x r);;
6. ADVANCED TOPICS
precise typing
when 2 types have the same structure but different semantical meaning: a sum type with only one constructor can be useful to distinguish them.
example:
type euro = Euro of float;;
type dollar = Dollar of float;;
let euro_of_dollar (Dollar d) = Euro (d /. 1.33);;
let x = Dollar 4;;
let y = Euro 5;;
let valid_comparison = (euro of dollar x < y)
disjunctive patterns
Use orpatterns to factorize branches into a unique branch:
some_pattern_1  some_pattern_2
means observation of either pattern 1 or pattern 2.
constraint: both must contain the same identifiers .
ex:
let remove_zero_or_one_head = function
 0::xs  1::xs > xs
 l > l
let remove_zero_or_one_head' = function
 (01)::xs > xs
 l > l
aspatterns
convenient ot
name a matched component
:
some_pattern as x
( if the value can be observed using some_pattern, name it x)
ex.
let rec duplicate_head_at_the_end = function
 [] > []
 (x::_) as l > l @[x]
guard: pattern matching branch using
when
a guard (some boolexpression) can add an extra constraint to a pattern:
ex.
let rec push_max_at_the_end = function
 ([]  [_]) as l > l
 x::((y::_) as l) when x<=y > x::(push_max_at_the_end l)
 x::y::ys > y::push_max_at_the_end (x::ys);; (*when x>y, should permuate x and y*)