-
Notifications
You must be signed in to change notification settings - Fork 16
/
Copy pathscott.lhs
199 lines (146 loc) · 6.5 KB
/
scott.lhs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
= The Scott encoding =
Encoding data as functions may seem strange at first, but after playing
with combinators for a while,
https://en.wikipedia.org/wiki/Mogensen%E2%80%93Scott_encoding[the Scott
encoding] suggests itself.
Take an algebraic data type:
------------------------------------------------------------------------
data Adt a b c = Foo a | Bar | Baz b c
------------------------------------------------------------------------
Suppose we come across a value `x` of type `Adt`. What can we do with it?
The only nontrivial course of action is to scrutinize it with a case statement:
------------------------------------------------------------------------
case x of
Foo a -> f a
Bar -> g
Baz b c -> h b c
------------------------------------------------------------------------
What's the quickest way to turn this expression into a closed lambda calculus
term?
One idea that springs to mind is to delete the keywords `case` and `of` and
replace each data constructor in a pattern with `\`. Then we get:
------------------------------------------------------------------------
x (\a -> f a) (\ -> g) (\b c -> h b c)
------------------------------------------------------------------------
Here, a lambda abstraction with no variables has the obvious meaning, namely,
`+\ -> g+` is the same as `g`.
Then the Scott encoding of the value `x` of type `Adt` is whatever it takes to
make the above work like the original case expression.
For example, the value `Foo 42` must be the function:
------------------------------------------------------------------------
\f _ _ -> f 42
------------------------------------------------------------------------
the value `Bar` must be:
------------------------------------------------------------------------
\_ g _ -> g
------------------------------------------------------------------------
and the value `Baz "qux" 9000` must be:
------------------------------------------------------------------------
\_ _ h -> h "qux" 9000
------------------------------------------------------------------------
More generally:
------------------------------------------------------------------------
Foo a = \f _ _ -> f a
Bar = \_ g _ -> g
Baz a b = \_ _ h -> h a b
------------------------------------------------------------------------
== Booleans, Numbers, Lists ==
We would like to define booleans as:
------------------------------------------------------------------------
data Bool = False | True
------------------------------------------------------------------------
Since `False` appears before `True`, a compiler seeing this would naturally
index them with the numbers 0 and 1 respectively, which matches common
practice. It matches Haskell, too, as can be seen in the functions: `fromEnum
toEnum minBound maxBound succ pred`.
Unfortunately, for the time being, we must define:
------------------------------------------------------------------------
data Bool = True | False
------------------------------------------------------------------------
so our compiler produces the Scott encodings:
------------------------------------------------------------------------
True = \x _ -> x
False = \_ y -> y
------------------------------------------------------------------------
This is because long before computers were commonplace, Alonzo Church devised
this particular encoding of booleans, which subsequently became standard.
We bow down to this convention to avoid confusion.
Peano numbers are defined by the following algebraic data type:
------------------------------------------------------------------------
data Peano = Zero | Succ Peano
------------------------------------------------------------------------
The Scott encoding is:
------------------------------------------------------------------------
Zero = \f _ -> f
Succ n = \_ g -> g n
------------------------------------------------------------------------
Unlike Church numerals, the predecessor function (where we define the
predecessor of zero to be zero) is easy to write down:
------------------------------------------------------------------------
predecessor n = case n of
Zero -> Zero
Succ n -> n
------------------------------------------------------------------------
Using the Scott encoding:
------------------------------------------------------------------------
predecessor n = n Zero (\n -> n)
------------------------------------------------------------------------
Pure lambda calculus has an undeserved reputation for sloth.
Perhaps one of the misconceptions is arithmetic must be performed in unary.
Not so! We may define numbers in binary instead of unary:
------------------------------------------------------------------------
data Binary = End | Nil Binary | One Binary
------------------------------------------------------------------------
But we may as well use a list of booleans instead.
A list is defined by:
------------------------------------------------------------------------
data [a] = [] | a : [a]
------------------------------------------------------------------------
yielding the Scott encodings:
------------------------------------------------------------------------
[] = \f _ -> f
(:) a as = \_ g -> g a as
------------------------------------------------------------------------
Here's a starter pack of functions for numbers encoded in binary:
\begin{code}
type Binary = [Bool]
suc :: Binary -> Binary
suc bs = case bs of
[] -> [True]
False : bt -> True : bt
True : bt -> False : suc bt
addC :: Binary -> Binary -> Bool -> Binary
addC as bs c = case as of
[] -> if c then suc bs else bs
(a:at) -> case bs of
[] -> if c then suc as else as
(b:bt) -> ((a /= b) /= c) : addC at bt (a && b || c && (a || b))
add :: Binary -> Binary -> Binary
add as bs = addC as bs False
strip :: Binary -> Binary
strip bs = case bs of
[] -> []
True : bt -> True : strip bt
False : bt -> case strip bt of
[] -> []
bt' -> False : bt'
pre' bs = case bs of
[] -> []
True : bs' -> False : bs'
False : bs' -> True : pre' bs'
pre :: Binary -> Binary
pre bs = strip (pre' bs)
decode' n acc bs = case bs of
[] -> acc
False : bt -> decode' (2*n) acc bt
True : bt -> decode' (2*n) (acc + n) bt
decode :: Binary -> Int
decode = decode' 1 0
encode :: Int -> Binary
encode n = if n == 0 then [] else case divMod n 2 of
(n', 0) -> False : encode n'
(n', 1) -> True : encode n'
\end{code}
We have no need for these since we'll extend CL to support the native integer
types of the underlying machine. But we stress again pure lambda calculus
numerals are not condemned to be unary!