Solving Partially Reduced Problem

We met the “partially reduced problem” when we were adding the builtin function feature. That is, the primitive expression is returned unevaluated, though it can still be reduced. Actually, it doesn’t only happen on the primitive expression, but also exists for ordinary expressions. Let’s try another one:

$ ./main
Welcome to Lambda Calculus Evaluator.
Press Ctrl+C to quit.

> (lambda x (lambda y y x)) 1 (lambda x x)
-> (lambda x x) 1

Obviously, the result can be further reduced to 1. Let’s do a simulation of the evaluation algorithm to see why we have such problem. The expression is evaluated from left to right. First, we need to evaluate (lambda x (lambda y y x)) 1. The left hand side is an abstraction, whose body is an abstraction too. We need to evaluate its body first. The inner body y x is evaluated to y x, so it remains unchanged. The right hand side is a constant and the result is itself. After performing a beta-reduction, the expression turns to (lambda y y 1). Then, we have (lambda y y 1) (lambda x x). Both left and right hand sides are abstraction and keeps unchanged after evaluating their bodies. Finally a beta-reduction is performed and it results in (lambda x x) 1. It is not a primitive expression, so the evaluation terminates and result (lambda x x ) 1 is returned.

This kind of “partially reduced problem” exists in many expressions, so we need to fix it. However, before doing that, we will define what “totally reduced” is, as opposed to “partially reduced”.

Evaluation Values

Partially reduced result is generated because the evaluator terminates at a point where it shouldn’t. So essentially we are talking of the termination condition. That means our evaluator should only terminate where some conditions are met. Otherwise, it should keep evaluating the resulted expression. So if the resulted expression doesn’t meet the conditions, it should never be returned. I’d like to call the expressions that can be returned as evaluation values, values for short, because they are the only results our evaluator can give out.

From our current evaluation algorithm, we can easily identify that identifiers and constants are values. For an abstraction, though we may evaluate its body first(it turns out to be a bad strategy, we will get rid of this later), we can still return it as a result. For applications and primitives, we will always perform reductions on them. So we define the evaluation values as:

V = identifier | constant | abstraction

The evaluator terminates only if it can return the above values. It should continue to evaluate the expression if it is an application or primitive expression. However, the evaluator doesn’t always guarantee to return a value for every expression. Take x y as an example, it is an application expression but it can not be further reduced because x is not a defined function. Another example is + (lambda x x) 1, with the + function can only be applied on constants. It doesn’t make sense to apply + on abstractions, so it cannot be reduced to a value. Those two are invalid expressions, and we will add error handling mechanism for them in the future. Another set of expression is those that can be reduced but it never stops, such as (lambda x x x) (lambda x x x). It is an infinite loop, and the evaluator runs forever. The evaluator needs to take both of these expressions into consideration, too.

Defective Evaluation Strategy

I never inspected our current evaluation strategy in the past posts, so I am going to discuss it in detail and find the defects. Currently, the evaluator obeys the following rules:

1) Evaluate the body of an abstraction before returning it;
2) Evaluate both parts of the application first, then perform beta-reduction for once, and return the result.

In rule 2), I perform beta-reduction for only once because I assume after performing rule 1) to an abstraction expression, it will be simple enough so no reducible expressions could be generated by applying it to any expression. For example, for expression (lambda x (lambda y y) x) 1, before applying the abstraction to 1, it will be simplified to (lambda x x). After that, we apply it to 1 and we get the result 1. However, this is not always true. The expression mentioned at the beginning of this post is a good counterexample. And we probably have this problem when providing an abstraction as a parameter to another abstraction.

The defective assumption for rule 2) is the root cause of “partially reduced problem”. Besides that, rule 1) is also a stupid choice. Consider expression (lambda x 7) (lambda x (lambda y y y) (lambda y y y)). Apparently, the result is 7, but if we evaluate the body of an abstraction first, we will go into an endless loop.

From the above analysis, we learned that performing beta-reduction only once on an application may not always give a value, and evaluating the body of an abstraction first is not a good strategy. Based on this, we are going to propose a robust evaluation strategy.

Robust Evaluation Strategy

Our robust evaluation strategy states the following rules:

1) Return an abstraction expression directly, without evaluating its body;
2) Keeps evaluating the expression until we get a value.

We can easily change the evaluator to use this new strategy. Here is the code:

 1 // file: eval.c
 2 TreeNode * evaluate(TreeNode *expr) {
 3     TreeNode* result = expr;
 4     if(expr!=NULL) {
 5         switch(expr->kind) {
 6             case IdK:
 7             case ConstK:
 8             case AbsK:
 9                 return expr;
10             case AppK:
11                 expr->children[0] = evaluate(expr->children[0]);
12                 expr->children[1] = evaluate(expr->children[1]);
13                 if(expr->children[0]->kind==IdK) {
14                     // expand tree node for builtin functions
15                     BuiltinFun* fun = lookupBuiltinFun(expr->children[0]->name);
16                     if(fun!=NULL) {
17                         expr->children[0] = (fun->expandFun)();
18                     }
19                 }
20                 return evaluate(betaReduction(expr));
21             case PrimiK:
22                 expr->children[0] = evaluate(expr->children[0]);
23                 expr->children[1] = evaluate(expr->children[1]);
24                 // only perform primitive operation if operands are constants
25                 if(expr->children[0]->kind==ConstK
26                     && expr->children[1]->kind==ConstK) {
27                     result = evalPrimitive(expr);
28                 }
29                 return result;
30             default:
31                 fprintf(errOut,"Unkown expression kind.\n");
32         }
33     }
34     return expr;
35 }

Abstractions are returned just like identifiers and constants, and we also evaluate the result of a beta-reduction. One thing I need to mention here is that we have expressions which cannot be reduced to values, as described in “Evaluation Values” section. For those invalid expressions, the evaluator could identify it. For example, for x y, we will treat x as an identifier of a builtin function and we will fail because it is not. For + (lambda x x) 1, after expanding + we will finally try to evaluate the primitive expression (lambda x x) + 1, and by checking its operands we will know it is invalid. We will add error handling mechanisms to keep it consistent with current computation model for these invalid expressions. For now, the evaluator just prints an error message and stop evaluating. For the expressions causing endless loops, I don’t think we can find any method to determine whether an expression runs forever, because it is a halting problem. So our evaluator doesn’t handle this case, and it will run forever. It is the programmer’s responsibility to make sure there is no endless loop in the code. Here is the revised code:

 1 // file: eval.c
 2 TreeNode * evaluate(TreeNode *expr) {
 3     TreeNode* result = expr;
 4     if(expr!=NULL) {
 5         switch(expr->kind) {
 6         case IdK:
 7         case ConstK:
 8         case AbsK:
 9             return expr;
10         case AppK:
11             expr->children[0] = evaluate(expr->children[0]);
12             expr->children[1] = evaluate(expr->children[1]);
13             if(expr->children[0]->kind==IdK) {
14                 // expand tree node for builtin functions
15                 BuiltinFun* fun = lookupBuiltinFun(expr->children[0]->name);
16                 if(fun!=NULL) {
17                     expr->children[0] = (fun->expandFun)();
18                 } else {
19                     fprintf(errOut, "Error: %s is not a builtin function.\n", expr->children[0]->name);
20                     return expr;
21                 }
22             }
23             return evaluate(betaReduction(expr));
24         case PrimiK:
25             expr->children[0] = evaluate(expr->children[0]);
26             expr->children[1] = evaluate(expr->children[1]);
27             // only perform primitive operation if operands are constants
28             if(expr->children[0]->kind==ConstK
29                 && expr->children[1]->kind==ConstK) {
30                 result = evalPrimitive(expr);
31             } else {
32                 fprintf(errOut, "Error: %s can only be applied on constants.\n", expr->name);
33             }
34             return result;
35         default:
36             fprintf(errOut,"Unkown expression kind.\n");
37         }
38     }
39     return expr;
40 }

With this robust evaluation strategy, we are able to get rid of the “partially reduced problem”.