4

I think i've understood more or less what a parsed Scheme program looks like (a binary tree with atomic values on the leaves, if i have understood correctly). Can anybody please define to me, or give a reference, what a state (or a computation) of a Scheme program is? Is it just the current binding plus a position, or a stack of positions, on the syntax tree? (In such a case, i would appreciate a reference for a formal definition of Scheme binding as well :).)

Is there some simple description, like the one for the Turing Machine (the program + the current content of the tape + the current position on the tape)?

Alexey
  • 932
  • 7
  • 19
  • 1
    check out [SECD machine](http://en.wikipedia.org/wiki/SECD_machine). Also, "Introduction to Functional Programming Systems Using Haskell" 1992 by Davie, describes it on pp 109 - ... (preview on Google Books). – Will Ness Mar 30 '13 at 14:45

4 Answers4

3

Both of the other answers look excellent to me.

I would add the following: the state of a scheme program is... a scheme program! That is, you can define a "meaning" function for scheme programs by showing how programs reduce to other programs. This is called a "small-step" semantics. To show you what I mean: what are the "steps"--in a seventh grade sense--in evaluating

3 * (4 + 5)

? According to the small-step rules provided by most teachers, the first step in evaluating this program is to reduce it to

3 * 20

In exactly the same way, you can define a small-step semantics for Scheme that reduces a program to an answer by taking a series of reduction steps.

John Clements
  • 351
  • 1
  • 5
  • I have though of this, but could not convince myself that there is nothing more in a state. For example, if there is some external input-output. Could you give me some reference to more information on small-step semantics for Scheme, please? It is hard for me for now to translate operational semantics to such "program transformation" semantics, for example i do not see what would be a small step to reduce something like `(quote x)`, or why this form is "irreducible". – Alexey Mar 29 '13 at 07:28
  • 1
    Well, you have two choices here. You can regard inputs and outputs as being "internal" to the model, or "external." In the first case, you'd probably use tricks like Morrisett/Felleisen/Harper to model external streams as mutable variables specified at the outside of the program. In the second case, you would probably model input operations as nondeterministic, and output operations as no-ops. more... – John Clements Apr 01 '13 at 22:36
  • Alternatively, you could go with a "labeled transition" semantics that attaches outputs to the arrows; basically, each step produces both a new expression and an optional output. If you're looking for a book, I think I'd probably recommend "Engineering Reduction Semantics with PLT Redex", by Findler/Felleisen/Flatt . – John Clements Apr 01 '13 at 22:37
  • John, your definition of state agrees with [this lecture](http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-001-structure-and-interpretation-of-computer-programs-spring-2005/video-lectures/1b-procedures-and-processes-substitution-model/) of [a course on Structure and Interpretation of Computer Programs](http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-001-structure-and-interpretation-of-computer-programs-spring-2005/). – Alexey Apr 03 '13 at 06:32
  • There is however something awkward with this semantics/state: its use of `define` seems far from optimal. To make a "small step" of replacing an identifier with its definition, the whole program must be searched for a matching `define` or `set!`. This semantics looks to me a bit impractical even from theoretical view point. The rules for searching for a matching `define` are complicated (for example, it should not occur in a quoted expression). – Alexey Apr 14 '13 at 09:48
  • All substitutions are going to require searching the whole program; naturally, an implementation would avoid (or amortize) this cost by using an environment. A semantics such as the one we're describing here is chosen for its simplicity, not for its performance. – John Clements Apr 15 '13 at 21:01
  • I think when i was asking for a simplest definition, i meant the one where transitions from state to state would be the easiest to define. Otherwise it can be said that a program has only 2 states: the program itself and its result, and everything happens during the transition from the first to the second. – Alexey Apr 17 '13 at 07:41
  • I am now at lecture 5 out of 10 on SICP, and it looks to me that the environment (in the form of a pointer in a rooted tree of frames, each frame consisting of variable bindings), should be a part of the state. – Alexey Apr 17 '13 at 07:44
  • Two responses, to your two comments. In your first comment, you're basically describing the difference between a "small-step" and "big-step" semantics; the first takes many small steps to get to the answer, the second takes one giant step to get to the answer. – John Clements Apr 17 '13 at 17:21
  • In response to your second question: the environment is part of the state if you're using a model with environments. If you're using substitution, though, you don't need environments. Environments are a way of implementing substitution efficiently. It looks to me like you'd *really* get a lot out of Shriram Krishnamurthi's (free, online, PDF) textbook, "Programming Languages: Application and Interpretation" – John Clements Apr 17 '13 at 17:23
  • Thanks for the reference, i will look at that book. About the substitution model, it is claimed in SICP that it is not appropriate once you allow assignment. – Alexey Apr 17 '13 at 18:09
  • That's true... unless you instead substitute memory locations, rather than values. That's an unusual combination, though, I agree. – John Clements Apr 17 '13 at 18:42
2

I'm not aware of any formal definition, but I can speculate a bit.

Basically you can represent any program with a lambda expression in lambda calculus. A binary tree is a simpler view introduced by creators of Scheme, to give a nice idea to newbies in computer science.

Lambda calculus explicitly defines operational semantics for evaluation of lambda expression. Basically this means, in lambda calculus, they define how you will calculate the actual value of any lambda expression, or if we map lambda expressions to programs, how a program will work.

There are different alternatives for defining operational semantics of a programming language. In lambda calculus world these alternatives involve reduction rules and reduction strategy. They always seem like identical but small variations change how lambda expression is reduced. In programming world this corresponds to how program executes. For example different reduction mechanisms might evaluate the actual parameters of a function before or after function definition is evaluated. This corresponds to eager and lazy parameter evaluation mechanisms of different programming languages.

In Scheme it is possible to represent each expression in a simpler binary tree view. Similar to lambda expressions, binary trees are reduced from leaves to root. So each configuration of a Scheme program running is still a binary tree.

However, function calls complicate this. When you call a new function, you have a new Scheme expression to evaluate. It is semantically possible to attach the new expression to the binary tree by replacing the function call. But practically it is not a nice method. I know that List does not use that approach, instead use a stack of configurations, each corresponding to a function. I wouldn't be surprised if Scheme follows this faction.

Moreover, bindings complicate that issue further. Scheme is not purely functional, it is possible to define bindings and their semantic should be included in the program by obeying scope rules. Even when there are simple scope rules, it is difficult to combine bindings with binary expression trees. Simplest way is using virtual links to bindings (like pointers).

These are only based on my observations from the times I've used Scheme, and some of them might false. But I guess it's clear that a very simple model such as Turing Machine is not purely applicable when factors such as function call and binding go into the picture. Programming language and compiler design is a complex puzzle with many sub-problems inside.

There is book available on web, called An Introduction to Scheme and its Implementations. It might provide more insight and real facts about the Scheme implementation. If you want to learn more on this subject I recommend reading a programming languages book on lambda calculus and a compiler design book (probably the famous dragon book).

  • Thanks for the information and references, but i am looking for a one- or two-line definition if possible. – Alexey Mar 28 '13 at 16:38
2

State and meaning are two very different things, and your question sort of conflates them.

To the question "What does this Scheme program mean?", one can look at a semantics, either an informal one (a specification written in English, say), or a formal one (usually an operational semantics, but occasionally a denotational semantics, etc.). However, the semantics of a real-world programming language like Scheme is going to be extremely complicated.

Now, if you've chosen an operational semantics (as opposed to a denotational semantics), it will have something that looks like a state (you described the state of the operational semantics of a Turing Machine in your question). This may be a reasonably simple object... however, there are many different operational semanticses you can come up with for Scheme (or any real-world language), and each of them will have their state look like something different!

So, essentially, there are infinitely many answers to your question, none of them is going to be illuminating without its corresponding semantics.

The Turing Machine, because the way it's usually presented is so close to an operational semantics anyways (because it's so low-level and "state-y"), doesn't really have this problem, because there's only one good choice, which you described well in your question.

Paul Stansifer
  • 196
  • 1
  • 3
  • I think i would be satisfied with any single answer out of infinitely many, preferably the shortest one, for any single "semantics". (I guess that without a semantics a Scheme program is just a sequence of Unicode symbols anyway.) If a Scheme program is not to be evaluated but to be "reduced", i would be interested to hear about it too, but from what i have read so far, it looks like after all a Scheme program describes a "computation" in some sense. As far as i know, a computation must be a sequence of states. – Alexey Mar 28 '13 at 22:07
  • 2
    Well, the simplest one is probably the least helpful: the state of a Scheme program is simply the memory of an x86 machine executing a conforming implementation of Scheme. I say this not to be difficult, but to illustrate that, absent a semantics, the state is not very useful. But anyways, here's an actually useful operational semantics for Scheme: http://www.eecs.northwestern.edu/~robby/an-operational-semantics-for-scheme/scheme-opsem.pdf (in general, arrows go between two states). You'll see that they start with an environment and an expression, much like what you suggested... – Paul Stansifer Mar 28 '13 at 22:21
  • ...but then things get more complicated as they reflect more and more of Scheme's actual features. – Paul Stansifer Mar 28 '13 at 22:22
  • Thanks for the link, Paul. I hoped there was an easier description of a state that "machine memory". By the way, it is not a complete definition, because you would need to also define *what* is stored in the memory and *how*. I will try to look through the article. – Alexey Mar 29 '13 at 07:35
0

After looking into Alonzo Church's The calculi of lambda-conversion, and thinking a bit about rewriting systems, different definitions of algorithms, and about Lisp, I have come to mostly agree with John Clements' answer: that the state of a Scheme program is a "reduced" Scheme program.

I would like however to formulate my own version of it (which i can edit or improve if I learn more).

In my opinion, a state of a Scheme program can be viewed as a sort of a Scheme program, where instead of constant literals, arbitrary constant data values are allowed (they have to be allowed because I am not sure if, for example, an arbitrary float value can be written as a float literal).

For example, a program can execute as follows (the sequence of states):

The program:

(define x 0)
(set! x (lambda (x) (+ x 1)))
(define y 2)
(x y)

Program state after step 1:

(define x (lambda (x) (+ x 1)))
(define y 2)
(x y)

Program state after step 2:

(define y 2)
((lambda (x) (+ x 1)) y)

Program state after step 3:

((lambda (x) (+ x 1)) 2)

Program state after step 4:

(+ 2 1)

Program state after step 5:

3

Update. To confirm this point of view, here are quotes from Racket reference:

1.1 Evaluation Model

Racket evaluation can be viewed as the simplification of expressions to obtain values. For example, just as an elementary-school student simplifies

  1 + 1 = 2

Racket evaluation simplifies

  (+ 1 1) → 2

The arrow above replaces the more traditional = to emphasize that evaluation proceeds in a particular direction towards simpler expressions. In particular, a value is an expression that evaluation simplifies no further, such as the number 2.

[...]

1.1.4 Top-Level Variables

Given

x = 10

then an algebra student simplifies x + 1 as follows:

x + 1 = 10 + 1 = 11

Racket works much the same way, in that a set of top-level variables are available for substitutions on demand during evaluation. [...]

Each evaluation step, then, takes the current set of definitions and program to a new set of definitions and program. Before a define can be moved into the set of definitions, its right-hand expression must be reduced to a value.

  defined:
  evaluate: (begin (define x (+ 9 1)) (+ x 1))
→ defined:
  evaluate: (begin (define x 10) (+ x 1))
→ defined:  (define x 10)
  evaluate: (begin (void) (+ x 1))
→ defined:  (define x 10)
  evaluate: (+ x 1)
→ defined:  (define x 10)
  evaluate: (+ 10 1)
→ defined:  (define x 10)
  evaluate: 11

Using set!, a program can change the value associated with an existing top-level variable:

  defined:  (define x 10)
  evaluate: (begin (set! x 8) x)
→ defined:  (define x 8)
  evaluate: (begin (void) x)
→ defined:  (define x 8)
  evaluate: x
→ defined:  (define x 8)
  evaluate: 8

I have also read about the SECD machine, closures, and environments, but they look to me like just a peculiar implementation method.

Alexey
  • 932
  • 7
  • 19