16

There's a quotation by Ralph William Gosper, Jr that says:

A data structure is just a stupid programming language.

What did he mean by this? Alas, all I can find in Google about it is relentless copy/pastes of the quotation itself, without any context.

missingfaktor
  • 3,906
  • 1
  • 24
  • 31
  • 2
    http://en.wikipedia.org/wiki/Homoiconicity – nicerobot Dec 05 '11 at 18:09
  • 1
    This type of question is now being [discussed on our meta-discussion site](http://meta.programmers.stackexchange.com/q/2645/8). –  Dec 05 '11 at 20:17
  • There are languages with Turing-complete type systems. Some of them are stupid. – SK-logic Dec 06 '11 at 08:39
  • @SK-logic: What do type systems, Turing complete or otherwise, have got to do with this quote? – missingfaktor Dec 06 '11 at 09:11
  • @missingfaktor, data structures *are* types. And you can code arbitrary complex programs in data structures definitions only - for example, see the entry "*Static Haskell programmer*" here: http://www.willamette.edu/~fruehr/haskell/evolution.html (and yes, it is quite stupid, so the quote is relevant) – SK-logic Dec 06 '11 at 09:38
  • @SK-logic, no data structures can be described (specified) by types. It's true that you can encode them into some type systems to run at compile time. Still, I wouldn't claim that data structures ≡ types. – Rehno Lindeque Sep 27 '12 at 05:34
  • 1
    @RehnoLindeque, have you ever seen Agda or Coq? Types can be Turing-complete. – SK-logic Sep 27 '12 at 06:01
  • @SK-logic, I realize that you can perform (universal) computations using types, but like I said - encoding a data structure (which is essentially just an algorithm) using types does not make types ≡ data structures. You're conflating types that describe data structures with types that perform computations. – Rehno Lindeque Sep 27 '12 at 18:55
  • @SK-logic sorry, that last sentence sounded more rude than I intended. I meant, "From my point of view your comment *seems* to be conflating types that describe data structures with types that perform computations." Computations performed at the type-level just doesn't seem very relevant to the quote IMO. – Rehno Lindeque Sep 27 '12 at 19:16

4 Answers4

10

Well, it seems like the heart of the statement is:

A data structure is just a ... programming language

Which is quite true if you think about it. After all, compilers rely on this transitivity all the time; they take a programming language, convert it into a data structure, do some transformations on that data, and then turn the result into another programming language.

In fact, if you wanted to you could even make something crazy like a C data structure, that lets you write C code by calling its various methods - for instance (in kinda C#, because that's what I'm using right now):

var C = new HorribleCObject();
C.Function<int>("main", typeof(char[][]), typeof(int))
  .Variable("i", typeof(int), 0)
  .While("i", Func(i) => i < 10))
     .Call("printf", "%d", "i")
     .PostIncrement("i")
  .EndWhile();
  .Return(0)
 .EndFunction();

Now, as to the full quote: why would something like that be stupid compared to (say) writing in C itself? It should be pretty obvious that this is verbose and not nearly as legible as its equivalent in C (and, in practice, might not support the full scope of what C can do - typedefs would be tricky); hence, this data structure is just a "stupid" programming language, embedded in a "real" programming language. That same logic can be generalized to any data structure you can think of; linked lists are just a "stupid" version of Lisp, and hash maps are just a "stupid" version of some theoretical Hash Programming Language (Hasp?).

The thing is, though, that we don't always want to write Hasp in order to interact with our hash maps. It's the problem all domain specific languages have - on the one hand, a well-implemented DSL is powerful enough to express everything the underlying model can do; on the other hand, you have to implement the DSL in the first place, and then other people have to learn it. That takes time and effort that they probably don't want to spend; after all, I just want to put things in my hash map and then check other things are in there, I don't want to learn all the intricacies of Hash Oriented Programming.

So, pretty much without thinking about it, we take these theoretical highly specific and very smart programming languages and distill them down to the few, stupid operations embodied in a data structure. A linked list has one small collection of simple methods; a hash map has some others. We ignore the other, more powerful operations you could potentially perform over the data structure (most LinkedList implementations don't have a .Map or .ForEach function, for instance, and I can't even imagine what you would do in Hasp), in favor of implementing them explicitly in the parent programming language - which is what most programmers are going to be familiar with.

Data structures are, essentially, a stupid extension of their parent language into the problem space that they conceptually represent. A sufficiently smart extension would require a new, specific programming language, and most people aren't going to want to learn that.

Tacroy
  • 667
  • 6
  • 11
2

A data structure is a REPRESENTATION of a programming language. But not a particularly "sharp" one.

This can be seen from a "node diagram" like the one in the wiki article below:

http://en.wikipedia.org/wiki/Root_node#Terminology

Nevertheless, a data structure is INCOMPLETE as a programming language, because it lacks syntax and complete thoughts that would be intelligible to a programmer. A data structure's "language" might be compared to a child who said something like, "Me, cold. Get coat."

The "language" is fractured, but can be understood. The child is saying that "s/he is cold, and would like more clothes as covering." The child's utterance is a "stupid" version of the English language, and likewise data structure in relation to a programming language.

Tom Au
  • 893
  • 7
  • 17
1

I believe that what Bill Gosper intended is that all data structures are just programming constructs with limited applicability. This is also related to the idea that "Language design is library design and library design is language design" [1].

One way of thinking about the issue is to consider data structures merely on an algorithmic basis. Forget about storage requirements or type annotations for the moment because these are simply ancillary.

For example, you could codify an associative array (called a map in some languages) in two ways: Either by using some kind of index stored in memory or by using a simple case expression.

In Haskell you could codify an associative array as a data structure...

let assocArray = [("a", 1),("b", 2),("c", 3)]
let key = "b"
lookup key assocArray

...or by using a case expression...

let key = "b"
case key of 
  "a" -> 1
  "b" -> 2
  "c" -> 3

...or even more directly...

let key = "b"
if key == "a" 
  then 1 
  else if key == "b"
    then 2
    else if key == "c"
      then 3
      else undefined

It is easy to see to that this kind of mirroring between data structures and code is possible by looking at the lambda calculus. Any value can be represented by a function in the lambda calculus and the calculus itself is universal (turing complete).

[1] The quote is thanks to Bjarne Stroustrup.

0

Consider Javascript, where all data is code. Consider LISP, where all data is code and all code is data.

In the beginning, end, and everywhere in between, data is just bits. That we attempt to ontologize bits with text and symbols to make them easily human-readable and human-transformable is a layer of abstraction that requires a) You learn the definition language and b) you learn the leakiness of the abstraction.

For example, in C#, learning the difference between a struct and class requires you learn the difference in equality comparison between value-types and reference-types. Every data ontology requires its own set of rules you must learn and adhere to. And, like any language, it allows you to quickly get to the general idea, but the closer you want to approach the actual truth of the matter, the more you should just look at the binary yourself.

Finally, when one considers a B-tree or similar data structure, navigating the structure of the tree and performing other kinds of operations on it requires a specialized kind of syntax that's not necessarily transferable across trees, structures, or languages.

Legatou
  • 117
  • 3
  • 3
    I’m not sure this really gets to the heart of it. Generic programming, for instance, is specifically about data structure–agnostic algorithms (typically with iterators or ranges). – Jon Purdy Dec 05 '11 at 17:35
  • 4
    Are you sure this is what Ralph William Gosper, Jr. actually meant? – Robert Harvey Dec 05 '11 at 18:53
  • In Common Lisp, not all data can be compiled as code, but all code can be treated as data. There aren't many syntax rules, but all code has to be S-expressions, at least after macro processing, and not all data is S-expressions. – David Thornley Dec 05 '11 at 22:47