6

I'm writing a compiler that compiles to C, one thing I'm attempting to do is implement a construct like Rust's match:

// { some function
    let mut foo = 32;
    match foo {
        3 => return "hey",
        4 | 5 | 6 => return "foo",
        9 ... 12 => return "blah",
        _ => return "bar",
    }
// }

How would a construct like this look in C code? I was thinking something simple like this could be a switch construct like so:

switch (foo) {
    case 3: return "hey";
    case 4: case 5: case 6: return "foo";
    case 9: case 10: case 11: case 12: return "blah";
    default: return "bar";
}

And most importantly, how would it look for more complicated examples like this:

let pair = (0, -2);

match pair {
    (0, y) => println!("First is `0` and `y` is `{:?}`", y),
    (x, 0) => println!("`x` is `{:?}` and last is `0`", x),
    _      => println!("It doesn't matter what they are"),
}

Where the tuple is destructured and can handle values too?

Jon Flow
  • 363
  • 2
  • 7
  • I'm not familiar with Rust, but Swift has a `switch` statement with this same functionality, and it is spelled mostly like the C `switch`. Here's the doc: https://developer.apple.com/library/prerelease/content/documentation/Swift/Conceptual/Swift_Programming_Language/ControlFlow.html#//apple_ref/doc/uid/TP40014097-CH9-ID127 – jscs Sep 26 '16 at 16:47
  • @JoshCaswell I'm not sure what the spelling has to do with the implementation or semantics... – 8bittree Sep 26 '16 at 17:23
  • I'm not sure which facet the question is actually about, so I thought another example would be useful, @8bittree. "And most importantly, how would it look..." suggests that syntax is under consideration. – jscs Sep 26 '16 at 17:25
  • 1
    @JoshCaswell and Jee, I think the main issue here might be how to implement pattern matching and destructuring in C. `match` isn't the only place these can happen in Rust, e.g. `let (a,b) = (1,2);`. Solving the simpler situations might make the solution for `match` more obvious. Knowing how you're implementing tuples and other complex data types may also help. – 8bittree Sep 26 '16 at 17:35
  • This is called generalized pattern matching. It must be built into the language. C does not have this functionality. – gardenhead Sep 26 '16 at 18:00
  • @8bittree Hmm. I've managed to support destructuring of simple tuples and structures like that in this language, so perhaps it might just make sense when I get down to implementing it with the match statement. I think I sort of have an idea at the moment too – Jon Flow Sep 26 '16 at 18:06
  • I've just discovered this [rust playground site here](https://play.rust-lang.org), and I can compile my example to MIR which is their intermediate language... perhaps it would be easier in this case to do the same with my language, translate the match into more primitive instructions and then translate those into C? – Jon Flow Sep 26 '16 at 18:09
  • That may help. There is a post on the Rust blog that [introduces MIR](https://blog.rust-lang.org/2016/04/19/MIR.html), you might find that useful as well. They actually include an example of simplifying `match`, but it's only a very simple enum match and destructure, which should be fairly straightforward to implement in C anyway, compared to more complex matches and destructuring, such as your example. – 8bittree Sep 26 '16 at 18:21
  • It might help to look at how scalac uses unapply functions to translate pattern matching / unpacking into jvm bytecode (readable as java) – Daenyth Sep 26 '16 at 18:36
  • This is a dup of a question on StackOverflow that has great answers: http://stackoverflow.com/questions/586362/pattern-matching-implementation#642896 – Jason Orendorff Sep 28 '16 at 16:45
  • Or rather, that question is specifically about Erlang, but the answers citing specific papers and books are relevant anyway. – Jason Orendorff Sep 28 '16 at 16:51

1 Answers1

5

Let's consider a super basic example. Your language only has two types, int and pair. Pairs can hold other pairs. For your "cases", you also have unbound variables.

So when you pass in the value to match, you can have an int or a pair. The pattern to match can be an int, a pair, or an unbound-variable. The pattern matching algorithm then goes:

if my pattern is an int, and the input is an int, if they're equal, do this.
if my pattern is a var, assign the value to the var, and do this.
if my pattern is a pair, and the input is a pair, then if 
    pattern-match(pattern.first, value.first) && 
    pattern-match(pattern.second, value.second), do this.
otherwise, try the next pattern.

For basic Lisps, which only have symbols and pairs - that's all they need. For other languages, you may need to extend the basics to handle other types, and maybe other structures.

And hopefully you can see how this becomes dicey once you have mutable state.

Telastyn
  • 108,850
  • 29
  • 239
  • 365