26

I'm currently creating a programming language for fun where the idea is that every function call/new block (if clauses, loops etc) will work in a separate thread. Instead of creating new Threads the standard should be that it does that automatically, and if you want it to run in the main thread you'll have to specify that.

I'm not that informed on multi-threaded, parallel programming but I do know the basics (Futures, thread safe objects). Therefore I'm wondering how such a language could look syntax wise and if it's even possible to begin with? The goal is not to make it "useful", it's more for the fun of it and a learning experience.

(I'm sorry if this is the wrong place to post. If so I would gladly appreciate if you point me to the right place where a question like mine is allowed.)

Giorgio
  • 19,486
  • 16
  • 84
  • 135
Grimbox
  • 377
  • 3
  • 4
  • 17
    Creating threads is simple. The trick with multi-threading is when they need to talk to each other or work with the same resources. In other words, it's not the forking that's tough, it's the joins. The chief problem you need to solve is how you control that. – JimmyJames Feb 28 '17 at 18:31
  • 20
    IMO, you can't do that without seriously stretching the usual definition of the word "function" or the word "thread." You might want to read about _actors_ if you aren't already familiar with them: https://en.wikipedia.org/wiki/Actor_model – Solomon Slow Feb 28 '17 at 18:34
  • 9
    Get fine grained enough, and you'll find that CPUs already parallelize instructions automatically, without putting any extra burden on the software developer. See [Out-of-order Execution](https://en.wikipedia.org/wiki/Out-of-order_execution) and [Superscalar Processor](https://en.wikipedia.org/wiki/Superscalar_processor). – 8bittree Feb 28 '17 at 19:30
  • 8
    That actually sounds kind of horrifying. I was going to explain why, but Karl Bielefeldt already did. – Mason Wheeler Feb 28 '17 at 21:36
  • 1
    What kind of paradigm do you want to support in your language? Should it be imperative or functional? Are you saying that all statements would be executed at the same time - the then/else-blocks together with what comes after them? – Bergi Feb 28 '17 at 23:49
  • I'm not 100% sure. I'm thinking that the program has a global thread pool that always exists. When a statement is called it will add it to pool que and available threads will work on it. Same goes for loops and alike. So basically the language is just feeding statements to a pool that handles it. Then if I need a return value from a statement I have to make it wait 'till the w/e operation is done. – Grimbox Mar 01 '17 at 00:34
  • 1
    Apart from the excellent answer suggesting you check out Erlang I'd also suggest you look up programming languages that compile circuits instead of CPU instructions. Languages like VHDL and Verilog generate real-world digital circuits so when programming in them you have to keep in mind that every block (their equivalent of functions) process logic in parallel all the time because the universe does not pause one part in order to run another part. Simulations of circuits by necessity are still done sequentially but the final product run in parallel. – slebetman Mar 01 '17 at 02:31
  • The problem here seems to be that the suggested programming language uses "Thread" in a way that's definitely at odds with established languages and Operating Systems, but the question fails to establish exactly what the properties of a Thread are in this context. E.g. in many programming languages, each thread has its unique call stack, where all call frames on that stack belong to the same thread, but this language by definition does not have that. – MSalters Mar 01 '17 at 10:06
  • Don't use threads, use processes (which have their own memory). That will make it much more useful. – reinierpost Mar 01 '17 at 10:44
  • 1
    Is the idea to make everything run concurrently, as if you'd add `&` to every call in a Unix shell? In that case you either need primitives to wait for finishing threads or processes you called and reap the results, or you need to have constructs such that this is taken care of by design, e.g. by basing your language on something like the [join calculus](https://en.wikipedia.org/wiki/Join-calculus). – reinierpost Mar 01 '17 at 10:52
  • After some more reading I'm really interested in Actors and how to make a language solely based on creating Actors instead of threads. Maybe more suited to create a new question. – Grimbox Mar 01 '17 at 14:39

10 Answers10

39

every function call/new block (if clauses, loops etc) will work in a separate thread.

Read a lot more about continuations and continuation-passing style (and their relation to threads or coroutines) I suggest to read SICP & Lisp In Small Pieces. Also, Programming Language Pragmatics gives a useful overview of several languages, and would help you to design your own.

Therefore I'm wondering how such a language could look syntax wise

Syntax does not matter much for exploring ideas. Semantics matters much more. I suggest to adopt some S-expr like syntax (so you could prototype using Scheme and its call/cc) at first.

Once your ideas are more clear (after some experimentation), you might discuss them on lambda-the-ultimate, you could define a more sexy syntax (which actually matters if you want people to adopt your language, and what also matters is the quality of implementation, a free software sample implementation, the documentation, the standard library, foreign function interface, etc)

Basile Starynkevitch
  • 32,434
  • 6
  • 84
  • 125
  • 25
    +1 for the mention of why syntax is irrelevant. Another +1 if I could for explaining why syntax is highly important. – Jörg W Mittag Feb 28 '17 at 21:11
  • Good suggestion, but to discuss stuff on LtU you'd want an account and it's not trivial to get one from my experience. – Mael Mar 01 '17 at 15:05
  • 1
    You should use a thread pool to reduce overhead. https://en.wikipedia.org/wiki/Thread_pool – shawnhcorey Mar 01 '17 at 16:51
37

You may be interested in reading about the research into data parallel Haskell. If you search around on youtube, Simon Peyton Jones has given some interesting talks related to the subject as well.

If I recall correctly from his talks, in pure functional programming, it's almost trivial to find opportunities to create threads. His main problem in his research is having too many that are too short-lived, such that the overhead of creating threads and communicating their results essentially outweighs the benefits of parallelism. For example, it's easy to teach a compiler to spin up 100 threads to compute sum $ fmap (+1) <100 length vector>, but will the overhead make it worthwhile?

The trick then is consolidating threads into beneficial sizes, without imposing a burden on the programmer to denote that manually. It's a difficult problem that needs to be cracked in order to effectively use future PCs with thousands of cores.

Karl Bielefeldt
  • 146,727
  • 38
  • 279
  • 479
  • 8
    I'd already be happy to work with an oldfashioned PC with only hundreds of cores ... – Hagen von Eitzen Feb 28 '17 at 22:22
  • 8
    I'd throw a note in that for heavily data-parallel designs, we've already been doing lots of parallelization very effectively: GPU's are essentially 200-1000 core machines (they've got their own memory hierarchy even). – Delioth Feb 28 '17 at 23:03
  • 4
    @HagenvonEitzen: The Azul Vega JCA (Java Compute Appliance) had 16 CPUs with 54 cores each for a total of 864 cores, and it wasn't some research-level or prototype machine, but a real product. It also didn't fill an entire building or even an entire room, it was a desk-side size appliance. And it was available 9 years ago. GPUs were already mentioned, this, however, was a machine with 864 *general purpose* CPU cores. – Jörg W Mittag Feb 28 '17 at 23:59
  • 3
    Of course, there's a reason why we _do_ have those massive data-parallel GPU's today, but no longer those massive multi-core CPU's for desktops. The former are commercially viable, the latter aren't, and that's because you couldn't efficiently program them. – MSalters Mar 01 '17 at 10:02
  • @MSalters maybe we just need an infinite number of monkeys? Or, Quantum programs that simply run every possible operation at each step? –  Mar 01 '17 at 19:36
  • @MSalters: Actually, the reason Azul isn't building their own systems anymore is for the same reason that there aren't any Lisp or Java or Smalltalk machines anymore: no matter how many PhDs you can throw at the problem, Intel just throws a couple billion on top, and overtakes you with manufacturing process improvements alone. The Azul Vega-3 CPU had specialized hardware and instructions for garbage collection; however, Intel and AMD added essentially the same functionality (nested page tables) for a totally different reason: running an OS inside an OS, i.e. virtualization. At that point, it … – Jörg W Mittag Mar 02 '17 at 00:25
  • … it didn't make sense to design and manufacture their own CPUs, and they instead ported their technology to run on a bare-metal hypervisor without an OS alongside Windows and Linux. – Jörg W Mittag Mar 02 '17 at 00:26
  • @JörgWMittag but why don't they make machines with 864 low-end Intel cores? – user253751 Mar 08 '17 at 23:51
15

This is exactly what Erlang does. It handles rejoining the threads mostly by using queues. It`s a brilliant concept but a bit difficult to wrap your head around initially if your background is more procedural type languages. I highly recommend looking into it.

GenericJam
  • 259
  • 1
  • 4
4

First I would recommend you look at PROMELA, a language used for describing a concurrent algorithm so that a model checker can brute force all possible executions to verify it is incapable of incorrect behaviour. (Concurrent programming is notoriously hard to get right, which is why such verification techniques are important.) It does not run all constructs in separate threads, but it does have rather odd syntax and semantics because its focus is the nondeterminism of concurrent programs.

Going more abstract, the π-calculus is a beautiful approach to modelling parallel computation. It is hard to get your head around unless you get the book Communicating and Mobile Systems: The Pi Calculus, by Robin Milner. It helped me think about parallel computation in a more broad sense that "multiple threads accessing a shared memory". It is quite interesting how conditional statements, "gotos" and so forth can be built from simpler, naturally parallel primitives.

With regards to syntax... the best way to work this out is to write some sample programs. Write a program to sort an array, or concurrently ping several servers and report which one responds the fastest, or try to solve a maze in parallel, or whatnot. While you are doing this, the things that are missing from your syntax will become apparent, and you can add them. After you have added several things, ask yourself if they have something in common, and if so maybe you can find a simpler approach that can serve multiple purposes.

Artelius
  • 368
  • 1
  • 6
4

Similar projects have been attempted in the past. I suggest reading up on the classics to loot ideas. (All links go to Wikipedia)

  • Unity This language was/is used to teach parallel programming. I don't think it was actually implemented. Syntax is somewhat cryptic, but basically you have a collection of statements that execute in unknown order and repeatedly until there is nothing more to be done. This is closest to what you ask for.

  • Occam This language was designed for actually using but it never really caught on. Here there is a keyword PAR which means that a list of statements should be executed in parallel.

  • Erlang Another real-world language. This one is used by the telecom company Ericsson and has quite a following. They have put a lot work into making the parallelism practical and usable.

  • Google GO This is my favourite of the bunch. Conceptually much the same as Erlang, but with better syntax and the weight of Google behind it. What possibly could wrong go?

I would like to close with a warning: Parallelism is very hard to get right. Most bugs in modern programs is the result of getting it wrong. Are you sure you want to go there?

Stig Hemmer
  • 552
  • 3
  • 5
3

It is possible but it would not be useful for 99+% of all thinkable applications. Logic is typicallry sequence-bound, it is a flow. Step by step you reach a solution to a problem and the order of the steps does matter, mostly because the output of one step will be input for the next.

In the few cases you do have a lot of tasks that can be performed independently of one another, it is typically cheap to set them up sequentially before letting them run in parallel.

So, I think your time would be better spent learning how to use the multi-threading features in your favorite programming language.

Martin Maat
  • 18,218
  • 3
  • 30
  • 57
  • 5
    This is an excellent answer to a different question; OP is not asking about the wisdom of such a scheme but merely whether it can be done and what the "syntax" would look like. – DepressedDaniel Mar 01 '17 at 00:41
  • 9
    Those who do not ask for it are in need of my wisdom the most. If someone asks if is possible to descend a hill in a shopping cart, the best answer would not be "You bet! Make sure to grease up them wheels!". It would be more like "Well, yes, but...". – Martin Maat Mar 01 '17 at 06:24
  • Reminds me of the old EIAO assembly language instruction - *Execute In Any Order*. (And where is that Any Key anyway?) –  Mar 01 '17 at 19:40
  • @MartinMaat nobody is going to kill themselves by writing a programming language for fun – user253751 Mar 08 '17 at 23:52
2

While not a programming language as such, you should take a look at VHDL. It is used to describe digital circuitry, which naturally does everything in parallel unless you specifically tell it to do it serially. It might give you some ideas both how to design your language and what kind of logic it might be suited for.

Toby Speight
  • 550
  • 3
  • 14
OnePie
  • 225
  • 1
  • 4
2

Clojure may be worth a look for some ideas.

http://clojure-doc.org/articles/language/concurrency_and_parallelism.html

Here are some thoughts: If we call a unit of computation that can be performed independently a task: 1. Tasks are independent so can be run concurrently 2. Different tasks require different resources and take different times to run 3. Therefore tasks should be scheduled for maximum throughput 4. The only program in a position to act as a scheduler is the operating system

Things like apples grand central dispatch are an attempt to provide such a scheduler.

The above means that the responsibility of executing tasks is not necessarily that of the programming language.

A second thought is to reduce the burden of programming parallel systems as much as possible. The only way to do this is to remove any specification of how something is to be done from the program. A program should only specify what should be done and the rest should happen automatically.

The above probably means that dynamic languages and just in time compilation are the way to go.

foobar
  • 29
  • 1
2

What you're looking for is called implicit parallellism, and there are languages which have explored this concept, such as Sun/Oracle's Fortress. Among other things, it (potentially) runs loops in parallell.

Unfortunately, it has been discontinued and there's a lot of dead links out there, but you can still find a few PDFs floating around there, if you google hard enough:

https://www.eecis.udel.edu/~cavazos/cisc879-spring2008/papers/fortress.pdf (the language spec)

http://stephane.ducasse.free.fr/Teaching/CoursAnnecy/0506-Master/ForPresentations/Fortress-PLDITutorialSlides9Jun2006.pdf

http://www.oracle.com/technetwork/systems/ts-5206-159453.pdf

http://dl.acm.org/citation.cfm?id=1122972 (paywalled)

Worth noting is that you wouldn't typically want to start an actual thread for every statement/expression, as creating and starting threads tends to be expensive - instead, you'd have pool of threads to which you post bits of work that need doing. But that's an implementation detail.

gustafc
  • 129
  • 4
  • 1
    +1 for mantioning Fortress... I did like the idea of the language very much. I was really sad when they announced that the development was discontinued... – Roland Tepp Mar 06 '17 at 11:35
0

This can be simulated quite easily in C++. Just make certain that "every"* function call is implemented by a std::future. Handling the return value is simply done by calling .get() on the future.

Hence, you might prototype your language by compiling to C++. This also tells us what the syntax would look like: the chief difference is that you separate the call point (where input arguments are provided) from the return point (where the function output is used).

(*) I say "every function" but it's up to you what counts as a function. Is memset an intrinsic or a function? Is assignment of integers, or assignment of user-defined types a function call?

MSalters
  • 8,692
  • 1
  • 20
  • 32
  • And, if you defer answering a question long enough, usually the need for an answer goes away. I think this is called "Lazy Existing". –  Mar 01 '17 at 19:42