12

I would like to learn how to make graphs and perform some local operations on them in Haskell, but the question is not specific to Haskell, and instead of graphs we may consider doubly linked lists.

Question: What would be an idiomatic or recommended way to implement a doubly linked list (or other doubly linked or circular data structure) and operations on it in a language that mainly supports and advocates for immutable data structures (Haskell, Clojure, etc.)? In particular, how to use in-place updates, which are formally forbidden by the language?

I can easily imagine that if some local operation is performed on a doubly linked list (if an item is inserted, for example), there may be no need to copy the entire list immediately due to the laziness of the language. However, since the list is doubly linked, if it is modified in one place, none of the old nodes can be used in the new version of the list, and they would need to be somehow marked, copied, garbage-collected sooner or later. Obviously these are redundant operations if only the updated copy of the list shall be used, but they would add an "overhead" proportional to the size of the list.

Does this mean that for such tasks immutable data is simply inappropriate, and functional declarative languages without "native" support for mutable data are not as good as imperative ones? Or, is there some tricky workaround?

P.S. I have found some articles and presentations on this subject in the Internet but had hard time following them, while i think that the answer to this question should not take more than one paragraph and maybe a diagram... I mean, if there is no "functional" solution to this problem, the answer probably is "use C". If there is one, then how complicated can it be?


Related questions


Relevant quotations

Purely functional programming languages allow many algorithms to be expressed very concisely, but there are a few algorithms in which in-place updatable state seems to play a crucial role. For these algorithms, purely-functional languages, which lack updatable state, appear to be inherently inefficient ([Ponder, McGeer and Ng, 1988]).

-- John Launchbury and Simon Peyton Jones, Lazy functional state threads (1994), also John Launchbury and Simon Peyton Jones, State in Haskell (1995). These papers introduce ST monadic type constructor in Haskell.

Alexey
  • 932
  • 7
  • 19
  • 4
    Recommended: [Okasaki](http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504) – Robert Harvey Dec 16 '15 at 17:11
  • 2
    Thanks for the reference. I've found [his thesis](http://www.cs.cmu.edu/~rwh/theses/okasaki.pdf). – Alexey Dec 16 '15 at 17:40
  • This paper looks promising: [*Lazy depth-first search and linear graph algorithms in Haskell*](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.3876) (1994), by David King and John Launchbury. – Alexey Dec 30 '15 at 23:20
  • It looks like a similar problem with arrays is addressed by [*diffarray*](http://hackage.haskell.org/package/diffarray) package that implements [`DiffArray`](https://en.wikibooks.org/wiki/Haskell/Libraries/Arrays#DiffArray) type. Looking at the [source](http://hackage.haskell.org/package/diffarray-0.1.1/src/Data/Array/Diff.hs) of *diffarray* package, i see 91 occurrences of `unsafePerformIO`. It looks like the answer to my question is "yes, no, purely functional languages with immutable data are not appropriate for implementing algorithms that normally rely on in-place updates". – Alexey Dec 31 '15 at 14:56
  • My current solution (in Haskell) is to use a dictionary (`Map`, `IntMap`, or `HashMap`) as a storage and to make nodes contain IDs of the linked nodes. ["All problems in computer science can be solved by another level of indirection."](http://www.dmst.aueb.gr/dds/pubs/inbook/beautiful_code/html/Spi07g.html) – Alexey Jan 05 '16 at 20:25

2 Answers2

6

There could be other efficient immutable data structures that fit your particular task, but are not as general as a doubly-linked list (which is unfortunately prone to concurrent modification bugs due to its mutability). If you specify your problem more narrowly, such a structure can probably be found.

The general answer for (relatively) economic traversing of immutable structures is lenses. The idea is that you can keep just enough information to reconstruct a modified immutable structure from its unmodified parts and the currently modified piece, and navigate over it to a neighboring node.

Another useful structure is a zipper. (The funny part is that the type signature for a lens zipper is a school-math derivative of a type signature of the structure.)

Here are a few links.

9000
  • 24,162
  • 4
  • 51
  • 79
  • 1
    depending on what is needed a zipper might also be useful – jk. Dec 16 '15 at 17:06
  • To specify my problem more narrowly, suppose i want to program a graph rewriting system, for example a lambda calculus evaluator based on graph rewriting. – Alexey Dec 16 '15 at 21:47
  • 1
    @Alexey: Are you familiar with the work of the Clean people on graph rewriting? http://wiki.clean.cs.ru.nl/Functional_Programming_and_Parallel_Graph_Rewriting – Giorgio Dec 16 '15 at 22:46
  • @Giorgio, not yet, thanks for the link. Did they program Clean in Haskell, by any chance? ;) – Alexey Dec 17 '15 at 10:40
  • I have a very clear idea about the graph rewriting system i want to program, and i do not think i would have a major difficulty programming it in C or in other imperative language with mutable data, but i do not see yet how to do it in a functional language. I will try to learn about lenses. – Alexey Dec 17 '15 at 10:45
  • 1
    @Alexey: Not that I know: Clean is a cousin of Haskell that was developed on its own. It also has a different mechanism for dealing with side-effects (AFAIK it is called unique types). On the other hand, the developers have worked a lot with graph rewriting. So they could be among the best people who know both about graph rewriting and functional programming. – Giorgio Dec 17 '15 at 10:51
  • BTW, what kind of graph rewriting would you like to program? DPO, SPO, ...? – Giorgio Dec 17 '15 at 10:56
  • @Giorgio, i do not get very well the difference, but i think that in my case it can be viewed as both. In fact, my graphs are not really graphs, but more like interaction nets. – Alexey Dec 18 '15 at 21:10
  • @9000, i have tried to learn about lenses, but i do not see how they answer my question. They seem to be functional "getters" and "setters", where "setters" make copies. For example, if you build a 1000-element list of integers, then modify its last element (through a lens or not -- i do not see why it should make a difference), and then calculate and print out the sum of the elements of the new list, wouldn't the whole list be copied first? – Alexey Dec 18 '15 at 21:19
  • 1
    I agree that a zipper seems to solve the problem with a doubly linked list or a tree if i want to navigate it and modify at the spot i am currently at, but it is not clear what to do if i want to focus on several spots simultaneously and, for example, swap two element in two places far apart. It is even less clear if is can be used with "circular" structures. – Alexey Dec 18 '15 at 21:23
2

Haskell does not prevent the use of mutable data structures. They are heavily discouraged and made harder to use because of the fact that the parts of code that use them must eventually return an IO action (which must eventually be bound into the IO action that is returned by the main function), but that does not make it impossible to use such structures if you really need them.

I would suggest investigating the use of software transactional memory as a way forward. As well as providing an efficient way to implement mutable structures, it also provides very useful guarantees for thread-safety. See the module description at https://hackage.haskell.org/package/stm and the wiki overview at https://wiki.haskell.org/Software_transactional_memory .

Jules
  • 17,614
  • 2
  • 33
  • 63
  • Thanks, i will try to learn about STM. It looks like there are more methods in Haskell to have mutability and state (i've stumbles upon `MVar`, `State`, `ST`), so i'll need to figure out their differences and intended uses. – Alexey Dec 19 '15 at 13:36
  • @Alexey: Good point regarding `ST`, IMO it should be mentioned in the answer because it allows to run a stateful computation, then throw away the state and extract the result as a pure value. – Giorgio Dec 19 '15 at 13:38
  • @Giorgio, is it possible to use Haskell's `ST` with STM to have both concurrency and disposable state? – Alexey Dec 24 '15 at 11:17
  • Just one more terminology suggestion: the composed main IO action is not "*returned* by the main function" but is assigned to `main` variable. :) (`main` does not even hold a function.) – Alexey Dec 24 '15 at 21:31
  • I see your point, but still "variable" has a connotation in most people's minds as a simple value, rather than a process that produces a value, and main is clearly better thought of as the latter rather than the former. The change you suggest, while clearly technically correct, has the potential to confuse those unfamiliar with the subject. – Jules Dec 24 '15 at 22:20