37

Let's say, I've the below logic. How to write that in Functional Programming?

    public int doSomeCalc(int[] array)
    {
        int answer = 0;
        if(array!=null)
        {
            for(int e: array)
            {
                answer += e;
                if(answer == 10) break;
                if(answer == 150) answer += 100;
            }
        }
        return answer;
    }

The examples in most blogs, articles... I see just explains the simple case of one straight forward math function say 'Sum'. But, I have a logic similar to the above written in Java and would like to migrate that to functional code in Clojure. If we can't do the above in FP, then the kind of promotions for FP doesn't state this explicitly.

I know that the above code is totally imperative. It was not written with the forethought of migrating it to FP in future.

Konrad Rudolph
  • 13,059
  • 4
  • 55
  • 75
Vicky
  • 399
  • 3
  • 6
  • 1
    Note that the combination of `break` and `return answer` can be replaced by a `return` inside the loop. In FP you could implement this early return using continuations, see e.g. https://en.wikipedia.org/wiki/Continuation – Giorgio Jan 03 '18 at 14:00
  • 1
    @Giorgio continuations would be an enormous overkill here. It's a loop anyway, to call its next iteration you do a tail call, so to break you just don't call it anymore and just return the answer. For *nested* loops, or other complicated control flow, that's where you might use continuations instead of heaving to restructure your code to use the above simple technique (which should always be possible, but may lead to overly complex code structure which would more or less *explicate* the continuation; and for more than one exit point you'd certainly need them). – Will Ness Jan 03 '18 at 14:28
  • 8
    In this case: `takeWhile`. – Jonathan Cast Jan 03 '18 at 14:39
  • 1
    @WillNess: I just wanted to mention it because it can be used to leave a complex computation at any point. It is probably not the best solution for the OP's concrete example. – Giorgio Jan 03 '18 at 16:14
  • @Giorgio you're right, it is the most comprehensive one, in general. actually this question is very broad, IYKWIM (i.e. would be closed on SO in a heartbeat). – Will Ness Jan 03 '18 at 16:18

7 Answers7

46

The closest equivalent to looping over an array in most functional languages is a fold function, i.e. a function that calls a user-specified function for each value of the array, passing an accumulated value along the chain. In many functional languages, fold is augmented by a variety of additional functions that provide extra features, including the option to stop early when some condition arises. In lazy languages (e.g. Haskell), stopping early can be achieved simply by not evaluating any further along the list, which will cause additional values to never be generated. Therefore, translating your example to Haskell, I would write it as:

doSomeCalc :: [Int] -> Int
doSomeCalc values = foldr1 combine values
  where combine v1 v2 | v1 == 10  = v1
                      | v1 == 150 = v1 + 100 + v2
                      | otherwise = v1 + v2

Breaking this down line by line in case you're not familiar with Haskell's syntax, this works like this:

doSomeCalc :: [Int] -> Int

Defines the type of the function, accepting a list of ints and returning a single int.

doSomeCalc values = foldr1 combine values

The main body of the function: given argument values, return foldr1 called with arguments combine (which we'll define below) and values. foldr1 is a variant of the fold primitive that starts with the accumulator set to the first value of the list (hence the 1 in the function name), then combines it using the user specified function from left to right (which is usually called a right fold, hence the r in the function name). So foldr1 f [1,2,3] is equivalent to f 1 (f 2 3) (or f(1,f(2,3)) in more conventional C-like syntax).

  where combine v1 v2 | v1 == 10  = v1

Defining the combine local function: it receives two arguments, v1 and v2. When v1 is 10, it just returns v1. In this case, v2 is never evaluated, so the loop stops here.

                      | v1 == 150 = v1 + 100 + v2

Alternatively, when v1 is 150, adds an extra 100 to it, and adds v2.

                      | otherwise = v1 + v2

And, if neither of those conditions is true, just adds v1 to v2.

Now, this solution is somewhat specific to Haskell, because the fact that a right fold terminates if the combining function doesn't evaluate its second argument is caused by Haskell's lazy evaluation strategy. I don't know Clojure, but I believe it uses strict evaluation, so I would expect it to have a fold function in its standard library that includes specific support for early termination. This is often called foldWhile, foldUntil or similar.

A quick look at the Clojure library documentation suggests that it is a little different from most functional languages in naming, and that fold isn't what you're looking for (it's a more advanced mechanism aimed at enabling parallel computation) but reduce is the more direct equivalent. Early termination occurs if the reduced function is called within your combining function. I'm not 100% sure I understand the syntax, but I suspect what you're looking for is something like this:

(reduce 
    (fn [v1 v2]
        (if (= v1 10) 
             (reduced v1)
             (+ v1 v2 (if (= v1 150) 100 0))))
    array)

NB: both translations, Haskell and Clojure, are not quite right for this specific code; but they convey the general gist of it -- see discussion in the comments below for specific problems with these examples.

Jules
  • 17,614
  • 2
  • 33
  • 63
  • 11
    the names `v1 v2` are confusing: `v1` is a "value from array", but `v2` is the accumulated result. and your translation is erroneous, I believe, the OP's loop exits when the *accumulated **(from left)** value* hits 10, not some element in the array. Same with the incrementing by 100. If to use folds here, use the left fold with early exit, some variation on `foldlWhile` [here](https://wiki.haskell.org/Foldl_as_foldr_alternative). – Will Ness Jan 03 '18 at 11:56
  • like e.g. `foldl'Breaking break reducer reduced acc list = foldr cons (\acc -> acc) list acc where cons x r acc = if (break acc x) then (reduced acc x) else (r $! reducer acc x)`, with the appropriate break test function `break`. `$!` makes for the strict evaluation, more efficient here. (but that's Haskell-specific). Clojure's `reduce` is indeed left-to-right, and there **`v1`** is the one referring to the value being accumulated; still there's an error in your Clojure translation as well. -- the first comment referred to the Haskell translation with `foldr1`. – Will Ness Jan 03 '18 at 12:26
  • 2
    it *is* funny how the most wrong answer gets the most upvotes on SE.... it's OK to make mistakes, you're in [the good company :)](http://www.dreamsongs.com/10ideas.html), too. But the knowledge discovery mechanism on SO/SE is definitely broken. – Will Ness Jan 03 '18 at 12:41
  • 1
    The Clojure code is almost correct, but the condition of `(= v1 150)` uses the value before `v2` (aka. `e`) is summed to it. – NikoNyrh Jan 03 '18 at 13:05
  • 1
    `Breaking this down line by line in case you're not familiar with Haskell's syntax` -- You are my hero. Haskell is a mystery to me. – Captain Man Jan 03 '18 at 13:16
  • 15
    @WillNess It’s upvoted because it’s the most immediately understandable translation and explanation. The fact that it’s wrong is a shame but *relatively* unimportant here because the slight errors don’t negate the fact that the answer is otherwise helpful. But of course it should be corrected. – Konrad Rudolph Jan 03 '18 at 14:32
  • @WillNess: I have no knowledge of Haskell, but found this answer enlightening. If it has an error, please use the edit button to make it better. – Mooing Duck Jan 03 '18 at 19:01
  • @MooingDuck I would have to re-write it whole. It can be done, but I feel it'd be too presumptuous of me to do it. There's this famous saying by Isaac Newton I think it was, how he stated some problem and the people felt he explained the solution to them, because of the general feeling of understanding, that he "explained it" to them. But he actually just stated the problem. I don't remember the specifics. – Will Ness Jan 03 '18 at 21:17
  • @WillNess - go ahead and edit it. Your comments are good, but I don't have time to fix the error at the moment. – Jules Jan 03 '18 at 23:27
  • A little tidbit _when_ stopping early in a `fold` actually works without extra work. Your `fold` needs to be lazy, your working function needs to be lazy. Or, better said, you need a working function that is a multiplication on some monoid, it would find an absorbing element of that monoid somewhere in the list, and doing so would yield no further demand on next list elements. I actually wrote a paper about it: http://www.mathematik.uni-marburg.de/~lobachev/papers/lobachev-flops12.pdf – Oleg Lobachev Jan 04 '18 at 01:04
  • @Jules OK I only told half the truth there (but it was implied) - it's too big of a change; too much work. It would be like me writing a whole new answer. then why would I not post it under my own name? :) the small error in the Clojure code is already corrected in an answer down below, too. --- OK, I've edited in a small note at the bottom. --- have you read that link about John McCarthy? it's a nice read. – Will Ness Jan 04 '18 at 07:57
  • @WillNess don't forget that this question is currently on HNQ where many users *can only upvote* (from association bonus) but not downvote. Also, as someone who doesn't write Haskell or Clojure, the detailed explanation in this answer would be convincing enough for me to upvote this, whether the info is correct or wrong (I didn't vote any posts on this particular Q&A though). – Andrew T. Jan 04 '18 at 09:52
  • @AndrewT. yes, as I said re: knowledge discovery mechanisms. this is of course a discussion not suitable here; maybe on meta, if at all (i.e. it touches on the very core of what SE is). Is it, still, on HNQ? It doesn't show for me. --- Also, the explanation is OK, in general; just that the code is not the correct translation of OP's specific code. I didn't downvote BTW. – Will Ness Jan 04 '18 at 10:08
  • so yeah, "the most wrong answer" wasn't quite right. – Will Ness Jan 04 '18 at 10:11
33

You could easily convert it to recursion. And it has nice tail-optimized recursive call.

Pseudocode :

public int doSomeCalc(int[] array)
{
    return doSomeCalcInner(array, 0);
}

public int doSomeCalcInner(int[] array, int answer)
{
    if (array is empty) return answer;

    // not sure how to efficiently implement head/tails array split in clojure
    var head = array[0] // first element of array
    var tail = array[1..] // remainder of array

    answer += head;
    if (answer == 10) return answer;
    if (answer == 150) answer += 100;

    return doSomeCalcInner(tail, answer);
}
Euphoric
  • 36,735
  • 6
  • 78
  • 110
  • 14
    Yep. The functional equivalent to a loop is tail-recursion, and the functional equivalent to a conditional is still a conditional. – Jörg W Mittag Jan 03 '18 at 08:15
  • 4
    @JörgWMittag I'd rather say tail recursion is the functional equivalent to `GOTO`. (Not quite as bad, but still pretty awkward.) The equivalent to a loop, as Jules says, is a suitable fold. – leftaroundabout Jan 03 '18 at 10:28
  • 6
    @leftaroundabout I'd disagree actually. I'd say tail recursion is more constrained than a goto, given the need to jump to itself and only in tail position. It is fundamentally equivalent to a looping construct. I'd say recursion in general is equivalent to `GOTO`. In any case, when you compile tail recursion it mostly just boils down to a `while (true)` loop with the function body in where early return is just a `break` statement. A fold, whilst you are correct about it being a loop, is actually more constrained than a general looping construct; it is more like a for-each loop – J_mie6 Jan 03 '18 at 11:02
  • cont: By that I mean that a fold has a deterministic number of iterations. For any given list, you know exactly how many iterations that "loop" takes. So a fold is like a traditional for loop (in the sense that it has a start, an end and a deterministic step), and a tail recursive function is like a while loop, where it is undecidable now long it takes to compute. In terms of raw power then, the unrestricted recursion is the real `GOTO` – J_mie6 Jan 03 '18 at 11:05
  • 1
    @J_mie6 the reason I consider tail recursion more as a `GOTO` is that you need to do fiddly bookkeeping of what arguments in what state are passed to the recursive call, to ensure it actually behaves as intended. That is not necessary to the same degree in decently written imperative loops (where it's pretty clear what the stateful variables are and how they change in each iteration), nor in naïve recursion (where usually not much is done with the arguments, and instead the result is assembled in a quite intuitive way). ... – leftaroundabout Jan 03 '18 at 12:19
  • 1
    ... As for folds: you're right, a traditional fold (catamorphism) is a very specific kind of loop, but these recursion schemes can be generalised (ana- / apo- / hylomorphisms); collectively these are IMO the proper replacement for imperative loops. – leftaroundabout Jan 03 '18 at 12:19
  • 1
    @J_mie6 The recursive equivalent of your "traditional for loop" also has a start, an end, and a deterministic step, and it's easy to show through induction that it'll stop. – Doval Jan 03 '18 at 17:27
  • @leftaroundabout That raises some questions, such as: Why do you think `GOTO` is bad? (To be clear, I'm not making any claim here on whether it's good or bad. I just want to know your reasoning.) Why do you single out tail recursion, and not include recursion in general, tail calls in general, or function calls in general (or do you include some or all of those, if so, why?)? – 8bittree Jan 03 '18 at 23:08
  • @8bittree I didn't really intend to single out tail recursion. As J_mie6 argued, _general recursion_ might be what should be considered analogous to GOTO. However, and that's where I'm coming from here, well-written _non-tail_- recursive code tends to not be such ”pathologically recursive”. On the contrary, functional programmers like to lay out their code in such a way that the defining equations almost spell out the invariants. But this is not enforced by the recursion mechanism per se (unlike with loops, which do constrain the amount of nonsense the programmer can commit). – leftaroundabout Jan 04 '18 at 00:40
14

I really like Jules' answer, but I wanted to additionally point out something people often miss about lazy functional programming, which is that everything doesn't have to be "inside the loop." For example:

baseSums = scanl (+) 0

offsets = scanl (\offset sum -> if sum == 150 then offset + 100 else offset) 0

zipWithOffsets xs = zipWith (+) xs (offsets xs)

stopAt10 xs = if 10 `elem` xs then 10 else last xs

result = stopAt10 . zipWithOffsets . baseSums

result [1..]         -- 10
result [11..1000000] -- 500000499945

You can see that each part of your logic can be calculated in a separate function then composed together. This allows for smaller functions which are usually much easier to troubleshoot. For your toy example, perhaps this adds more complexity than it removes, but in real world code the split apart functions are often much simpler than the whole.

Karl Bielefeldt
  • 146,727
  • 38
  • 279
  • 479
  • the logic is scattered all over here. this code won't be easy to maintain. `stopAt10` is ***not*** a good consumer. your answer *is* better than the one you cite in that it correctly isolates the basic producer `scanl (+) 0` of values. their consuming should incorporate the control logic directly though, is better implemented with just two `span`s and a `last`, explicitly. that would closely follow the original code structure and logic, too, and be easy to maintain. – Will Ness Jan 03 '18 at 17:27
6

Most list processing examples you will see use functions like map, filter, sum etc. which operate on the list as a whole. But in your case you have a conditional early exit - a rather uncommon pattern which is not supported by the usual list operations. So you need to drop down an abstraction level and use recursion - which is also closer to how the imperative example looks.

This is a rather direct (probably not idiomatic) translation into Clojure:

(defn doSomeCalc 
  ([lst] (doSomeCalc lst 0))
  ([lst sum]
    (if (empty? lst) sum
        (if (= sum 10) sum
            (let [sum (+ sum (first lst))]
                 [sum (if (= sum 150) (+ sum 100) sum)]
               (recur (rest lst) sum))))))) 

Edit: Jules point out that reduce in Clojure do support early exit. Using this is more elegant:

(defn doSomeCalc [lst]  
  (reduce (fn [sum val]
    (if (= sum 10) (reduced sum)
        (let [sum (+ sum val)]
             [sum (if (= sum 150) (+ sum 100) sum)]
           sum))
   lst)))

In any case, you can do anything in functional languages as you can in imperative languages, but you often have to change your mindset somewhat to find an elegant solution. In imperative coding you think of processing a list step by step, while in functional languages you look for an operation to apply to all elements in the list.

JacquesB
  • 57,310
  • 21
  • 127
  • 176
  • see the edit I just added to my answer: Clojure's `reduce` operation supports early exit. – Jules Jan 03 '18 at 08:55
  • @Jules: Cool - that is probably a more idiomatic solution. – JacquesB Jan 03 '18 at 08:57
  • Incorrect - or is `takeWhile` not a 'common operation'? – Jonathan Cast Jan 03 '18 at 14:41
  • @jcast - while `takeWhile` is a common operation, it isn't especially useful in this case, because you need the results of your transformation before you can decide whether to stop. In a lazy language this doesn't matter: you can use `scan` and `takeWhile` on the results of scan (see Karl Bielefeldt's answer, which while it doesn't use `takeWhile` could easily be rewritten to do so), but for a strict language like clojure this would mean processing the entire list and then discarding results afterwards. Generator functions could solve this, however, and I believe clojure supports them. – Jules Jan 04 '18 at 10:12
  • @Jules `take-while` in Clojure produces a lazy sequence (according to the docs). Another way to tackle this would be with [transducers](https://stackoverflow.com/a/45851399/849891) (maybe the best one). – Will Ness Jan 04 '18 at 10:13
5

As pointed out by other answers, Clojure has reduced for stopping reductions early:

(defn some-calc [coll]
  (reduce (fn [answer e]
            (let [answer (+ answer e)]
               (case answer
                 10  (reduced answer)
                 150 (+ answer 100)
                 answer)))
          0 coll))

This is the best solution for your particular situation. You can also get a lot of mileage from combining reduced with transduce, which lets you use transducers from map, filter etc. However it is far from a complete answer to your general question.

Escape continuations are a generalized version of break and return statements. They are directly implemented in some Schemes (call-with-escape-continuation), Common Lisp (block + return, catch + throw) and even C (setjmp + longjmp). More general delimited or undelimited continuations as found in standard Scheme or as continuation monads in Haskell and Scala can also be used as escape continuations.

For example, in Racket you could use let/ec like this:

(define (some-calc ls)
  (let/ec break ; let break be an escape continuation
    (foldl (lambda (answer e)
             (let ([answer (+ answer e)])
               (case answer
                 [(10)  (break answer)] ; return answer immediately
                 [(150) (+ answer 100)]
                 [else  answer])))
           0 ls)))

Many other languages also have escape continuation -like constructs in the form of exception handling. In Haskell you could also use one of the various error monads with foldM. Because they are primarily error handling constructs using exceptions or error monads for early returns is usually culturally unacceptable and possibly quite slow.

You can also drop down from higher-order functions to tail calls.

When using loops, you enter the next iteration automatically when you reach the end of the loop body. You can enter the next iteration early with continue or exit the loop with break (or return). When using tail calls (or Clojure's loop construct which mimics tail recursion), you must always make an explicit call to enter the next iteration. To stop looping you just don't make the recursive call but give the value directly:

(defn some-calc [coll]
  (loop [answer 0, [e es :as coll] coll]
    (if (empty? coll)
      answer
      (let [answer (+ answer e)]
        (case answer
          10 answer
          150 (recur (+ answer 100) es)
          (recur answer es))))))
nilern
  • 59
  • 1
  • 1
    Re using error monads in Haskell, I don't believe there is any real performance penalty here. They tend to get thought of along the lines of exception handling, but they don't work the same way and there isn't any stack walking required, so really shouldn't be a problem if used this way. Also, even if there's a cultural reason not to use something like `MonadError`, the basically-equivalent `Either` has no such bias towards only error handling, so can easily be used as a substitute. – Jules Jan 04 '18 at 10:26
  • @Jules I think returning Left does not prevent the fold from visiting the entire list (or other sequence). Not intimately familiar with Haskell standard library internals though. – nilern Jan 04 '18 at 10:30
2

The intricate part is the loop. Let us start with that. A loop is usually converted to functional style by expressing the iteration with a single function. An iteration is a transformation of the loop variable.

Here is a functional implementation of a general loop :

loop : v -> (v -> v) -> (v -> Bool) -> v
loop init iter cond_to_cont = 
    if cond_to_cont init 
        then loop (iter init) iter cond
        else init

It takes (an initial value of the loop variable, the function that expresses a single iteration [on the loop variable]) (a condition to continue the loop).

Your example uses a loop on an array, which also breaks. This capability in your imperative language is baked into the language itself. In functional programming such capability is usually implemented at the library level. Here is a possible implementation

module Array (foldlc) where

foldlc : v -> (v -> e -> v) -> (v -> Bool) -> Array e -> v
foldlc init iter cond_to_cont arr = 
    loop 
        (init, 0)
        (λ (val, next_pos) -> (iter val (at next_pos arr), next_pos + 1))
        (λ (val, next_pos) -> and (cond_to_cont val) (next_pos < size arr))

In it :

I use a ((val, next_pos)) pair which contains the loop variable visible outside and the position in the array, which this function hides.

The iteration function is slightly more complex than in the general loop, this version makes it possible to use the current element of the array. [It is in curried form.]

Such functions are usually named "fold".

I put an "l" in the name to indicate that the accumulation of the elements of the array is done in a left-associative manner; to mimic the habit of imperative programming languages to iterate an array from low to high index.

I put a "c" in the name to indicate that this version of fold takes a condition that controls if and when the loop to be stopped early.

Of course such utility functions are likely to be readily available in the base library shipped with the functional programming language used. I wrote them here for demonstration.

Now that we have all the tools that are in the language in the imperative case, we can turn to implement the specific functionality of your example.

The variable in your loop is a pair ('answer', a boolean that encodes whether to continue).

iter : (Int, Bool) -> Int -> (Int, Bool)
iter (answer, cont) collection_element = 
  let new_answer = answer + collection_element
  in case new_answer of
    10 -> (new_answer, false)
    150 -> (new_answer + 100, true)
    _ -> (new_answer, true)

Note that i used a new "variable" 'new_answer'. This is because in functional programming i can not change the value of an already initialized "variable". I do not worry about performance, the compiler may get to reuse the memory of 'answer' for 'new_answer' via life-time analysis, if it thinks that is more efficient.

Incorporating this into our loop function developed earlier :

doSomeCalc :: Array Int -> Int
doSomeCalc arr = fst (Array.foldlc (0, true) iter snd arr)

"Array" here is the module name which exports function foldlc is.

"fist", "second" stand for functions that returns the first, second component of its pair parameter

fst : (x, y) -> x
snd : (x, y) -> y

In this case "point-free" style increases the readability of the implementation of doSomeCalc:

doSomeCalc = Array.foldlc (0, true) iter snd >>> fst

(>>>) is function composition : (>>>) : (a -> b) -> (b -> c) -> (a -> c)

It is the same as above, just the "arr" parameter is left out from both sides of the defining equation.

One last thing : checking for case (array == null). In better designed programming languages, but even in badly designed languages with some basic discipline one rather uses an optional type to express non-existence. This does not have much to do with functional programming, which the question is ultimately about, thus i do not deal with it.

libeako
  • 159
  • 3
1

First, rewrite the loop slightly, such that each iteration of the loop either early-exits, or mutates answer exactly once:

    public int doSomeCalc(int[] array)
    {
        int answer = 0;
        if(array!=null)
        {
            for(int e: array)
            {
                if(answer + e == 10) return answer + e;
                else if(answer + e == 150) answer = answer + e + 100;
                else answer = answer + e;
            }
        }
        return answer;
    }

It should be clear that the behavior of this version is exactly the same as before, but now, it's much more straightforward to convert to recursive style. Here's a direct Haskell translation:

doSomeCalc :: [Int] -> Int
doSomeCalc = recurse 0
  where recurse :: Int -> [Int] -> Int
        recurse answer [] = answer
        recurse answer (e:array)
          | answer + e == 10 = answer + e
          | answer + e == 150 = recurse (answer + e + 100) array
          | otherwise = recurse (answer + e) array

Now it's purely functional, but we can improve it from both an efficiency and readability standpoint by using a fold instead of explicit recursion:

import Control.Monad (foldM)

doSomeCalc :: [Int] -> Int
doSomeCalc = either id id . foldM go 0
  where go :: Int -> Int -> Either Int Int
        go answer e
          | answer + e == 10 = Left (answer + e)
          | answer + e == 150 = Right (answer + e + 100)
          | otherwise = Right (answer + e)

In this context, Left early-exits with its value, and Right continues the recursion with its value.


This could now be simplified a bit more, like this:

import Control.Monad (foldM)

doSomeCalc :: [Int] -> Int
doSomeCalc = either id id . foldM go 0
  where go :: Int -> Int -> Either Int Int
        go answer e
          | answer' == 10 = Left 10
          | answer' == 150 = Right 250
          | otherwise = Right answer'
          where answer' = answer + e

This is better as final Haskell code, but it's now a bit less clear how it maps back to the original Java.