40

I have seen that in imperative paradigms

f(x)+f(x)

might not be the same as:

2*f(x)

But in a functional paradigm it should be the same. I have tried to implement both cases in Python and Scheme, but for me they look pretty straightforward the same.

What would be an example that could point out the difference with the given function?

Peter Mortensen
  • 1,050
  • 2
  • 12
  • 14
asgard
  • 667
  • 1
  • 7
  • 9
  • 8
    You can, and often do, write referentially transparent functions in python. The difference is the language doesn't enforce it. – Karl Bielefeldt Aug 24 '14 at 16:22
  • 7
    in C and alike: `f(x++)+f(x++)` might be not the same as `2*f(x++)` (in C it's especially lovely when stuff like that is hidden within macros - did I broke my nose on that? you bet) – gnat Aug 24 '14 at 21:59
  • In my understanding, @gnat's example is why functionally-oriented languages like R employ pass-by-reference and explicitly avoid functions that modify their arguments. In R, at least, it can actually be difficult to skirt these restrictions (at least, in a stable, portable way) without digging into the language's complicated system of environments and namespaces and search paths. – shadowtalker Aug 25 '14 at 01:30
  • 5
    @ssdecontrol: Actually, when you have referential transparency, pass-by-value and pass-by-reference *always* yield the exact same result, so it doesn't matter which one the language uses. Functional languages are frequently specified with something akin to pass-by-value for semantic clarity, but their implementations often use pass-by-reference for performance (or even both, depending on which one is faster for the given context). – Jörg W Mittag Aug 25 '14 at 05:54
  • 5
    @gnat: In particular, `f(x++)+f(x++)` can be absolutely anything, since it's invoking undefined behavior. But that's not really related to referential transparency - which would not help for this call, it's 'undefined' for referentially transparent functions as in `sin(x++)+sin(x++)`, too. Could be 42, could format your hard drive, could have demons flying out of the users nose … – Christopher Creutzig Aug 25 '14 at 06:54
  • i really liked the definition from haskell wiki (https://wiki.haskell.org/Referential_transparency) and I've searched for other sources, which result is here: http://pedrorijo.com/blog/fp-concepts/ – pedrorijo91 Feb 21 '17 at 18:00

4 Answers4

65

Referential transparency, referred to a function, indicates that you can determine the result of applying that function only by looking at the values of its arguments. You can write referentially transparent functions in any programming language, e.g. Python, Scheme, Pascal, C.

On the other hand, in most languages you can also write non referentially transparent functions. For example, this Python function:

counter = 0

def foo(x):
  global counter

  counter += 1
  return x + counter

is not referentially transparent, in fact calling

foo(x) + foo(x)

and

2 * foo(x)

will produce different values, for any argument x. The reason for this is that the function uses and modifies a global variable, therefore the result of each invocation depends on this changing state, and not only on the function's argument.

Haskell, a purely functional language, strictly separates expression evaluation in which pure functions are applied and which is always referentially transparent, from action execution (processing of special values), which is not referentially transparent, i.e. executing the same action can have each time a different result.

So, for any Haskell function

f :: Int -> Int

and any integer x, it is always true that

2 * (f x) == (f x) + (f x)

An example of an action is the result of the library function getLine:

getLine :: IO String

As a result of expression evaluation, this function (actually a constant) first of all produces a pure value of type IO String. Values of this type are values like any other: you can pass them around, put them in data structures, compose them using special functions, and so on. For example you can make a list of actions like so:

[getLine, getLine] :: [IO String]

Actions are special in that you can tell the Haskell runtime to execute them by writing:

main = <some action>

In this case, when your Haskell program is started, the runtime walks through the action bound to main and executes it, possibly producing side-effects. Therefore, action execution is not referentially transparent because executing the same action two times can produce different results depending on what the runtime gets as input.

Thanks to Haskell's type system, an action can never be used in a context where another type is expected, and vice versa. So, if you want to find the length of a string you can use the length function:

length "Hello"

will return 5. But if you want to find the length of a string read from the terminal, you cannot write

length (getLine)

because you get a type error: length expects an input of type list (and a String is, indeed, a list) but getLine is a value of type IO String (an action). In this way, the type system ensures that an action value like getLine (whose execution is carried out outside the core language and which may be non-referentially transparent) cannot be hidden inside a non-action value of type Int.

EDIT

To answer exizt question, here is a small Haskell program that reads a line from the console and prints its length.

main :: IO () -- The main program is an action of type IO ()
main = do
          line <- getLine
          putStrLn (show (length line))

The main action consists of two subactions that are executed sequentially:

  1. getline of type IO String,
  2. the second is constructed by evaluating the function putStrLn of type String -> IO () on its argument.

More precisely, the second action is built by

  1. binding line to the value read by the first action,
  2. evaluating the pure functions length (compute length as an integer) and then show (turn the integer into a string),
  3. building the action by applying function putStrLn to the result of show.

At this point, the second action can be executed. If you have typed "Hello", it will print "5".

Note that if you get a value out of an action using the <- notation, you can only use that value inside another action, e.g. you cannot write:

main = do
          line <- getLine
          show (length line) -- Error:
                             -- Expected type: IO ()
                             --   Actual type: String

because show (length line) has type String whereas the do notation requires that an action (getLine of type IO String) be followed by another action (e.g. putStrLn (show (length line)) of type IO ()).

EDIT 2

Jörg W Mittag's definition of referential transparency is more general than mine (I have upvoted his answer). I have used a restricted definition because the example in the question focuses on the return value of functions and I wanted to illustrate this aspect. However, RT in general refers to the meaning of the whole program, including changes to global state and interactions with the environment (IO) caused by evaluating an expression. So, for a correct, general definition, you should refer to that answer.

Giorgio
  • 19,486
  • 16
  • 84
  • 135
  • 10
    Can the downvoter suggest how I can improve this answer? – Giorgio Aug 24 '14 at 18:03
  • So how would one get the length of a string read from terminal in Haskell? – Stas Bichenko Aug 24 '14 at 19:53
  • 2
    This is extremely pedantic, but for the sake of completeness, it's not Haskell's type system that ensures actions and pure functions don't mix; it's the fact that the language doesn't provide any impure functions that you can call directly. You can actually implement Haskell's `IO` type pretty easily in any language with lambdas and generics, but because anyone can call `println` directly, implementing `IO` doesn't guarantee purity; it'd merely be a convention. – Doval Aug 25 '14 at 14:15
  • I meant that (1) all functions are pure (of course, they are pure because the language does not provide any impure ones, even though as far as I know there are some mechanisms to bypass that), and (2) pure functions and impure actions have different types, so they cannot be mixed. BTW, what do you mean by call **directly**? – Giorgio Aug 25 '14 at 14:21
  • By "directly" I mean without using the `IO` type. – Doval Aug 25 '14 at 14:57
  • I think it would be more instructive to simply use `fmap :: (a -> b) -> IO a -> IO b` to construct the `IO Int` that represents the length of a string read from stdin. The `do` syntactic sugar hides what really happens. – ziggystar Feb 10 '15 at 08:22
  • 6
    Your point about `getLine` not being referentially transparent is incorrect. You are presenting `getLine` as if it evaluates to or reduces to some String, the particular String of which depends on the user's input. This is incorrect. `IO String` does not contain a String any more than `Maybe String` does. `IO String` is a recipe for maybe, possibly obtaining a String and, as an expression, it as pure as any other in Haskell. – LuxuryMode Jun 07 '16 at 13:51
  • @LuxuryMode: `getLine` is a constant that refers to an action of type `IO String`. As such, `getLine` is of course referentially transparent since it always evaluates to the same action. Should I stress this in my answer or how can I explain it better? – Giorgio Jun 07 '16 at 18:35
  • @Giorgio I think you should remove the part that states that "Therefore, this action is not referentially transparent because its result depends on what the user types in each time." Even better would be removing the entire discussion about IO. In my opinion, treating IO as if it is somehow special with regards to referential transparency is a common source of confusion. IO String is as referentially transparent of an expression as any other in Haskell. – LuxuryMode Jun 07 '16 at 20:05
  • @LuxuryMode: But evaluating actions is not referentially transparent: action evaluation can produce different results according to user input. As an expression, `getLine :: IO String` always evaluates to the same action and is therefore referentially transparent. Evaluating the resulting action is not. – Giorgio Jun 08 '16 at 05:04
  • @Giorgio You are confusing execution with evaluation and the effect with the semantics of `IO a` in a program. Referential transparency is a property of expressions in a program. `getLine :: IO String`is referentially transparent because it produces the same IO action every time. An `IO a` can only be executed in one way: by the special value `main :: IO ()` being handed to the runtime system. But the fact of "executing" IO eventually by the runtime system has nothing to with the semantics of my program. In my program, `IO a` is an expression and it is RT. – LuxuryMode Jun 08 '16 at 13:37
  • @LuxuryMode: I am not confusing action execution with expression evaluation, even though I am using the same word. By "evaluating an action" I did not mean the same as "evaluating an expression": expressions and actions are different things. If the term "execution" is more appropriate (and even the official one) for actions, you are right that it should be used. But I am not confusing the two: expression evaluation and action execution are two levels of the language and the first is definitely referentially transparent. I will try to improve my answer as soon as I have some time. – Giorgio Jun 08 '16 at 17:05
  • @Giorgio No. Aside from the escape hatch of `unsafePerformIO` there is no notion of execution in the *language*. – LuxuryMode Jun 08 '16 at 17:51
  • @LuxuryMode: I have tried to improve the explanation in my answer to stress the difference between expression evaluation and action execution. I hope it is less misleading now. – Giorgio Jun 08 '16 at 18:11
  • Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/40904/discussion-between-giorgio-and-luxurymode). – Giorgio Jun 08 '16 at 18:12
  • @LuxuryMode: Where is the notion of execution defined? – Giorgio Jun 08 '16 at 18:32
28
def f(x): return x()

from random import random
f(random) + f(random) == 2*f(random)
# => False

However, that's not what Referential Transparency means. RT means that you can replace any expression in the program with the result of evaluating that expression (or vice versa) without changing the meaning of the program.

Take, for example, the following program:

def f(): return 2

print(f() + f())
print(2)

This program is referentially transparent. I can replace one or both occurences of f() with 2 and it will still work the same:

def f(): return 2

print(2 + f())
print(2)

or

def f(): return 2

print(f() + 2)
print(2)

or

def f(): return 2

print(2 + 2)
print(f())

will all behave the same.

Well, actually, I cheated. I should be able to replace the call to print with its return value (which is no value at all) without changing the meaning of the program. However, clearly, if I just remove the two print statements, the meaning of the program will change: before, it printed something to the screen, after it doesn't. I/O isn't referentially transparent.

The simple rule of thumb is: if you can replace any expression, sub-expression or subroutine call with the return value of that expression, sub-expression or subroutine call anywhere in the program, without the program changing its meaning, then you have referential transparency. And what this means, practically speaking is that you can't have any I/O, can't have any mutable state, can't have any side-effects. In every expression, the value of the expression must depend solely on the values of the constituent parts of the expression. And in every subroutine call, the return value must depend solely on the arguments.

Vincent Savard
  • 1,906
  • 1
  • 17
  • 16
Jörg W Mittag
  • 101,921
  • 24
  • 218
  • 318
  • 4
    "can't have any mutable state": Well, you can have it if it is hidden and does not influence the observable behaviour of your code. Think e.g. about memoization. – Giorgio Aug 24 '14 at 16:25
  • 4
    @Giorgio: This is perhaps subjective, but I'd argue that cached results are not really "mutable state" if they're hidden and have no observable effects. Immutability is always an abstraction implemented on top of mutable hardware; frequently it's provided by the language (giving the abstraction of "a value" even if the value can move between registers and memory locations during execution, and can vanish once it's known it will never be used again), but it's no less valid when it's provided by a library or whatnot. (Assuming it's implemented correctly, of course.) – ruakh Aug 24 '14 at 23:04
  • 1
    +1 I really like the `print` example. Perhaps one way to see this, is that what's printed on the screen is part of the "return value". If you can replace `print` with its function return value and the equivalent writing on the terminal, the example works. – Pierre Arlaud Aug 25 '14 at 08:33
  • @ruakh: I 100% agree with you if you consider cached results that are used internally by the language runtime and are totally invisible to the programmer. On the other hand, if you implement memoization yourself (e.g. in Python) using some private mutable variable, I am not so sure. But, as you say, it may be subjective. – Giorgio Aug 25 '14 at 09:32
  • @ruakh It's the semantics of the language that matter, not the hardware. The memory backing an immutable value may be mutable, but neither programmer nor the compiler has to worry about the value changing. I think you're playing too fast and loose with the definitions; if you have a mutable value at the language level, you're dealing with mutable state, even if the rest of the program doesn't have to worry about it. – Doval Aug 25 '14 at 16:39
  • Another kind of effect that one might consider in combination with memoization is the delay caused by each call. If you not only look at the return value but also at the duration of each call, then the first call might take much more time than the subsequent calls that use the cache. In certain contexts the delay of each call is considered a relevant side effect and therefore I guess that in such contexts a memoized function would not be considered referentially transparent. – Giorgio Aug 25 '14 at 16:45
  • 1
    @Giorgio Space/time usage can't be considered a side effect for the purposes of referential transparency. That would make `4` and `2 + 2` non-interchangeable since they have different running times, and the whole point of referential transparency is that you can substitute an expression with whatever it evaluates to. The important consideration would be thread safety. – Doval Aug 25 '14 at 17:05
  • @Doval: I recently attended a course on reactive systems and, in that context, the delay of a computation is considered an effect. If you think about it, `4` could be the result of a query to a remote server. Then it makes a difference whether the result comes within 2 seconds or after 30 minutes. – Giorgio Aug 25 '14 at 17:35
  • @Giorgio If that `4` comes from a query to a remote server, then that's a side effect, so it wouldn't be considered equal to computing `2 + 2` locally even if the query to the remote server was somehow faster than local computations. There's way bigger differences between the two than just running times. At any rate, it sounds like in that context you care about more than *just* referential transparency. – Doval Aug 25 '14 at 17:51
  • In this [answer](http://codereview.stackexchange.com/a/80027/37254), did I not maintain "immutability of state" of an object that is referenced by name `listOfSequence`. Does this program still follow functional paradigm? Because am not rewriting in every index that is once written in the list object using `append`. – overexchange Feb 09 '15 at 13:40
  • 1
    @overexchange: Referential Transparency means that you can replace every subexpression with its value without changing the meaning of the program. `listOfSequence.append(n)` returns `None`, so you should be able to replace every call to `listOfSequence.append(n)` with `None` without changing the meaning of your program. Can you do that? If not, then it is not referentially transparent. – Jörg W Mittag Feb 09 '15 at 13:46
  • Can I say that, any program that is referentially transparent, do not require error handling? Because we are suppose to replace every sub-expression with its value. So, Where is the necessity of error handling(compile-time/run-time)? – overexchange Apr 23 '15 at 09:19
1

Parts of this answer are taken directly from an unfinished tutorial on functional programming, hosted on my GitHub account:

A function is said to be referentially transparent if it, given the same input parameters, always produces the same output (return value). If one is looking for a raison d'être for pure functional programming, referential transparency is a good candidate. When reasoning with formulae in algebra, arithmetic, and logic, this property — also called substitutivity of equals for equals — is so fundamentally important that it is usually taken for granted...

Consider a simple example:

x = 42

In a pure functional language, the left-hand and right-hand side of the equals sign are substitutable for each other both ways. That is, unlike in a language like C, the above notation truly asserts an equality. A consequence of this is that we can reason about program code just like mathematical equations.

From Haskell wiki:

Pure computations yield the same value each time they are invoked. This property is called referential transparency and makes possible to conduct equational reasoning on the code...

To contrast this, the type of operation performed by C-like languages is sometimes referred to as a destructive assignment.

The term pure is often used to describe a property of expressions, relevant to this discussion. For a function to be considered pure,

  • it is not allowed to exhibit any side effects, and
  • it must be referentially transparent.

According to the black-box metaphor, found in numerous mathematical textbooks, a function's internals are completely sealed off from the outside world. A side-effect is when a function or expression violates this principle — that is, the procedure is allowed to communicate in some way with other program units (e.g. to share and exchange information).

In summary, referential transparency is a must for functions to behave like true, mathematical functions also in the semantics of programming languages.

gnat
  • 21,442
  • 29
  • 112
  • 288
laserpants
  • 21
  • 3
  • this seems to open with [word-by-word copy taken from here](https://github.com/johanneshilden/code-overload): "A function is said to be referentially transparent if it, given the same input parameters, always produces the same output..." Stack Exchange has [rules for plagiarism](http://meta.stackexchange.com/tags/plagiarism/info), are you aware about these? **"Plagiarism is the soulless act of copying chunks of someone else's work, slapping your name on it and passing yourself of as the original author..."** – gnat Aug 24 '14 at 19:15
  • 3
    I wrote that page. – laserpants Aug 24 '14 at 19:21
  • if this is the case, consider making it look less of a plagiarism - because readers have no way to tell. Do you know how to do this at SE? 1) You refer the originals source, like "As (I have) written `[here](link to source)`..." followed by 2) proper quote formatting (use quote marks, or better yet, `>` symbol for that). It also wouldn't hurt if besides giving general guidance, answer addresses concrete question asked about, in this case about `f(x)+f(x)` / `2*f(x)`, see [Answer] - otherwise it may look like you're simply advertising your page – gnat Aug 24 '14 at 19:24
  • 1
    Theoretically, I understood this answer. But, practically following these rules, I need to return hailstone sequence list in this [program](http://codereview.stackexchange.com/a/80027/37254). How do I do this? – overexchange Feb 09 '15 at 15:27
0

The referential transparency is the property of a function when a function always returns same result for the same input. The referential transparency is a property possessed by a mathematical function.

e.g. f (x) = y;

Here the function,

1]Always accepts some argument. (A function without an argument is never referetially transparent).

2]Always return some value. (Here 'Y').

3]Acts only on it's inputs and not on any outside entity.

4]For a given 'x' it always returns same 'Y'.

Many answers here have explained it well with good examples. So I am not going to cover them. I just want to make the concept more clear with what I have understood.

Referential transparency is a vital thing in functional programming. Application of functional transparency results in a testable cacheable codebase.

A referentially transparent function never accesses or manipulates the global data. Thus it leads to thread safe code that does not need locking or synchronization.

Referential transparency and testable code: (Citing a JavaScript code)

var percentValue = 5;
var calculateTax = (value) => {
    return value/100 * (100 + percentValue);
}

If we consider the above code for testing the result will vary every time the value of global variable "percentValue" changes. Thus the function will not give same result for same input every time. So there is no referential transparency here. This will not lead to testable code.

Now look at this code:

var calculateTax = (value, percentValue) => {
    return value/100 * (100 + percentValue);
}

The above code is referentially transparent and thus can be tested painlessly. Thus referential transparency leads to testable code.

Deepeshkumar
  • 141
  • 4