36

Encapsulation

In object-oriented programming (OOP), encapsulation refers to the bundling of data with the methods that operate on that data, or the restricting of direct access to some of an object's components.1 Encapsulation is used to hide the values or state of a structured data object inside a class, preventing unauthorized parties' direct access to them. Wikipedia - Encapsulation (Computer Programming)

Immutability

In object-oriented and functional programming, an immutable object (unchangeable object) is an object whose state cannot be modified after it is created.Wikipedia - Immutable object

If you can guarantee immutability, do you need to think about encapsulation?

I have seen these concepts being used in explaining ideas in object-oriented programming (OOP) and functional programming (FP).

I tried to investigate on the topics of encapsulation, immutability and their relation to one another. I couldn't find a post that explicitly asked if encapsulation is guaranteed if you have immutability.

Please correct me if I have misunderstood anything on the topic of encapsulation or immutability. I wish to understand these concepts better. Also, direct me to any other posts that have been done on the topic which answers the question above.

Peter Mortensen
  • 1,050
  • 2
  • 12
  • 14
  • 12
    Does this answer your question? [Is guaranteeing immutability a justification for exposing a field instead of a property?](https://softwareengineering.stackexchange.com/questions/288797/is-guaranteeing-immutability-a-justification-for-exposing-a-field-instead-of-a-p) – gnat Jan 28 '20 at 10:07
  • see also: [Complete immutability and Object Oriented Programming](https://softwareengineering.stackexchange.com/a/232714/31260) – gnat Jan 28 '20 at 10:09
  • @gnat, Thank you for the links. It gives the general idea of immutability and encapsulation, however, I wanted a more straightforward question to the relation between immutability and encapsulation. It seems most tackle these concepts with the problem they are attempting to solve. Which is great, but hard to understand the concepts themselves sometimes. – Christopher Trotter Jan 28 '20 at 18:27

7 Answers7

60

I hate how encapsulation is always framed as preventing unauthorized access. If this were the best way to think of it, immutability would indeed eliminate most of the need for encapsulation. In fact, immutability does eliminate many cases of overzealous encapsulation, where the only purpose of the encapsulation was to keep the bumbling callers out.

Encapsulation is better thought of as providing good customer service. You're adding an interface that is easier to use and at a more appropriate level of abstraction. It's like the utilidors, a system of tunnels under Disney parks. Their purpose isn't to provide some privileged space where guests aren't allowed. It's to avoid spoiling the guests' experience and immersion. Immutability makes no difference either way to the "visitor experience" kind of encapsulation.

Damian Yerrick
  • 309
  • 3
  • 10
Karl Bielefeldt
  • 146,727
  • 38
  • 279
  • 479
  • 1
    Although the accepted answer kind of touched this, I think this is much clearer and to the point and is by far the most important issue. Might add an example though. Wish I could +5 – Bill K Jan 30 '20 at 21:13
  • This answer implies that encapsulation is always a benefit to the _user_ of the module; but sometimes it eixsts for the benefit the _author_ of the module. For instance, in the calendar example in [Flater's answer](https://softwareengineering.stackexchange.com/a/404373/96713), "good customer service" would be to _expose_ the raw string, because that's what the "customer" wants; but doing so restricts what the _author_ of the module can do with it. In your analogy, the guests might benefit from access to those tunnels, but that would hamper the staff's use, so they're kept out. – IMSoP Jan 31 '20 at 13:15
53

The question

Casting your question to real life:

Is it okay for your doctor to post your private medical records publicly to Facebook, provided no one (other than you) is able to change it?

Is it okay for me to let strangers in your house, provided they can't steal or damage anything?

It's asking the same thing. The core assumption of your question is that the only concern with exposing data is that it can be changed. As per your reference material:

Encapsulation is used to hide the values or state of a structured data object inside a class, preventing unauthorized parties' direct access to them.

The ability to change values or state is definitely the biggest concern, but it's not the only concern. "Direct access" entails more than just write access. Read access can be a source of weakness as well.

A simple example here is that you are generally advised to not show stacktraces to an end user. Not just because errors shouldn't occur, but because stacktraces sometimes reveal specific implementations or libraries, which leads to an attacker knowing about the internal structure of your system.

The exception stacktrace is readonly, but it can be of use to those who wish to attack your system.

Edit
Due to confusion mentioned in the comments, the examples given here are not intended to suggest that encapsulation is used for data protection (such as medical records).
This part of the answer so far has only addressed the core assertion that your question is built upon, i.e. that read access without write access is not harmful; which I believe to be incorrect, hence the simplified counterexamples.


Encapsulation as a safety guard

Additionally, in order to prevent write access, you would need to have immutability all the way to the bottom. Take this example:

public class Level1
{
    public string MyValue { get; set; }
}

public class Level2 // immutable
{
    public readonly Level1 _level1;

    public Level2(Level1 level1) { _level1 = level1; }
}

public class Level3 // immutable
{
    public readonly Level2 _level2;

    public Level3(Level2 level2) { _level2 = level2; }
}

We've let Level2 and Level3 expose their readonly fields, which is doing what your question is asserting to be safe: read access, no write access.

and yet, as a consumer of a Level3 object, I can do this:

// fetch the object - this is allowed behavior
var myLevel3 = ...; 

// but this wasn't the intention!
myLevel3.Level2.Level1.MyValue = "SECRET HACK ATTACK!";

This code compiles and runs perfectly fine. Because read access on a field (e.g. myLevel3.Level2) gives you access to an object (Level2) which in turn exposes read access to another object (Level1), which in turn exposes read and write access to its MyValue property.

And this is the danger of brazenly making everything immutably public. Any mistake will be visible and become an open door for unwanted behavior. By needlessly exposing some things that could easily have been hidden, you have opened them up to scrutiny and abuse of weakness if any exists.

Edit
Caleth mentioned that a class is not immutable if it exposes something that itself is not immutable. I think that this is a semantical argument. Level2's properties are readonly, which ostensibly makes it immutable.

To be fair, if the law of Demeter had been followed in my example, the issue wouldn't have been as glaring since Level2 wouldn't expose direct access to Level1 (but that precludes the issue I was trying to highlight); but the point of the matter is that it's a fool's errand to try and ensure the immutability of an entire codebase. If someone makes one adjustment in a single class (that a lot of other classes depend on in some way), that could lead to an entire assembly worth of classes becoming mutable without anyone noticing it.

This issue can be argued to be a cause of a lack of encapsulation or not following the law of Demeter. Both contribute to the issue. But regardless of what you attribute it to the fact remains that the this is unmistakably a problem in the codebase.


Encapsulation for clean code

But that's not all you use encapsulation for.

Suppose my application wants to know the time, so I make a Calendar which tells me the date. Currently, I read this date as a string from a file (let's assume there is a good reason for this).

public class Calendar
{
    public readonly string fileContent; // e.g. "2020-01-28"

    public DateTime Date => return DateTime.Parse(fileContent);

    public Calendar()
    {
        fileContent = File.ReadAllText("C:\\Temp\\calendar.txt");
    }
}

fileContent should have been an encapsulated field, but I've opened it up because of your suggestion. Let's see where that takes us.

Our developers have been using this calender. Let's look at Bob's library and John's library:

public class BobsLibrary
{
    // ...

    public void WriteToFile(string content)
    {
        var filename = _calendar.fileContent + ".txt"; // timestamp in filename
        var filePath = $"C:\\Temp\\{filename}";

        File.WriteAllLines(filePath , content);
    }
}

Bob has used Calendar.fileContent, the field that should've been encapsulated, but wasn't. But his code works and the field was public after all, so there's no issue right now.

public class JohnsLibrary
{
    // ...

    public void WriteToFile(string content)
    {
        var filename = _calendar.Date.ToString("yyyy-MM-dd") + ".txt"; // timestamp in filename
        var filePath = $"C:\\Temp\\{filename}";

        File.WriteAllLines(filePath , content);
    }
}

John has used Calendar.Date, the property that should always be exposed. At first glance, you'd think John is doing unnecessary work by converting the string to a DateTime and back to a string. But his code does work, so no issue is raised.

Today, we have learned something that will save us a lot of money: you can get the current date from the internet! We no longer have to hire an intern to update our calendar file every midnight. Let's change our Calendar class accordingly:

public class Calendar
{
    public DateTime Date { get; }

    public Calendar()
    {
        Date = GetDateFromTheInternet("http://www.whatistodaysdate.com");
    }
}

Bob's code has broken! He no longer has access to the fileContent, since we're no longer parsing our date from a string.

John's code, however, has kept working and does not need to be updated. John used Date, the intended public contract for the calendar. John did not build his code to rely on implementation details (i.e. the fileContent from which we parsed the date in the past), and therefore his code can effortlessly handle changes to the implementation.

This is why encapsulation matters. It allows you to disconnect your consumers (Bob, John), from your implementation (the calendar file) by having an intermediary interface (the DateTime Date). As long as the intermediary interface is untouched, you can change the implementation without affecting the consumers.

My example is a bit simplified, you'd more likely use an interface here and swap out the concrete class that implements the interface for another class that implements the same interface. But the issue I pointed out remains the same.

Flater
  • 44,596
  • 8
  • 88
  • 122
  • 1
    Thank you for taking the time answering my question thoroughly. It really helped with the example you provided. What a really great example! :D – Christopher Trotter Jan 28 '20 at 11:14
  • 8
    calling `Level2` immutable is a lie, because it's exposing a mutable `Level1` – Caleth Jan 28 '20 at 11:50
  • @Caleth: To each class their own responsibility, and `Level2`'s properties and fields _are_ immutable. To be fair, if the law of Demeter had been followed the issue wouldn't have been as glaring since `Level2` wouldn't expose direct access to `Level1` (but that precludes the issue I was trying to highlight), but the point of the matter is that it's a fool's errand to try and ensure the immutability of an entire codebase. If someone makes one adjustment in a single class, that could lead to an entire assembly worth of classes becoming mutable without anyone noticing it. – Flater Jan 28 '20 at 11:55
  • 59
    Encapsulation on the source-code level has nothing to with data security or privacy. – Christian Hackl Jan 28 '20 at 12:28
  • @ChristianHackl: I didn't claim that it is a privacy or data security measure (because of reflection, for one), I said it was a safety guard against unwanted behavior. – Flater Jan 28 '20 at 14:15
  • @ChristianHackl If you are running untrusted code within an application, say in an applet or PAAS, for example, encapsulation can be used for security. In practice, however, I would mostly agree due to the level of difficultly of being successful in this approach. – JimmyJames Jan 28 '20 at 18:29
  • @Caleth It sure is immutable. But not transitively immutable, if that matters. And it might. – Deduplicator Jan 28 '20 at 22:59
  • 11
    "preventing unauthorized parties' direct access to them" is not the reason why you want to have encapsulation. – Lie Ryan Jan 29 '20 at 06:48
  • 4
    `Level2` is a good example of how `readonly` *isn't* a synonym for immutable – Caleth Jan 29 '20 at 08:28
  • Coming from google, I did not find `http://www.whatistodaysdate.com`, and that made me sad. Still, +1, best answer this question could have. – bracco23 Jan 29 '20 at 09:43
  • "Level2's properties are readonly, which ostensibly makes it immutable.". Goodness no. That's a dangerous thing to conflate (this example actually already shows why it's not true). C# is an unfortunate choice of language since it makes it very hard to create immutable objects. I'd recommend looking at for example D to see how a language can actually guarantee immutability. Had to downvote for this even if I agree with the rest (encapsulation to hide implementation details is something immutability simply can't offer) – Voo Jan 29 '20 at 09:49
  • In FP, your Level2 example is a non-starter when everything is immutable by default. Now, if you’re managing hidden state behind the scenes, well, ostensibly thats private. – D. Ben Knoble Jan 29 '20 at 13:19
  • 19
    I think the whole first part of your answer is a red herring. We are talking about Encapsulation in Source Code - if someone is compiling your source code, nothing is private for him. He could just fork your project and change visibility, or use reflection, or something else. - You second part about api-contract makes a lot more sense – Falco Jan 29 '20 at 13:41
  • 1
    @Falco: Taking the example I used of Bob and John, these could be coworkers who both have access to the same source files, and both have started working in the codebase when the calendar already existed. Keeping the implementation details private, even if John or Bob could technically change the access modifier, on `fileContent`, still has a purpose as it **documents** how the calender _should_ be used. – Flater Jan 29 '20 at 17:20
  • 3
    @Flater yes. That is what the second part of the answer talks about and I said I agree with it. But the first part about data privacy/security is a false trail and will likely inspire wrong implications. Just taking your first example: making a field private is like your doctor prefixing the public post with "please don't read this" - the examples are bad! – Falco Jan 29 '20 at 17:45
  • 1
    @Falco "Safety guard for unwanted behavior" is not synonymous with "data privacy/security". Those are two very different things that should not be conflated (e.g. I wear a motorcycle helmet to prevent unwanted injuries - not to remain unidentifiable). Nothing is private to any consumer of either the source code _or even the assembly_, since you can always access things using e.g. reflection. – Flater Jan 29 '20 at 17:48
  • 7
    @Flater exactly! We're completely on the same page. Now explain to me how the doctor example provides any indication to understanding the point you just made. I think the doctor example and the whole first part of this answer will steer people to misunderstand this important point. – Falco Jan 29 '20 at 18:02
  • 1
    @Falco The posted question is rooted in the assumption that read access without write access (even if you presume it's implemented flawlessly) is harmless, which it is not. Hence my counterexamples to prove the point. Yes, that's not a technical argument, but it gets at the heart of the question by countering its core assertion. – Flater Jan 29 '20 at 22:41
  • 3
    We are saying that the counterexamples do *not* prove the point, and they do *not* get at the heart of the question. A compiler or interpreter processing source code according to predefined language rules which may cause an error message to be emitted or a build to fail is nothing like strangers getting into your house or a doctor posting medical records to Facebook. – Christian Hackl Jan 30 '20 at 06:21
  • @ChristianHackl: You're not seeing the wood for the trees here. I am not stating that encapsulation is done because of data privacy. I am countering the assertion that read access is harmless without write access using counterexamples. Furthermore, encapsulation is not about "a compiler or interpreter processing source code according to predefined language rules", encapsulation is about **sensibly managing access to certain data**. Not as a measure of security or data privacy, but as a measure of clean coding practice by ensuring that you only expose a limited relevant contract to consumers. – Flater Jan 30 '20 at 07:45
  • 5
    @Flater: You keep saying in your comments that encapsulation has nothing to do with privacy or security, yet you (inexplicably for me) still insist on your doctor example. – Christian Hackl Jan 30 '20 at 16:48
  • @ChristianHackl: It is an example of read access without write access that shows that read access without write access can still be harmful and undesirable. – Flater Feb 01 '20 at 01:24
33

Encapsulation could mean that you hide the actual storage of immutable data.

E.g.:

class Color
{
  private readonly uint argb;
  public byte Blue => (byte)(argb & 0xFF);

  public Color(byte red, byte green, byte blue, byte alpha)
  {
    argb = alpha << 24 | red << 16 | green << 8 | blue;
  }
}

The interface (the constructor and byte Blue) hides the fact that the actual blue data is stored in the last byte of a uint. This uint could be changed later on to rgba or an array of bytes WITHOUT breaking the interface:

class Color
{
  private readonly byte[] argb;
  public byte Blue => (byte)argb[3];

  public Color(byte red, byte green, byte blue, byte alpha)
  {
    argb = new []{alpha, red, green, blue};
  }
}

In this case properly encapsulating the (immutable) data allows changes to the implementation without breaking the public interface.

So, yes it might be worth it to think about encapsulation, even when data is immutable.

Deduplicator
  • 8,591
  • 5
  • 31
  • 50
Emond
  • 1,248
  • 8
  • 13
8

If you can guarantee Immutability, do you need to think about Encapsulation?

Perhaps, but probably not in a way most people think.


First, realize that encapsulation (the practice of hiding the internal structure of data) is not a goal. It is a means to decoupling (which in turn is a means to abstraction).

Next, you can achieve decoupling in ways that don't involve declaring private members on your types. The most common example of this in functional programming is closures. Consider the following code in C#:

var lowest = int.Parse(someString);
var filtered = collection.Where<Elem>(x => x >= lowest);

The function Where filters results based on a predicate that in turn depends on a value that Where knows nothing about. As long as the closure exposes the interface of accepting an object of type Elem and returning a bool, everything works.

It could be the case that the type Elem has all of its members fully exposed publicly, and yet Where is fully capable of utilizing that type as part of a filter and yet remaining fully decoupled from that type. The decoupling is so complete that both Where and Elem could be defined in entirely separate assemblies that know nothing about each other, and the snippet above could be in a third assembly referencing both, and nothing changes about whether or how it works.

Thus, hiding the structure of your data is not a necessary precondition for achieving decoupling.

  • 1
    I disagree, encapsulation should be a goal, and in your example Where may be allowed blissful ignorance, but the developer isn’t so lucky. The developer needs to know that Elem has either an implicit or explicit comparison/conversion operator for int and what value that represents. That is unavoidable in this case, but you shouldn’t require a developer to learn any more about a class than is absolutely required to use it. In particular it’s an absolute waste to require that they learn “don't use X”. – jmoreno Jan 31 '20 at 00:45
  • I don't see how the `Where` in this example is decoupled from the type; there is a contract that `Elem` must meet for it to continue working. It only appears to be decoupled from the implementation because the assumption it makes about `Elem` is so simple (that it implements `>=`); but it's conceptually no different from the assumption that it's safe and appropriate to call `x.value.toInt()`. Complete decoupling is impossible: you need to make _some_ assumption about how two modules will interact; but encapsulation allows you to _limit_ that coupling to an agreed contract. – IMSoP Jan 31 '20 at 13:09
  • @IMSoP `Where` is decoupled from `Elem`. The lambda passed to `Where` obviously isn't, as you said, but I wasn't claiming that. – Theodoros Chatzigiannakis Jan 31 '20 at 17:21
  • OK, but it _is_ coupled to a particular contract for the closure, and the implementation of that closure hides (i.e. encapsulates) details that are irrelevant to that contract. Using a closure is just another form of encapsulation, not an alternative to it. Indeed, a closure can be modelled as an object with a single method, and a private member for each captured value; conversely, an object can be modelled as a closure whose arguments are a method name and its parameters. – IMSoP Jan 31 '20 at 17:43
  • @IMSoP Yeah, but at the point where we're talking about a function, I think it doesn't make sense to talk about encapsulation versus not. Very rarely do you expect a function from A to B to expose the internal structure of how it goes from A to B (the exception being representations like expression trees). But your mileage may vary. – Theodoros Chatzigiannakis Jan 31 '20 at 18:44
  • If the question was "do you need to think about encapsulation in a functional programming language", that might be a reasonable response; but given a completely free choice, _the fact that you've encapsulated something in a function in the first place_ is already a decision, showing that you value encapsulation. In some languages, using local rather than global variables within that function is also a decision; again, encapsulation is an explicit reason to choose local. – IMSoP Jan 31 '20 at 23:54
  • I think a function (any function, not just a closure) meets definition (b) as well: it is decoupled from its caller only to the extent that it hides its internal structure. For instance, a global variable in a function is like a public property on an object: it can be seen and manipulated from outside that code unit, breaking encapsulation. Local variables and private properties are doing the same job in different contexts. – IMSoP Feb 02 '20 at 11:35
3

Short answer is: the two properties are unrelated and taking care of one does not ensure that the other is taken care of. Encapsulation is a property of the code you write, so it is implemented in the coding phase. Immutability is a property of runtime objects. It means ensuring that the state of the objects in memory is not modified after it is instantiated.

So, to recap, encapsulation is meant to decouple different parts of the code, in such a way that if the developer changes their mind and modifies the names of some variables the rest of the code is not broken. The most obvious use case is API development, everything that is changed within the API should not break the dependencies.

Immutability instead is only about the values of the properties set at runtime, when an object instance is created. The decoupling is the other way around, a library creating an instance should be able to pass it to several consumers without worrying that one of them might change the state of the instance.

FluidCode
  • 709
  • 3
  • 10
  • thanks for the summary. The misconception of Immutability and Encapsulation became clear when Flater provided an example which highlighted the differences. The feedback from others helped also understand the different use cases and perspectives. Often when reading about Encapsulation, the topic of Immutability was discussed which might be the reason for my misconception. Thank you for taking the time to answer and summaries. – Christopher Trotter Jan 29 '20 at 08:05
2

I mainly agree with Karl Bielefeldt’s answer, but believe the analogy is a bit off.

Encapsulation is not security, it’s purpose is to avoid information overload. Consider the various flavors of autocompletion/auto suggest/intellisense, on systems with reflection, suggesting private methods would make reflection easier and less error prone. Would you want your IDE to suggest those outside the class itself?

There is a limit to what you can keep in mind at one time. In the old day’s we used to have global variables, hundreds of them, possibly even thousands and most of them were used in relatively few places. When you used them, you had to determine which one to use, whether it already existed or not, where it was being changed and where it was being read from, and where in the flow of the application the read/write was happening. It was hard, working with such code it was not uncommon to use the wrong variable or to declare a duplicate that was used in different places, nor was it uncommon to change the value inappropriately between usages. It’s impossible to think about hundreds of variables at once.

Encapsulation means that state is exposed where it should be changed, that actions are exposed where they can be used. If you have a class Foo which has a private method CalcBarXSetY that calculates an intermediate value used in private method BarX which is used in the method Bar, as a user of the class Foo, I shouldn’t be burdened with the knowledge that CalcBarXSetY exists, because I have no idea what it does and can’t reliably use it. Knowing that Foo has a method named CalcBarSetX, is approximately as useful to me as a user of the class as the number of lines used to create the class or whether Bar uses a variable named i or x.

jmoreno
  • 10,640
  • 1
  • 31
  • 48
  • Limiting the information you _need_ to know is _one_ purpose of encapsulation, but it is not _the only_ purpose. The last example in [flater's answer](https://softwareengineering.stackexchange.com/a/404373/96713) shows a different purpose: sometimes the hidden information _would_ be useful to you, but the author of the encapsulated module wants the freedom to _change_ that detail, so is _deliberately hiding it from you_. – IMSoP Jan 31 '20 at 12:52
  • @IMSoP: that’s the other side of the same coin — if it’s changing, it’s not useful to know it. Not exposing unnecessary details gives the freedom to change them. – jmoreno Jan 31 '20 at 13:11
  • I think that framing misses the point. It's not that knowing it isn't useful, it's that _even if I do know it, I should avoid acting on that knowledge_. If I can write `goto foo.php line 1234`, the problem is not that I _know_ what's on line 1234, it's that I am _using_ that information in a dangerous way. – IMSoP Jan 31 '20 at 13:20
0

Let me disagree with all answers so far.

IMHO immutability instead of encapsulation is good for simple classes and bad for complex classes.

Justification:

  • A simple class I call something with 1-3 fields, meant to obtain and provide them. Most people would use struct for that. Writing accessors for that is pure boilerplate code.
  • Constness makes reasoning about the program behaviour much simpler.
  • On the other hand, a complex class with internal logic must most definitely be completely encapsulated, just as other answers advise. A pImpl idiom could very well be suitable as well.
  • Case study: python does not feature access restrictions; is still extremely popular.
Vorac
  • 7,073
  • 7
  • 38
  • 58