47

In legacy code I occasionally see classes that are nothing but wrappers for data. something like:

class Bottle {
   int height;
   int diameter;
   Cap capType;

   getters/setters, maybe a constructor
}

My understanding of OO is that classes are structures for data and the methods of operating on that data. This seems to preclude objects of this type. To me they are nothing more than structs and kind of defeat the purpose of OO. I don't think it's necessarily evil, though it may be a code smell.

Is there a case where such objects would be necessary? If this is used often, does it make the design suspect?

Michael K
  • 15,539
  • 9
  • 61
  • 93
  • 1
    This doesn't quite answer your question but seems relevant nonetheless: http://stackoverflow.com/questions/36701/struct-like-objects-in-java – Adam Lear Dec 27 '10 at 14:39
  • 3
    I think the term you are looking for is POD (Plain Old Data). – Gaurav Dec 27 '10 at 14:51
  • 10
    This is a typical example of structured programming. Not necessarily bad, just not object oriented. – Konrad Rudolph Dec 27 '10 at 15:02
  • 1
    shouldn't this be on stack overflow? – Muad'Dib Dec 27 '10 at 15:42
  • @Muad'Dib: No, it is not a 'how do I code this' question. – Michael K Dec 27 '10 at 15:46
  • @michale true, but it is a question about "programming" not "programmers" – Muad'Dib Dec 27 '10 at 17:27
  • 8
    @Muad'Dib: technically, it *is* about programmers. Your compiler doesn't care if you use plain old data structures. Your CPU probably enjoys it (in the "I love the smell of data fresh from the cache" sort of way). It's *people* who get hung up on "does this make my methodology less pure?" questions. – Shog9 Dec 27 '10 at 18:42
  • @Muad'dib: From the FAQ, "Programmers - Stack Exchange is for expert programmers who are interested in subjective discussions on software development." The questions here aren't limited to things that are about programmers. Programming in general is fair game as well, under certain conditions. – Adam Lear Dec 27 '10 at 21:41
  • @Gaurav and others - There is no real *POD* in Java. It's a C/C++ concept (even if C++0x changes it somewhat). POD classically refers to a struct which has no user-defined constructors/destructors and no members with restricted access (the variables in OP's example are package private, hence no POD). Therefore, the title is also wrong. – Felix Dombek Dec 27 '10 at 22:39
  • This can be the DTO pattern, which is useful for sharing data between layers (think between your DAL and your business objects, for example). – Andy Aug 14 '16 at 15:00
  • Possible duplicate of [What is the point of using DTO (Data Transfer Objects)?](http://softwareengineering.stackexchange.com/questions/171457/what-is-the-point-of-using-dto-data-transfer-objects) – gnat Dec 15 '16 at 11:38
  • In my programming class about OO we learned that we should do this. Man thats a throwback – SirHawrk May 31 '22 at 07:12

10 Answers10

75

Definitely not evil and not a code smell in my mind. Data containers are a valid OO citizen. Sometimes you want to encapsulate related information together. It's a lot better to have a method like

public void DoStuffWithBottle(Bottle b)
{
    // do something that doesn't modify Bottle, so the method doesn't belong
    // on that class
}

than

public void DoStuffWithBottle(int bottleHeight, int bottleDiameter, Cap capType)
{
}

Using a class also allows you to add an additional parameter to Bottle without modifying every caller of DoStuffWithBottle. And you can subclass Bottle and further increase the readability and organization of your code, if needed.

There are also plain data objects that can be returned as a result of a database query, for example. I believe the term for them in that case is "Data Transfer Object".

In some languages there are other considerations as well. For example, in C# classes and structs behave differently, since structs are a value type and classes are reference types.

Adam Lear
  • 31,939
  • 8
  • 101
  • 125
  • 3
    True. And it would make it easier to add methods to `Bottle` later if you need to, right? – Michael K Dec 27 '10 at 15:02
  • OO is meant to make working with objects easier. Objects should be in parts of work. Everything to do with the inner workings of `Bottle` should be in the Bottle class. If, at a later date, new functionality needs to be added to Bottle, it is very easy. Containers, in all forms, are another important aspect of OO that should not be ignored. – IAbstract Dec 27 '10 at 15:19
  • @Michael Yes, definitely. – Adam Lear Dec 27 '10 at 15:22
  • 27
    Nah. `DoStuffWith` should be a method *of* the `Bottle` class in OOP (and should most probably be immutable, too). What you have written above is not a good pattern in an OO language (unless you are interfacing with a legacy API). It *is*, however, a valid design in a non-OO environment. – Konrad Rudolph Dec 27 '10 at 16:00
  • 3
    If "DoStuffWith" were "Display" ? – Murph Dec 27 '10 at 16:04
  • @Konrad Rudolph: just because Java is mostly an OO Language, it doesn't mean OOP is always the best answer. – Javier Dec 27 '10 at 16:06
  • 11
    @Javier: Then Java isn’t the best answer either. Java’s emphasis on OO is overwhelming, the language is basically no good at anything else. – Konrad Rudolph Dec 27 '10 at 16:09
  • 5
    @Konrad: depends on what you're doing. It might not make sense to have a Crush() method on Bottle. It may make more sense to have a Crusher class that can crush objects, and you may have a CrushBottle method on _that_. Ultimately, the correct way to do it is the one that makes most sense, whether or not that agrees with some arbitrary (and flawed) definition of what OOP _is_ – JohnL Dec 27 '10 at 16:41
  • 11
    @JohnL: I was of course simplifying. But in general, objects in OO encapsulate *state*, not data. This is a fine but important distinction. The point of OOP is precisely not to have a lot of data sitting around. It is *sending messages* between states that create a new state. I fail to see how you can send messages either to or from a methodless object. (And *this* is the original definition of OOP so I wouldn’t consider it flawed.) – Konrad Rudolph Dec 27 '10 at 16:53
  • 15
    @Konrad Rudolph: This is why I explicitly made the comment inside the method. I agree that methods that affect instances of Bottle should go in that class. But if another object needs to modify its state based on the information in Bottle, then I think my design would be fairly valid. – Adam Lear Dec 27 '10 at 17:12
  • 3
    @Konrad, sure but just because it was the original definition doesn't make it perfect. Situations arise in practice that were never considered in the theory, which makes the theory flawed to some extent. You can twist he situation to try and fit the theory, but as always it's better to adapt the theory to the situation. You can start adding methods to the object purely to fit the "objects must have methods" rule, but that wouldn't be appropriate because those methods don't really need to be there. – JohnL Dec 27 '10 at 17:26
  • 14
    @Konrad, I disagree that doStuffWithBottle should go in the bottle class. Why should a bottle know how to do stuff with itself? doStuffWithBottle indicates that something *else* will do something with a bottle. If bottle had that in it, that would be tight coupling. However, if the Bottle class had a isFull() method, that would be totally appropriate. – Nemi Dec 27 '10 at 18:31
  • 4
    @Konrad: I prefer the concept of *information*, not *state*, as the alternative to *data*, because then we can look at data as an (ideally minimal) encoding of the information, and understand its purpose as the retention of information for the time lag between its entrance to the channel and its exit. – Mike Dunlavey Dec 27 '10 at 18:57
  • 6
    @Konrad If that would be the case, then **all** OO systems would have only one class named "System" with all the methods in it. OO is about objects, not state, state is just one feature objects have. There is also a very important concept in OO which is **abstraction**. You don't have to list all the object characteristics if you don't need them. Depending on the scenario you might or not need an operation for `Bootle` – OscarRyz Dec 27 '10 at 21:07
  • 2
    @Anna: hmm, I had misunderstood that comment … in hindsight it makes sense. – Konrad Rudolph Dec 28 '10 at 09:42
30

Data classes are valid in some cases. DTO's are one good example mentioned by Anna Lear. In general though, you should regard them as the seed of a class whose methods haven't yet sprouted. And if you are running into a lot of them in old code, treat them as a strong code smell. They are often used by old C/C++ programmers who have never made the transision to OO programming and are a sign of procedural programming. Relying on getters and setters all the time (or worse yet, on direct access of non-private members) can get you into trouble before you know it. Consider an example of an external method that needs information from Bottle.

Here Bottle is a data class):

void selectShippingContainer(Bottle bottle) {
    if (bottle.getDiameter() > MAX_DIMENSION || bottle.getHeight() > MAX_DIMENSION ||
            bottle.getCapType() == Cap.FANCY_CAP ) {
        shippingContainer = WOODEN_CRATE;
    } else {
        shippingContainer = CARDBOARD_BOX;
    }
}

Here we've given Bottle some behavior):

void selectShippingContainer(Bottle bottle) {
    if (bottle.isBiggerThan(MAX_DIMENSION) || bottle.isFragile()) {
        shippingContainer = WOODEN_CRATE;
    } else {
        shippingContainer = CARDBOARD_BOX;
    }
}

The first method violates the Tell-Don't-Ask principle, and by keeping Bottle dumb we have let implicit knowledge about bottles, such as what makes one fragle (the Cap), slip into logic that is outside the Bottle class. You have to be on your toes to prevent this sort of 'leakage' when you habitually rely on getters.

The second method asks Bottle only for what it needs to do its job, and leaves Bottle to decide whether it is fragile, or larger than a given size. The result is a much looser coupling between the method and Bottle's implementation. A pleasant side-effect is that the method is cleaner and more expressive.

You'll rarely make use of this many fields of an object without writing some logic that ought to reside in the class with those fields. Figure out what that logic is and then move it to where it belongs.

Mike E
  • 419
  • 4
  • 2
  • 2
    Can't believe this answer has no votes (well, you have one now). This might be a simple example, but when OO is abused enough you get service classes that turn into nightmares containing tons of functionality that should have been encapsulated in classes. – Alb Feb 28 '12 at 17:21
  • "by old C/C++ programmers who have never made the transision (sic) to OO" ? C++ programmers usually are pretty OO, as it's an OO language, i.e. the whole point of using C++ instead of C. – nappyfalcon Apr 15 '18 at 21:06
  • 1
    @nappyfalcon Writing code in an OO language doesn't necessarily make it OO code; many C++ programmers sure *try* for OO, but it should be clear that they don't automatically succeed. Also note C++ is general-purpose paradigm, not OO exclusively. – geometrian Oct 22 '22 at 12:50
  • @nappyfalcon C++ is also C and those old programmers actually made a transition from C to C++ by either transforming existing C code, at least to some degree, or starting new C++ projects with experience, habits, and a mindset rooting in C programming. A lot of beginner level C++ books started with the basics of C before they got to C++. There is nothing in the C++ language that forces you to write OO code. Even languages like Java just force you to put everything into classes, which doesn't automatically lead to OO code. On the other hand there is C code that is very much OO. – BlackJack Dec 05 '22 at 14:14
9

If this is the sort of thing you need, that's fine, but please, please, please do it like

public class Bottle {
    public int height;
    public int diameter;
    public Cap capType;

    public Bottle(int height, int diameter, Cap capType) {
        this.height = height;
        this.diameter = diameter;
        this.capType = capType;
    }
}

instead of something like

public class Bottle {
    private int height;
    private int diameter;
    private Cap capType;

    public Bottle(int height, int diameter, Cap capType) {
        this.height = height;
        this.diameter = diameter;
        this.capType = capType;
    }

    public int getHeight() {
        return height;
    }

    public void setHeight(int height) {
        this.height = height;
    }

    public int getDiameter() {
        return diameter;
    }

    public void setDiameter(int diameter) {
        this.diameter = diameter;
    }

    public Cap getCapType() {
        return capType;
    }

    public void setCapType(Cap capType) {
        this.capType = capType;
    }
}

Please.

compman
  • 1,387
  • 13
  • 21
  • ++ Amen. See my answer. – Mike Dunlavey Dec 27 '10 at 18:33
  • 14
    The only problem with this is that there is no validation. Any logic as to what a valid Bottle is should be in the Bottle class. However, using your proposed implementation, I can have a bottle with a negative height and diameter - there's no way to enforce any business rules on the object without validating every time the object is used. By using the second method, I can ensure that if I have a Bottle object, it was, is, and always will be a valid Bottle object according to my contract. – Thomas Owens Dec 27 '10 at 20:35
  • 7
    That's one area where .NET has a slight edge over properties, since you can add a property accessor with validation logic, with the same syntax as if you were accessing a field. You can also define a property where classes can get the property value, but not set it. – JohnL Dec 27 '10 at 22:04
  • 1
    @Thomas: I'm not saying there aren't times when you could have code maliciously misusing code you've written, but it seems that you shouldn't spend all of your time acting like your code other code is out to attack the code you're working on at the moment. Ultimately, you have to trust your code. If you can't trust it, fix it. Perhaps we should have debugging tools that keep track of information (like the stack trace) about where the current value of a variable came from, and maybe the same information about previous values of a variable. Thoughts anyone? – compman Dec 28 '10 at 02:08
  • 3
    @user9521 If you are sure that your code will not cause a fatal error with "bad" values, then go for your method. However, if you need further validation, or the ability to use lazy loading, or other checks when data is read or written, then using explicit getters and setters. Personally I tend to keep my variables private and use getters and setters for consistency. This way all my variables are treated the same, regardless of validation and/or other "advanced" techniques. – Jonathan Dec 28 '10 at 04:29
  • 1
    @user9521 After I'm done testing, debugging, verifying, and validating, I trust my code. But I don't trust clients. You should always write code as if the client is malicious or is going to attempt to put the application into an illegal state in an attempt to exploit it. – Thomas Owens Dec 28 '10 at 09:46
  • 2
    The advantage of using the constructor is that it makes it much easier to make the class immutable. This is essential if you wish to write multi-threaded code. – Fortyrunner Dec 29 '10 at 17:36
  • 8
    I would make the fields final whenever possible. IMHO I would have preferred fields to be final by default and have a keyword for mutable fields. e.g. var – Peter Lawrey Dec 30 '10 at 01:08
  • @user9521: Absolutely. People should realise that field names are *already* functions. In fact, they're called projections. The representation of a product type is *already* abstract. – Yttrill Jan 22 '11 at 17:14
  • When you're maintaining a really large program of variable quality, being able to put a breakpoint on a getter or setter or being able to add validation logic really pays off. Not using getters/setters is only sensible for private local classes. There's no performance benefit to not doing so (see http://stackoverflow.com/questions/23931546/java-getter-and-setter-faster-than-direct-access ), and project Lombok Getter and Setter annotations can save typing. (But yes, C# properties win here for syntactic sugar). – MZB Dec 29 '16 at 14:35
6

As @Anna said, definitely not evil. Sure you can put operations (methods) into classes, but only if you want to. You don't have to.

Permit me a small gripe about the idea that you have to put operations into classes, along with the idea that classes are abstractions. In practice, this encourages programmers to

  1. Make more classes than they need to (redundant data structure). When a data structure contains more components than minimally necessary, it is un-normalized, and therefore contains inconsistent states. In other words, when it is altered, it needs to be altered in more than one place in order to remain consistent. Failure to perform all coordinated changes makes it inconsistent, and is a bug.

  2. Resolve problem 1 by putting in notification methods, so that if part A is modified, it tries to propagate necessary changes to parts B and C. This is the primary reason why it is recommended to have get-and-set accessor methods. Since this is recommended practice, it appears to excuse problem 1, causing more of problem 1 and more of solution 2. This results not only in bugs due to incompletely implementing the notifications, but to a performance-sapping problem of runaway notifications. These are not infinite computations, merely very long ones.

These concepts are taught as good things, generally by teachers who haven't had to work within million-line monster apps riddled with these issues.

Here's what I try to do:

  1. Keep data as normalized as possible, so that when a change is made to the data, it is done at as few code points as possible, to minimize the likelihood of entering an inconsistent state.

  2. When data must be un-normalized, and redundancy is unavoidable, do not use notifications in an attempt to keep it consistent. Rather, tolerate temporary inconsistency. Resolve inconsistency with periodic sweeps through the data by a process that does only that. This centralizes the responsibility for maintaining consistency, while avoiding the performance and correctness problems that notifications are prone to. This results in code that is much smaller, error-free, and efficient.

Mike Dunlavey
  • 12,815
  • 2
  • 35
  • 58
4

Agree with the Anna Lear,

Definitely not evil and not a code smell in my mind. Data containers are a valid OO citizen. Sometimes you want to encapsulate related information together. It's a lot better to have a method like...

Sometimes people forget to read the 1999 Java Coding Conventions which make it very plain that this kind of programming is perfectly fine. In fact if you avoid it, then your code will smell! (too many getters/setters)

From Java Code Conventions 1999: One example of appropriate public instance variables is the case where the class is essentially a data structure, with no behavior. In other words, if you would have used a struct instead of a class (if Java supported struct), then it's appropriate to make the class's instance variables public. http://www.oracle.com/technetwork/java/javase/documentation/codeconventions-137265.html#177

When used correctly, PODs (plain old data structures) are better than POJOs just like POJOs are often better than EJBs.
http://en.wikipedia.org/wiki/Plain_Old_Data_Structures

3

Structs have their place, even in Java. You should only use them if the following two things are true:

  • You just need to aggregate data that doesn't have any behavior, e.g. to pass as a parameter
  • It doesn't matter one bit what sort of values that aggregate data has

If this is the case, then you should make the fields public and skip the getters/setters. Getters and setters are clunky anyway, and Java is silly for not having properties like a useful language. Since your struct-like object shouldn't have any methods anyway, public fields make the most sense.

However, if either one of those do not apply, you're dealing with a real class. That means all fields should be private. (If you absolutely need a field at a more accessible scope, use a getter/setter.)

To check if your supposed-struct has behavior, look at when the fields are used. If it seems to violate tell, don't ask, then you need to move that behavior into your class.

If some of your data shouldn't change, then you need to make all those fields final. You might consider making your class immutable. If you need to validate your data, then provide validation in the setters and constructors. (A useful trick is to define a private setter and modify your field within your class using only that setter.)

Your Bottle example would most likely fail both tests. You could have (contrived) code that looks like this:

public double calculateVolumeAsCylinder(Bottle bottle) {
    return bottle.height * (bottle.diameter / 2.0) * Math.PI);
}

Instead it should be

double volume = bottle.calculateVolumeAsCylinder();

If you changed the height and diameter, would it be the same bottle? Probably not. Those should be final. Is a negative value ok for the diameter? Must your Bottle be taller than it is wide? Can the Cap be null? No? How are you validating this? Assume the client is either stupid or evil. (It's impossible to tell the difference.) You need to check these values.

This is what your new Bottle class might look like:

public class Bottle {

    private final int height, diameter;

    private Cap capType;

    public Bottle(final int height, final int diameter, final Cap capType) {
        if (diameter < 1) throw new IllegalArgumentException("diameter must be positive");
        if (height < diameter) throw new IllegalArgumentException("bottle must be taller than its diameter");

        setCapType(capType);
        this.height = height;
        this.diameter = diameter;
    }

    public double getVolumeAsCylinder() {
        return height * (diameter / 2.0) * Math.PI;
    }

    public void setCapType(final Cap capType) {
        if (capType == null) throw new NullPointerException("capType cannot be null");
        this.capType = capType;
    }

    // potentially more methods...

}
Eva
  • 254
  • 2
  • 8
3

This kind of classes are quite useful when you are dealing with mid-size/big applications, for some reasons:

  • it's quite easy to create some test cases and ensure that data is consistent.
  • it holds all kind of behaviors that involve that information, so data bug tracking time is reduce
  • Using them should keep method args lightweight.
  • When using ORMs , this classes gives flexybility and consistency. Adding a complex attribute that's calculated based on simple information already in the class, resutls in writing one simple method. This is quite more agile and productive that having to check your database and ensure all databases are patched with new modification.

So to sum up, in my experience they usually are more useful than annoying.

guiman
  • 2,088
  • 13
  • 17
3

With game design, the overhead of 1000's of function calls, and event listeners can sometimes make it worth it to have classes that only store data, and have other classes that loop through all the data only classes to perform the logic.

0

IMHO there often aren't enough classes like this in heavily object-oriented systems. I need to carefully qualify that.

Of course if the data fields have wide scope and visibility, that can be extremely undesirable if there are hundreds or thousands of places in your codebase tampering with such data. That's asking for trouble and difficulties maintaining invariants. Yet at the same time, that doesn't mean that every single class in the entire codebase benefits from information hiding.

But there are many cases where such data fields will have very narrow scope. A very straightforward example is a private Node class of a data structure. It can often simplify the code a great deal by reducing the number of object interactions going on if said Node could simply consist of raw data. That serves as a decoupling mechanism since the alternative version might require bidirectional coupling from, say, Tree->Node and Node->Tree as opposed to simply Tree->Node Data.

A more complex example would be entity-component systems as used often in game engines. In those cases, the components are often just raw data and classes like the one you've shown. However, their scope/visibility tends to be limited since there are typically only one or two systems that might access that particular type of component. As a result you still tend to find it quite easy to maintain invariants in those systems and, moreover, such systems have very few object->object interactions, making it very easy to comprehend what's going on at the bird's eye level.

In such cases you can end up with something more like this as far as interactions go (this diagram indicates interactions, not coupling, since a coupling diagram might include abstract interfaces for the second image below):

enter image description here

... as opposed to this:

enter image description here

... and the former type of system tends to be a lot easier to maintain and reason about in terms of correctness in spite of the fact that the dependencies are actually flowing towards data. You get much less coupling primarily because many things can be turned into raw data instead of objects interacting with each other forming a very complex graph of interactions.

-1

There are even much strange creatures exist in the OOP world. I've once created a class, which is abstract, and contains nothing, it just for to be a common parent of other two classes... it's the Port class:

SceneMember 
Message extends SceneMember
Port extends SceneMember
InputPort extends Port
OutputPort extends Port

So SceneMember is the base class, it does all the job, which required to appear on the scene: add, delete, get-set ID, etc. Message is the connection between ports, it has its own life. InputPort and OutputPort contain the i/o specific functions of their own.

Port is empty. It is just used to group InputPort and OutputPort together for, say, a portlist. It's empty because SceneMember contains all stuff which are for acting as a member, and InputPort and OutputPort contain the port-specified tasks. InputPort and OutputPort are such different, that they have no common (except SceneMember) functions. So, Port is empty. But it's legal.

Maybe it's an antipattern, but I like it.

(It's like the word "mate", which is used for both "wife" and "husband". You never use the "mate" word for a concrete person, because you know the sex of him/her, and it does not matter if he/she is married or not, so you use "someone" instead, but it still exists, and is used in rare, abstract situations, e.g. a law text.)

ern0
  • 1,035
  • 3
  • 8
  • 14
  • 1
    Why do your ports need to extend SceneMember? why not create the port classes to operate on SceneMembers? – Michael K Dec 27 '10 at 15:11
  • 1
    So why not use the standard marker interface pattern? Basically identical to your empty abstract base class, but it's a much more common idiom. – TMN Dec 27 '10 at 18:38
  • 1
    @Michael: Only for theoretical reasons, I reserved Port for future use, but this future hasn'nt arrived yet, and maybe never will. I didn't realised that they have no common at all, unlike their names suppose. I will pay compensation for everyone who suffered any loss occurred by that empty class... – ern0 Dec 27 '10 at 22:03
  • 1
    @TMN: SceneMember (derivative) types have getMemberType() method, there're several cases when Scene is scanned for subset of SceneMember objects. – ern0 Dec 27 '10 at 22:10
  • 4
    This does not answer the question. – Eva Jan 28 '13 at 13:57