How should an `Employee` class be designed?

Question

I am trying to create a program for managing employees. I cannot, however, figure out how to design the Employee class. My goal is to be able to create and manipulate employee data on the database using an Employee object.

The basic implementation I thought of, was this simple one:

class Employee
{
    // Employee data (let's say, dozens of properties).

    Employee() {}
    Create() {}
    Update() {}
    Delete() {}
}

Using this implementation, I encountered several issues.

The ID of an employee is given by the database, so if I use the object for describing a new employee, there will be no ID to store yet, while an object representing an existing employee will have an ID. So I have a property that sometimes describes the object and sometimes doesn't (Which could indicate that we violate SRP? Since we use the same class for representing new and existing employees...).
The Create method is supposed to create an employee on the database, while the Update and Delete are supposed to act on an existing employee (Again, SRP...).
What parameters should the 'Create' method have? Dozens of parameters for all of the employee data or maybe an Employee object?
Should the class be immutable?
How will the Update work? Will it take the properties and update the database? Or maybe it will take two objects - an "old" one and a "new" one, and update the database with the differences between them? (I think the answer has to do with the answer about the mutability of the class).
What would be the responsibility of the constructor? What would be the parameters it takes? Would it fetch employee data from the database using an id parameter and them populate the properties?

So, as you can see, I have a bit of a mess in my head, and I am very confused. Could You please help me understand how such a class should look like?

Please notice that I do not want opinions, just to understand how such a frequently used class is generally designed.

Your curent violation of the SRP is that you have a class representing both an entity and being responsible for CRUD logic. If you separate it, that CRUD operations and the entity structure will be different classes, then **1.** and **2.** do not break SRP. **3.** should take an `Employee` object to provide abstraction, questions **4.** and **5.** are generally unanswerable, depend on your needs, and if you separate the structure and CRUD operations into two classes, then it's quite clear, the constructor of the `Employee` cannot fetch data from db anymore, so that answers **6.** — Andy, Jul 03 '16 at 16:03
Do not, I repeat, **do not** have your ctor reaching out to the database. Doing so tightly couples the code to the database and makes things god awful hard to test (even manual testing gets harder). Look into the repository pattern. Think about it for a second, do you `Update` an employee, or do you update an employee record? Do you `Employee.Delete()`, or does a `Boss.Fire(employee)`? — RubberDuck, Jul 03 '16 at 16:54
Aside from what has been already been mentioned, does it make sense to you that you need an employee to create an employee? In active record, it might make more sense to new up an Employee and then call Save on that object. Even then, though, you now have a class that is responsible for business logic as well as its own data persistence. — Mr Cochese, Jul 04 '16 at 08:47

score 10 · Accepted Answer · edited Jun 16 '20 at 10:01

_{This is a more well-formed transcription of my initial comment under your question. The answers to questions addressed by the OP may be found at the bottom of this answer. Also please check the important note located at the same place.}

What you are currently describing, Sipo, is a design pattern called Active record. As with everything, even this one has found its place among programmers, but has been discarded in favour of repository and the data mapper patterns for one simple reason, scalability.

In short, an active record is an object, which:

represents an object in your domain (includes business rules, knows how to handle certain operations on the object, such as if you can or cannot change a username and so forth),
knows how to retrieve, update, save and delete the entity.

You address several issues with your current design and the main problem of your design is addressed in the last, 6th, point (last but not least, I guess). When you have a class for which you are designing a constructor and you do not even know what the constructor should do, the class is probably doing something wrong. That happened in your case.

But fixing the design is actually pretty simple by splitting the entity representation and CRUD logic into two (or more) classes.

This is what your design looks like now:

Employee - contains information about the employee structure (its attributes) and methods how to modify the entity (if you decide to go the mutable way), contains CRUD logic for the Employee entity, can return a list of Employee objects, accepts an Employee object when you want to update an employee, can return a single Employee through a method like getSingleById(id : string) : Employee

_{Wow, the class seems huge.}

This will be the proposed solution:

Employee - contains information about the employee structure (its attributes) and methods how to modify the entity (if you decide to go the mutable way)
EmployeeRepository - contains CRUD logic for the Employee entity, can return a list of Employee objects, accepts an Employee object when you want to update an employee, can return a single Employee through a method like getSingleById(id : string) : Employee

Have you heard of separation of concerns? No, you will now. It is the less strict version of the Single Responsibility Principle, which says a class should actually have only one responsibility, or as Uncle Bob says:

A module should have one and only one reason to change.

It is quite clear that if I was able to clearly split your initial class into two which still have a well rounded interface, the initial class was probably doing too much, and it was.

What is great about the repository pattern, it not only acts as an abstraction to provide a middle layer between database (which can be anything, file, noSQL, SQL, object-oriented one), but it does not even need to be a concrete class. In many OO languages, you can define the interface as an actual interface (or a class with a pure virtual method if you are in C++) and then have multiple implementations.

This completely lifts the decision whether a repository is an actual implementation of you are simply relying on the interface by actually relying on a structure with the interface keyword. And repository is exactly that, it is an fancy term for data layer abstraction, namely mapping data to your domain and vice versa.

Another great thing about separating it into (at least) two classes is that now the Employee class can clearly manage its own data and do it very well, because it does not need to take care of other difficult things.

Question 6: So what should the constructor do in the newly created Employee class? It is simple. It should take the arguments, check if they are valid (such as an age shouldn't probably be negative or name shouldn't be empty), raise an error when the data was invalid and if the validation passed assign the arguments to private variables of the entity. It now cannot communicate with the database, because it simply has no idea how to do it.

Question 4: Cannot be answered at all, not generally, because the answer heavily depends on what exactly you need.

Question 5: Now that you have separated the bloated class into two, you can have multiple update methods directly on the Employee class, like changeUsername, markAsDeceased, which will manipulate the data of the Employee class only in RAM and then you could introduce a method such as registerDirty from the Unit of Work pattern to the repository class, through which you would let the repository know that this object has changed properties and will need to be updated after you call the commit method.

Obviously, for an update an object requires to have an id and thus be already saved, and it's the repository's responbitility to detect this and raise an error when the criteria is not met.

Question 3: If you decide to go with the Unit of Work pattern, the create method will now be registerNew. If you do not, I would probably call it save instead. The goal of a repository is to provide an abstraction between the domain and the data layer, because of this I would recommend you that this method (be it registerNew or save) accepts the Employee object and it is up to the classes implementing the repository interface, which attributes they decide to take out of the entity. Passing an entire object is better so you do not need to have many optional parameters.

Question 2: Both methods will now be a part of the repository interface and they do not violate the single responsibility principle. The responsibility of the repository is to provide CRUD operations for the Employee objects, that is what it does (besides Read and Delete, CRUD translates to both Create and Update). Obviously, you could split the repository even further by having an EmployeeUpdateRepository and so forth, but that is rarely needed and a single implementation can usually contain all CRUD operations.

Question 1: You ended up with a simple Employee class which will now (among other attributes) have id. Whether the id is filled or empty (or null) depends on whether the object has been already saved. Nonetheless, an id is still an attribute the entity owns and the responsibility of the Employee entity is to take care of its attributes, hence taking care of its id.

Whether an entity does or does not have an id does not usually matter untill you try to do some persistence-logic on it. As mentioned in the answer to the question 5, it is the repository's responsibility to detect you aren't trying to save an entity which has already been saved or trying to update an entity without an id.

Important note

Please be aware that although separation of concerns is great, actually designing a functional repository layer is quite a tedious work and in my experience is a bit more difficult to get right than the active record approach. But you will end up with a design which is far more flexible and scalable, which may be a good thing.

@Ewan I didn't downvote your answer, but I can see why some may have. It does not directly answer some of the OP's questions and some of your suggestions seem unfounded. — Andy, Jul 03 '16 at 18:25
Nice and comprenhensive answer. Hits the nail on the head with the separaton of concern. And I like the warning that pintpoints the important choice to make between a perfect complex design and a nice compromise. — Christophe, Jul 03 '16 at 19:06
when first create a new employee object , there will be no value to ID. id field can leave with null value but it will cause that the employee object is in invalid state ???? — Susantha7, Mar 30 '19 at 18:20

Ewan · Answer 2 · 2016-07-04T08:00:12.173

2

First create an employee struct containing the properties of the conceptual employee.

Then create a database with matching table structure, say for example mssql

Then create a employee repository For that database EmployeeRepoMsSql with the various CRUD operations you require.

Then create an IEmployeeRepo interface exposing the CRUD operations

Then expand your Employee struct to a class with a construction parameter of IEmployeeRepo. Add the various Save/Delete etc methods you require and use the injected EmployeeRepo to implement them.

When it cones to Id I suggest you use a GUID which can be generated via code in the constructor.

To work with existing objects your code can retrieve them from the database via the repository before calling their Update Method.

Alternatively you can opt for the frowned upon (but in my view superior) Anemic Domain Object model where you don't add the CRUD methods to your object, and simply pass the object to the repo to be updated/saved/deleted

Immutability is a design choice which will depend on your patterns and coding style. If you are going all functional then try to be immutable as well. But if you are unsure a mutable object is probably easier to implement.

Instead of Create() I would go with Save(). Create works with the immutability concept, but I always find it useful to be able to construct an object which is not yet 'Saved' eg you have some UI which allows you to populate an employee object or objects and then verify them again some rules before saving to the database.

***** example code

public class Employee
{
    public string Id { get; set; }

    public string Name { get; set; }

    private IEmployeeRepo repo;

    //with the OOP approach you want the save method to be on the Employee Object
    //so you inject the IEmployeeRepo in the Employee constructor
    public Employee(IEmployeeRepo repo)
    {
        this.repo = repo;
        this.Id = Guid.NewGuid().ToString();
    }

    public bool Save()
    {
        return repo.Save(this);
    }
}

public interface IEmployeeRepo
{
    bool Save(Employee employee);

    Employee Get(string employeeId);
}

public class EmployeeRepoSql : IEmployeeRepo
{
    public Employee Get(string employeeId)
    {
        var sql = "Select * from Employee where Id=@Id";
        //more db code goes here
        Employee employee = new Employee(this);
        //populate object from datareader
        employee.Id = datareader["Id"].ToString();

    }

    public bool Save(Employee employee)
    {
        var sql = "Insert into Employee (....";
        //db logic
    }
}

public class MyADMProgram
{
    public void Main(string id)
    {
        //with ADM don't inject the repo into employee, just use it in your program
        IEmployeeRepo repo = new EmployeeRepoSql();
        var emp = repo.Get(id);

        //do business logic
        emp.Name = TextBoxNewName.Text;

        //save to DB
        repo.Save(emp);

    }
}

edited Jul 04 '16 at 08:00

answered Jul 03 '16 at 15:57

Ewan

70,664
5
76
161

1

Anemic Domain Model has very little to do with CRUD logic. It is a model which, although belongs to the domain layer, has no functionality and all functionality is served through services, to which this domain model is being passed as a parameter. – Andy Jul 03 '16 at 16:07
Exactly, in this case the repo is the service and the functions are the CRUD operations. – Ewan Jul 03 '16 at 16:11
@DavidPacker are you saying the Anemic Domain Model is a good thing? – candied_orange Jul 03 '16 at 16:38
1

@CandiedOrange I haven't expressed my opinion in the comment, but no, if you decide to go as far as diving your application to layers where one layer is responsible only for business logic, I am with Mr. Fowler that an anemic domain model is in fact an anti-pattern. Why should I need a `UserUpdate` service with a `changeUsername(User user, string newUsername)` method, when I can just as well add the `changeUsername` method to the class `User` directly. Creating a service for that is a non-sense. – Andy Jul 03 '16 at 16:52
1

I think in this case injecting the repo just to put the CRUD logic on the Model isnt optimal. – Ewan Jul 03 '16 at 16:52
Interesting answer. I don't understand the downvotes here. IMO The choice of creating a GUID in the constructor is valid, but the principle of getting it from DB is a frequent constraint in transactional systems (not speaking of other data that might be initialised/changed per db trigger) so that I wouldn't really question it. Cold you clarify if the EmployeeRepo represents the employee in the repository (i.e. bound to a specific employee), or the db connection for employee data (shared by all the employees, not bound to a specific employee) ? – Christophe Jul 03 '16 at 19:21
EmployeeRepo encapsulates the data access layer, ie all your insert into Employee (id,name..... Etc see davids link – Ewan Jul 03 '16 at 21:16
downvotes probably due to my answer really addressing.your first question 'how do i design an object' rather than your issues encountered? Not sure – Ewan Jul 03 '16 at 21:21
@Ewan - I believe so too. But for me, it is a helpful answer so I upvoted it. I need to read the other answers before I could choose one. Thanks. – Michael Haddad Jul 04 '16 at 07:38
Downvote was due to an odd wording and misunderstanding of @Ewan's intent. It's been rectified. I'd just add that not knowing how to save itself doesn't *necessarily* make the model anemic. The employee could have lots of methods, it might not. It's also wise to remember what we mean by "model". Often, we're either speaking of a "data model" or the "business logic model", which are two extremely different things. A business model could be an entire subsystem of interacting objects, while a data model really might be be a simple struct. – RubberDuck Jul 04 '16 at 12:43
If the Employee is a domain entity, I really do not like it knowing about the repository, as you demonstrate in your example. I would not do that. I just realized you may think that is what I meant when I criticized the anemic domain model. I do not think the repository should be inside the model. – Andy Jul 04 '16 at 13:27
well here we get into an argument about what ADM 'means' and if it includes CRUD operations. I guess we just have to hunt MF down and shine a light into his eyes until he tells us!! :) – Ewan Jul 04 '16 at 15:22

Christophe · Answer 3 · 2016-07-03T20:02:20.277

Review of your design

Your Employee is in reality a kind of proxy for an object managed persistently in the database.

I therefore suggest to think to the ID as if it were a reference to your database object. With this logic in mind you can continue your design as you would do for non database objects, the ID allowing you to implement the traditional composition logic:

If the ID is set, you have a corresponding database object.
If the ID isn't set, there is no corresponding database object: Employee may yet to be created, or it could just have been deleted.
You need some mechanism to initiate the relationship for exising employees and exising database records that are not yet loaded in memory.

You'd also need to manage a status for the object. For example:

when an Employee is not yet linked with a DB object either via create or data retrieval, you should not be able to perform updates or deletes
is the Employee data in the object in sync with the database or are there changes made ?

WIth this in mind, we could opt for:

class Employee
{
    ...
    Employee () {}       // Initialize an empty Employee
    Load(IDType ID) {}   // Load employee with known ID from the database
    bool Create() {}     // Create an new employee an set its ID 
    bool Update() {}     // Update the employee (can ID be changed?)
    bool Delete() {}     // Delete the employee (and reset ID because there's no corresponding ID. 
    bool isClean () {}   // true if ID empty or if all properties match database
}

In order to be able to manage your object status in a reliable manner, you must ensure a better encapsulation by making the properties private and give access only via getters and setters that setters update status.

Your questions

I think the ID property doesn't violate the SRP. Its single responsibility is to refer to a database object.
Your Employee as a whole isn't compliant with the SRP, because it is responsible for the link with the database, but also for holding temporary changes, and for all the transactions happening to that object.

Another design could be to keep the changeable fields in another object that would be loaded only when fields need to be accessed.

You could implement the database transactions on the Employee using the command pattern. This kind of design also would ease the decoupling between your business objects (Employee) and your underlying database system, by isolating database specific idioms and APIs.
I wouldn't add dozen of parameters to Create(), because the business objects could evolve and make all this very difficult to maintain. And the code would become unreadable. You have 2 choices here: either passing a minimalistic set of parameters (no more than 4) that are absolutely required for creating an employee in database and perform the remaining changes via update, OR you pass an object. By the way, in your design I understand that you've already chosen: my_employee.Create().
Should the class be immutable ? See discussion above: in your original design no. I would opt for an immutable ID but not an immutable Employee. An Employee evolves in real life (new job position, new address, new matrimonial situation, even new names...). I think it will be easier and more natural to work with this reality in mind, at least in the business logic layer.
If you consider using a commands for update and a distinct objects for (GUI ?) for holding desired changes you could opt for old/new approach. In all other cases, I'd opt for updating a mutable object. Attention: the update could trigger database code so that you should make sure after an update the object is still really in sync with the DB.
I think that fetching an employee from DB in the constructor is not a good idea, because fetching could go wrong, and in many languages, it's difficult to cope with failed construction. Constructor should initialize object (especially the ID) and its status.

How should an `Employee` class be designed?

3 Answers3

Important note

Linked