This is a more well-formed transcription of my initial comment under your question. The answers to questions addressed by the OP may be found at the bottom of this answer. Also please check the important note located at the same place.
What you are currently describing, Sipo, is a design pattern called Active record. As with everything, even this one has found its place among programmers, but has been discarded in favour of repository and the data mapper patterns for one simple reason, scalability.
In short, an active record is an object, which:
- represents an object in your domain (includes business rules, knows how to handle certain operations on the object, such as if you can or cannot change a username and so forth),
- knows how to retrieve, update, save and delete the entity.
You address several issues with your current design and the main problem of your design is addressed in the last, 6th, point (last but not least, I guess). When you have a class for which you are designing a constructor and you do not even know what the constructor should do, the class is probably doing something wrong. That happened in your case.
But fixing the design is actually pretty simple by splitting the entity representation and CRUD logic into two (or more) classes.
This is what your design looks like now:
Employee
- contains information about the employee structure (its attributes) and methods how to modify the entity (if you decide to go the mutable way), contains CRUD logic for the Employee
entity, can return a list of Employee
objects, accepts an Employee
object when you want to update an employee, can return a single Employee
through a method like getSingleById(id : string) : Employee
Wow, the class seems huge.
This will be the proposed solution:
Employee
- contains information about the employee structure (its attributes) and methods how to modify the entity (if you decide to go the mutable way)
EmployeeRepository
- contains CRUD logic for the Employee
entity, can return a list of Employee
objects, accepts an Employee
object when you want to update an employee, can return a single Employee
through a method like getSingleById(id : string) : Employee
Have you heard of separation of concerns? No, you will now. It is the less strict version of the Single Responsibility Principle, which says a class should actually have only one responsibility, or as Uncle Bob says:
A module should have one and only one reason to change.
It is quite clear that if I was able to clearly split your initial class into two which still have a well rounded interface, the initial class was probably doing too much, and it was.
What is great about the repository pattern, it not only acts as an abstraction to provide a middle layer between database (which can be anything, file, noSQL, SQL, object-oriented one), but it does not even need to be a concrete class. In many OO languages, you can define the interface as an actual interface
(or a class with a pure virtual method if you are in C++) and then have multiple implementations.
This completely lifts the decision whether a repository is an actual implementation of you are simply relying on the interface by actually relying on a structure with the interface
keyword. And repository is exactly that, it is an fancy term for data layer abstraction, namely mapping data to your domain and vice versa.
Another great thing about separating it into (at least) two classes is that now the Employee
class can clearly manage its own data and do it very well, because it does not need to take care of other difficult things.
Question 6: So what should the constructor do in the newly created Employee
class? It is simple. It should take the arguments, check if they are valid (such as an age shouldn't probably be negative or name shouldn't be empty), raise an error when the data was invalid and if the validation passed assign the arguments to private variables of the entity. It now cannot communicate with the database, because it simply has no idea how to do it.
Question 4: Cannot be answered at all, not generally, because the answer heavily depends on what exactly you need.
Question 5: Now that you have separated the bloated class into two, you can have multiple update methods directly on the Employee
class, like changeUsername
, markAsDeceased
, which will manipulate the data of the Employee
class only in RAM and then you could introduce a method such as registerDirty
from the Unit of Work pattern to the repository class, through which you would let the repository know that this object has changed properties and will need to be updated after you call the commit
method.
Obviously, for an update an object requires to have an id and thus be already saved, and it's the repository's responbitility to detect this and raise an error when the criteria is not met.
Question 3: If you decide to go with the Unit of Work pattern, the create
method will now be registerNew
. If you do not, I would probably call it save
instead. The goal of a repository is to provide an abstraction between the domain and the data layer, because of this I would recommend you that this method (be it registerNew
or save
) accepts the Employee
object and it is up to the classes implementing the repository interface, which attributes they decide to take out of the entity. Passing an entire object is better so you do not need to have many optional parameters.
Question 2: Both methods will now be a part of the repository interface and they do not violate the single responsibility principle. The responsibility of the repository is to provide CRUD operations for the Employee
objects, that is what it does (besides Read and Delete, CRUD translates to both Create and Update). Obviously, you could split the repository even further by having an EmployeeUpdateRepository
and so forth, but that is rarely needed and a single implementation can usually contain all CRUD operations.
Question 1: You ended up with a simple Employee
class which will now (among other attributes) have id. Whether the id is filled or empty (or null
) depends on whether the object has been already saved. Nonetheless, an id is still an attribute the entity owns and the responsibility of the Employee
entity is to take care of its attributes, hence taking care of its id.
Whether an entity does or does not have an id does not usually matter untill you try to do some persistence-logic on it. As mentioned in the answer to the question 5, it is the repository's responsibility to detect you aren't trying to save an entity which has already been saved or trying to update an entity without an id.
Important note
Please be aware that although separation of concerns is great, actually designing a functional repository layer is quite a tedious work and in my experience is a bit more difficult to get right than the active record approach. But you will end up with a design which is far more flexible and scalable, which may be a good thing.