15

Can I get some thoughts on the entity design?

Let's say I have an entity called Book. Let's say I create a specific instance of that book. It has a title A and an author B. Since there can be multiple books, should the amount be included in the entity OR should I instead create a separate entry for each of the same book in my database ?

So in each case, if the amount of books were 30, database would look respectively:

id = 1, title = A, author = B, amount = 30;

vs

id = 1, title = A, author = B; 
id = 2, title = A, author = B;
...
id = 30, title = A, author = B;

Which method (if any) is considered a good practice? Since in this case there may be multiple identical books and I may want to update them, having one entry in database which would include the amount seems easier to update than n amount of db entries which only differ by id (every other data for that specific book being identical).

I am developing using Java and Spring Boot if that matters.

EDIT: all the answers I have received were very helpful to me. Too bad I can't accept all of them as an answer so I will have to go with the seniority!

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
Kleronomas
  • 169
  • 1
  • 6
  • Just a quick confirmation: when you say amount of books, you mean the quantity, not the price, don't you? – Christophe Mar 20 '21 at 00:11
  • 1
    You seem to be missing everything that matters. id = 1, title = A, author = B, amount = 30 Is by no means “vs” anything. All that matters is that your other examples include less detail. – Robbie Goodwin Mar 20 '21 at 20:44
  • @RobbieGoodwin I am not sure I understand you. My examples are potential representations in a database. First one contains amount, which is here set to 30. If I wanted to get amount in the second example, I would use a query to get an amount of the records. – Kleronomas Mar 20 '21 at 21:10
  • My question is why do you need to duplicate the same entity? for example if you want to record book sales, you can make an additional table for orders containing the id of the book and its price at the moment of the sale. – Alex Mar 21 '21 at 12:10
  • What you propose would be a violation of 3NF. – user207421 Mar 22 '21 at 00:37
  • I think you're looking for the [Fungible](https://en.wikipedia.org/wiki/Fungibility) concept. For fungible items, you track amounts. – MSalters Mar 22 '21 at 14:33

4 Answers4

99

This is not a question of "good practice", but a question of the requirements of the system. For example:

  • Let's say your system is for a library. If the library has several instances of the same book, each copy will have a individual library id and individual attributes like its age, who borrowed it at what date, which condition the copy has, and maybe some more.

  • Let's say your system is for an online book shop, mainly selling new books. Hundreds of the same copy of the book are sold daily, and the copies have all the same price and are exchangeable. Then the system will probably not give each book an own identity, and it is not important to keep track of each individual item. So storing an amount for each book will probably way more useful within the system.

Often people run into the common misconception that there are "best practices" which can replace a thorough requirements analysis - do yourself a favor and don't fall into that trap. Data models are not living in "thin air", one needs context to make the right modeling decisions.

Doc Brown
  • 199,015
  • 33
  • 367
  • 565
  • 22
    @Polygorial The two models here are those that OP proposed, and Doc Brown illustrated how both are valid and none is strictly better than the other in general. Model 1 may not be the ideal model for the suggested context, but that's not the point of the answer. The point is that OP has to think about their own context, and doing so, they will hopefully recognize and consider the issue you pointed out (if it is relevant in their context). – Silly Freak Mar 20 '21 at 19:08
  • 1
    There is a lot of confusion throughout comment and answers about what "book" *even means*. Yours is the one that best addresses that distinction, but I feel there is room to be even more direct. That is, it's not just about the "system", but of what the *word* is supposed to refer to. Despite having the same name, the "book" that is a physical object is not the same as the "book" that is just the content and metadata as information. Independently of the system as a whole, if "book" means "object", quantity does not fit. If that's required, it belongs elsewhere. – TheRubberDuck Mar 22 '21 at 16:06
21

It depends on the identity of your items:

  • In a book store, books having the same title and same author all look identical and interchangeable. You would have one instance, holding the total quantity.
  • In a second-hand book store, books having the same title and the same author do not look identical: each has a different origin, is more or less used state, may have distinctive marks. Not only could you distinguish each of them, but moreover, you may even want to have an individualized price or you want to track where each of it is in the shop. You would a separate instance for each distinct individual.
  • In an art book store, you could have fine art books printed in limited editions, with a serial number on each item. There are two common strategies for this: one is to have multiple instances, each bearing the serial number of the individual book. However, in large scale industrial or commercial applications, the strategy is often to manage one singe instance of the book, and have a separate associated entity with distinct instances for the serial numbers. The business process then foresee an extra step to ask for serial numbers when ever you reduce or increase the quantity.
  • You may have different sets of books sharing the same title and the same authors, but with different editions. Or you have several sets of books purchased at different times: the older ones are dusty and passed color due to exposure to light. You may then want to manage different batches of the same book. You would have a book entity associated with batch entities (lots): one book instance could be associated with many batch instances with the quantity maintained at the batch level. This is not really common for books, but for pharmaceuticals and chemical products it's always like this (batches have in this case also expiration dates).
  • You could also for practical reasons decide to manage arbitrary sets of books. For example if you have a huge warehouse dominated by robots, with boxes of the same book on different shelves. You could then either manage stock keeping units (i.e. manage individual boxes of books referring to the same book instance), or manage each box as if it were a batch of books.

Several scenarios above can be combined. And there are other variants as well. In a typical ERP you'd have a product (book) with only the attributes of the product (title, author, ISBN, publisher, year), and then you would have entities for managing the copies of a book in warehouses (e.g. book B in quantity x in warehouse Y of subsidiary Z), and probably as well entities for managing the product in the stores (e.g. book B in quantity x at store/point of sale W of the subsidiary Z at a selling price s).

So in the end, it's not about the best approach, but the most suitable one. The fundamental question is what is the identifiable items that you are interested in or that you need to track individually.

Glorfindel
  • 3,137
  • 6
  • 25
  • 33
Christophe
  • 74,672
  • 10
  • 115
  • 187
1

It depends on what are you going to do with it.

If every entity needs to be distinguished from the others or needs to have its own history, then every entity should be represented by a separate record in the database, and the amount should not be its attribute. Examples: You are renting cars, or books, or you are selling tickets for flights and every ticket has its specific seat number.

If entities don't need to be distinguished and have no own history, then there is no sense to represent every entity by a separate record, a single record for all entities of this type will be sufficient, and the amount will be one of attributes of such entity. Examples: You are selling books, selling museum entry tickets.

mentallurg
  • 854
  • 4
  • 9
-3

You have a million copies of the latest Harry Potter book. Does it seem at all reasonable to have a million rows in the database?

You’d have one table with book ids, authors and titles (which is probably not enough to uniquely identify a book). And the you have a second table where each row contains an id for the row, and a book Id and an amount of books.

Why not one table with author, title and amount? Because you are stuck when you have a use of a book that doesn’t involve amounts.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
  • 1
    Could you give an example of a book that doesn't involve amounts? – ojs Mar 20 '21 at 20:42
  • @ojs: One example would be for books that are permanently out of print, as opposed to books that are just temporarily out of stock. Both technically have `amount=0`. Using hacks like `amount=-1` to mean "out of stock" is asking for problems. – MSalters Mar 22 '21 at 14:30