194

I was involved in a programming discussion today where I made some statements that basically assumed axiomatically that circular references (between modules, classes, whatever) are generally bad. Once I got through with my pitch, my coworker asked, "what's wrong with circular references?"

I've got strong feelings on this, but it's hard for me to verbalize concisely and concretely. Any explanation that I may come up with tends to rely on other items that I too consider axioms ("can't use in isolation, so can't test", "unknown/undefined behavior as state mutates in the participating objects", etc.), but I'd love to hear a concise reason for why circular references are bad that don't take the kinds of leaps of faith that my own brain does, having spent many hours over the years untangling them to understand, fix, and extend various bits of code.

Edit: I am not asking about homogenous circular references, like those in a doubly-linked list or pointer-to-parent. This question is really asking about "larger scope" circular references, like libA calling libB which calls back to libA. Substitute 'module' for 'lib' if you like. Thanks for all of the answers so far!

dash-tom-bang
  • 2,543
  • 2
  • 16
  • 11
  • Does circular reference pertain to libraries and header files? In a workflow, new ProjectB code will be processing a file that's output from legacy ProjectA code. That output from ProjectA is a new requirement driven by ProjectB; ProjectB has a code that facilitates generically determining which fields go where, etc. The point being, legacy ProjectA could **reuse code** in new ProjectB, and ProjectB would be foolish not to reuse utility code in legacy ProjectA (e.g.: character set detection and transcoding, record parsing, data validation and transformation, etc.). – Luv2code Oct 02 '15 at 16:11
  • 1
    @Luv2code It only becomes foolish if you cut and paste the code between projects or possibly when both projects compile and link in the same code. If they're sharing resources like this, put them into a library. – dash-tom-bang Feb 11 '16 at 01:50

14 Answers14

262

There are a great many things wrong with circular references:

  • Circular class references create high coupling; both classes must be recompiled every time either of them is changed.

  • Circular assembly references prevent static linking, because B depends on A but A cannot be assembled until B is complete.

  • Circular object references can crash naïve recursive algorithms (such as serializers, visitors and pretty-printers) with stack overflows. The more advanced algorithms will have cycle detection and will merely fail with a more descriptive exception/error message.

  • Circular object references also make dependency injection impossible, significantly reducing the testability of your system.

  • Objects with a very large number of circular references are often God Objects. Even if they are not, they have a tendency to lead to Spaghetti Code.

  • Circular entity references (especially in databases, but also in domain models) prevent the use of non-nullability constraints, which may eventually lead to data corruption or at least inconsistency.

  • Circular references in general are simply confusing and drastically increase the cognitive load when attempting to understand how a program functions.

Please, think of the children; avoid circular references whenever you can.

Aaronaught
  • 44,005
  • 10
  • 92
  • 126
  • 37
    I particularly appreciate the last point, "cognitive load" is something that I am very conscious of but never had a great concise term for it. – dash-tom-bang Oct 14 '10 at 16:40
  • 6
    Good answer. It would be better if you said something about testing. If modules A and B are mutually dependent, they must be tested together. This means they are not really separate modules; together they are one broken module. – kevin cline Oct 29 '12 at 18:35
  • @kevincline: While that's a true statement, I haven't been able to convince myself that it's germane to *circular* references; even a one-way module dependency implies that at least one module must be tested together with another (dependent) module, unless some type of abstraction is used - in which case, the same abstraction would make the circular reference testable. If I'm overlooking something, can you clarify with a specific example? – Aaronaught Oct 29 '12 at 22:46
  • @Aaronaught Circular referenced objects are really hard to serialize. Add this one too :). – AnyOne Oct 30 '12 at 10:56
  • @Aaronaught: if A depends on B, then you can test B alone. If they are mutually dependent, you can't test either alone, so they aren't separate in any useful way. – kevin cline Oct 30 '12 at 15:23
  • @kevincline: If A and B are classes then I suppose I'd say yes (mostly)... but if A and B are interfaces and the methods themselves aren't circular references then both are still testable; and if A and B are modules, assemblies, etc., then they may both still be testable. No question these represent design problems, but to say that they are meaningfully the same entity implies that the conflict should be solved by merging, whereas typically I find that it's better separation (i.e. through abstraction) that's needed. – Aaronaught Oct 31 '12 at 00:40
  • 5
    Dependency injection is not impossible with circular references, even with automatic DI. One will just have to be injected with a property rather than as a constructor parameter. – BlueRaja - Danny Pflughoeft Apr 29 '14 at 15:24
  • 3
    @BlueRaja-DannyPflughoeft: I consider that an anti-pattern, as do many other practitioners of DI, because (a) it's not clear that a property is actually a dependency, and (b) the object being "injected" can't easily keep track of its own invariants. Worse, many of the most sophisticated/popular frameworks like Castle Windsor can't give useful error messages if a dependency can't be resolved; you end up with an annoying null reference instead of a detailed explanation of exactly which dependency in which constructor couldn't be resolved. Just because you *can*, doesn't mean you *should*. – Aaronaught Apr 30 '14 at 01:53
  • 3
    I wasn't claiming it's a good practice, I was just pointing out it's not impossible as claimed in the answer. – BlueRaja - Danny Pflughoeft Apr 30 '14 at 03:09
  • 1
    The definition of naive algorithms is that they crash in edge cases. This isn't the fault of the edge case, its the fault of the naive algorithm. – B T Jul 11 '15 at 08:11
  • @BT more generally I don't see the problem with a design where the developer might be more likely to code a solution that crashes a program. Crashing errors at runtime are not an issue, they are loud and you just need to fix them. – DPM Feb 18 '18 at 13:39
  • Why do you say dependency injection is impossible? There are multiple ways to perform dependency injection, you do not have to always use the constructor. – Ini Jun 23 '18 at 11:51
  • And even using constructor injection, there's possibility to pass Lazy as constructor argument, like `public ClassB(Lazy classA)`. See https://stackoverflow.com/a/25671236/4535033 – Mikhail Tumashenko Nov 20 '18 at 12:04
  • Circular object references also cause issues with refcounting-based GCs. – Solomon Ucko Oct 19 '19 at 18:12
  • Another kind of circular reference can be found in automated package building , when used with tools like minVer. This tool uses tag information on commits to determine which is the actual version of the package. This makes packaging process of the sources depend on the GIT repository, which itself manage history of the source files. This makes a circular reference, and your software is unable to build properly a package without the git repository. – Anthony Brenelière Mar 08 '21 at 16:56
25

A circular reference is twice the coupling of a non-circular reference.

If Foo knows about Bar, and Bar knows about Foo, you have two things that need changing (when the requirement comes that Foos and Bars must no longer know about each other). If Foo knows about Bar, but a Bar doesn't know about Foo, you can change Foo without touching Bar.

Cyclical references can also cause bootstrapping problems, at least in environments that last for a long time (deployed services, image-based development environments), where Foo depends on Bar working in order to load, but Bar also depends on Foo working in order to load.

Frank Shearar
  • 16,643
  • 7
  • 48
  • 84
17

When you tie two bits of code together, you effectively have one large piece of code. The difficulty of maintaining a bit of code is at least the square of its size, and possibly higher.

People often look at single class (/function/file/etc.) complexity and forget that you really should be considering the complexity of the smallest separable (encapsulatable) unit. Having a circular dependency increases the size of that unit, possibly invisibly (until you start trying to change file 1 and realize that also requires changes in files 2-127).

Alex Feinman
  • 5,792
  • 2
  • 27
  • 48
15

They may be bad not by themselves but as an indicator of a possible poor design. If Foo depends on Bar and Bar depends on Foo, it is justified to question why they are two instead of a unique FooBar.

mouviciel
  • 15,473
  • 1
  • 37
  • 64
13

Hmm... that depends on what you mean by circular dependence, because there are actually some circular dependencies which I think are very beneficial.

Consider an XML DOM -- it makes sense for every node to have a reference to their parent, and for every parent to have a list of its children. The structure is logically a tree, but from the point of view of a garbage collection algorithm or similar the structure is circular.

Billy ONeal
  • 8,073
  • 6
  • 43
  • 57
  • 1
    wouldn't that be a tree? – Conrad Frix Oct 14 '10 at 01:31
  • @Conrad: I suppose it could be thought of as a tree, yes. Why? – Billy ONeal Oct 14 '10 at 02:14
  • 1
    I don't think of tree's as circular because you can navigate down its children and will terminate (regardless of the parent reference). Unless a node had a child that was also a ancestor which in my mind makes it a graph and not a tree. – Conrad Frix Oct 14 '10 at 15:21
  • 5
    A circular reference would be if one of the children of a node looped back to an ancestor. – Matt Olenik Oct 14 '10 at 17:04
  • This isn't _really_ a circular dependency (at least not in a way that causes any issues). For example, imagine that `Node` is a class, which has other references to `Node` for the children inside itself. Because it's only referencing itself, the class is completely self-contained and isn't coupled to anything else. --- With this argument, you could argue that a recursive function is a circular dependency. It _is_ (at a stretch), but not in a bad way. – byxor Nov 05 '16 at 14:00
9

Is like the Chicken or the Egg problem.

There are many cases in which circular reference are inevitable and are useful but, for example, in the following case it doesn't work:

Project A depends on project B and B depends on A. A needs to be compiled to be used in B which requires B to be compiled before A which requires B to be compiled before A which ...

Victor Hurdugaci
  • 3,243
  • 18
  • 20
6

While I agree with most of the comments here I would like to plead a special case for the "parent"/"child" circular reference.

A class often needs to know something about its parent or owning class, perhaps default behavior, the name of the file the data came from ,the sql statement that selected the column, or, the location of a log file etc.

You can do this without a circular reference by having a containing class so that what was previously the "parent" is now a sibling, but it is not always possible to re-factor existing code to do this.

The other alternative is to pass all the data a child might need in its constructor, which end up being just plain horrible.

James Anderson
  • 18,049
  • 1
  • 42
  • 72
  • On a related note, there are two common reasons X might hold a reference to Y: X might want to ask Y to do things on X's behalf, or Y might be expecting X to do things to Y, on Y's behalf. If the only references that exist to Y are for the purpose of other objects wanting to do things on Y's behalf, then the holders of such references should be told that Y's services are no longer needed, and that they should abandon their references to Y at their convenience. – supercat Oct 15 '18 at 21:53
5

In database terms, circular references with proper PK/FK relationships make it impossible to insert or delete data. If you can't delete from table a unless the record is gone from table b and you can't delete from table b unless the record is gone from table A, you can't delete. Same with inserts. this is why many databases do not allow you to set up cascading updates or deletes if there is a circular reference because at some point, it becomes not possible. Yes you can set up these kind of relationships with out the PK/Fk being formally declared but then you will (100% of the time in my experience) have data integrity problems. That's just bad design.

HLGEM
  • 28,709
  • 4
  • 67
  • 116
5

I'll take this question from modelling point of view.

As long as you don't add any relationships that aren't actually there, you are safe. If you do add them, you get less integrity in data (cause there is a redundancy) and more tightly coupled code.

The thing with the circular references specifically is that I haven't seen a case where they would be actually needed except one - self reference. If you model trees or graphs, you need that and it is perfectly all right because self-reference is harmless from the code-quality point of view (no dependency added).

I believe that at the moment you start to need a not-self reference, immediately you should ask if you can't model it as a graph (collapse the multiple entities into one - node). Maybe there is a case in between where you make a circular reference but modelling it as graph is not appropriate but I highly doubt that.

There is a danger that people think that they need a circular reference but in fact they don't. The most common case is "The-one-of-many case". For instance, you have got a customer with multiple addresses from which one should be marked as the primary address. It is very tempting to model this situation as two separate relationships has_address and is_primary_address_of but it is not correct. The reason is that being the primary address is not a separate relationship between users and addresses but instead it is an attribute of the relationship has address. Why is that? Because its domain is limited to the user's addresses and not to all the addresses there are. You pick one of the links and mark it as the strongest (primary).

(Going to talk about databases now) Many people opt for the two-relationships solution because they understand to "primary" as being a unique pointer and a foreign key is kind of a pointer. So foreign key should be the thing to use, right? Wrong. Foreign keys represent relationships but "primary" is not a relationship. It is a degenerated case of an ordering where one element is above all and the rest is not ordered. If you needed to model a total ordering you would of course consider it as a relationship's attribute because there is basically no other choice. But at the moment you degenerate it, there is a choice and quite a horrible one - to model something that is not a relationship as a relationship. So here it comes - relationship redundancy which is certainly not something to be underestimated. The uniqueness requirement should be imposed in another way, for instance by unique partial indexes.

So, I wouldn't allow a circular reference to occur unless it is absolutely clear that it comes from the thing I am modelling.

(note: this is slightly biased to database design but I would bet it is fairly applicable to other areas too)

clime
  • 1,564
  • 13
  • 15
  • In your opinion should we use circular reference for domain models? Consider this case, there are two entities, Student and Course. The relationship between them is many-to-many. When I fetch the Student entity, I want to know all the courses a student has enrolled in. When I fetch the Course, I want to know all the students who have enrolled in this course. In this case, both the Student and the Course will have a reference to each other. What if I want to support some complex requirement like when I fetch a student, I want to show the other students(coursemates) enrolled in a course? – Hem Bhagat Jul 10 '23 at 06:38
  • In a real case, you will probly end up with a third object like "enrollment" that carries a link to a course and a student + additional info like when the student has practice lessons or e.g. grades. If there is no additional info needed for the relation then probly there should be only a link Student -> Course in db explicitly (of course the opposite relation follows implicitly). If there is a need for speed (e.g. to quickly get students for a course), then one can cache the queries. But this is just from my POV ;). – clime Jul 10 '23 at 08:33
  • I agree. A link entity makes sense in the case of a many-to-many relationship. But how would you design models in case of a one-to-many relationship? The one-to-many relationship also introduces circular reference in models. Example: One User has Many Orders. When fetching a User, I need all its Orders and when fetching an Order I need the User associated with it. Would you prefer doing a separate query for any one of the Entities and keep the link in the other one? Or would you go with two different domain models for each entity, one with the link and one without the link? – Hem Bhagat Jul 10 '23 at 09:42
2

I'd answer that question with another question:

What situation can you give me where keeping a circular reference model is the best model for what you're trying to build?

From my experience, the best model will pretty much never involve circular references in the way I think you mean it. That being said, there are a lot of models where you use circular references all the time, it's just extremely basic. Parent -> Child relationships, any graph model, etc, but these are well known models and I think you're referring to something else entirely.

Joseph
  • 533
  • 3
  • 8
  • 1
    It MAY be that a circular linked list (single-linked or double-linked) would be an excellent data structure for the central event queue for a program that's supposed to "never stop" (stick the important N things on the queue, with a "do not delete" flag set, then simply traverse the queue until empty; when new tasks (transient or permanent) are needed, stick them in a suitable place on the queue; whenever you serve an even without the "do not delete" flag, do it then take it off the queue). – Vatine Oct 14 '10 at 14:11
1

Some garbage collectors have trouble cleaning them up, because each object is being referenced by another.

EDIT: As noted by the comments below, this is true only for an extremely naive attempt at a garbage collector, not one that you would ever encounter in practice.

shmuelp
  • 286
  • 1
  • 4
  • 11
    Hmm.. any garbage collector tripped up by this isn't a true garbage collector. – Billy ONeal Oct 14 '10 at 00:37
  • 11
    I don't know of any modern garbage collector which would have problems with circular references. Circular references are a problem if you're using reference counts, but most garbage collectors are tracing style (where you start with the list of known references and follow them to find all others, collecting everything else). – Dean Harding Oct 14 '10 at 00:40
  • 4
    See http://sct.ethz.ch/teaching/ws2005/semspecver/slides/takano.pdf who explains the drawbacks to various types of garbage collectors -- if take mark and sweep and start optimizing it to reduce the long pause times (e.g. creating generations), you start to have problems with circular structures (when circular objects are in different generations). If you take reference counts and start fixing the circular reference problem, you end up introducing the long pause times are characteristic of mark and sweep. – Ken Bloom Oct 14 '10 at 13:45
  • If a garbage collector looked at Foo and deallocated its memory which in this example references Bar it should handle the removal of Bar. Thus at this point there is no need for garbage collector to go ahead and remove bar because it already did. Or vice versa, if it removes Bar which references Foo it shuold remove Foo too and thus it will not need to go remove Foo because it did so when it removed Bar? Please correct me if I am wrong. – Chris Oct 14 '10 at 13:46
  • (PDF article here: http://portal.acm.org/citation.cfm?id=1035292.1028982) – Ken Bloom Oct 14 '10 at 13:46
  • However, probably in today's split-heap garbage collectors, circular structures will *eventually* migrate to the same generation, at which point they can be collected, unlike pure reference counting where circular structures will *never* be collected. – Ken Bloom Oct 14 '10 at 13:49
  • Thanks everyone for the referesher. I suppose that is my punishment for not thinking about garbage collection for years. (Day job uses C++.) – shmuelp Oct 14 '10 at 13:59
  • 1
    In objective-c, circular references make it so the ref count doesn't hit zero when you release, which trips up the garbage collector. – Dexter Oct 14 '10 at 14:06
  • Only a problem for ref-counting garbage collectors and they have all sorts of interesting problems (like being unable to deal with a double-linked list or a tree with parent-pointers as well as child-pointers). – Vatine Nov 29 '13 at 10:34
  • And now entering the scene... Swift, with Active Reference Counting and strong reference cycles. I guess we will be encountering this problem in practice after all. – Michael Jul 20 '16 at 17:16
1

Circular references in data structures is sometimes the natural way of expressing a data model. Coding-wise, it's definitely not ideal and can be (to some extent) solved by dependency injection, pushing the problem from code to data.

Vatine
  • 4,251
  • 21
  • 20
1

A circular reference construct is problematic, not just from a design standpoint, but from an error catching standpoint as well.

Consider the possibility of a code failure. You haven't placed proper error catching in either class, either because you haven't developed your methods that far yet, or you're lazy. Either way, you don't have an error message to tell you what transpired, and you need to debug it. As a good program designer, you know what methods are related to what processes, so you can narrow it down to those methods relevant to the process that caused the error.

With circular references, your problems have now doubled. Because your processes are tightly bound, you have no way of knowing which method in which class might have caused the error, or from whence the error came, because one class is dependent on the other is dependent on the other. You now have to spend time testing both classes in conjunction to find out which one is really responsible for the error.

Of course, proper error catching resolves this, but only if you know when an error is likely to occur. And if you're using generic error messages, you're still not much better off.

Zibbobz
  • 1,522
  • 3
  • 13
  • 20
-3

In my opinion having unrestricted references makes program design easier, but we all know that some programming languages lack support for them in some contexts.

You mentioned references between modules or classes. In that case it's a static thing, predefined by the programmer, and it's clearly possible for the programmer to search for a structure that lacks circularity, though it might not fit the problem cleanly.

The real problem comes in circularity in run time data structures, where some problems actually can't be defined in a way that gets rid of circularity. In the end though - it's the problem that should dictate and requiring anything else is forcing the programmer to solve an unnecessary puzzle.

I'd say that's a problem with the tools not a problem with the principle.

gnat
  • 21,442
  • 29
  • 112
  • 288
Josh S
  • 9
  • Adding a one sentence opinion doesn't significantly contribute to the post or explain the answer. Could you elaborate upon this? –  May 11 '14 at 02:11
  • Well two points, the poster actually mentioned references between modules or classes. In that case it's a static thing, predefined by the programmer, and it's clearly possible for the programmer to search for a structure that lacks circularity, though it might not fit the problem cleanly. The real problem comes in circularity in run time data structures, where some problems actually can't be defined in a way that gets rid of circularity. In the end though - it's the problem that should dictate and requiring anything else is forcing the programmer to solve an unnecessary puzzle. – Josh S May 11 '14 at 06:37
  • I have found that it makes it easier to get your program up and running but that generally speaking it ultimately makes it harder to maintain the software since you find that trivial changes have cascading effects. A makes calls into B which makes calls back to A which makes calls back to B... I've found it's tough to truly understand the effects of changes of this nature, especially when A and B are polymorphic. – dash-tom-bang May 14 '14 at 20:06