101

Is it an anti-pattern or code smell to put "general use" functions (examples below) into a catch-all file named "helpers" or "utils"?

It's a pattern I've seen quite a lot in the JavaScript world, which I've also seen with other languages.

By "general use" functions, I mean functions which abstract away some common, shared functionality used in an application/library and make them available for use in a general way. Some examples I've seen include returning a copy of an object with some keys omitted, transforming all null-ish values in a JavaScript object (eg. "", null, {}, []) to undefined, constructing a URL from a struct of parameters, transforming strings in some fashion, etc, etc.

I often come across applications or libraries with one (or more!) util.js or helpers.ts files which just seem to be, in my opinion, a dumping ground for unrelated functions. In my opinion, code is more readable and discoverable if it's named semantically.

If working with the examples above, I'd place them all in their own files (omit.js, null-to-undefined.js, url-builder.ts, etc), or if they are related functionality, group them (eg, deepClone, withoutKeys, shallowClone, in clone.js or similar).

I struggle to articulate why this seems like a code smell or anti-pattern to me. I think there are various things at play: creating the wrong abstraction or abstracting at the wrong time (YAGNI) meaning code is just dumped "somewhere", lack of foresight for future maintainability.

On the other hand, I've seen proponents of this pattern argue that it's a well-used pattern, and thus it has merit simply because it's well-used. Does this argument stand?

Interesting in anyone's thoughts on this, whether you think it's good or bad, and if you can explain the pros/cons better than I can!

old greg
  • 909
  • 2
  • 4
  • 6
  • 14
    I think when you are first starting out I see no harm in this. As your project grows, you might see the need for more organization which would mean multiple files and folders. At that time a refactor may be necessary if the helpers file grows to a point of not being manageable. – Jon Raynor Mar 02 '21 at 14:10
  • 7
    Do _not_ just make a file called `utils.ext` or `utilities.ext` and stick it in your main folder with all your other files. Segregate the utilities by "kind" and give the file a proper name. Alternatively, perhaps better, do call it "utils" but stick it in a subdirectory with a proper name. You will figure out why the second someone else (or you, 2 months from now) wants a _different_ "utils" file ... – davidbak Mar 02 '21 at 18:35
  • 9
    I would see it as bad code if same/ similar code was in more than one place – tgkprog Mar 02 '21 at 18:56
  • 8
    I personally use a util **folder** instead of a util file. Not every module need every util. So for example I have `util/formatters.js` and `util/strings.js` – slebetman Mar 03 '21 at 01:59
  • 2
    Related: https://stackoverflow.com/questions/40996614/whats-the-best-practice-for-creating-stateless-utility-classes-in-java – VGR Mar 03 '21 at 13:05
  • You seem to be somewhat aware of this, but JavaScript is often perceived as missing expected features and requiring mild hacks to make certain things act intuitively. I see a difference between a general "filling in perceived-missing language features and alternatives for functions perceived as broken" (which you might reasonably call "util" or "helpers"), and "random helper functions". – Esther Mar 04 '21 at 08:42
  • I use these classes when I'm writing small projects quickly. For instance, some JavaScript projects where I need a couple of geometry functions in a few different places (e.g. distance between points, radians to degrees, etc.). They tend to get dumped into utils. It's usually pretty minimal in size, tho. If this got bloated, or the project keeps getting bigger, it makes much more sense to put them in a shared file which is indicative of what it does. E.g. I'd rename that class `Geometry` and might have another for formatting text, etc. – AJFaraday Mar 04 '21 at 09:25
  • I think you're asking the "wrong" question, which is why the answers are all over the place. Obviously, you need to keep all your code structured, this isn't unique to an utils file. But what does structured mean? Is it better to put two separate but related pieces of code into one single file, or in two files? If the latter, do you put it in a separate folder or not? Do you refactor your code into files and folders as you go or do you try to get it "right" from the start? You're basically asking all these things at once, but specifically for utils files. Too many questions for one answer. – FrederikVds Mar 04 '21 at 09:49
  • huh, same question, no interest at all: https://stackoverflow.com/questions/61849056/is-a-utils-py-or-utils-hpp-cpp-file-evidence-of-a-poor-design – innisfree Mar 11 '21 at 04:43

9 Answers9

101

Both Java and C# have the existence proof that this works: the Math class.

The danger is the "kitchen drawer" problem, where a class simply gathers new kitchen tools and implements, many of which never get used, or only get used once.

So as long as you can keep such classes tightly focused on a theme (like the Math class is), I don't see a problem. In fact, static methods in classes of this kind tend to be easier to write, maintain and test, so long as they're always "pure" methods (i.e. referentially transparent, and you don't store static state in them).

Robert Harvey
  • 198,589
  • 55
  • 464
  • 673
  • 8
    Thanks for the comment! I agree that classes like `Math` which contain related functionality are useful, however I think the problem I am working against is a bit different. My problem is that arbitrary, unrelated code is just "dumped" into a `helpers.js` file because the author couldn't think of a better than or more logical place for the code to live. – old greg Mar 02 '21 at 20:02
  • 43
    That's a developer discipline problem, not a best practices problem. – Robert Harvey Mar 02 '21 at 20:34
  • 5
    @jimmysumshugar You could try a refactoring exercise of trying to create 3 (or some other number that you think is better) categories for the functions in the "helpers" file, and then see how many functions can be moved to more specific "helper" files. – FrustratedWithFormsDesigner Mar 02 '21 at 21:46
  • 12
    @RobertHarvey I read the question differently and I think this is what OP's comment is getting at -- Java/C#'s `Math` class is what OP believes is better. They're suggesting (or asking rather) that files of *unrelated* functions are bad. I.e., if all of Java's classes that have only static methods were in a single class called `Utils` or something. – Captain Man Mar 02 '21 at 22:05
  • 13
    This answer gives an impression like "this is good, so long as you keep it properly focused and give it descriptive name like 'math' for math helpers", but the question is asking about catch-all (not focused on a theme) files with names like "utils" or "helpers" which give only the loosest of hints about what's going on inside them. (Maybe this answer is next-level human-chess plays? Reel in the kind of people who put stuff in a "utils" file with affirmation, and then inject them with some actual wisdom about what they should be doing while they're receptive?) – mtraceur Mar 03 '21 at 03:37
  • 3
    Thanks for the comments folks. I agree this is a dev discipline problem! And yes, I am talking about the situation where there are unfocused 'util' classes. I am searching for eloquent answers as to _why_ this is harmful to a codebase and how better organisation around focused modules helps readability and maintenance. My problem is that I am rebuffed whenever I try to discuss this because it's either "unimportant" or "that's just how it's done" – old greg Mar 03 '21 at 10:53
  • 3
    @oldgreg: Respectfully, you're not going to win this argument with your colleagues by arguing "best practices." Lazy is as lazy does. – Robert Harvey Mar 03 '21 at 15:01
  • For a good example of theme-grouped "utils" in a mature project, see [Laravel's support directory](https://github.com/laravel/framework/tree/22b229bc644d3800903ca74f674409b9972bf674/src/Illuminate/Support). There is still a miscellaneous `helpers.php` containing a dozen or so very simple pure functions, but most of the reusable code is grouped by theme, eg. Str.php, ProcessUtils.php, InteractsWithTime.php, etc. All of the function names are descriptive and usage-agnostic. – Mikkel Mar 05 '21 at 16:59
41

My criterion is: If you needed a function from the "helpers" file, would you find it? Or would you write your own version, use it, and put it into the "helpers" file together with two other almost identical functions?

So this needs to be organised, and should be done in such a way that your development tools can help you. For example, with Xcode / Swift you can add extensions to existing classes, and methods in these extensions will be used as code suggestions and are reasonably easy to find. On the other hand, if you add a standalone function, there are so many standalone function, you have no chance to find it unless it is named really well. Other development systems might want different approaches.

gnasher729
  • 42,090
  • 4
  • 59
  • 119
  • 5
    C# also has "Extension Methods." They are essentially syntactic sugar over ordinary static methods, although they do associate the methods with the type they operate on and work well with Intellisense in Visual Studio. – Robert Harvey Mar 02 '21 at 23:03
  • Kotlin has extension methods, too, with (what sound like) the same advantages.  In my experience converting Java apps to Kotlin, most helper methods fit very naturally as extension methods, with very few stand-alone functions needed.  — This suggests that in other languages, stand-alone functions could be organised according to the type they deal with: StringUtils, CharUtils, CollectionUtils, ArrayUtils,… to keep them focussed and easy to locate. – gidds Mar 04 '21 at 21:52
35

Putting commonly used functions into a single file is not on its own an antipattern. But there are many antipatterns associated with this:

  • Grouping functions whose only common property is reuse is not very maintainable. Usually grouping should be based on some stronger relatedness criteria. For example, the API part of an application might have utility methods for returning common responses, which is completely fine.
  • Overly generic names, like an isEmpty which only really checks whether a string value is empty.
  • Overly specific names, where nobody would guess that isFooCompatibleString really should be isNotEmptyString.
  • Lack of testing, spreading subtle bugs until some code inevitably starts relying on the buggy behaviour.
  • "Utility" functions which are only used in one place should not be moved into a utility file - YAGNI.
  • In interpreted languages like JavaScript I believe importing a single function from a file results in the whole file being loaded into memory (somebody will be able to correct me on this). When writing an application where performance or size is paramount this could be a problem.
l0b0
  • 11,014
  • 2
  • 43
  • 47
  • 2
    For the last point, JavaScript is client-side, so by definition it MUST load the whole file before the browser even tries to parse it. By extension, this means that every tiny little thing included in JS will have to all be loaded before it runs. If any of them returns a 404 or some other network error, your page stops loading from that point onward. If it happens early enough you get an entirely blank page (e.g. a failed jQuery include in the `header` section. – Nelson Mar 03 '21 at 04:35
  • 18
    @Nelson JS hasn't been purely client side for over a decade. It's also false to say that a webpage immediately stops loading once the first linked script or resource 404s. – coagmano Mar 03 '21 at 06:34
  • Regarding the 'overly generic names' - some languages (e.g. C#, C++) have the concept of overloading, where different functions have the same name and (hopefully) the same semantics, but work on different types. Then you can actually keep adding functions to extend the functionality while keeping them all in the same place because logically they are the same. – Asaf Mar 03 '21 at 08:01
  • 1
    In JavaScript, loading a module will parse everything, but I think the objects that aren't imported or referenced from the objects you import will quickly become garbage. The same with using `from module import ...` in Python. – Barmar Mar 03 '21 at 15:06
  • 3
    `isFooCompatibleString` is different from `isNotEmptyString`, as the definition of ‘foo-compatible’ may potentially change over time (depending on what foo is), while non-empty strings will always be the same thing until the end of time. – user3840170 Mar 04 '21 at 09:01
  • 1
    @user3840170 That's a good comment, but it's also important to note that what the answer is saying is that sometimes a function *should* be called `is_not_empty_string`, not `is_foo_compatible_string`, because that is not the appropriate place to abstract the details of what makes a string foo-compatible, but the unfocused concept soup that is "utils" files makes it easy to end up in a situation where it's hard to even determine if the function should current be called that, or should have ever been called that. – mtraceur Mar 04 '21 at 23:24
  • @Barmar unfortunately, that won't happen in Python unless the runtime can prove that the imported pieces are not using the unimported pieces. I'm not sure that's even allowed by the semantics: once a module is imported, it's cached in memory by the module loading system - when you do `from foo import bar`, `foo` never becomes unreachable! `foo` is stored in some dictionary of loaded modules somewhere, and everything within `foo` is thus reachable for the lifetime of the program as well. But conceptually you're right: whole-program analysis by sufficiently advanced optimization could do that. – mtraceur Mar 04 '21 at 23:30
  • @mtraceur Yeah, forgot about the module dictionary. – Barmar Mar 04 '21 at 23:43
  • @mtraceur Then the problem is not ‘overly specific name’, but ‘unfocused concept soup’, and yet this point of the answer does not actually say anything about that. – user3840170 Mar 05 '21 at 12:49
  • 1
    @user3840170 and @mtraceur are both correct. `isFooCompatibleString` doesn't belong in a general utils file/class, but it _should_ be abstracted somewhere adjacent to `Foo`, even if it (currently) only calls out to `isNotEmptyString`. – Mikkel Mar 05 '21 at 16:50
  • @user3840170 [What Mikkel said], with the caveat that every abstraction and indirection has a mental cost to readers of the code who are not already familiar with the abstraction. Sometimes it is better to just show people that you're checking for emptiness rather than some abstract notion of "`foo`-compatibility", so they can more easily mentally model what the code will actually *do* for some inputs. For example, if `is_foo_compatible_string` is only used in one constructor of the `Foo` class, then that function name is just redundant and obfuscating in that context. – mtraceur Mar 05 '21 at 20:34
  • That said, I agree with @user3840170 in that the answer is ambiguous in this regard - I thought it was obvious what it meant, but I can now see how that point in the answer can entail a failure to recognize when `is_foo_compatible_string` is more appropriate than `is_not_empty_string`. – mtraceur Mar 05 '21 at 20:40
  • `isEmpty` could also apply to Stomach, PetrolTank and Soul objects. – JBRWilkinson Mar 11 '21 at 09:29
16

It's all about the balance and good taste in scope.

It's perfectly fine to have a ConsoleUtils module that contains all Utilities that are helpful when working in console. Those can range from displaying tables, through creating users, to statistics code.

On the other hand, on one side lies the Kitchen Sink, on the other lies Functional Decomposition (Litter is my pet name for it) - thousands of modules each containing 1-2 methods, essentially re-creating procedural programming, but with lots of namespaces.

Having said above - reason why you run into this anti-pattern a lot in JavaScript - is because it's a scripting language that was not designed and is not meant to be used for programming anything else but simple forms. Language that until recently didn't even have iterators, classes, or anything really that modern programming languages have.

It's standard lib was (and still is really) also incredibly poor, hence the rise of "JavaScript frameworks" and having libraries to pad a string (something that's a part of stdlib even in C). Lots of those issues are still present in Node.js too.

Pang
  • 313
  • 4
  • 7
12

It's not a terrible thing, but if you can group them a little tighter than just "util"/"helpers", that's better. For example, shallowClone and clone and other related functionality sound like they should be in a clone-utils.js file.

FrustratedWithFormsDesigner
  • 46,105
  • 7
  • 126
  • 176
11

Its definitely a code smell, just because its such a common thing to collect together random unrelated code into a single "helper" library which everything then depends on and becomes a blocker to refactoring.

ie. you cant change helper.isNull() to return true for some new case, say a blank string because its so widely used you can't tell what it might break.

Also with a large collection of functions, other projects which use the library tend not to bother upgrading unless the function they use changes. Thus you end up with many projects using old very old versions which can't be updated due to cumulative changes resulting in large breaking changes to the library's use.

Vaguely related 'official' libraries such as Math avoid this problem because the functions in them are so rigorously defined that they never change.

But even in these cases we have issues, such as when cryptography specifications change or are interpreted differently on different platforms.

Ewan
  • 70,664
  • 5
  • 76
  • 161
9

It is not necessarily an anti-pattern per se. But it quickly devolves into one if care is not taken.

First things first: Going the complete opposite isn't good, either. If you have a folder util/ and put every function in its own file, like deepClone.ts, shallowClone.ts, then that is not any better. Instead of a file, your dumping ground becomes a directory. A directory has the added disadvantage that it blows up your import statement significantly.

So basically, that leaves us with the middle ground: Put some stuff together, and some stuff apart. Again, you could do this with one file per function, like util/clone/deep.ts, util/clone/shallow.ts. That way, you get to group semantically similar functions together. But you still blow up imports. So putting it into a single util/clone.ts file seems appropriate.

But... that is not what a single helpers.ts file is, is it? A single helpers.ts file often is a kitchen sink gobbling up completely unrelated functions. Its a mess, and nobody really knows what goes in there and what doesn't.

So, my answer would be: Utility files that group together static functions aren't bad. Java's Math class has already been given as an example. Apache Commons has IOUtils. But notice something? Both have a clearly defined theme/topic around which functions are grouped. They precisely aren't kitchen sink collections of completely unrelated stuff.

With the examples you have given, I would probably organize it like this:

.
├── util
│   ├── url.ts
│   ├── clone.ts
│   │   ├── shallowClone
│   │   └── deepClone
│   └── string.ts

I've only added a couple of actual functions, but you get the gist.

Polygnome
  • 2,039
  • 15
  • 15
4

Indeed "util.js" or "helper.js" tells you very little about what you can expect to find.

Will it help you compute a Fourier transform? Who knows?

If working with the examples above, I'd place them all in their own files (omit.js, null-to-undefined.js, url-builder.ts, etc), or if they are related functionality, group them (eg, deepClone, withoutKeys, shallowClone, in clone.js or similar).

One the other hand, a directory with a file for every function is bit unnavigable.

The answer is Hierarchy.

Code should be arrange in a hierarchical manner (directories, files, namespaces, classes, functions) so that someone looking for something can navigate there, making educated guesses about where the code would lie, if it existed.


Some examples I've seen

Were this my project, I'd put these in three files.

include returning a copy of an object with some keys omitted

transforming all null-ish values in a JavaScript object (eg. "", null, {}, []) to undefined

object.js

constructing a URL from a struct of parameters

url.js

transforming strings in some fashion, etc, etc.

string.js


But again, the key is organize with a non-extreme amount of content at each level.

Should I have a 5000-line or 50-function object.js file, I'd think about whether there are any sensible subdivisions of that.


EDIT: And there may be other considerations. For example, JavaScript bundlers may have a tough time tree-shaking unused code from files. You may choose to divide code into more files if this is the case.

Paul Draper
  • 5,972
  • 3
  • 22
  • 37
4

No, they're not an anti pattern.

However, if this file grows bigger than what you're comfortable with and there are actually big sets of functions with related functionalities, then it does become a smell.

By itself though, there's nothing wrong with having a random utility tool belt, if it's useful and it works, then don't let anyone else tells you that it's not fine.

However, you should regularly refactor the file and move functions that have outgrown the file into their own proper modules, otherwise there's a tendency that you'll just leave the functions there growing related functions without ever graduating into a proper module.

Putting single functions into dozens of files just to avoid having to have utilities file would've been maintenance nightmare. It'll just make it much harder to find what you need, adding additional complexity with very little benefit.

Utility functions often start out as utilities because you weren't sure what kind of related utilities you're going to be needing in the future. In such case, putting the function into their own files too early may mean the you'll end up with module names that may not make sense with whatever future functions that you end up creating. So putting it in the toolbelt is actually the less bad.

However, there's a couple additional considerations for certain scenario's.

First consideration is in front-end languages, such util files will eventually end up in a file that the user needs to download and will use up memory. If your util file ends up containing large number of functions that some large sections that your users never need to use, that's a practical problem you'll also need to address. You may opt for a compiler that can automatically remove unused functions, or you may want to do the reorganization manually, but either way any functions that the user have to download and execute should be either something that the user needs either now or they need it very soon.

Second consideration is if you're writing a library and you package and publish it for large number of other people to use, having such utilities would be more problematic because people will depend on them. Avoid having public functions in a util file/module. If a utility becomes so useful in your library that many people are seeking to use it, put it into a proper module.

Lie Ryan
  • 12,291
  • 1
  • 30
  • 41