13

I'm a self-taught, novice-ish coder, so I apologize if I don't nail the programmer lingo.

I'm working on a project in which I am providing data, which will be continually updated, to developers who will essentially create a tool to generate reports from queries on the data.

It seems that everyone involved thinks that they need to hard-code data values (not the schema, but the domains/values themselves) into the report-generation program.

For example, suppose we were reporting on personnel; the report would be split into categories, with a heading for each department, and then under each department heading will be subheadings of job titles, and then under each subheading will be a list of employees. The developers want to hard-code the departments and job titles. On the other hand, I would think that they could/would query out those things at runtime, sort records by them, and generate report headers dynamically based on what values are there.

Since the list of potential values will change over time (e.g., departments will be created/renamed, new job titles will be added), the code will need to be continually updated. It seems to me that we could skip the code maintenance steps and dynamically organize the reports.

Since I am not a developer, I'm wondering what I'm missing. What advantages might there be to hard-coding values into a tool like this? Is this typically how programs are designed?

Tom
  • 281
  • 1
  • 3
  • 7
  • 1
    possible duplicate of [Removing hard-coded values and defensive design vs YAGNI](http://programmers.stackexchange.com/questions/56282/removing-hard-coded-values-and-defensive-design-vs-yagni) – gnat Aug 09 '16 at 18:12
  • Are report cross-tabs, meaning values in rows should appear as columns? – Tulains Córdova Aug 09 '16 at 18:17
  • Question is not clear enough, specially paragraphs 4 and 5. – Tulains Córdova Aug 09 '16 at 18:45
  • Think I get it but please correct me: Example: they want a report of total hours spent grouped per department. Is that the idea? In that case you will need to implement aggregates, group by statements for example. Then you will be able to build the report dynamically. Other question: How do they get the data? Do they get flat data or more a relational database to build their report on? – Luc Franken Aug 09 '16 at 19:01
  • @LucFranken, I edited the question to clarify. As for flat/relational, I've provided sample data both ways. Still waiting on the decision. – Tom Aug 09 '16 at 19:18
  • Sadly computers lack the extrasensory perception ability. When things change (department names, etc); if you hard-code values then you have to change source code, and if you don't hard-code values then you have to change something else (a text file, a database, etc). In both cases you still have to change something, maintain something, etc. However; if it's not hard-coded then you need additional code to obtain the information from wherever it is, so there's additional code to write and maintain. – Brendan Aug 10 '16 at 06:39
  • @Tom, the same applies in program design that has applied organizing data in spread sheets for at least 2 decades. If at all possible, you want to make use of native lists rather than hardcoding anything. Just as if data from another source was pulled into a spreadsheet with OLE, when the source changes, your finished sheet already has current info. The same is true with your example of personnel, departments, titles, personnel. That data exists in some native location and format in HR/whereever. There shouldn't be any need today to carry a floppy from HR just to get an employee list. – David C. Rankin Aug 10 '16 at 08:45
  • 1
    @Brendan - If you hard code values in the report, you'll need to change the list in TWO places (data source and the report) whereas if the report is dynamic, you only need to change it in one location (the report). – kwah Aug 10 '16 at 09:40
  • @kwah: If you hard code the values in the source code (and use them in multiple places - the code that generates the report, any/all code that uses the data for other things) then it'll be in ONE place. If you have a copy of the values in a "report config text file" and another copy in a database and another copy somewhere else, then it'll be in THREE places! Weeeeee... – Brendan Aug 10 '16 at 09:53
  • 1
    @Brendan why would you end up with three locations? Perhaps my understanding is incorrect but I'm envisioning an sql query to fetch data from a database, the the application will aggregate/group the returned values by, e.g., the department. If you're willing to have the overhead of multiple db queries, you could select distinct departments/role titles if you really want to. At no point is the data existing in more than one location - the report is being driven by the data. – kwah Aug 10 '16 at 10:03
  • @Tom: Mostly what I'm saying (above) is that where something is (hard-coded in the source or somewhere else) depends on who ends up needing to change it. If the programmers themselves are the only people that need to change something then there's no problem hard-coding it. If the end user needs to change something then it shouldn't be hard-coded and probably needs a fancy user interface (wizard, dialog box, settings web page, ...) with decent error handling, etc; and (even for "bare minimum user-friendliness") this ends up being an order of magnitude more painful for the programmers. – Brendan Aug 10 '16 at 10:04
  • 1
    @Brendan I also disagree with your definition of it being in one place - the way you describe it it's in multiple locations, scattered throughout the source code. – kwah Aug 10 '16 at 10:06
  • @kwah: What I'm suggesting (in a sarcastic way) is that the same value/s can be in one place or multiple places, regardless of whether they're hard-coded or not; and implying that "when it's hard-coded it's in multiple places" is dishonest at best. – Brendan Aug 10 '16 at 10:08
  • @kwah: If I do `#define MY_VALUE 123.456` in a C or C++ header file and use that in 100 different places; then it's in one place (but used in 100 different places). If I create a "GlobalConstants" class or interface in Java and use it in 100 different places; then it's in one place (but used in 100 places). If I create a variable in Python and declare it as `global` everywhere it's used; then it's in one place (but used in lots of places). – Brendan Aug 10 '16 at 10:16
  • @Brendan Having given it further thought, it depends *how* you define it. If it's a collection of values which is iterated over dynamically then I'd accept that as being defined in one location within the source code (with a second location being the data store itself). If it's a series of variables e.g. `column1title`, `column2title` etc then when adding an additional column or role etc. you are required to edit multiple locations within the source code to accommodate these changes. Both of these require recompilation and distribution of updated versions (not inherently a bad thing). – kwah Aug 10 '16 at 10:42
  • Why is this code even being written? There are hundreds of reporting applications available. What requirements does your business have that are not met by any of the existing applications? – kevin cline Aug 11 '16 at 19:18

4 Answers4

29

Really? No Possible Valid Use Cases?

While I agree that hard-coding is generally an anti-pattern or at least a very bad code smell, there are plenty of cases where it makes sense. Here are a few.

  • Simplicity / YAGNI.

  • Real constants that actually never change.

    ie the constant represents a natural or business constant, or an approximation of one. (e.g. 0, PI, ...)

  • Hardware or software environmental constraints

    In embedded software, memory and allocation constraints come to mind.

  • Secure software

    These values are not the be available and/or easy to decode or reverse-engineer, e.g. cryptographic tokens and salts. (Note that keeping them hard-coded does have obvious downsides as well...)

  • Generated code

    Your preprocessor or generator is configurable, but spits out code with hard-coded values. Not unusual for code that relies on rule engines, or if you have a model-driven architecture.

  • High-performance code

    In a way this is "generated" code, although even more specialized. e.g. a pre-determined lookup/computation table with unlikely changes. This isn't unusual at all in graphics programming for instance.

  • Configuration and Fallbacks

    Both in your actual code and in your configuration files, your are likely to have configuration values, and fallbacks for several cases (when the configuration is unavailable, when a component doesn't respond as expected, etc...). Still, it's generally best to keep it outside of your code and look it up, but there might be cases where you absolutely want to have a specific value/reaction to a specific action/issue/situation.

  • And probably a few more...

Still an Anti-Pattern? So is Over-Engineering! It's about your Software's Life Expectancy!!

Not that I'm saying there are all great reasons, and generally I'd balk at hard-coded values. But some can easily get a pass for valid reasons.

And don't oversee the first one regarding simplicity/YAGNI either by thinking it's trivial: there's probably no reason to implement a crazy parser and value checker for a simple script that does one job for a narrow use case very well.

xkcd: The General Problem - https://xkcd.com/974/

It's difficult to find the balance. Sometimes you don't foresee that a software will get to grow and last longer than the simple script it started as. Oftentimes though, it's the other way around: we over-engineer things, and a project gets shelved faster than you can read the Pragmatic Programmer. You wasted time and effort on things than an early prototype did not need.

That's the mean things with Anti-Patterns: they're present in both extremes of the spectrum, and their appearance depends on the sensitivity of the person reviewing your code.


Personally, I would tend to always go the generic route, as soon as I see that something might change or if I've had to do it more than once. But a more precise approach would be to evaluate carefully the cost of hard-coding vs generating or generifying your code for that specific situation. It's the same as determining if a task is worth automating as opposed to doing it manually. Just take into consideration time and cost.

xkcd: Is It Worth the Time? - https://xkcd.com/1205/

haylem
  • 28,856
  • 10
  • 103
  • 119
  • That's funny, because I piloted this myself, and it was much easier and faster and cleaner for me to handle the values dynamically. I did it in Python, whereas I believe the end product will be coded in Java--if this makes a difference. It felt over-engineered when I hard-code in the values, because each in-coming column had to be tracked in multiple places. – Tom Aug 09 '16 at 21:02
  • 1
    @Tom: You're saying it was easier and faster to implement (or even reuse) a configuration lookup library than to use an hard-coded value? Great for you. Also, I don't see how your last sentence fits the definition of over-engineering. It would feel obviously messy, and obviously if it's hard-coded and duplicated it's even worse (which was not the point of your question question, I probably misunderstood, but it seemed to me like you meant the value was not hard-coded in place every time, but in a single point in the program). – haylem Aug 09 '16 at 21:16
  • Anyways, I'm just pointing out cases where it'd be valid. I'm also pointing out that it'd be controversial in my last sentence. You can't please everybody and teams have people with varying skill levels. – haylem Aug 09 '16 at 21:18
  • Like I said, I'm a self-taught hack, so I probably mis-used the term over-engineered. I just meant that it took far fewer lines if code when values weren't hard-coded. My explanation is probably lacking due to a lack of a background in this field and my insufficient terminology--e.g., I have no idea what a configuration lookup library is. – Tom Aug 09 '16 at 22:18
  • 1
    @Tom, don't sell yourself too short. You're definitely on to something. It sounds easier and less time consuming to write a quick algorithm to organize the data by looking at the Department and Job Title fields as opposed to hard coding `Department = ['IT', 'Sales', 'Operations', 'HR', 'Finance']`. It would also be much more difficult to maintain the hard coded array in the event that a new Department or Title was introduced. – Chris G Aug 09 '16 at 22:32
  • @Tom: I think I misunderstood your question. I did not understand your use case and the duplication of values. There's still the case of code generators that could lead to this and not necessarily be bad, but in that case I'd definitely favor the generic approach then. – haylem Aug 09 '16 at 22:52
  • 2
    You can have more complex things that are still sensible to hardcode. One that comes to mind that I wrote a few years back was all possible permutations of a set of values. I needed to find a random valid direction, picking a random permutation and then taking the first valid result was by far the most efficient solution and since it was in an O(N^3) loop efficiency mattered. – Loren Pechtel Aug 11 '16 at 20:20
  • I've hard-coded a 135-line database table that never changes and is only, and will only, ever used in a select * where 1 query in one place. I've yet to see the downside. – Jacob Lee Jul 31 '20 at 17:50
10

Wikipedia:

Hard coding (also hard-coding or hardcoding) refers to the software development practice of embedding what may, perhaps only in retrospect, be considered an input or configuration data directly into the source code of a program or other executable object, or fixed formatting of the data, instead of obtaining that data from external sources or generating data or formatting in the program itself with the given input.

Hard-coding is considered an antipattern.

Considered an anti-pattern, hard coding requires the program's source code to be changed any time the input data or desired format changes, when it might be more convenient to the end user to change the detail by some means outside the program.

Sometimes you cannot avoid it but it should be temporary.

Hard coding is often required. Programmers may not have a dynamic user interface solution for the end user worked out but must still deliver the feature or release the program. This is usually temporary but does resolve, in a short term sense, the pressure to deliver the code. Later, softcoding is done to allow a user to pass on parameters that give the end user a way to modify the results or outcome.

  • Hardcoding of messages makes it hard to internationalize a program.
  • Hardcoding paths make it hard to adapt to another location.

The only advantage of hardcoding seems to be fast deliver of code.

Tulains Córdova
  • 39,201
  • 12
  • 97
  • 154
  • 7
    OK, but the "only advantage" is often hugely important. Design decisions in programming are often about the trade-off between future proofing and quick delivery now, and as such, hard coding can be a perfectly acceptable choice. Sometimes **not** hard coding is a bad design choice. –  Aug 10 '16 at 07:37
  • 2
    -1 I don't think this is a helpful answer. It essentially says 'embedding values into source code inappropriatly' is inappropriate. I think the OP wants guidance about when things may belong in source code and therefore fall outside your Wikipedia definition. – Nathan Cooper Aug 11 '16 at 22:14
  • 1
    Hard coding should be a vital part of your process and considering it an anti-pattern is outdated in the age of micro-services, with the Angular Tour of Heroes tutorial being a high profile example of a huge software house directly encoraging or even mandating as an intermediate step. What is more, when you move to dynamic data you should still retain some hard coded data as a fall-back, perhaps controlled by an environment variable or even a boolean toggle on the code itself so bugs and security issues can be properly isolated down the line. – Peter David Carter Mar 27 '18 at 18:50
4

There are times it's OK to hard-code values. For example, there are some numbers like 0, or one or various n^2-1 values for bitmasks that need to be certain values for algorithmic purposes. Allowing such values to the configurable has no value and only opens up the possibility of issues. In other words, if changing a value would only break things, it should probably be hardcoded.

In the example you gave, I don't see where hard-coding would be useful. Everything you mention would/should already be in the database including headings. Even things that drive the presentation (such as sort order) can be added if they aren't there.

JimmyJames
  • 24,682
  • 2
  • 50
  • 92
  • Thanks. Sort order was the one concern I had. However, in our case it doesn't matter, and I didn't even consider that it could be added as another table in the database. – Tom Aug 09 '16 at 20:59
  • 1
    I should note that managing all of this in the DB is one option. You could also use configuration files or other solutions but hardcoding appears to be a poor choice. The DB option is often used because it's easy to create an interface to allow the options to be managed by users. There are also tools like [this](https://zookeeper.apache.org/) which are specifically designed for this purpose. – JimmyJames Aug 10 '16 at 14:29
0

Implementing a robust solution that allows for values that might otherwise have been hard-coded to instead be configurable by the end users demands robust validation of those values. Did they put in an empty string? Did they put in something non-numeric where it should have been a number? Are they doing SQL injection? Etc.

Hard-coding avoids a lot of these risks.

Which isn't to say that hard-coding is always, or even often, a good idea. This is just one of the factors to take into account.