How do you tackle really bizarre errors that keep you puzzled for more than 10 hours?

Question

You know them, those errors that make NO sense. Where it seems like a gremlin just jumped deep inside your chips and messed up something. Do you take a walk, write stuff, call an uncle?

Good-News Update: I(we) got it fixed! Thanks to you all! I was aided by the walk, then by simply examining closely the important files and googling. I will write down all your great advice for when I get another challenge! Thank You So Much — Caffeinated, Sep 03 '11 at 09:33
That said, I do have one new discovery - it may help to jump in an IRC channel and ask stuff. — Caffeinated, Sep 03 '11 at 15:43
Why did you decide on a threshold of 10 hours? That's way too long - if you don't have a good idea of what's causing unexpected behavior within an hour or two, you are in trouble. — Vector, Sep 05 '11 at 20:37
@Mikey, yes. Well, I was thinking of non-contiguous programming time. i.e, with breaks & naps & chatting. But we can call it 2 hours really, even 30 minutes for advanced coders. — Caffeinated, Sep 06 '11 at 17:06
"When the going gets tough, the tough go to sleep and let the subconscious work on it." -- anon — Michael Easter, Sep 06 '11 at 23:52
@BlackJack, Yeah my uncle would launch into "when i was your age, back in the USSR...." — Caffeinated, Sep 07 '11 at 03:13
1. Get someone to help. Two people is a must. 2. Narrow it down using excessive amounts of debug statements. There was a file where every single line was preceded by a debug macro to pinpoint the one that segfaulted. — SF., Sep 07 '11 at 09:27
I just usually get away from it for some time, and then have a fresh look at it again after that. Usually helps when you're in those situations :) — Michell Bak, Sep 07 '11 at 11:22
@SF - "There was a file where every single line was preceded by a debug macro to pinpoint the one that segfaulted." I've used that technique many times - very effective. — Vector, Sep 07 '11 at 16:16
1) Rebuild 2) Look at an older commit to figure out what part of code has the problem 3) Make sure I understand what triggers the bug and doesnt and what the bug is actually doing 4) Isolate the code 5) Fix. Steps are always in order but not every step is must be done — , Aug 13 '13 at 10:28
"If you don't have a good idea of what's causing unexpected behavior within an hour or two, you are in trouble": I once had a bug in legacy code and it took me 8 days to find the problem (and 10 minutes to fix it). — Giorgio, Dec 08 '14 at 10:34
@Giorgio:I think vector is assuming that you know how to create the problem in the first place. Being able to recreate the problem on demand is 95% of the solution (at least). I've seen many bugs over the years that have stumped people for days, weeks or a month because they couldn't figure out how to recreate the problem. — Dunk, Dec 08 '14 at 20:01
@Dunk: In my particular example I could recreate the bug at will (open that particular data set, then do X, then do Y, then close the application --> crash) but I had to spend lots of hours in the debugger to find out the exact place in the code where the bug originated. It was pretty messy legacy code. — Giorgio, Dec 08 '14 at 20:58

score 79 · Answer 1 · answered Sep 03 '11 at 02:07

79

Quit. No, not your job! Just get up and go home. You're done for the day or the weekend. 19 times out of 20 when you come back to the problem next, the solution will present itself within an hour.

answered Sep 03 '11 at 02:07

Dave Nay

3,809
2
18
25

Hmm, I see. However, I'm right here looking at a problem I was looking at yesterday! But yes, I know - walk a way a bit. :) – Caffeinated Sep 03 '11 at 02:08
17

You could also try rubber ducking it. http://en.wikipedia.org/wiki/Rubber_duck_debugging – Dave Nay Sep 03 '11 at 02:11
2

19 out of 20, yes. My worst one never did get solved, only worked around. No test environment ever showed it, only the full production environment in operation--we couldn't even reproduce it after hours. – Loren Pechtel Sep 03 '11 at 02:14
3

Getting away from something thats irritating you can be really hard - but I've found over the years that its always the best thing to do. The subconscious mind can work on the problem while you eat, sleep, good off, watch TV... and next day (or the day after) things go better. One word of warning though: gather information before you walk away... Walking away is not the same as ignoring it and pretending its not there. You still need hard work! – quickly_now Sep 03 '11 at 03:10
1

I do not know about an hour. I generally solve most of these types of problems in the shower when I get up in the morning. The second most frequent will be when i am nearly asleep at night and have finally allowed myself to stop thinking about it. – SoylentGray Sep 06 '11 at 18:25
3

There was a fascinating NOVA Science NOW hosted by Neil deGrasse Tyson which talked about the science of sleep. In it was discussed the phenomenon of banging your head on a problem for hours, going to sleep, and waking up and solving it right away. When we sleep, our brain turns the events of our day over and over and over, analyzing it from many different angles. What it leaves behind are new neural pathways that can actually help us see the problem in a whole new way subconsciously, and then actually solve the problem. Pretty awesome. – Byrne Reese Sep 07 '11 at 00:05
@Byrne: http://www.pbs.org/wgbh/nova/body/sleep.html – Dave Nay Sep 07 '11 at 00:57
I can usually solve one programming problem while I sleep or at least come up with a new angle to try. Hard to solve bugs are often caused by something very simple or an unusual combination of conditions. Think about the bug without the code to try to come up with possible causes. – Michael Shopsin Sep 07 '11 at 17:02

score 44 · Answer 2 · answered Sep 03 '11 at 02:49

44

Before ten hours go by, I would get some help.

Describe the problem to someone else, anyone else, even your rubber duck.
Ask someone else to take a look at the code, or step through it with them.
Isolate it. Delete a bunch of stuff, then bring it back bit by bit until the problem reappears.
Get some sleep!

answered Sep 03 '11 at 02:49

kevin cline

33,608
3
71
142

12

+1 for deleting everything until the problem disappears. – Jonah Sep 03 '11 at 03:05
4

You should do one of those things before 1 hour goes by. The more you stare the less likely you are to achieve your epiphany. I normally solve a problem just by talking it through with someone. – Ben Sep 03 '11 at 21:43
Spot on. Often I figure out the problem, (or get close to it) by first describing the problem. Frequently this occurs while writing a description of the problem for a StackOverflow question. Which also requires a reduction (isolation) and then failing that, a wait period where you step away from the problem and let the SO answers come rolling in. – sholsinger Sep 07 '11 at 14:31

score 17 · Answer 3 · answered Sep 03 '11 at 02:15

One word, timebox, set a limited amount of time to work on something, and if it isn't solved, move on to something else and come back to it the next day with a fresh perspective.

That and another set of eyes, is always worth more than any time you can waste staring at something.

I would never spend more than 45 mins to an hour trying to solve something in one sitting, it violates the law of diminishing returns.

Thank You So Much - I read the timebox article on wikipedia, very useful. — Caffeinated, Sep 03 '11 at 03:21

score 9 · Accepted Answer · answered Sep 03 '11 at 19:41

For those really horrible problems my strategy usually goes as follows.

Experiment and google. Keep trying to solve the problem. Most of the time this solves the problem in an hour or less.
So that hasn't worked. Take a break. Have a coffee, talk about something unrelated to a colleague. Push the problem out of your mind. When you look at the problem 5 or 10 minutes later you are looking at it from a slightly different perspective. Most of the time this works.
In this case it hasn't. So spend another 10 - 30 minutes looking at it. Then call in a colleague. But before you do, make some notes; you want to demonstrate the problem, reproduce it, then list of the things you have tried, and most importantly prove that you have tried them. So do a dry run first. Set some book marks in the code, close any superfluous open documents etc. This way you may either solve the problem yourself, or when you do demonstrate the problem you won't be wasting their time.
Ask your colleague to make you prove all your assumptions. is that setter actually being invoke? Is that method really returning what you claim it is? You think that object isn't null - show them it isn't null.
Most of the time, either demonstrating the problem will make you realise that you haven't tried all the possibilities or your colleague will see your mistake.
If that doesn't work its time to get serious. Document exactly what you are trying to do, what you have tried, and why it didn't work. Email this to all your colleagues. Post it on SO. At this point the document should be a perfect SO question.
While you wait for responses, google google google. Try every permutation of the question you have. Open up a bunch of tabs. You probably aren't going to get an answer by this point, but you're looking for ideas, possibilities, different ways of approaching the problem.
Do something else, if you've spent 5 hours on a problem its time to leave it for another day. Maybe you will get a useful response. Maybe when you attack the problem the next day it will be obvious.
If none of that works, its time to look for a different solution. Maybe you can use a different method, a different technology. Maybe you should consider abandoning the feature for now. Are you billing the client by the hour? Are you working for a company on an internal app? You need to escalate this to the owner and tell them "look, I've spent x hours on this and made no progress, is the cost benefit worth it?". You don't want to go to your boss and tell them you spent 16 hours on a problem only for them to turn around and say, it nots that important, skip it for this release. you need to find that out earlier.
And if that doesn't work? Well your only options are to keep hammering away at the problem or seek out industry expertise. Ask technology experts on twitter. Email your technology provider.

score 7 · Answer 5 · answered Sep 04 '11 at 03:14

7

Explain the problem to someone else.

By explaining the problem to someone else, you have to clarify it: this often lets you see the solution.

(One of the UK professional computer magazines once proposed selling life size cardboard cut-outs of a senior programmer specifically for this purpose.)

I find sleeping on a problem (sometimes for a couple of days) can also help.

answered Sep 04 '11 at 03:14

MZB

535
4
8

1

The "someone else" need not be human. Sometimes I explain things to the cat, and aha! I find the problem. – DarenW Sep 04 '11 at 04:49
I really should buy a cat too. I'd train it to scratch my head on demand. – Caffeinated Sep 06 '11 at 17:18
Someone really ought to make a life size cardboard cut-out of Jon Skeet. – Don Roby Sep 06 '11 at 22:10

score 5 · Answer 6 · answered Sep 03 '11 at 18:30

5

I have a three step plan:

Get a coffee or other tasty beverage.
Work on something else for the rest of the day.
"Phone a friend" and doodle on the whiteboard.

Each stage is an escalation if the previous step failed. There's almost always something else productive I can work on at stage 2.

answered Sep 03 '11 at 18:30

Flexo

712
2
7
17

Nice advice! So "Phone a friend" is quoted because it should be limited to 60 seconds, like on Millionaire, yeah? I like the whiteboard idea too. – Caffeinated Sep 06 '11 at 17:09
1

I find the whiteboard really helps think it through methodically. Quotes were because often the friend is in the same office so actually calling would be odd. But it sort of felt like a lifeline from the tv show. – Flexo Sep 06 '11 at 19:38

score 4 · Answer 7 · answered Sep 03 '11 at 19:46

4

Sleep over it

Otherwise, call someone nearby and ask him to take a quick look over the code.

Often errors which would take you a long time to find (since its your code) are found very easily by others

answered Sep 03 '11 at 19:46

Akash

674
1
5
12

score 3 · Answer 8 · answered Sep 03 '11 at 03:22

3

You could see if getting up, pacing around, and thinking about the problem helps you find a solution. Whether or not you're actually standing or pacing, try getting away from the computer while you're thinking.

answered Sep 03 '11 at 03:22

compman

1,387
13
21

score 3 · Answer 9 · answered Sep 05 '11 at 18:58

I generally do one of the three:

Take a walk/bike ride...some that gets you away from the computer.
Play with my dog or cat
If you have a hobby, work on that for a while.

Any of the three do a good job of distracting oneself from the situation at hand. I find the distractions let my subconscious brain chew on something for a while. After an hour or so of this, bam, there's the solution :-).

Morons · Answer 10 · 2011-09-06T19:10:28.613

3

Build a test Harness to target that exact Defect and Isolate it

Just keep eliminating good code.. while replicating the defect. Until you target the exact piece of code casing the error. Then trace the code.

Recommended reading : The Pragmatic Programmer Specifically Chapter 10 : Tracer bullets

edited Sep 06 '11 at 19:10

answered Sep 06 '11 at 19:04

Morons

14,674
4
37
73

all this is good and well but it takes for granted that the bug has been and can be reproduced. What if the 19 hours spent thus far were to just that... attempting to find a means to reproduce the problem in a deterministic and systematic manner... what then ? To me THAT is the essence of the question here ! – Newtopian Sep 07 '11 at 04:58
The Pragmatic Programmer is excellent – Caffeinated Sep 07 '11 at 19:02