12

I'm responsible for maintaining a satellite that has an ADC that is failing in an unusual manner. Essentially what is happening is that it toggles back and forth between two values, if the raw voltage is in a certain range. I'm working on getting the raw counts, but the processed data looks something like this:

Sample image

Note that the blue line means something else (Essentially, the software is trying to make the red line match the blue line).

Normally, the steps are quire small, as can be seen by the left few small bumps on the larger square wave. However, the steps are quite large once you drop below the value. While I don't have the raw count data, I do know it is reflected in the counts.

What I'm trying to understand is how this ADC is failing.

I'm guessing the following, but I would like to get some analysis of this idea:

  1. In the linear region, for each change in analog voltage in delta v, there is a change in counts of one.
  2. In the non-linear region, for a change in analog voltage in delta v, the jump in counts is much larger.
  3. It is possible that the delta v in 2 is larger than 1, but it is much smaller than would be normally predicted.

Remember, this is a satellite, so I can't bring it back to the lab for testing. Any thoughts?

EDIT: Here is the raw counts for such an episode (sampled at a lower frequency). Also, the ADC is about 15-20 years old space rated, I don't have a part number on hand, but I'll see if I can get it. It was probably around in 1993, and might be FPGA based. The counts are as far as I can tell 374- 421 as the gap (Might be off by a few counts). Binary is as follows

374 101110110
421 110100101

enter image description here

Part of the reason that I think it might be the ADC is that there are multiple sensors with similar gaps. I'm working right now on better quantifying it, but here's an example plot. Note the points are the actual measured values, and the lines simply connect two of the same data point together. All of these values are read by the same ADC.

enter image description here

Furthermore, here is a list of every value read by the ADC over the period of about 24 hours. There are a multitude of lines (About 20 in all). I believe the gaps represent a dead zone in the ADC or related circuitry. The y axis in this plot is the ADC read output values. Anytime you see a largely vertical line seems to represent a region where the ADC cannot record a value.

enter image description here

The ADC is part ADC0808, the analog multiplexor is Part number is HCF4051BM1, at least from the schematics I can find... It's possible a change was made at some point in time.

EDIT- More of an update: There are 3 analog multiplexors which feed into the ADC. I wanted to see if perhaps one of them was having this issue, where the others weren't. There isn't much evidence for that, however, see below. There are many gaps like this, I just chose to show one.

Count   #tot    #mux1   #mux2   #mux3
557 3360    1336    68  1956
558 252 128 4   120
577 684 292 4   388
578 964 480 8   476
PearsonArtPhoto
  • 327
  • 3
  • 20
  • 3
    Part number and data sheet for the part? – Brian Carlton Nov 12 '12 at 22:01
  • It's difficult to tell from the processed data. How is it processed? Is the ADC an off the shelf component? Is it sigma-delta or SAR? – Samuel Nov 12 '12 at 22:15
  • 1
    Posting the actual values may help. It could be that somehow the MSB and LSB are not being read *together*, in the sense that if the value is changing for example from `0x00FF` to `0x0100` (small change) you may be getting `0x01FF` or `0x0000` (big change). – apalopohapa Nov 12 '12 at 22:36
  • Have you been able to observe the entire range for this effect? What is the full scale +/10? the resolution appears to be much smaller than the skipped zone from -0.5 to +1.5 and there is also an offset. I assume are converting this to analog using software and not DAC hardware. Correct? – Tony Stewart EE75 Nov 12 '12 at 23:24
  • The output values I listed are converted using software. I'm trying to get the part number and the raw values as well, will post them as soon as I have them. – PearsonArtPhoto Nov 12 '12 at 23:25
  • It appears from your full scale of +/-10 the ripple is +/-0.1, the dead spot has a range +/-1 with an offset of 0.5. What kind of ADC is it? SAR or I&D or Sigma Delta? whatn resolution? – Tony Stewart EE75 Nov 12 '12 at 23:29
  • Was there a loss of data between peaks of the linear triangle wave, shown by interpolation? – Tony Stewart EE75 Nov 13 '12 at 02:55
  • @Richman: The data is gathered at a pretty infrequent rate (Say, once every 30 seconds or so). There is no loss of data other than is normal. – PearsonArtPhoto Nov 13 '12 at 02:57
  • Are the linear triangles normal signals? test patterns? that when exceeded below say -0.5, it inverts up to +1.1 and back – Tony Stewart EE75 Nov 13 '12 at 03:02
  • In this case, it is raw battery current. The current should be even at this point in time. The expected signal is the blue line. – PearsonArtPhoto Nov 13 '12 at 03:04
  • 3
    Wait, this is a satellite? Like, it's in space *now*? I hope you bought radiation-hardened parts. – Connor Wolf Nov 13 '12 at 05:20
  • Well Pearson, other than the triangle waves it looks like alias noise with 1Vpp pulses asynchronous and undersampled. Noise ingress on the current sensor when certain loads are active. i.e. -20 dB SNR What about an instability between charging and discharge under low levels? – Tony Stewart EE75 Nov 13 '12 at 05:41
  • Was it ever working? Or has it always done this? – Jim Paris Nov 13 '12 at 05:45
  • It is also entirely possible that the red line is accurate, and it's the blue line that's a fantasy. Considering that the line is battery current, it's not hard to believe that something (such as the ADC doing the measurement, or the processor reading it, or the radio transmitting it) is drawing current that you didn't expect. – Theran Nov 13 '12 at 05:53
  • 2
    It has worked in the past, it started to fail after ~10 years of continual use. I've seen similar behavior from temperature and pressure sensors, not to mention battery voltage, I just happened to post current. – PearsonArtPhoto Nov 13 '12 at 11:17
  • Okay, I've posted more information, including the approximate date of manufacture of the ADC, raw count posts, etc. – PearsonArtPhoto Nov 13 '12 at 14:33
  • 2
    You got a cool job. – Ktc Nov 13 '12 at 14:55
  • 1
    The ADC0808 is an 8-bit ADC, but the raw counts you show aren't in the range [0, 255]. – Theran Nov 15 '12 at 22:02
  • Strange... Guess they must have updated the part after the schematics that I included... Hmmmm... – PearsonArtPhoto Nov 15 '12 at 22:53
  • When you say it's worked in the past do you mean you have actually seen the current draw be close to or exactly what we see in the blue line? Is the similar behavior you've seen due to monitoring temp, pressure, and voltage on different channels of the same ADC that you think is failing now? – Littleman Nov 19 '12 at 22:02
  • 1
    Yes, essentially there was a full range of the ADC once upon a time, and there isn't now. All of the various sensors have jumps in them, which didn't exist some time ago. Hence why I'm trying to learn what the failure mechanisms are for ADCs. – PearsonArtPhoto Nov 19 '12 at 22:04

2 Answers2

4

Is there a reason to suspect the ADC over everything else in the system? Anything between the battery and the ground station could be causing what you see. A good fault tree will consider other causes besides the ADC.

  • The analog front-end
    • radiation effects on op-amps and analog switches
    • op-amps getting stuck at incorrect values
    • transmission gates not opening/closing, or only the N or P side working
    • thermal cycling causing intermittent opens
    • metal whiskers causing intermittent shorts
  • The ADC itself
    • single-bit error
    • data becoming out of sync with the clock (skipped/skewed bits)
    • some other failure mode specific to the type of ADC
  • The digital logic/microprocessor
    • failing to configure the loads as expected
    • not configuring or reading ADC properly
    • incorrectly packing data for transmission
  • Other loads in the system
    • subsystems turning on when not commanded to
    • unexpectedly high power draw from damaged loads
Theran
  • 3,432
  • 1
  • 19
  • 21
  • Added more information as to why I think it is the ADC. Essentially, all values read by the same ADC seem to have a similar region of missing counts. – PearsonArtPhoto Nov 13 '12 at 21:57
  • Is the purple channel read immediately after the pink channel? It looks a bit like the sample and hold for the purple channel sometimes only conducts one way. – Theran Nov 13 '12 at 23:51
  • Honestly, I don't know and I don't even know if there's a way to figure it out... But I'll see what I can do to get it figured out. It is interesting that they are the same signal level, but notice that it doesn't show up when the purple is at the higher level at all. – PearsonArtPhoto Nov 14 '12 at 00:38
  • I'm guessing that what we are seeing is a half-dead CMOS transmission gate where only one of the two transistors is conducting. It's charging the sample and hold capacitor but not discharging it when the purple channel is active. – Theran Nov 14 '12 at 01:07
2

Using the detailed information I am collecting, I am noticing the following trends:

  1. There doesn't appear to be any complete gaps in the ADC range, except for areas where it appears there just wasn't any input signal.
  2. There are a number of regions that look like the data below, where it seems that values in a small window are almost never read, with huge numbers before and after. The first column is the output from the ADC, the second is the number of occurrences, across multiple object types.

The data is:

350 253
351 106
354 1
357 1
359 2
360 183
361 270


375 288
376 188
392 1
409 1
424 762
425 1058
  1. These measurements measure a wide variety of inputs, but there are several very small scale jumps, including things that shouldn't jump quickly, like temperature, battery pressure, battery voltage, etc.

Given all of this, I would have to say that ADC or supporting circuits can fail in such a way that they provide limited capacity to measure small scale phenomena. Furthermore, it seems like these are just step functions.

I'm still trying to figure out how these jumps are connected, but failing to get the full picture...

PearsonArtPhoto
  • 327
  • 3
  • 20