7

I am considering whether to use parity or not with my UART. It is a board level high speed (upwards of 115,200 baud) signal. The traces are very short (less than 2 cm), MCU to DSP, but they could pick up noise. During a logic analysing session using my logic sniffer I noticed one byte was captured incorrectly, it was just a single error. I couldn't replicate it. Now I'm wondering whether I should include parity.

My application is somewhat safety critical in that a failure would lead to a graphical error on a HUD/OSD which could provide incorrect information to someone piloting a model aircraft. However, the HUD is updated at 30 frames per second, so any glitch would be temporary. One problem that could happen is it could send a command which put the OSD into an incorrect state where it did not display anything, leaving the pilot blind about 25 km away from home... which is not good.

Will including parity protect me against common glitches or will it just make the protocol slower? Why is it that most UART protocols do not have parity? And is there a reason to select odd or even parity over each other?

Thomas O
  • 31,546
  • 57
  • 182
  • 320
  • 1
    Beyond designing for reliability, if you have a critical system which you envision could get into a non-functional state that doesn't resolve itself, perhaps it should have a user-accessible (display) reset/refresh button and be designed to quickly return to functionality when that's pressed. **Especially** if it's a prototype. Solving the problem the right way *is* important, but holding out for the right solution while users have to resort to pulling battery packs until you complete, qualify, and distribute a firmware update isn't right either. – Chris Stratton Aug 01 '11 at 16:15

9 Answers9

13

In my experience, almost all the equipment and systems I've ever worked with skip parity, and just use message checksums or CRC's to detect errors.

JustJeff
  • 19,163
  • 3
  • 48
  • 75
  • I was thinking of combining the two into some kind of message format which used both parity and CRC8 or CRC16. The PIC24F I'm using has a built in CRC generator which is very useful for me. – Thomas O Jan 16 '11 at 22:06
  • 3
    It is very rare, almost impossible, to have an error that the parity will catch and the CRC won't catch. – markrages Jan 17 '11 at 03:41
  • 1
    @markrages, for 32-bit CRC, you have 1/8 billion chance to have same crc & parity, provided that you have multi-bit errors. – BarsMonster Jan 17 '11 at 08:47
  • @BarsMonster, How did you get that figure? – tyblu Jan 18 '11 at 02:46
  • 2
    For multibit errors, probability to get same CRC - 1/2^32. Probability to get same parity = 1/2. Multiply and here we go. – BarsMonster Jan 18 '11 at 04:02
  • Ah, I see, so `<#bits/packet>[bits/packet]*2^33[packets/error]/1Mbaud[bps]=<#bits/packet>*2.4hours/error`, or about 2.4hours per error per bit, or ~102 days for 1kB packets. – tyblu Jan 18 '11 at 06:47
  • For small controllers with limited memory, consider the Fletcher checksums, which are almost as good as CRC and do not require lookup tables. – Mike DeSimone Jul 31 '11 at 12:25
9

If you have the capability of interrupting a message, a parity error can be used to kill a bad transmission faster than a CRC check, especially for large packet sizes. Otherwise, a CRC check will catch anything a parity check will, and more. If you are really concerned, you can use additional software detection methods, such as context checking and message mirroring. Timeouts can be used to prevent the OSD from falling into an incorrect state permanently.

bt2
  • 3,784
  • 2
  • 25
  • 28
  • 4
    +1 on the timeouts. On an application like this, you need to enable your watchdog and test to make sure it is working properly. – markrages Jan 17 '11 at 03:40
7

Odd vs Even Parity

This will depend slightly on what communication you are using. I know you said you are using UART, but I am going to answer a bit broader. If you do decide to go with parity, select the option that will cause your "idle" state to require the parity bit to switch.

If you have an active high system, then all 0's is idle. So make it odd parity so that the parity bit will have to change state from idle.

In an active low system you should look at how many bits the parity will be over. If it is 8 bits, then even parity will result in a 0 parity bit, which follows the idea of forcing a change in state for the parity.

Should you use Parity?

Well this is a bit of a difficult questions. In general we like to model the noise as Gaussian noise which means that the bit errors will be completely random. In actuality noise that has an effect on our system is not always random. The reason for this is things that can cause errors on a PCB are radiators from something else. If you think about it, in order for a trace that short to have enough noise to cause bit error then it had to be something rather extreme. When you have a noise source like this, there is a decent chance that you will flip more then one bit. Parity is useless against an even number of bit errors. With out diving into the math, parity will help, but doesn't help tons. If you can't afford to do much processing then parity may be the best that you can do..

Why use a CRC?

First of all, you say you have a built in CRC generator, this means it should be very easy for you to compute. CRCs are much better at catching multiple bit errors. In an environment where you want a very low chance of getting any errors, you really want to use a CRC. In fact, go with the biggest CRC you can afford in your system. One little trick that I know works for atleast crc16 if not others is if you CRC the message you received with the CRC on it you should get 0 as your answer. If you have hardware to compute the CRC then this is a very efficient way of both generating CRC and checking CRC.

Kellenjb
  • 17,509
  • 5
  • 51
  • 87
  • I would like to mention that some systems, when they calculate parity bits they look at what the logic is and not the actual voltage. My answer still holds true, just make sure that how ever the parity works out, it requires a change in voltage for the idle state. – Kellenjb Jan 19 '11 at 14:52
5

I would work on shielding this wire first of all.

Put ground planes around it (and/or) burry it into inner layers between ground planes. Then rely on CRC. Use parity if it's free.

BarsMonster
  • 3,267
  • 4
  • 45
  • 79
5

If you use parity think about how you will manage the error in you code. If you do not have a graceful way of managing the error detecting it will not help much.

russ_hensel
  • 2,904
  • 16
  • 12
  • This is a very important point that people who make theoretical/armchair arguments often forget to consider. In some systems, where output integrity is more important than output production, just setting off alarm bells and refusing to go further may be right. But in a lot of cases, allowing an error to cause denial of service may be far more costly than allowing it to pass by - it's application dependent. – Chris Stratton Aug 01 '11 at 16:14
4

Parity is useless, in my opinion.

With short chip-to-chip connections on a single board like this, as long as you properly drive the lines, it should be practically noise-free -- at least compared to, say, your DRAM. Some people expect one soft bit error, per day, per gigabyte of DRAM. How do you know the error you saw wasn't caused by such a soft bit error, rather than noise on the communication wires? If you have induced noise big enough to flip a bit on a properly-driven PCB trace, you probably have other bigger problems to worry about. (By "properly driven", I mean either actively drive each line all the time, or use a pull-up resistor or pull-down resistor to set the state between messages).

With longer-range off-board connections, it's often unavoidable to occasionally have floating wires that present an input pin with more-or-less continuous random noise, and even when you are transmitting a message you often get bit errors. In this case, parity is better than nothing. As bt2 mentioned, you'll want to use CRC so you can catch many errors that parity completely misses, and once you have CRC, adding parity to it doesn't help significantly.

If it's possible to put your system in a "bad state", try to design things such that it returns to a good state in a reasonable amount of time. Use communication timeouts and watchdog timers as markrages and bt2 mentioned. Periodically re-transmit the initialization commands so that no matter what weird state the receiver is in, it gets reset to the correct state.

"Static in the satellite affected its computer which read the noise as a command to shut down. To overcome the problem, controllers sent a continuous stream of ON commands to the satellite to keep it turned on." -- Amateur Radio Satellites: OSCAR-6

davidcary
  • 17,426
  • 11
  • 66
  • 115
3

You don't want parity. You want a more reliable communication.

The reason parity is little used is that it's expensive in terms of data throughput. It starts by making a frame almost 10% longer: start bit + 8 data bits + parity + stop bit = 11 bits instead of 10. But it's far worse than that. If you have a way to tell if you received the data correctly you have the duty to do something with that. Simply ignoring the erroneous communication won't do; the transmitter has to send it again. So it needs to know whether it was received well. You'll have to send an acknowledge (ACK/NAK) after each byte, and the transmitter can't send the next byte before it has received the ACK. If you use ASCII codes that's 11 return bits. So this halves the thoughput, and we already lost 10%, so we're now at a 36% payload efficiency, from 80%. And that's the reason why nobody is really fond of parity.

Notes:
1. You don't need to acknowledge the receipt of an acknowledge; the Hamming distance between the ASCII codes for ACK and NAK is 3 (with parity even 4), so an error in the reception of ACK/NAK can be not only detected, but also corrected.
2. Many UARTs can work with data lengths down to 5 bits, and it's possible to switch to 5 bits for sending the ACK, but this is mere window-dressing, and it only complicates communication.

A better solution can be to use a CRC at the end of each block. CRCs are better than parity bits at capturing multiple errors (yet they still can't correct them). Improved efficiency can only be obtained for long blocks; if a block consists of only 2 bytes it's no use to add an 8-bit CRC.
Another disadvantage would be that you still have to acknowledge correct reception. So that's probably not it either.

How about self-correcting codes? Hamming codes add little overhead, and allow you to correct 1 erroneous bit yourself; no longer need for acknowledging. Like CRCs Hamming codes are more efficient on longer blocks; the number of additional bits is defined as

\$N + H < 2^H\$

where N = number of data bits, and H = number of Hamming bits. So to correct 1 bit in an 8-bit communication you need to add 4 Hamming bits; a fifth Hamming bit is only required from 12 data bits. This is the most efficient way of error detection/correction on short messages (a few bytes), though it requires some juggling with your data: the Hamming bits have to be inserted at specific positions between your data bits.

Now before you add Hamming error correction codes it's worth looking into your setup. You can expect errors on a 100m line running between heavy machinery, but you shouldn't have errors on a 2cm line. If it picks up noise it may be too high impedance. Are the drivers push/pull? If so they should be able to give you fast edges, except if you "cable" is capacitive, which it won't be at this short distance. Are there high current traces running parallel to the data lines? They could induce noise. Do you really need this high speed, and do clocks on both sides match closely enough? Slowing down to 57600 bits per second may solve the problem.

stevenvh
  • 145,145
  • 21
  • 455
  • 667
1

Despite the plethora of answers I'll add my piece. Parity will work on a byte level, and a CRC works (generally) on a packet level. Packets schemes are in my opinion superior to raw byte-type communications. If your hardware rejects a single byte based on parity, that's great for a raw bytestream, but not so good for a packet. All of a sudden, boom, you've missed a byte in the middle of a packet - that screws up your parsing. In the best case, your whole packet is gone, but if your packet finding algorithm isn't good, you could potentially lose more (and I've seen some brain-dead packet detection algorithms used in real products). Use packets and CRCs like was suggested.

Toby Jaffey
  • 28,796
  • 19
  • 96
  • 150
AngryEE
  • 8,669
  • 20
  • 29
0

Parity is generally useless with "normal" async-style communication, but it may be useful in some other contexts. For example, some differential coding schemes require that every character have an even number of line transitions so that the final state of the line will match the initial state. Code 39 barcodes use parity, sort of(*), to validate individual characters, and generally eliminate the need for a checksum character.

If single-bit errors were much more common and multi-bit errors or framing errors, per-byte parity checking could be a useful adjunct to a checksum or CRC, since locating a byte with an error would allow the error to be corrected. In most practical situations involving UARTs, however, many line glitches cause framing errors, making data recovery in many cases impossible with or without per-byte parity.

(*) A valid Code 39 character must have an odd number of wide spaces (1 or 3) and an even number of wide bars (0 or 2).

supercat
  • 45,939
  • 2
  • 84
  • 143
  • @supercat, many people use parity to just know that the byte is bad, not because they need to correct it. That is an ECC problem. – Kortuk Mar 05 '11 at 03:12
  • @supercat, but causing spacing so that every data word has an even number of transitions you have implemented parity, it will take at least a bit of wasted space to implement this. Barcodes are the end all of redundancy though, using a very large portion of their space for redundancy. – Kortuk Mar 05 '11 at 03:18
  • @Kortuk: What advantage is there to knowing that e.g. the sixth byte in a message is bad, as opposed to merely knowing that something is bad, if one isn't going to attempt error correction? Sometimes it can be useful to have a protocol where each packet has a short checksum, and each group of packets has a long one. If a bad packet is caught, just that packet needs retransmission. If a bad packet gets through without a checksum error, the longer group checksum will show something's wrong, but it will be necessary to retransmit the whole group. – supercat Mar 05 '11 at 21:53
  • @Kortuk: I'm not sure that principle would apply to parity-checking individual bytes, though, unless one had a means of requesting that only specific bytes needed retransmission. There would be ways of handling that, but I can't think of any communications contexts where it would really help. – supercat Mar 05 '11 at 21:54
  • @supercat, an easy example, you put parity on the output from a keypad. if you get a bad byte you ignore it instead of using that number. I think there would be many many situations where dropping a bad byte does not harm it. All of them completely transparent to the user. You could even have a byte by byte acknowledgment. I can understand why this seems pointless to you, but if a byte coming in is a command, you might be better off without the command then flushing the buffer of data or whatever other possible command was falsely called. – Kortuk Mar 06 '11 at 00:30
  • @Kortuk: It may be useful to use parity with a keypad, granted, though preferred behavior would be to request a retransmission rather than ignoring a keystroke, and single-bit parity is a little wimpy (too much likelihood of undetected error). – supercat Mar 06 '11 at 14:21
  • @supercat, I am not saying that your logic is not solid, I am trying to say that you are thinking of cases where single bit parity makes sense. In high error situations you may want a hamming code. How useful is a CRC in a higher error rate channel? It is not because you will end up having an error every time and just get stuck in a retransmit loop. ECC and error detection need to be chosen on a case by case basis. – Kortuk Mar 07 '11 at 02:29