How to interface to a PDM audio source?

Question

I hear about this technology, I read about it, I even see a bunch of MEMS microphones with PDM output. I can perfectly understand how it works (I have worked with DSP). However, I can't seem to make heads or tails regarding how do I deal with it in practice.

I expect that there would be Digital Signal Processors that have built-in (hardware-based) PDM inputs; I would even expect microcontrollers to have that (since MEMS microphones typically used for mobile devices seem to use PDM interface quite often).

However, I do a search in Digikey, and not a single one lists PDM among the included features or interfaces (not a single DSP, not a single MCU). I did find a PDM-to-I²S chip (in 8-pin BGA --- *ugh*) , but it doesn't make sens that that would be the only option --- in that case, I might as well get a microphone with I²S output!

I think I understand enough of the math that I would be able to implement in software a bit-banged PDM decoder --- however, I also understand enough to know that that is unthinkable in practice.

I guess my concrete question for the audience is: what's up with that?

Pulse-density modulation, wouldn't you just need a lowpass filter to convert that to an analog signal? I'd say just sample it at a sufficiently high frequency and then do your processing on that. — Hearth, Oct 31 '18 at 23:18
I think in theory, yes --- but I don't think in practice it would work well, since you don't necessarily have precise/guaranteed values (actual analog voltage) for the 1's and 0's. Maybe if you buffer it first, and even then I expect that it would be hard. But either way, if I'm going to an MCU or DSP and I ultimately want the signal in digitized form, it seems gratuitous and inefficient to have to convert it back to analog, right? — Cal-linux, Oct 31 '18 at 23:22
I'm not suggesting you convert it back to analog, though I did word that a bit confusingly. What I'm saying is just sample the PDM output into memory and then post-process it. As long as you sample frequently enough that you don't miss (too many) transitions you should be fine. — Hearth, Oct 31 '18 at 23:25
I think what is being suggested is to oversample by a huge margin, like audio sigma-delta ADC's do. Sample at 1 MHZ then do a running average of 'n' samples. At some point it will be analog again even if you store it as digital files. A LPF is mandatory at some point. I do not know that you can get 130dB dynamic range out of this setup though, even if the mic is rated for it. — , Oct 31 '18 at 23:38
Ok, but that is precisely what I was referring to by "a bit-banged PDM decoder" in software. However, the sampling rates I've seen are in the order of MHz, so doing that by software seems unthinkable. You mention sampling at 1MHz, in which case I'll be missing lots of samples and yes, the SNR will suffer (I think it will be far worse than "you won't get 130dB dynamic range", though). But even then, having to do operations every microsecond seems heavy, no? (I know it's just one IO read, one addition, one subtraction, and one pointer adjustment for a running average --- still) — Cal-linux, Nov 01 '18 at 11:47

il--ya · Answer 1 · 2019-04-05T14:00:37.437

To avoid bit-banging, I think you can use SPI input to capture PDM data, but that would only work for one channel (mono input). This Texas Instruments app note describes something along these lines.

After some brief search I've found these off-the-shelf products which support PDM:

Codecs: Cirrus CS53L30, Maxim MAX9888, TI's programmable TLV320AIC3253

Micros with integrated PDM inputs: Silabs Giant Gecko range, Maxim MAX32666

Analog Devcies PDM to I2C converter

That's just a few examples, there are plenty other products I'm sure.

score 1 · Answer 2 · answered Sep 27 '21 at 18:57

I'm still trying to understand this as well.

The STM32 processors seem to have PDM hardware processing. I think the ESP32 also has it. So if you use the right processor, the hardware will solve this for you. And there are I2S microphones with the conversion from PDM to PCM built into the microphone itself. But if we don't have a processor with the needed hardware support are there software options for slower microcontrollers?

Here's an app note from ST that has lots of good data about PDM microphones and how to use them with their processors:

Interfacing PDM digital microphones using STM32 MCUs and MPUs

They also have a software package that converts raw single-bit PDM data streams into PCM format so it seems they either have very fast processors or very clever software. Or maybe both. I have not found any reference to the limits of real-time processing by this software so it seems the software can keep up even at the high speeds PDM microphones send data (up to around 4 MHz).

My limited understanding of the conversion from the high-speed 1-bit format of PDM to a conventional multiple bit lower speed PCM format is that the 1-bit data is processed by a digital low pass filter that operates with the output bit resolution required (say 16 bits), and this produces a high-speed 16-bit data stream which is them "decimated", meaning you simply take one out of every N samples from this output and ignore the rest to end up with a 16 kHz sample rate at 16 bits per sample (for example). The filtering logic is required to prevent aliasing of any high-frequency content down to the low frequencies.

Though the above logic can be fairly simple using a recursive low pass filter, I still don't get how these processors can run fast enough to keep up with a real-time 4 Mhz sample rate.

Ok, I just found this Adafruit code that gives me some real insight into how this can be done. (Thankyou Adafruit)

Adafruit_ZeroPDM/examples/pdm_analogout_dma/pdm_analogout_dma.ino

She's using a 64 sample windowed sinc low pass digital filter which requires a sum-product of 64 input samples to compute each output sample. Seems like way too much math to do for each bit of the PDM input. But I see there are obvious tricks here I didn't understand. Because it's a FIR filter and not a recursive filter, you don't need to compute all the samples you will be throwing away. Those intermediate steps are not used so you don't need to compute them just to throw them away. So the math is only done for each output sample you need, not the ones you throw away.

The code above is configured to produce 16-bit resolution output values, at a rate of 16,000 samples a second. It runs the PDM microphone at 64 times that frequency, which is 1.024 MHz. So she's using a 64 coefficient size filter, to convert each block of 64 PDM bits, to one 16 bit output sample.

The code is using an I2S interface that reads the bits in 16-bit blocks and uses DMA to put it into memory for you (no I/O reads in a loop needed), so she reads 4, 16-bit samples to get 64 bits of PDM data and then converts that to one 16 bit output value with a sum-product of those bits against the constants in her filter. Because the filter size matches her decimation, it lines up nicely with every 64 bits it can produce on a 16-bit sample out.

Normally, a filter like this would be done with floating-point to reduce rounding error accumulations but she's using integers here. I don't known enough about this so understand the possible loss of dynamic range this creates but since 16 bits is already a 96db resolution and since the input samples are just 0 or 1, there's no loss of precision from the input samples. And the accumulated error is only for the 64 addition (max) for each output simple, not sample to sample. So the loss of resolution is worse case 64 * .5 or +-32 for each 16-bit sample I guess? That reduces the resolution from 16 bits to 12 bits worse case? So the dynamic range might be 12 bits instead of 16 because of the use of 16-bit integers? Which means 70 dB instead of 96dB? And if the coefficients were rounded to integers in a more complex way, the worse case error accumulation might be less so maybe 80 dB result? (I'm no expert on any of this).

So since the sum-product is just multiplying the coefficients by 1 or 0, all she had to code was a test if a bit was turned on, and add the matching coefficient to a running sum for the bits that are on.

So the code reads 64 bits from the PDM mic, then for each bit, she adds the corresponding filter value or not. The sum of these values is the 16-bit output sample.

So the only processing required per PDM bit, is a one-bit test (sum&0x01), one 16 bit sum through a pointer (result += *ptr) (which only happens for about half the bits), then a bit shift (sample>>1), and one pointer increment (ptr++); Or this code repeated 64 times for each block of 64 PDM bits read into memory:

if (sample & 0x1) {
    runningsum += *sinc_ptr;
}
sinc_ptr++;
sample >>= 1;

So if your processor can read PDM data and run this much code for each bit, you can process the PDM data in software in real-time to convert it to 16-bit samples. But what you do with the output data at that rate, is another issue.

I see there's also the possibility to trade off memory for CPU by using lookup tables. For example, each 8-bit byte convolution with the filter could be pre-computed and turned into a lookup table with 256 entries so the conversion of 64 input bits to a 16-bit output could be done with 8 table lookups summed together. It requires 8 different tables each with 256 16-bit entries so it's 4K of lookup tables but I would expect it to use 1/4 the CPU as her code. So if you are short on CPU but have the memory to waste, that could be an option.

So in the comments above, the idea of doing an average of the bits was mentioned. This "correct" low pass filter code is not any harder than a running average but should get accurate 16-bit results. But it will use up a lot of CPU even on a fast processor to try and do the conversion in real-time at 1 MHz per bit. I would say that if you want to do anything with the data other than saving it to memory as a limited sample, you need to use hardware to do this conversion for you (or dedicate an entire small micro controller to do this work for you).

How to interface to a PDM audio source?

2 Answers2

Linked