How to get fundamental frequency of a signal using autocorrelation?

Question

I am trying to get the fundamental frequency of a signal that only has a single pitch. I coded out the autocorrelation function using FFT and already got the autocorrelation result. Unfortunately, I don't know how to get the fundamental frequency from the autocorrelation result. Can someone help me? My code is below:

public double getPitch(double[] buffer, int firstSample, int lastSample, double sampleRate)
{
    int lengthOfFFTWindow=lastSample-firstSample;
    double[] input_buffer=new double[lengthOfFFTWindow];
    DoubleFFT_1D fft = new DoubleFFT_1D(lengthOfFFTWindow);
    double[] autocorrelation_values=new double[lengthOfFFTWindow];
    double[] fftData = new double[lengthOfFFTWindow * 2];
    double max=-1;
    double max_i=-1;
    //FFT on each sample in each window
    for (int i = 0; i < lengthOfFFTWindow; i++) {
        // copying audio data to the fft data buffer, imaginary part is 0
        fftData[2 * i] = buffer[i+firstSample];
        fftData[2 * i + 1] = 0;
    }
    fft.complexForward(fftData);
    for (int i = 0,j=0; i < fftData.length; i += 2,j++) {
        // complex numbers -> vectors, so we compute the length of the vector, which is sqrt(realpart^2+imaginarypart^2)
        autocorrelation_values[j] = Math.sqrt((fftData[i] * fftData[i]) + (fftData[i + 1] * fftData[i + 1]));
    }
    fft.complexInverse(fftData, false);
    for(int i=0;i<fftData.length;i++)
    {
        if(max<fftData[i])
        {
            max=fftData[i];
            max_i=i;
        }
    }
    return (max_i * 44100 )/ lengthOfFFTWindow;
}

Is it correct to return the maximum autocorrelation value divide by 2 as the fundamental frequency? I keep getting wrong answers when I do that.

EDIT: An example of the single pitch test file: http://dl.dropbox.com/u/24274475/testfile_A4.wav

I suggest dsp.stackexchange.com would be a good place for your questions — geometrikal, Mar 30 '13 at 12:48
You now say this is a recording of a violin note. That makes it misleading to say it has a "single pitch". It certainly will have harmonics. Can you provide a link to a WAV file or something? Then we can see what the raw input really looks like. — Olin Lathrop, Mar 30 '13 at 15:35
@geometrikal generally you should flag it for migration, not suggest the site, as it leads to issues with crossposting. — Kortuk, Apr 04 '13 at 16:05

Olin Lathrop · Answer 1 · 2013-03-31T12:31:34.857

Autocorrellation is one way to help find the dominant frequency of a signal, but I don't see what a FFT has to do with that. The autocorrellation will produce peaks with the period of any strong frequency components. If you then take the FFT of that to find the frequency of those peaks, you might as well take the FFT of the original signal in the first place.

Instead of showing us code, show us the data at various stages of your process. The details of the code are your issue, and are separate from the conceptual processes of going thru the various convolutions, filters, or whatever.

You say your signal only has a single pitch, meaning it's a pure sine wave. In that case, I really don't see the advantage of a autocorrellation pass. You can find the period directly by looking at the time between zero crossings.

In the past I've had to find the fundamental frequency of mostly repeating signals with significant noise on them. What I usually did was apply several stages of low pass filtering. One big advantage of digital filtering is that smaller signals out of a filter don't mean less signal to noise ratio as long as you keep adding the necessary bits at the low end. Using floating point, for example, does this automatically. You can then aggressively low pass filter a signal such that it would be only µV in analog, but still have the same meaningful bits left at the end.

Each LPF pass attenuates the harmonics relative to the fundamental. After enough passes, you are left with mostly the fundamental. Once you have attenuated the harmonics enough to guarantee only two zero crossings per cycle, you look at the zero crossing period, perhaps apply a little low pass filtering to successive ones, and infer the frequency from there.

Added:

Now that you have provided some data we can see what is really going on:

It looks like you have about a 440 Hz signal, but this is clearly far from a "single pitch" as the shape is far from a sine. Just from inspection we can see that the second harmonic is particularly strong. It may be so strong that this "note" is perceived to be 880 Hz instead of the fundamental of 440 Hz.

In this case, what is it you want the answer to be, 440 Hz or 880 Hz? With enough low pass filtering, eventually you get mostly the fundamental and measuring 440 Hz shouldn't be that hard. If you want the answer to be the possibly perceptual tone of 880 Hz, then things get a lot more complicated. One possibility would be to identify the fundamental in all cases. Once you have that, it's easy to find the relative amplitude of the first few harmonics. Then you can decide based on the strength of those harmonics whether you want to report one of them or the fundamental.

My signal only has 1 pitch (but also has multiple frequencies). It is not a pure sine wave. I recorded a music note by playing on the violin. — Sakura, Mar 30 '13 at 15:27
@Sakura: Then saying it has a "single pitch" is misleading. If your signal is not a pure sine, then it has harmonics by definition. In any case, try the LPF method. It shouldn't take too many passes to reduce the harmonics such that you reliably get just two zero crossings per cycle. — Olin Lathrop, Mar 30 '13 at 15:33
The note is about A=440Hz. A violin can't produce 44Hz. The second harmonic is normally stronger than the first in many musical instruments including the violin. This won't cause misperception of the pitch, due to the harmonic structure. — user207421, Mar 30 '13 at 23:31
@EJP: Doh! I don't know how I manged to slip a decimal point like that. You are right, the graph clearly shows 440 Hz fundamental, not 44 Hz. I have edited the answer to fix that. — Olin Lathrop, Mar 31 '13 at 12:32
The 'possible perceptual tone' is still 440Hz. The ear does an FFT itself and would perceive the 880 and 1320 harmonics as separate tones if the 440Hz was suppressed enough, not as oart of a 'perceptual' 880Hz. — user207421, Apr 01 '13 at 01:45

The Photon · Answer 2 · 2013-03-30T16:18:56.850

3

I see a couple of things that might be problems in your code.

First, to use the FFT to calculate an autocorrelation, there are three steps:

transform the input signal, \$F_R(f) = \mathrm{FFT}(x[t])\$
compute the power spectrum, \$S(f) = F_R(f) F_R^*(f)\$
inverse transform to get the autocorrelation, \$R[\tau]=\mathrm{IFFT}(S(f))\$

I see you doing step 1 and step 2, but then you do something completely different in step 3.

Note that if you did take the autocorrelation, the peaks would indicate the period of your signal, not the frequency.

What your code actually seems to be trying to do is take the FFT, then search the bins for the highest peak, and take that as the frequency of the signal. You also calculate the IFFT, but then throw away the result of that calculation.

A peak-search in the power spectrum can be used to estimate the signal frequency, but there you have another, more basic bug:

for(int i=0;i<autocorrelation_values.length;i++)
{
    if(max<autocorrelation_values[i])
    {
        max=autocorrelation_values[i];
        max_i=i;
    }
}
return max/2;

Your max is storing the maximum amplitude of the power spectrum values (which you confusingly named autocorrelation_values). max_i is storing the bin number where the maximum amplitude was found. You should be returning something based on max_i rather than max. You need to use the sampling rate and number of samples used in the FFT to scale the bin number into an actual frequency.

Edit I'd also recommend only searching the first half of the power spectrum for peaks.

for(int i=0; i < 0.5 * autocorrelation_values.length; i++)
   ...

The higher bins correspond to an alias of the spectrum in the negative frequencies, so don't contain any new information.

edited Mar 30 '13 at 16:18

answered Mar 30 '13 at 15:47

The Photon

126,425
3
159
304

Thank you for your suggestions! In that case, should I return samplingrate/(max_i*2)? – Sakura Mar 30 '13 at 16:00
@Sakura, I think it should be `max_i * samplingrate / lenthOfFFTWindow`. I don't think you need the factor of 2. Once you get close to the right result you should be able to see immediately if you're off by 2x though. – The Photon Mar 30 '13 at 16:04
I have updated the code accordingly but I keep getting 18935Hz instead of 440Hz. It's more than twice greater. Did I do something wrong? – Sakura Mar 30 '13 at 16:09
If you're testing on the wav file you linked, it doesn't sound at all like a single tone to me. Maybe your data doesn't match your assumptions? Try plotting the power spectrum (the array you call `autocorrelation_values`) and see if the highest peak is really where you think it is. – The Photon Mar 30 '13 at 16:14
I plot the spectrum using Audacity and get the fundamental frequency (440Hz) as one of the peaks. – Sakura Mar 30 '13 at 16:17
Is it the *highest* peak? Because your code finds the highest peak only. – The Photon Mar 30 '13 at 16:19
It's the second highest peak. The highest peak is 880Hz instead of 440Hz. – Sakura Mar 30 '13 at 16:22
Next thing I'd do is output and plot your `autocorrelation_values` array ... make sure it is what you think it is...Audacity might be playing a lot of tricks to make a nice spectrum display, not trying to be mathematically correct. – The Photon Mar 30 '13 at 16:46

score 0 · Answer 3 · edited May 07 '14 at 17:30

I want to find fundametal frequency with autocorrelation method. I read your code but I think that the last part of code could be like this:

int max=0; 

for (int i=0; i < autocorrelation_value.length; i++) {
  if (autocorrelation_value[i]>autocorrelation_value[max]) {
    max = i;
  }
  return 1/autocorrelation_value[max]
}

because the maximum of autocorrelation is the fundamental period and so in the "return" I calculate the inverse of autocorrelation's max.

And after I don't understand what means "max_i".

score 0 · Answer 4 · answered Mar 30 '13 at 13:03

0

Why don't you do it the old tried and tested way by using a time-delayed version of the signal. Multiply the signal by delayed version of itself and store the result. For a given window in time containing (say) 1000 samples, you'll need to store the biggest result of the individual multiplications.

Then re-test with a range of different delays. The result that is numerically the greatest corresponds with the period of the signal. Be aware that once the delay has found the "period" of the signal, it will also find an "image" period at twice the delay (because the signal peaks coincide once more).

Alternatively use a zero-cross detector and measure the time. You said the signal has only a single pitch so this should be pefectly ok - watch out for zero-crossing noise by averaging a few cycles - the more the better.

answered Mar 30 '13 at 13:03

Andy aka

434,556
28
351
777

How is what you describe in the first two paragraphs not a autocorrellation? – Olin Lathrop Mar 30 '13 at 13:06
@Olin Lathrop - It is auto correlation - I was saying it so he could design his own algorithm if he wanted. – Andy aka Mar 30 '13 at 13:09
I took the question to mean he already knows what autocorrellation is, but doesn't know how to apply the result or use it in the overall goal of determining the fundamental frequency. – Olin Lathrop Mar 30 '13 at 13:19
@OlinLathrop - I didn't understand his "reliance" on FFT so I assumed he didn't know what the ins and outs were and thought I'd spell it out a bit. – Andy aka Mar 30 '13 at 13:22
@OlinLathrop - I see you might have thought the same. – Andy aka Mar 30 '13 at 13:23
Yeah, I don't get what FFT has to do with anything here if trying to use the autocorrellation method. – Olin Lathrop Mar 30 '13 at 13:41
Hi. I used FFT to calculate autocorrelation so that the speed of the algorithm will be faster. By multiplying the signal with its time-shift signal, the speed will be much slower. – Sakura Mar 30 '13 at 15:25

How to get fundamental frequency of a signal using autocorrelation?

4 Answers4

Added: