By "averaging", you apparently really mean low pass filtering. Averaging, or more specifically, box filtering, is a form of low pass filter, but not a particularly good one for most usages.
Yes, low pass filtering to reduce the high frequencies you know aren't relevant increases signal to noise ratio. You can look at this in the frequency domain and try to eliminate everything above the highest frequency you care about. Or, you can look at this in the time domain and throw as much filtering as possible at the system while making sure the step response is still sufficient within the time of the shortest possible level.
True averaging is really only useful if you know when the bit times are. Then you can average with as many samples as you can manage during one bit time. The average is then the best indication of the measured bit level. Put another way, this is really synchronous averaging, and can be quite effective.
Unsynchronized averaging is just silly. If you don't know where the bit boundaries are, then some IIR (equation based) low pass filter will be better.