The 3 headphone buttons you referred to do not pass information digitally, but are being signaled electronically using change of resistance, as explained here: How do volume control headphones work?
In the case of the headphone buttons, you can say information about pressed buttons is being passed "out-of-band" - it doesn't occupy the audio in or out channels (except maybe for the moments of time a button is depressed).
The answer by "cagrigurleyuk" above is incorrect, in addition to being incomplete.
Yes, the Shannon-Hartley Theorem places the upper bound of bit rate that may be losslessly transferred within a given bandwidth, at a given SNR.
The quantization noise for a digitized channel is modeled as a SNR of \$2^Q\$ (where Q is the amount of resolution bits), in which case \$B\cdot\log_2 \left( 1+SNR \right)\cong B\cdot Q\ \$, so for the numbers mentioned above (20kHz bandwidth and 12bits), the theoretical upper bound would be 240kbps (not 500kbps).
In reality, you actually have a bit more bandwidth (up to 24kHz, for a sample rate of 48kHz), and less quantization noise (as the maximal resolution is often 16 bits), but you do need to take into account more noise sources on the line that lower the total SNR (the worst noise source probably being the amplifier).
Finally, the number originating from Shannon-Hartley Theorem is, again, just a theoretical upper bound.
The actual bitrate you can achieve is a function of the modulation and encoding schemes you use, which themselves are a function of the program's efficiency, and the processing power that is available to you (due to realtime limitations). The processing power and the program's efficiency are unknown to us, which is why we can't really fully answer this question. Also unknown are the capabilities of your buttons, or transmitters.
I can give you some pointers, though:
- With a proper design of communication protocol, the buttons probably need not send data while they are not being depressed, so you are actually theoretically limited by the amount of buttons pressed at once, rather than the amount of total buttons. E.g., your average keyboard has hundred-something keys, but only uses a transfer rate of around 12kbps.
- The spectral efficiency (a figure of how many bps of throughput you can cram per a Hz of bandwidth) of various digital modulation (or keying) methods are listed in a table in page 17 of this PDF file, for your reference. The higher the spectral efficiency, the more complex the algorithm would be, which would accordingly require more processing power for realtime performance.