All you need is cap. such that the RC=T exceeds the bounce time using the large internal pullup R.
The contact closure gives a low RC attack time and the release a slow decay time to act as a Sample and hold for as many milliseconds to cover your transitional time.
Logic and software can also do the same thing with edge detection with a timeout.
After new info
But if you introduce a series string of binary weighted R's, the the MSB R weighted value will swamp the value of all the others.
The only solution then is to use software averaging and choose analog thresholds to discriminate which combination of switches are active. You then have to deglitch by averaging or look for successive values in a 5% window of full scale for a duration > max transitional time.