As you increase the double-sided spectral width of bandlimited continuous real white noise while keeping the PSD constant, as the variance, RXX(τ=0) = σ², is the bandwidth of the noise multiplied by the PSD of the noise, the variance increases by a rate of 2B, where B is the baseband bandwidth. The autocorrelation function is a sinc of height σ²= N0B/2. The PSD is of magnitude N0/4 = σ²/2B. As the bandwidth goes to infinity you get an infinite variance. Alternatively if you fix the variance and produce a true continuous process with that fixed variance you end up with an infinite spectral width and a PSD of σ²Ts which is 0 as Ts → 0 such that the area under the PSD is still the power (the fixed variance). For this reason a true continuous white noise process must have an infinite variance so that the PSD is non zero. The reality in a real world situation is that the PSD of the thermal noise of a resistor is fixed at a flat KBT, (Boltzmann constant multiplied by noise temperature), but tapers to 0 at higher frequencies, which is why the thermal noise of a resistor has a finite power. When this is sampled at the nyquist rate and reconstructed, the PSD is KBT. If it is sampled at double the nyquist rate, the thermal noise remains within the same bandwidth and the PSD doesn't change, unlike quantisation noise, this is because the continuous thermal noise is not a true continuous process in that oversampling the thermal noise does not introduce more random variables and is instead just samples of a series of continuous linear functionals based on a fixed number of random variables, but the extra samples of quantisation noise are considered to be individual random variables and therefore the noise does spread equally over the sampling frequency bandwidth. The bandwidth of noise depends on the spacing of the random variables. A continuous process has infinite bandwidth, but the PSD also tends to 0 which is why true white noise variance must be increased and tend to infinity.
The PSD = KBT doesn't depend on the length of the noise in time either. Because noise is a stochastic signal, the discrete PSD of sampled noise is always the variance, regardless of how many samples or how long the noise is, and therefore the reconstructed noise PSD is always going to be σ²Ts regardless of length in time, so it is always that PSD, i.e. N0/4. Note the difference between N0, N0/2 and N0/4. Take the periodogram of the deterministic realisation (random walk) of the noise process that happens to have the expected value of noise energy but not the expected distribution of the energy over frequency. If you have a random energy that needs to be randomly distributed over discrete frequencies then the expected value is going to be E divided by the number of frequencies. One such example is all time samples equal to the standard deviation of the noise forming a discrete rect window. The amplitude at 0 rad frequency on the periodogram is σ²N. The other samples are 0. It has the expected energy of σ²N but the expected energy at each frequency is σ²N/N = σ². This does not change regardless of how many samples there are i.e. regardless of the duration of the noise in time.
When you sample white noise at the nyquist rate of the noise, the autocorrelation function is sampled and RXX(0) = σ²δ(t). The PSD = σ² = N0B/2. As mentioned, the continuous white noise process power spectrum does not change at all when it is truncated in time, but when the frequency spectrum bandwidth is increased then the variance (power in the time domain) increases at a rate of 2B and the PSD does not change. Infinite bandwidth finite-PSD white noise has infinite variance, but when it is sampled, the discrete PSD becomes the variance, which is infinite, which is explained by the overlapping spectral densities of the aliased images due to sampling at less than the nyquist rate. So long as you sample at at least the nyquist rate of the bandlimited noise, then there is no overlap. If you sample at half the nyquist rate then then noise is of course double, and uncorrelated (because the discrete autocorrelation function is still 0 at all samples), but of you sample at a non integer fraction of the nyquist rate i.e. 1/(2.5) then you end up with a square wave shape noise in the frequency domain because of the way the images are overlapping, which of course reflects that the noise is correlated on the discrete autocorrelation function at lags other than 0 in the time domain.
In the sampled bandlimited (to 2B) white noise case, the PSD is σ² but when reconstructed using Ts width pulses, the PSD of the new continuous white noise process becomes σ²Ts which is σ²/2B. The variance remains the same but the PSD is now PSD = σ²Ts = N0/4 instead of PSD = σ² = N0B/2 (at least I think that's what this definition is). This continuous white noise process is one formed by a discrete number of linear functionals, which are based on the value of the discrete samples, so is effectively still a discrete process and not a true continuous process.
Note that the PSD is the expected value of the periodogram. The periodogram is a single record / walk / realisation of the random process. The single realisation is a deterministic signal and is what's often shown when you see the 'PSD of white noise' and it's not perfectly smooth – it's showing the PSD of the single realisation of the white noise. If you sample continuous white noise at less than the nyquist rate, then there will be overlap for every realisation, with an expected value of 2σ² in the location of the overlap if just 2 images are overlapping. All of the power in the original signal (divided by the sampling period Ts) is always present in the sampling bandwidth, so SNR doesn't change.
SNR is defined as (RMSS)² / (RMSN)² for a deterministic signal and E[S(t)²] / E[N(t)²] for a WSS stochastic signal, where E is the ensemble average (a further time average is irrelevant because the mean and variance are the same for all variables in the process). Since E[S(t)²] = (μS)²+(σS)², the definition is ((μS)²+(σS)²) / ((μN)²+(σN)²), where (σS)² is the variance of all variables in the random process. This is simplified to (σS)² / (σN)² if the noise and the signal have zero ensemble mean. It is simplified to (μS)² / (σN)² if the signal has no ensemble variance i.e. it is a constant. Notation wise, σ and μ should be used to refer to the ensemble mean and variance, and not the RMS (temporal standard deviation) and temporal mean of a deterministic signal (I've also seen μ being used for RMS and μ² for MS). Since the signal is usually deterministic and the noise random, the notation should be (RMSS)²/ E[N(t)²]. The notation E[X(t)] is interpreted to be ensemble mean and therefore equal to X(t) if X(t) is a deterministic signal, as opposed to the temporal mean of the signal, which should be represented using a bar but I think I've see expectation being used for that.