I will preface this that it is highly likely that I have misunderstood how Harvard architecture works, but I cannot understand how an 8-bit instruction set, say the ATmega128 for example, can contain 133 instructions along with the addresses to 32*8 registers inside a single 8-bit instruction.
If you had all 8 bits dedicated to the 133 instructions how do you then contain the operands for the expression? I don't really understand how the addressing mode work, does this reduce the amount of expressions used since you have fewer duplicate expressions? Is it because the operands are contained in a different instruction, if so surely this makes it a 16-bit processor?