"16-bit byte-addressable memory"
what do you mean by that either it is a 16 bit wide memory or an 8 bit or an 11 bit or a 72 bit or...
modern computers cant write bytes nor 16 bit quantities, they generally write/read 32 or 64 or 40/72 if ecc is involved (or 33 or 65 with parity or other sizes).
When your processor gives the illusion of writing a byte the 32 or 64 bit bus goes to the decoder with a byte lane enable, then the cache is probably 32 or 64 bits wide or wider and you can only read/write that sram in those sizes otherwise it wouldnt have that definition. if it is a 16 bit wide sram then you cant write bytes you write 16 bit quanitites. If you want your program to write a byte the the memory controller in front of the sram does a read-modify-write, it reads 16 bits since it is a 16 bit sram, modifies one half of it (assuming a byte is 8 bits in this processor you are creating, bytes are not always 8 bits) and then writes the 16 bit quantity back.
If you want to deal with each byte separately then implement using 8 bit wide srams, you can use two 8 bit srams such that you can access them separately or in parallel. You can of course take two 8 bit srams and wrap them with something that looks like a 16 bit with an extra couple of control signals so that you can do either of the halves or the whole 16 bits.
In any scheme you decide what endianness you want to use and thus which half of a 16 bit wide sram or which of the two 8 bit srams if you use two.
As commented though, design your processor and its bus(ses) and the memory controller separately you can have one processor design and multiple memory/peripheral solutions.
Think 8086 vs 8088.
How your bus works is up to you and how complicated you want this first design to be, naturally you should not try to hit a home run first time, take many attempts, many different designs with different goals.
Some schemes have a byte lane enable (for writes, reads you generally read the whole width of the bus 16, or 32 or 64, whatever) and thus when writing (a byte) to address 0x1000 or 0x1001 you can put 0x1000 or even better 0x800 on the address bus then use the byte lane enables to indicate which lanes are being written one or the other or both. then the bus/memory controller on the other end which is a separate design will react to those control signals on the processor side then deal with whatever memory or peripherals it has on those sides. Some have length or other fields then the address can be used, for example a "32 bit" processor using a 64 bit bus and the length in 32 bit units. Current designs like arm and others have separate read address, read data, write address and write data busses and multiple transactions in flight.
Starting over if your question is the desire to integrate the sram and the processor into one design the either pick a 16 bit wide sram or two 8 bit wide srams or wrap some logic around two 8 bit wide srams such that you have a 16 bit wide bus with byte controls. Or a 16 bit wide sram that you wrap with some logic to perform the read-modify-write as needed and present byte lane controls to the processor bus. And then when implementing the various 8 and 16 bit loads and stores, you set the bus controls accordingly to match the sram.
Or you leave it as a 16 bit wide memory and the processor logic takes care of the read-modify-write if you want to change less than 16 bits in one instruction/operation. (some cpus have set bit(s) and clear bit(s) instructions)
EDIT
Or do you already have this sram in your simulator and want to know how it works? if that is the case simply try it, spend the few minutes to bang out some signals to see what happens. should be trivial to figure out.
bottom line is that 16 bit wide srams do not have byte controls so your question is very confusing, they would then be 8 bit wide srams not 16. And/or this is not the sram interface but something wrapped around one or more sram interfaces).