You don't need an arbitrary lane alignment barrel shifter in a TenG base-r MAC or PCS (TX or RX side). You can add two lane alignment positions in the TX PCS as an optimisation if want to use a running IPG that can add the next packet on a 4-lane boundary rather than 8-lanes, and you have a MAC that can emit with the half alignment. But that's only a layer of 2-input muxing and saves you 33U in latency, only for back to back packets.
On the RX side there's a shifter of sorts in the RX gearboxing, but that is usually hardware in the PMA (below the 64/66) so you generally don't need to worry about it. The lowest (serially clocked) part usually just skips a bit when looking for block alignment, rather than do muxing. The higher conversions 32:66 or 40:66 toward the block-lock side are involved shifters, but again they usually come in the hard PCS.
For 64/66 encoding, each of the 64 data bits appears in only three possible bit positions in the output, so this packs really nicely into a single 6-LUT. As ever, if timing permits you can merge the shift with the surrounding logic, limited only by your imagination and the part.