Regarding "why not directly connect the transistors's emitters to the ground so DTR and RTS would drive the pins independently" - I cannot speak for the designers, but I do have an educated guess about why they did it with the cross-coupled transistors. It has to do with what happens when a USB-serial chip is not powered or otherwise disabled. It is possible, and often desirable, to power the ESP32 independently from USB. If that is the case, and the ESP32 is running and the USB cable is not plugged in, you do not want the ESP32 to be reset (or GPIO0 to be forced) during that state. Nor do you necessarily want it be reset when you do plug in USB, or when the USB enumeration process wakes up the USB-serial chip, or when a program connects to the chip. You want to let the program decide what to do, not for something to happen automatically before the program takes control of the serial. It is hard to say what happens to the RTS and DTR lines during all of those transitions - are they high or low or do they change when power is applied? But one thing that is fairly likely is that, whatever it is that happens, the RTS and DTR lines do the same thing. So with the cross-coupled circuit, you have at least a fighting chance of disconnecting/reconnecting USB and all that entails, without crashing an already-running program.