How do I decide the timing variables for a circuit implementation? I
understand its application specific, but how do I know if I should
choose components with relatively low delay or if I can afford high
delays? How do I quantify my delay requirements? Maybe I start with,
say, what I want my throughput to be at the circuit output, and work
backwards down to the transistor level?
Within electrical engineering, there are many sub-disciplines. Some of these may be addressed in classes you take, some of them you just have to learn yourself through experience and/or self-teaching. The answers to these questions often come with application-specific knowledge.
For example, one of these disciplines is analog signal conditioning. Not all good EEs are well-versed in this, and some are expert analog designers who know the answers to all of your questions when working with BJTs and op-amps, but can't answer them easily when it comes to high speed digital circuits. The way you speak of delays seems to imply a digital context, but if you have too much phase shift (phase shift is a type of delay!) in the feedback loop of an amplifier (while the gain is >1), the amplifier will oscillate.
Once you get a specific type of circuit to design, there will also be literature with "working examples" available. Look at any switching power supply's data sheet for example. Here's a random datasheet to look at http://cds.linear.com/docs/en/datasheet/8711f.pdf, it has plenty of application circuits which will give you a starting point for selecting components and as well as PCB layout considerations.
Related to the timing thing, how do I know what scale I should use for
my transistors? My assumption is its some combination of allowable
space, monetary cost, and the delay tolerances needed for the
application. That seems like an awful lot of things to juggle. Are
there any rules for determining this?
Another example. Following the switching PSU theme, when selecting a MOSFET as the switch for a power supply, you typically want to use a MOSFET which is marketed specifically for this purpose, which significantly narrows your search and helps you in the quest to determine which of the tens of parameters describing the transistor are relevant to your application.
Finally, what about non-ideal effects like input capacitance? It seems
like I'd I just use this as a guide to make sure that my desired clock
frequency does not suffer severe attenuation from parasitic effects.
Or do these non ideal effects have a much different role in choosing
parameters for the circuit?
Again, this knowledge comes with experience, talking to people more experienced than you, reading datasheets, etc.
To address parasitic capacitance specifically, here are some things to consider:
Parasitic capacitance slows down switching times, which can lead to unnecessary increased power usage.
Whatever output pin is driving the input with the parasitic capacitance needs to be able to provide enough current to charge the capacitance quickly enough. This is why we have these things called "gate drivers". Because a little MCU GPIO pin is not equipped to handle the job of driving the large capacitance of a power transistor's gate. See What is the purpose of "MOSFET driver" IC's .
Capacitances (parasitic or otherwise) can work with inductances (parasitic or otherwise) to cause your digital signal switches to "ring" or "overshoot" which is bad for signal integrity and can even damage components. This type of problem is often best looked at in terms of transmission lines and impedance matching rather than talking about capacitances and inductances.