In the past, I've worked in a variety of environments. Desktop apps, games, embedded stuff, web services, command line jobs, web sites, database reporting, and so on. All of these environments shared the same trait: no matter their complexity, no matter their size, I could always have a subset or slice of the application on my machine or in a dev environment to test.
Today I do not. Today I find myself in an environment whose primary focus is on scalability. Reproducing the environment is prohibitively costly. Taking a slice of the environment, while plausible (some of the pieces would need to be simulated, or used in a single-instance mode that they're not made to do), kinda defeats the purpose since it obscures the concurrency and loading that the real system encounters. Even a small "test" system has its flaws. Things will behave differently when you've got 2 nodes and when you have 64 nodes.
My usual approach to optimization (measure, try something, verify correctness, measure differences, repeat) doesn't really work here since I can't effectively do steps 2 and 3 for the parts of the problem that matter (concurrency robustness and performance under load). This scenario doesn't seem unique though. What is the common approach to doing this sort of task in this sort of environment?
There are some related questions: