TBM 34/53: Better Experiments

Note: I recently wrote a short list post on Pandemic, Teams, Health, and Self-Care.

Here’s a tool/activity I have been using recently to help teams design internally-focused (non product) experiments.

During the pandemic, teams are running experiments to counter burnout and to adapt to remote-work. Hopefully, this activity can help your team design better experiments.

My general observation is that teams either 1) don’t really think through their experiments or 2) are overly biased to certain types of experiments. An organization might always try highly localized, low-risk experiments, yet never figure out how to “scale” those experiments. Or, an org might turn every experiment into a big-bang, long duration program.

The activity is relatively simple. 

  1. Set aside ~90 minutes.

  2. Pick a problem or observation. 

  3. Read and discuss the dimensions described below. For each dimension, brainstorm example experiments representing the “extremes”. These don’t need to be real. Have fun.

  4. Optionally (as demonstrated with L+ and R+), chat about how the extremes could be considered positive.

  5. Return to the problem or observation. Ask individuals to brainstorm 1-3 candidate experiments to address that problem or observation. 

  6. Ask team members to individually describe each candidate experiment using the ranges below.

  7. As a group, discuss each experiment, and where each team member placed each experiment.

  1. Finally, ask team members to dot vote on the best-fit experiment (for the given context). Discuss ranking. Ideally, pick an experiment to try.

Why is this helpful? First, the activity helps the team build a common vocabulary around experimentation. Second, it helps elicit better options given your context.

Below I briefly describe each dimension, and provide a sample of “positives” for each extreme. I suggest brainstorming your own.


Local | Global

How containable (or localized) is the experiment?

L+: Localized impact, fewer dependencies, less visibility/oversight/meddling.

R+: Broader impact, more support, more visibility.

Flexible | Rigid

Will it be possible to pivot the experiment on the fly?

L+: May be easier to sustain. More adaptable to changing environments and new information.

R+:May be easier to understand, teach, support, and promote.

Short Duration | Long Duration

How long must the experiment last to provide meaningful information?

L+: Less disruptive. Easier to pitch. Faster feedback.

R+: More time to “simmer” and pick up steam. Commitment.

Invitation | Imposition

Will the participants be invited to take part in the experiment, or will the experiment be imposed?

L+: More intrinsic motivation. More vested in outcome. “Advocates for life!”

R+: Speed. Less need to “sell” change.

Small Shift | Large Shift

Will the experiment represent a small change from how things currently work, or will it feel foreign and new? Perhaps different participants will experience different degrees of change.

L+: Easier. Less disruptive. More potential to “pick up momentum”.

R+: “Get it over with”. Less chance of getting stuck in local maximum.

Self-powering | Requires “fuel” & external support

Can the experiment sustain itself without outside support and resources, or will it require external support?

L+: Independent. Easier. Can be sustained indefinitely.

R+: Involves and “vests” broader group in the effort. 

Value in 2nd/3rd order effects | Risk in 2nd/3rd order effects

Second and third order effects are common when running an experiment. Is the experiment expected to “throw off” potentially valuable 2nd/3rd order effects? 

L+: Discover valuable things!

R+: Risk may be necessary to explore new areas of uncertainty.

Fewer dependencies, lower blast radius |
More dependencies, higher blast radius

How independent/dependent is the experiment on other things (people, projects, systems, processes, etc.) in the org?

L+: Independent. More degrees of freedom. Less constrained.

R+: Potentially more impactful. Potentially more involvement and support.

Shorter feedback loops | Longer feedback loops

How easily and quickly can we get feedback?

L+: Can respond more quickly. Can pivot experiment more quickly.

R+: May be less noisy. May provide “deeper” or more cohesive information.

Low threat to formal structures/incentives | Challenges formal structures/incentives

Does the experiment represent a threat to formal power/incentive structures?

L+: Can fly under radar. Consider “safe” and non-threatening.

R+: May be less likely to test (and change) formal power/incentive structures.


I hope you found this helpful.