Jump to content

This is a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Test Kitchen/Decision Records/Remove 24h requirement

From Wikitech
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Decision Record: Remove 24-Hour Lead Time for Experiment Activation

Date: 27 October 2025

Context

The xLab UI currently requires experiments to be "turned on" 24 hours before their scheduled start date. This has created confusion and friction:

  • Users must understand that "turning on" an experiment doesn't actually start it—it only becomes active when both the start date is reached AND the experiment has been toggled on
  • Multiple teams have forgotten this step, blocking their experiments from running
  • The terminology is confusing with overlapping concepts: "active," "turn on/off," "activate," "started"

The 24-hour buffer was originally implemented as a conservative approach to ensure Varnish nodes have experiment configuration with ample lead time.

Technical Background

  • Varnish nodes fetch new configuration every minute and store it locally on disk. Network jitter exists, but 3 minutes is a reasonable propagation estimate based on observations from the A/A tests we completed in FY24/25 SDS2.4 and the first A/B tests run by teams.
  • There is a 14:30 UTC deployment window that already provides built-in lead time, and the 24-hour requirement is significantly more conservative than necessary.

Decision

Remove the 24-hour lead time requirement for experiment activation. Experiments can be turned on and start collecting data on the same day, provided the activation occurs before the 14:30 UTC deployment window.

Implications

User Experience

  • Simplified mental model: experiments can be started when needed (respecting the deployment window)
  • Reduced friction and forgotten activations
  • Better alignment with GrowthBook's approach (which has simple "Start Experiment" functionality)

Operational Constraints

  • Users must still respect the 14:30 UTC deployment window
  • If an experiment is turned on after 14:30 UTC (e.g., 15:05 UTC), it will need to wait until the next day's deployment window
  • The 3-minute propagation time for Varnish nodes remains acceptable

Future Considerations

  • Align with GrowthBook terminology when implementing SDS2.3
  • Investigate how GrowthBook models phase changes and whether event tagging adjustments are needed
  • Consider measuring Varnish propagation time directly

TODO

  1. Ticket to implement removal of 24-hour requirement T408233
    1. Update documentation to reflect new timing requirements
    2. Clarify that the deployment window constraint remains in effect
  2. Ticket to measure Varnish propagation time T408236