💡 Post-training alignment in 7 sentences — one page covering the interview essentials (see §2–§9 for derivations). RLHF pipeline (Ouyang 2022 InstructGPT): SFT → RM (Bradley-Terry pairwise) → PPO + ...
# Default S3 URL for frame-to-burst mapping (static frame definitions, no sensing times) DEFAULT_FRAME_TO_BURST_URL = "https://opera-ancillaries.s3.us-west-2 ...