indirect_caption — side-by-side compare reports

Trained vs base — random + top-by-output-tokens differing rows per benchmark, with embedded images.

Source CSV: projects/indirect-caption/data/eval_results.csv on exp-record.