47% returns. Sharpe 6.8. Foundation models aren’t ready — until you retrain them from scratch.
A new paper benchmarks large-scale Time Series Foundation Models (TSFMs) across 34 years, 94 countries, and 2 billion+ return observations.
The result?
Off-the-shelf TSFMs underperform CatBoost,
Fine-tuning helps a little,
But pre-training from scratch on financial data yields up to 5.4 Sharpe,
And when you add synthetic data + JKP factors + tuning... → TSFMs hit 41.9% return, 6.78 Sharpe.
Still not plug-and-play — but close.
arxiv.org