Improvement to Samudra v2: Larger effective resolutions via increased dilation schedule by alxmrs · Pull Request #753 · m2lines/Samudra

alxmrs · 2026-06-01T23:49:42Z

This hyper-parameter config improves rollouts in the deep ocean with some regression to upper level depths.

Metric	Paper Samudra-2 1°	E1 dense+dilated 1°	Δ (E1 − paper)	Winner
Niño 3.4 R²	0.9331	0.9792	+0.046	E1 ⭐
Niño 3.4 RMSE (°C)	0.2222	0.1238	−44 %	E1 ⭐
Niño 3.4 corr	0.9675	0.9896	+0.022	E1
Upper R² (0–700 m)	0.8787	0.4850	−0.394	Paper
Mid R² (700–2000 m)	−1.579	−4.1949	−2.62	Paper
Deep R² (2000–7000 m)	−16.170	−15.021	+7 % error reduction	≈ tie
2.5 m snapshot corr	0.6842	0.5513	−0.133	Paper
2.5 m snapshot RMSE	0.4594	0.5690	+0.110	Paper
700 m snapshot corr	0.2839	0.2785	−0.005	≈ tie
700 m snapshot RMSE	0.2291	0.2863	+0.057	Paper
2000 m snapshot corr	0.3370	0.2435	−0.094	Paper
2000 m snapshot RMSE	0.0433	0.0615	+0.018	Paper

Metric	Paper Samudra-2 1/2°	E1 dense+dilated 1/2°	Δ (E1 − paper)	Winner
Niño 3.4 R²	0.9333	0.9810	+0.0477	E1 ⭐
Niño 3.4 RMSE (°C)	0.2217	0.1173	−47.1 %	E1 ⭐
Niño 3.4 corr	0.9662	0.9906	+0.0244	E1
Upper R² (0–700 m)	0.8435	0.8059	−0.0376	Paper
Mid R² (700–2000 m)	−2.017	−0.196	sign flip	E1 ⭐
Deep R² (2000–7000 m)	−13.449	−6.666	+50 % error reduction	E1 ⭐
2.5 m snapshot corr	0.6795	0.6704	−0.0091	≈ tie
2.5 m snapshot RMSE	0.4721	0.4890	+0.0169	≈ tie
700 m snapshot corr	0.2975	0.3457	+0.0482	E1 ⭐
700 m snapshot RMSE	0.2500	0.2646	+0.0146	≈ tie
2000 m snapshot corr	0.4076	0.3834	−0.0242	Paper
2000 m snapshot RMSE	0.0450	0.0480	+0.0030	≈ tie

Full experiment results

Rank	Experiment	Block	Spatial	Resolution	Params	Niño R²	Niño RMSE	Niño corr	Upper R²	Mid R²	Deep R²	2.5 m corr/RMSE	700 m corr/RMSE	2000 m corr/RMSE
1 ⭐	E1 dense+dilated	dense ConvNeXt	k=3, d=[1, 8, 16, 16]	1/2°	84.04 M	+0.9810	0.1173	+0.9906	+0.8059	−0.196	−6.666	0.6704 / 0.4890	0.3457 / 0.2646	0.3834 / 0.0480
2	E1 dense+dilated	dense ConvNeXt	k=3, d=[1, 8, 16, 16]	1°	84.04 M	+0.9792	0.1238	+0.9896	+0.4850	−4.1949	−15.0211	0.5513 / 0.5690	0.2785 / 0.2863	0.2435 / 0.0615
3	kernel_peakmid_wide	depthwise TrueConvNeXt	k=[7, 13, 21, 31], d=1	1°	~12 M	+0.9788	0.1244	+0.9899	−4.380	−39.84	−103.91	0.6428 / 0.4922	0.2755 / 0.2299	0.2540 / 0.0476
4	kernel_peakmid_seed42	depthwise TrueConvNeXt	k=[7, 13, 21, 31], d=1 (seed 42)	1°	~12 M	+0.9648	0.1612	+0.9885	−3.168	−51.62	−144.61	0.6482 / 0.4833	0.3358 / 0.2097	0.1985 / 0.0485
5	E16 v3 multiscale-dw	depthwise + parallel dilations	k=3, d=[1, 2, 4, 16] + parallels	1°	13.40 M	+0.9554	0.1832	+0.9796	−5.061	−32.22	−83.73	0.5736 / 0.5466	0.3195 / 0.2293	0.2657 / 0.0488
6	Paper Samudra-2	dense ConvNeXt (V2)	k=3, d=[1, 2, 4, 8]	1/2°	84.09 M	+0.9333	0.2217	+0.9662	+0.8435	−2.0166	−13.4489	0.6795 / 0.4721	0.2975 / 0.2500	0.4076 / 0.0450
7	Paper Samudra-2	dense ConvNeXt (V2)	k=3, d=[1, 2, 4, 8]	1°	84.09 M	+0.9331	0.2222	+0.9675	+0.8787	−1.5793	−16.1695	0.6842 / 0.4594	0.2839 / 0.2291	0.3370 / 0.0433
8	kernel_control	dense ConvNeXt	k=7, d=1 (ConvNeXt default)	1°	~84 M	+0.9320	0.2243	+0.9686	−2.354	−27.67	−119.41	0.6220 / 0.5057	0.3260 / 0.2256	0.1827 / 0.0524
9	kernel_peakmid_v1	depthwise TrueConvNeXt	k=[7, 13, 21, 31], d=1	1°	~12 M	+0.9314	0.2258	+0.9670	−1.551	−17.23	−75.74	0.6419 / 0.4810	0.3792 / 0.2058	0.2666 / 0.0479
10	kernel_peakmid_reparam	depthwise + reparam branch	k=[7, 13, 21, 31] + 3×3 reparam	1°	~12 M	+0.8729	0.3062	+0.9346	−2.217	−22.68	−59.75	0.6060 / 0.5103	0.3332 / 0.2312	0.3225 / 0.0462
11	large_kernel_v4	depthwise TrueConvNeXt	k=[7, 13, 21, 31] reverse	1°	~12 M	+0.8438	0.3389	+0.9203	+0.5090	−13.67	−44.54	0.6107 / 0.5012	0.2845 / 0.2173	0.2655 / 0.0463
❌	E1 dense+dilated (broken)	dense ConvNeXt	k=3, d=[1, 8, 16, 16]	1/4°	84.04 M	−0.3338	0.9914	−0.0651	−324.6	−3397.8	−27992.6	0.1496 / 3.906	0.0091 / 0.900	0.0232 / 0.486

_These are ranked by Niño R² _.

(Full one degree viz comparison in this folder)

One degree training: https://wandb.ai/ocean_emulators/default/runs/4jvm2k1z?nw=nwuseralxmrs
One degree eval: https://wandb.ai/ocean_emulators/default/runs/epb3vf1l?nw=nwuseralxmrs
Half-degree training: https://wandb.ai/ocean_emulators/default/runs/qkhejvyl?nw=nwuseralxmrs
Half-degree eval: https://wandb.ai/ocean_emulators/default/runs/iateffe3?nw=nwuseralxmrs

I have not yet run this at quarter degree resolution, but I can. Happy to add more results / collect more data to see if this is something we want to merge! Full experiment configurations can be found in #727. This was the most effective result, so we'll only merge this. My goal is to land the learnings from these experiments, which I am now beyond my time box.

…ts to dilations schedule.

jder · 2026-06-15T19:42:43Z

Thanks, @alxmrs. My first impression is that this is a pretty mixed bag for 1-degree and I don't understand why we'd want a dilation of 16 for the coarsest level. That gives a 33x33 receptive field. At 1 degree, the grid is 11x22 at that point so the latitudes are all padding and the longitudes are wrapping around the earth more than once.

1/2 degree results do look better, but the coarsest half-degree level is still 22x45 which (with a 33x33 receptive field) still gets mostly vertical padding.

Basically, I would feel better about one of:

Having this be our default for 1/2 or 1/4 degree only
Seeing results with narrower final convolutions, or
Getting these runs evaluated with the new metric suite we're going after.

What do you think?

alxmrs · 2026-06-16T21:05:04Z

I like your analysis, Jesse. Yeah, I'd be comfortable to land this change just for 1/2 and 1/4 degree only. Maybe, it'd be in our interest to tune the last hparam to be specific to each resolution. My Claude did warn me of the vertical degree padding vs receptive field issue, but it passed it off with the idea that the NN would be able to learn to deal with these boundary conditions.

Getting these runs evaluated with the new metric suite we're going after.

I'd like to double down on this point: My plan for landing this will be to pause my experiments until I can rigorously evaluate this/these hparam changes with @YuanYuan98's new evaluation system. I see this change as serving as a useful test case to see how well we can make the metrics system automated.

Seeing results with narrower final convolutions

I also agree with this push back in principle. I was experimenting with a really wide breadth of levers in my initial investigation, and I could see this branch becoming a narrower search just on the optimal dilation hparams per each resolution. I mentioned this somewhere in chat, but one experiment I am especially curious about is to see if we could tune the dilations to mimic GraphCasts' hierarchical icosahedral grid (in receptive field coverage).

In sum, thanks for your review, and I'm happy to postpone landing this change until later. If you'd like this PR to be merged sooner, I could make a narrower contribution to make this specific to 1/2 degree samudra.

jder · 2026-06-17T13:21:54Z

OK, thanks for the detailed response. I think I'm in favor of using this as a test for our new metrics so let's pause for now and we can move forward once we have those results.

Improvement to Samudra v2: Larger effective resolutions via adjustmen…

d9ff988

…ts to dilations schedule.

alxmrs requested a review from jder June 2, 2026 21:05

alxmrs marked this pull request as ready for review June 2, 2026 21:05

jder marked this pull request as draft June 17, 2026 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improvement to Samudra v2: Larger effective resolutions via increased dilation schedule#753

Improvement to Samudra v2: Larger effective resolutions via increased dilation schedule#753
alxmrs wants to merge 1 commit into
mainfrom
u/alxmrs/bigger-dilations

alxmrs commented Jun 1, 2026 •

edited

Loading

Uh oh!

jder commented Jun 15, 2026

Uh oh!

alxmrs commented Jun 16, 2026

Uh oh!

jder commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

alxmrs commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jder commented Jun 15, 2026

Uh oh!

alxmrs commented Jun 16, 2026

Uh oh!

jder commented Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alxmrs commented Jun 1, 2026 •

edited

Loading