Skip to content

Improvement to Samudra v2: Larger effective resolutions via increased dilation schedule#753

Draft
alxmrs wants to merge 1 commit into
mainfrom
u/alxmrs/bigger-dilations
Draft

Improvement to Samudra v2: Larger effective resolutions via increased dilation schedule#753
alxmrs wants to merge 1 commit into
mainfrom
u/alxmrs/bigger-dilations

Conversation

@alxmrs

@alxmrs alxmrs commented Jun 1, 2026

Copy link
Copy Markdown
Member

This hyper-parameter config improves rollouts in the deep ocean with some regression to upper level depths.

Metric Paper Samudra-2 1° E1 dense+dilated 1° Δ (E1 − paper) Winner
Niño 3.4 R² 0.9331 0.9792 +0.046 E1
Niño 3.4 RMSE (°C) 0.2222 0.1238 −44 % E1
Niño 3.4 corr 0.9675 0.9896 +0.022 E1
Upper R² (0–700 m) 0.8787 0.4850 −0.394 Paper
Mid R² (700–2000 m) −1.579 −4.1949 −2.62 Paper
Deep R² (2000–7000 m) −16.170 −15.021 +7 % error reduction ≈ tie
2.5 m snapshot corr 0.6842 0.5513 −0.133 Paper
2.5 m snapshot RMSE 0.4594 0.5690 +0.110 Paper
700 m snapshot corr 0.2839 0.2785 −0.005 ≈ tie
700 m snapshot RMSE 0.2291 0.2863 +0.057 Paper
2000 m snapshot corr 0.3370 0.2435 −0.094 Paper
2000 m snapshot RMSE 0.0433 0.0615 +0.018 Paper
Metric Paper Samudra-2 1/2° E1 dense+dilated 1/2° Δ (E1 − paper) Winner
Niño 3.4 R² 0.9333 0.9810 +0.0477 E1
Niño 3.4 RMSE (°C) 0.2217 0.1173 −47.1 % E1
Niño 3.4 corr 0.9662 0.9906 +0.0244 E1
Upper R² (0–700 m) 0.8435 0.8059 −0.0376 Paper
Mid R² (700–2000 m) −2.017 −0.196 sign flip E1
Deep R² (2000–7000 m) −13.449 −6.666 +50 % error reduction E1
2.5 m snapshot corr 0.6795 0.6704 −0.0091 ≈ tie
2.5 m snapshot RMSE 0.4721 0.4890 +0.0169 ≈ tie
700 m snapshot corr 0.2975 0.3457 +0.0482 E1
700 m snapshot RMSE 0.2500 0.2646 +0.0146 ≈ tie
2000 m snapshot corr 0.4076 0.3834 −0.0242 Paper
2000 m snapshot RMSE 0.0450 0.0480 +0.0030 ≈ tie
Full experiment results
Rank Experiment Block Spatial Resolution Params Niño R² Niño RMSE Niño corr Upper R² Mid R² Deep R² 2.5 m corr/RMSE 700 m corr/RMSE 2000 m corr/RMSE
1 ⭐ E1 dense+dilated dense ConvNeXt k=3, d=[1, 8, 16, 16] 1/2° 84.04 M +0.9810 0.1173 +0.9906 +0.8059 −0.196 −6.666 0.6704 / 0.4890 0.3457 / 0.2646 0.3834 / 0.0480
2 E1 dense+dilated dense ConvNeXt k=3, d=[1, 8, 16, 16] 84.04 M +0.9792 0.1238 +0.9896 +0.4850 −4.1949 −15.0211 0.5513 / 0.5690 0.2785 / 0.2863 0.2435 / 0.0615
3 kernel_peakmid_wide depthwise TrueConvNeXt k=[7, 13, 21, 31], d=1 ~12 M +0.9788 0.1244 +0.9899 −4.380 −39.84 −103.91 0.6428 / 0.4922 0.2755 / 0.2299 0.2540 / 0.0476
4 kernel_peakmid_seed42 depthwise TrueConvNeXt k=[7, 13, 21, 31], d=1 (seed 42) ~12 M +0.9648 0.1612 +0.9885 −3.168 −51.62 −144.61 0.6482 / 0.4833 0.3358 / 0.2097 0.1985 / 0.0485
5 E16 v3 multiscale-dw depthwise + parallel dilations k=3, d=[1, 2, 4, 16] + parallels 13.40 M +0.9554 0.1832 +0.9796 −5.061 −32.22 −83.73 0.5736 / 0.5466 0.3195 / 0.2293 0.2657 / 0.0488
6 Paper Samudra-2 dense ConvNeXt (V2) k=3, d=[1, 2, 4, 8] 1/2° 84.09 M +0.9333 0.2217 +0.9662 +0.8435 −2.0166 −13.4489 0.6795 / 0.4721 0.2975 / 0.2500 0.4076 / 0.0450
7 Paper Samudra-2 dense ConvNeXt (V2) k=3, d=[1, 2, 4, 8] 84.09 M +0.9331 0.2222 +0.9675 +0.8787 −1.5793 −16.1695 0.6842 / 0.4594 0.2839 / 0.2291 0.3370 / 0.0433
8 kernel_control dense ConvNeXt k=7, d=1 (ConvNeXt default) ~84 M +0.9320 0.2243 +0.9686 −2.354 −27.67 −119.41 0.6220 / 0.5057 0.3260 / 0.2256 0.1827 / 0.0524
9 kernel_peakmid_v1 depthwise TrueConvNeXt k=[7, 13, 21, 31], d=1 ~12 M +0.9314 0.2258 +0.9670 −1.551 −17.23 −75.74 0.6419 / 0.4810 0.3792 / 0.2058 0.2666 / 0.0479
10 kernel_peakmid_reparam depthwise + reparam branch k=[7, 13, 21, 31] + 3×3 reparam ~12 M +0.8729 0.3062 +0.9346 −2.217 −22.68 −59.75 0.6060 / 0.5103 0.3332 / 0.2312 0.3225 / 0.0462
11 large_kernel_v4 depthwise TrueConvNeXt k=[7, 13, 21, 31] reverse ~12 M +0.8438 0.3389 +0.9203 +0.5090 −13.67 −44.54 0.6107 / 0.5012 0.2845 / 0.2173 0.2655 / 0.0463
E1 dense+dilated (broken) dense ConvNeXt k=3, d=[1, 8, 16, 16] 1/4° 84.04 M −0.3338 0.9914 −0.0651 −324.6 −3397.8 −27992.6 0.1496 / 3.906 0.0091 / 0.900 0.0232 / 0.486

_These are ranked by Niño R² _.

timeseries_grid_shallow_all_vars

(Full one degree viz comparison in this folder)

I have not yet run this at quarter degree resolution, but I can. Happy to add more results / collect more data to see if this is something we want to merge! Full experiment configurations can be found in #727. This was the most effective result, so we'll only merge this. My goal is to land the learnings from these experiments, which I am now beyond my time box.

@alxmrs alxmrs requested a review from jder June 2, 2026 21:05
@alxmrs alxmrs marked this pull request as ready for review June 2, 2026 21:05
@jder

jder commented Jun 15, 2026

Copy link
Copy Markdown
Member

Thanks, @alxmrs. My first impression is that this is a pretty mixed bag for 1-degree and I don't understand why we'd want a dilation of 16 for the coarsest level. That gives a 33x33 receptive field. At 1 degree, the grid is 11x22 at that point so the latitudes are all padding and the longitudes are wrapping around the earth more than once.

1/2 degree results do look better, but the coarsest half-degree level is still 22x45 which (with a 33x33 receptive field) still gets mostly vertical padding.

Basically, I would feel better about one of:

  • Having this be our default for 1/2 or 1/4 degree only
  • Seeing results with narrower final convolutions, or
  • Getting these runs evaluated with the new metric suite we're going after.

What do you think?

@alxmrs

alxmrs commented Jun 16, 2026

Copy link
Copy Markdown
Member Author

I like your analysis, Jesse. Yeah, I'd be comfortable to land this change just for 1/2 and 1/4 degree only. Maybe, it'd be in our interest to tune the last hparam to be specific to each resolution. My Claude did warn me of the vertical degree padding vs receptive field issue, but it passed it off with the idea that the NN would be able to learn to deal with these boundary conditions.

Getting these runs evaluated with the new metric suite we're going after.

I'd like to double down on this point: My plan for landing this will be to pause my experiments until I can rigorously evaluate this/these hparam changes with @YuanYuan98's new evaluation system. I see this change as serving as a useful test case to see how well we can make the metrics system automated.

Seeing results with narrower final convolutions

I also agree with this push back in principle. I was experimenting with a really wide breadth of levers in my initial investigation, and I could see this branch becoming a narrower search just on the optimal dilation hparams per each resolution. I mentioned this somewhere in chat, but one experiment I am especially curious about is to see if we could tune the dilations to mimic GraphCasts' hierarchical icosahedral grid (in receptive field coverage).

In sum, thanks for your review, and I'm happy to postpone landing this change until later. If you'd like this PR to be merged sooner, I could make a narrower contribution to make this specific to 1/2 degree samudra.

@jder

jder commented Jun 17, 2026

Copy link
Copy Markdown
Member

OK, thanks for the detailed response. I think I'm in favor of using this as a test for our new metrics so let's pause for now and we can move forward once we have those results.

@jder jder marked this pull request as draft June 17, 2026 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants