Commit 10de688
committed
Fix a bug where the agent would be rewarded for the wrong action
Because reward is delayed until the other player has taken action, I
need to actually choose a given agent’s action *after* that
retrospective reward is given.1 parent 0eb494f commit 10de688
1 file changed
Lines changed: 3 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
171 | 171 | | |
172 | 172 | | |
173 | 173 | | |
174 | | - | |
175 | | - | |
| 174 | + | |
176 | 175 | | |
177 | 176 | | |
178 | 177 | | |
| |||
547 | 546 | | |
548 | 547 | | |
549 | 548 | | |
550 | | - | |
| 549 | + | |
551 | 550 | | |
552 | 551 | | |
| 552 | + | |
553 | 553 | | |
554 | 554 | | |
555 | 555 | | |
| |||
0 commit comments