-
-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove incorrect push!
and pop!
gradients
#1025
Conversation
After fiddling a bit, I'm not really sure this can be fixed. Here's the core problem:
pushing to / popping from |
Yeah seems like you'd need element-level tracking to make sure things lined up when accumulating even if we did have some padding hack for the size mismatch. |
It's possible that if you always wrote But without something like that, we should probably just make these back into errors. |
Rolling back the clock a bit, is there a way we could tackle the original issue in #37 (comment) without something like #876? As noted in FluxML/Flux.jl#1614 (comment) that would get us back to a known point that is less buggy than what exists now. |
That does sound like a good path. #37 doesn't have a stacktrace, I can block out time on my calendar to compile the DiffEqverse and try it sometime soon... |
Partially answering my own question now that this is up for review, @ChrisRackauckas's comment at #876 (comment) suggests that permitting |
Ok, then you've got more out of these scattered threads than I did. If the goal is to make certain objects in DataStructures like The The code in #37 (comment) leads me (eventually) to this error in all 4 cases, from which I learn nothing about what's desired:
|
New CI error on 1.3 is not caused by this PR. It's a bad interaction between complex numbers in the gradient of |
push!
and pop!
gradientspush!
and pop!
gradients
Apparently my attempt to re-run CI was a bust, I guess this needs a rebase? Rolling things back sucks, but it'll buy us some time to ponder how best to address |
Co-authored-by: Brian Chen <[email protected]>
This is a replacement for #876, which doesn't seem to get the right gradients at all.
It also looks into fixing #992, where similar gradients appear never to be called with nontrivial input.