Remove incorrect `push!` and `pop!` gradients #1025

mcabbott · 2021-07-12T05:13:40Z

This is a replacement for #876, which doesn't seem to get the right gradients at all.

It also looks into fixing #992, where similar gradients appear never to be called with nontrivial input.

src/lib/array.jl

mcabbott · 2021-07-12T12:30:07Z

After fiddling a bit, I'm not really sure this can be fixed. Here's the core problem:

julia> using Zygote: @showgrad

julia> gradient([1,2,3], 4) do xs, y
             a = sum(@showgrad(xs))
             b = sum(push!(@showgrad(xs), y))
             c = sum(@showgrad xs)
             a+b+c
           end
∂(xs) = Fill(1, 4)  # c, final size
∂(xs) = Fill(1, 3)  # b, size before push!
∂(xs) = Fill(1, 4)  # a, surprising, a bug?
ERROR: DimensionMismatch("arrays could not be broadcast to a common size; got a dimension with lengths 3 and 4")
Stacktrace:
...
  [8] accum(::FillArrays.Fill{Int64, 1, Tuple{Base.OneTo{Int64}}}, ::FillArrays.Fill{Int64, 1, Tuple{Base.OneTo{Int64}}}, ::FillArrays.Fill{Int64, 1, Tuple{Base.OneTo{Int64}}})

pushing to / popping from xs means the same variable corresponds to a different size of array at different times, and somehow the gradients for it which are being accumulated will need to keep track of this. One could hack accum not to error, but not to do the right thing. Whether the deep internals can keep track of this I'm not quite sure.

darsnack · 2021-07-12T12:52:11Z

Yeah seems like you'd need element-level tracking to make sure things lined up when accumulating even if we did have some padding hack for the size mismatch.

mcabbott · 2021-07-12T15:57:22Z

It's possible that if you always wrote xs = push!(xs, y) then Zygote would understand that the label xs is attached to different-length arrays at different points of the code, and accumulate their gradients correctly. Maybe some attempt was made to automate that, when trying to allow mutation via @adjoint! etc? Have never looked that deep into the internals. Maybe @simeonschaub knows things.

But without something like that, we should probably just make these back into errors.

ToucheSir · 2021-07-12T16:19:05Z

Rolling back the clock a bit, is there a way we could tackle the original issue in #37 (comment) without something like #876? As noted in FluxML/Flux.jl#1614 (comment) that would get us back to a known point that is less buggy than what exists now.

mcabbott · 2021-07-12T16:24:30Z

That does sound like a good path. #37 doesn't have a stacktrace, I can block out time on my calendar to compile the DiffEqverse and try it sometime soon...

ToucheSir · 2021-09-05T22:17:57Z

Partially answering my own question now that this is up for review, @ChrisRackauckas's comment at #876 (comment) suggests that permitting push!/pop! for AbstractVector wouldn't have done much for pop!(::BinaryHeap) anyhow. This seems to be corroborated by the DataStructures.jl method too, which not only calls push!, but further performs scalar array mutation in a loop here.

test/features.jl

mcabbott · 2021-09-05T22:50:47Z

Ok, then you've got more out of these scattered threads than I did. If the goal is to make certain objects in DataStructures like BinaryHeap AD-able, then I think someone should directly state that on an issue there, and we can discuss.

The percolate_up! and heappush! functions you link to don't look like good candidates, for the same reason as push!(::Vector, ...) above, and for the generic reason that allowing f!(x) assumes that no other operation has closed over x to re-use this in the backward pass. But maybe something can be attached to a higher-level function. Or maybe the bad things can have errors attached to them to leave a safe happy path.

The code in #37 (comment) leads me (eventually) to this error in all 4 cases, from which I learn nothing about what's desired:

ERROR: MethodError: no method matching fast_materialize(::Vector{Float64})
Closest candidates are:
  fast_materialize(::Base.Broadcast.Broadcasted{S}) where S at /Users/me/.julia/packages/FastBroadcast/WP7Ws/src/FastBroadcast.jl:18
Stacktrace:
  [1] macro expansion
    @ ~/.julia/packages/Zygote/nsu1Y/src/compiler/interface2.jl:0 [inlined]
  [2] _pullback(ctx::Zygote.Context, f::typeof(FastBroadcast.fast_materialize), args::Vector{Float64})
    @ Zygote ~/.julia/packages/Zygote/nsu1Y/src/compiler/interface2.jl:9

mcabbott · 2021-09-05T22:53:40Z

New CI error on 1.3 is not caused by this PR. It's a bad interaction between complex numbers in the gradient of ^ and Dual numbers from hessian. It's fixed (well, skipped) here: https://github.com/FluxML/Zygote.jl/pull/1044/files#diff-3483c521f73b08fdb6b00f014614cc0d69f87ea7b098a24aff838cbdc812704dR25
So it would be simplest to merge that first.

ToucheSir · 2021-09-28T22:21:49Z

Apparently my attempt to re-run CI was a bust, I guess this needs a rebase?

Rolling things back sucks, but it'll buy us some time to ponder how best to address push!, pop! and co. without more issues like #992 (comment) popping up. So if there the tests pass and there aren't any unrelated regressions, this is probably good to go.

Co-authored-by: Brian Chen <[email protected]>

DhairyaLGandhi suggested changes Jul 12, 2021

View reviewed changes

src/lib/array.jl Outdated Show resolved Hide resolved

src/lib/array.jl Outdated Show resolved Hide resolved

mcabbott force-pushed the pushpop branch from 29f17a7 to 2beec8b Compare July 12, 2021 07:01

mcabbott mentioned this pull request Jul 12, 2021

placeholder adjoint for params FluxML/Flux.jl#1614

Closed

mcabbott force-pushed the pushpop branch from 906ffb2 to 41167da Compare July 12, 2021 12:48

mcabbott force-pushed the pushpop branch from bf50f7b to 7ba5356 Compare July 18, 2021 05:20

DhairyaLGandhi mentioned this pull request Jul 26, 2021

Replacing map with loop leads to either and error or an incorrect gradient (range and push! related?) #939

Open

ToucheSir mentioned this pull request Jul 26, 2021

Better handling for nesting Params #823

Closed

mcabbott marked this pull request as ready for review September 5, 2021 16:06

ToucheSir reviewed Sep 5, 2021

View reviewed changes

test/features.jl Outdated Show resolved Hide resolved

mcabbott changed the title ~~Fix push! and pop! gradients~~ Remove incorrect push! and pop! gradients Sep 10, 2021

CarloLucibello mentioned this pull request Sep 28, 2021

Differentiate push! with implicit Params #992

Merged

ToucheSir closed this Sep 28, 2021

ToucheSir reopened this Sep 28, 2021

mcabbott added 8 commits September 28, 2021 18:41

fix push + pop gradient for vector of arrays, add real tests

3d5fa62

tweak

df7d069

allow only trivial gradients in push!(::Params) etc.

56b966f

generalise, and fail

dc1d15f

fix

4b3e448

rm gradients which don't work

182a30e

rm unused methods from push(IdSet) gradient

1d1fbb2

restrict push error to arrays, rm adjoint for params

f906ef1

mcabbott force-pushed the pushpop branch from a354853 to f906ef1 Compare September 28, 2021 22:42

ToucheSir approved these changes Sep 28, 2021

View reviewed changes

Update test/features.jl

f9d17a1

Co-authored-by: Brian Chen <[email protected]>

mcabbott merged commit 4e43922 into FluxML:master Sep 28, 2021

mcabbott deleted the pushpop branch September 28, 2021 23:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove incorrect `push!` and `pop!` gradients #1025

Remove incorrect `push!` and `pop!` gradients #1025

mcabbott commented Jul 12, 2021 •

edited

Loading

mcabbott commented Jul 12, 2021

darsnack commented Jul 12, 2021

mcabbott commented Jul 12, 2021

ToucheSir commented Jul 12, 2021

mcabbott commented Jul 12, 2021

ToucheSir commented Sep 5, 2021

mcabbott commented Sep 5, 2021

mcabbott commented Sep 5, 2021 •

edited

Loading

ToucheSir commented Sep 28, 2021

Remove incorrect push! and pop! gradients #1025

Remove incorrect push! and pop! gradients #1025

Conversation

mcabbott commented Jul 12, 2021 • edited Loading

mcabbott commented Jul 12, 2021

darsnack commented Jul 12, 2021

mcabbott commented Jul 12, 2021

ToucheSir commented Jul 12, 2021

mcabbott commented Jul 12, 2021

ToucheSir commented Sep 5, 2021

mcabbott commented Sep 5, 2021

mcabbott commented Sep 5, 2021 • edited Loading

ToucheSir commented Sep 28, 2021

Remove incorrect `push!` and `pop!` gradients #1025

Remove incorrect `push!` and `pop!` gradients #1025

mcabbott commented Jul 12, 2021 •

edited

Loading

mcabbott commented Sep 5, 2021 •

edited

Loading