Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backwards differentiation #142

Open
Gertian opened this issue Sep 24, 2024 · 7 comments
Open

backwards differentiation #142

Gertian opened this issue Sep 24, 2024 · 7 comments

Comments

@Gertian
Copy link

Gertian commented Sep 24, 2024

Firstly, thanks the developers for the great package !

I tried to use the package with Zygote.jl (for automatic differentiation) but sadly this failed due to the usage of =. in the definition of pfaffian(A).

The usual fix for this is to define the backwards derivative manually, in this case I got :

using Zygote
using ChainRulesCore
using ChainRules

function ChainRules.rrule(::typeof(pfaffian), x::Union{Number, AbstractMatrix})
    Ω = pfaffian(x)
    function det_pullback(ΔΩ)
        ∂x = x isa Number ? zero(ΔΩ) : #=0.5 *=# inv(x)' * dot(Ω, ΔΩ)  #we removed the 0.5 because changing element ij immediatelyu changes element ji. 
        return (NoTangent(), ∂x)
    end
    return Ω, det_pullback
end

which works well from the test :

for _ in 1:100
    n = rand(1:10)

    A = rand(2*n,2*n) 
    A = A - transpose(A)
    A = SkewHermitian(A)  

    epsi = 1e-6

    function make_zero_but_ij(i,j)
        out = zero(A)
        if i != j 
            out[i,j] =  1.   #note, the one half here is because the pfaffian is antisymmetric in the rows and columns. 
        end
        return out
    end

    numerical = gradient(x->pfaffian(x), A)[1]
    analytical = [(pfaffian(A + epsi*make_zero_but_ij(i,j)  )-pfaffian(A))/epsi for i  1:2*n, j  1:2*n]

    @show analytical  numerical
end

would it be desirable if I made a pull request with this code ?

All the best,
Gertian

@stevengj
Copy link
Member

stevengj commented Sep 24, 2024

A pull request with a chain rule would be welcome. You'll want to do it as a package extension, similar to how e.g. SpecialFunctions.jl does it. This is a case where a custom rrule would probably be much more efficient than AD anyway, even if Zygote worked on this code.

(dot(Ω, ΔΩ) looks a bit weird to me since they are both real scalars, in which case it is just a multiplication. But the formula looks correct to me.)

(What you are calling "numerical" and what you are calling "analytical" in your test looks backwards to me.)

@Gertian
Copy link
Author

Gertian commented Sep 24, 2024

Hi,

Thanks for the quick reply ! I agree that this is indeed surely faster with a custom rrule so I'm happy I got forced into finally reading the zygote.jl manual.

As for the (dot(Ω, ΔΩ) I agree that this is a bit strange. However, I based based my code of the rrule for det in
https://github.com/JuliaDiff/ChainRules.jl/blob/e245d50a1ae56ce46fc8c1f0fe9b925964f1146e/src/rulesets/LinearAlgebra/dense.jl#L129
and made minimal changes since it's my first time implementing an rrule if you are certain that this can be dropped I'd be happy to do so !

As for the naming conventions in my test. Thanks for pointing that out. I'll be smarter in my pull request ;)

@stevengj
Copy link
Member

dot is harmless, since for two real scalars it simply calls *, so I would leave it as-is.

@stevengj
Copy link
Member

(Out of curiosity, in what context did you encounter a need to differentiate a Pfaffian?)

@Gertian
Copy link
Author

Gertian commented Sep 25, 2024

Thanks for your interest :)

I'm currently writing code to optimize Gaussian matrix product states so that they approximate groundstates of quantum many body problems. For now I'm just reproducing this work by Norbert Schuch and Bela Bauer , but in the future I plan to go beyond their approach.

More specifically, in this context the state is parametrized by a covariance matrix and the expectation value of operators (such as the Hamiltonian=energy) is calculated as the pfaffian of a submatrix of this covariance matrix. For more details I refer to this paper by Brevyi.

@smataigne
Copy link
Collaborator

Thank you @Gertian for your interest in this package and your PR! Thank you also @stevengj! I am personnally not familiar with AD in Julia but you are welcome to contribute for anything that you find useful.

@Gertian
Copy link
Author

Gertian commented Sep 26, 2024

Thank you @smataigne. We've currently got a pull request which implements the desired functionality as a package extension.

I'm still testing it but it seems like we will be able to merge soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants