You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There appears to be an edge case in the current Adam implementation. If the optimizer contains non-real variables, i.e., quaternions or complex numbers, the variance estimation might not be what we want.
where g_p is the parameter gradient. If a parameter is a quaternion or complex number, the gradient will be of the same type. Then, dr.sqr will compute a quaternion or complex number product. This is different from the element-wise product that we would expect here.
Hi,
There appears to be an edge case in the current Adam implementation. If the optimizer contains non-real variables, i.e., quaternions or complex numbers, the variance estimation might not be what we want.
The Adam implementation computes
where
g_p
is the parameter gradient. If a parameter is a quaternion or complex number, the gradient will be of the same type. Then,dr.sqr
will compute a quaternion or complex number product. This is different from the element-wise product that we would expect here.While there is some literature on this subject (pytorch github issue, pytorch docs, https://arxiv.org/pdf/0906.4835.pdf), the easiest practical solution is to just optimize using Vector2f or Vector4f instead.
Maybe Mitsuba should raise an error if a quaternion or complex value is assigned to the optimizer?
The text was updated successfully, but these errors were encountered: