4

I'm wondering how the following code could be faster. At the moment, it seems unreasonably slow, and I suspect I may be using the autograd API wrong. The output I expect is each element of timeline evaluated at the jacobian of f, which I do get, but it takes a long time:

import numpy as np
from autograd import jacobian


def f(params):
    mu_, log_sigma_ = params
    Z = timeline * mu_ / log_sigma_
    return Z


timeline = np.linspace(1, 100, 40000)

gradient_at_mle = jacobian(f)(np.array([1.0, 1.0]))

I would expect the following:

  1. jacobian(f) returns an function that represents the gradient vector w.r.t. the parameters.
  2. jacobian(f)(np.array([1.0, 1.0])) is the Jacobian evaluated at the point (1, 1). To me, this should be like a vectorized numpy function, so it should execute very fast, even for 40k length arrays. However, this is not what is happening.

Even something like the following has the same poor performance:

import numpy as np
from autograd import jacobian


def f(params, t):
    mu_, log_sigma_ = params
    Z = t * mu_ / log_sigma_
    return Z


timeline = np.linspace(1, 100, 40000)

gradient_at_mle = jacobian(f)(np.array([1.0, 1.0]), timeline)

Primusa
  • 13,136
  • 3
  • 33
  • 53
Cam.Davidson.Pilon
  • 1,606
  • 1
  • 17
  • 31

1 Answers1

5

From https://github.com/HIPS/autograd/issues/439 I gathered that there is an undocumented function autograd.make_jvp which calculates the jacobian with a fast forward mode.

The link states:

Given a function f, vectors x and v in the domain of f, make_jvp(f)(x)(v) computes both f(x) and the Jacobian of f evaluated at x, right multiplied by the vector v.

To get the full Jacobian of f you just need to write a loop to evaluate make_jvp(f)(x)(v) for each v in the standard basis of f's domain. Our reverse mode Jacobian operator works in the same way.

From your example:

import autograd.numpy as np
from autograd import make_jvp

def f(params):
    mu_, log_sigma_ = params
    Z = timeline * mu_ / log_sigma_
    return Z

timeline = np.linspace(1, 100, 40000)

gradient_at_mle = make_jvp(f)(np.array([1.0, 1.0]))

# loop through each basis
# [1, 0] evaluates (f(0), first column of jacobian)
# [0, 1] evaluates (f(0), second column of jacobian)
for basis in (np.array([1, 0]), np.array([0, 1])):
    val_of_f, col_of_jacobian = gradient_at_mle(basis)
    print(col_of_jacobian)

Output:

[  1.           1.00247506   1.00495012 ...  99.99504988  99.99752494
 100.        ]
[  -1.           -1.00247506   -1.00495012 ...  -99.99504988  -99.99752494
 -100.        ]

This runs in ~ 0.005 seconds on google collab.

Edit:

Functions like cdf aren't defined for the regular jvp yet but you can use another undocumented function make_jvp_reversemode where it is defined. Usage is similar except that the output is only the column and not the value of the function:

import autograd.numpy as np
from autograd.scipy.stats.norm import cdf
from autograd.differential_operators import make_jvp_reversemode


def f(params):
    mu_, log_sigma_ = params
    Z = timeline * cdf(mu_ / log_sigma_)
    return Z

timeline = np.linspace(1, 100, 40000)

gradient_at_mle = make_jvp_reversemode(f)(np.array([1.0, 1.0]))

# loop through each basis
# [1, 0] evaluates first column of jacobian
# [0, 1] evaluates second column of jacobian
for basis in (np.array([1, 0]), np.array([0, 1])):
    col_of_jacobian = gradient_at_mle(basis)
    print(col_of_jacobian)

Output:

[0.05399097 0.0541246  0.05425823 ... 5.39882939 5.39896302 5.39909665]
[-0.05399097 -0.0541246  -0.05425823 ... -5.39882939 -5.39896302 -5.39909665]

Note that make_jvp_reversemode will be slightly faster than make_jvp by a constant factor due to it's use of caching.

Community
  • 1
  • 1
Primusa
  • 13,136
  • 3
  • 33
  • 53
  • @Cam.Davidson.Pilon hold on, I absolutely don't want to be pressuring you into anything. If you need more please let me know !!! – Primusa Feb 05 '19 at 03:01
  • Well, there is this case I can't figure. I'm hitting an error: https://gist.github.com/CamDavidsonPilon/82f24833e1c965c1beee65bdd574335a -- the only addition is the norm.cdf – Cam.Davidson.Pilon Feb 06 '19 at 04:09
  • It appears to be from ` raise NotImplementedError("JVP of {} wrt argnums {} not defined" .format(name, parent_argnums))` in https://github.com/HIPS/autograd/blob/master/autograd/core.py#L113 – Primusa Feb 06 '19 at 04:18
  • Here the `cdf` gets defined for `VJP` but not `JVP` https://github.com/HIPS/autograd/blob/master/autograd/scipy/stats/norm.py – Primusa Feb 06 '19 at 04:21
  • @Cam.Davidson.Pilon, see edit, it should be working. I've got your back :) – Primusa Feb 06 '19 at 04:28
  • how are you so smart at this?! :) – Cam.Davidson.Pilon Feb 06 '19 at 04:41
  • @Cam.Davidson.Pilon haha :) I'm just a hs student who can google things and is compelled by internet points. Just realized that you wrote that awesome github book on bayesian methods so now I need to get your comment framed! Anyway let me know if anything else comes up I'll try to help the best I can :) – Primusa Feb 06 '19 at 04:57
  • We all have our niches. Good for you for being young and already wanting to help others. Looking forward to your “autograd for hackers” book – Cam.Davidson.Pilon Feb 06 '19 at 13:52