Infinite gradient in simple network Latent Space Model in RStan

Question

I am trying to fit a Latent Space Model in RStan, a well established type of (social) network model. (Meaning this MAY be a question for CrossValidated, but I have reasons to believe it's not, yet.) I know packages are available, but I am working towards building something that no package implements, and want to first make sure I can make a verifiable model work.

To give a quick intuition, a LSM takes a network, and tries to find a layout for the nodes, such that close-together nodes have edges, and far-apart nodes don't. The positions of the nodes are latent variables, which the model attempts to estimate (along with an intercept, corresponding to the overall propensity to form an edge).

My code (all below, in two files) generates a simple and small barbell graph (dense on the ends, sparse in the middle), from a data generating process of the type a LSM assumes. It then passes that data both to RStan, which finds an infinite gradient, and the package latentnet, which works fine.

The error suggests I start with a good initialization, so I provide the TRUE generating values, and the sampler still finds an infinite gradien. It also suggests loosening the constraints on variables where possible, but my constraints are only lower-bounds on variance parameters. This leads me to believe my problem is elsewhere, but I am not yet familiar enough with Stan to be able to see where. Any help, including suggestions of tests to run, would be much appreciated.

First, the R:

# GENERATING LSM DATA #
# The sparse path connecting the dense ends
w <- cbind(0.5*0:8-2, rep(0,9))
# The two dense barbell ends
x <- mvrnorm(10, mu=c(-2,0), Sigma=array(c(.2,0,0,.2),dim=c(2,2)))
y <- mvrnorm(10, mu=c(2,0), Sigma=array(c(.2,0,0,.2),dim=c(2,2)))
z <- rbind(w,x,y)

d <- as.matrix(dist(z))
a <- 1

logit <- function(x) 1/(1+exp(-x))

for(i in 1:N){
    for(j in 1:N){
        # This is what a LSM assumes
        X[i,j] <- (logit(a - d[i,j]) > .5)+0
    }
}
mean(X)

N <- nrow(X)

lsm.data <- list(N=N,
                 edges=X,
                 sigma_alpha=10,
                 mu_z=c(0,0),
                 sigma_z=array(c(1,0,0,1),dim=c(2,2)))

# This initialization provides the TRUE values of the parameters
lsm.init <- function(){
    list(alpha=a, z=z)
}

library(rstan)
lsm.fit <- stan("lsm.stan",
                model_name="lsm",
                data=lsm.data,
                init=lsm.init,
                verbose=T)

# Here is an independent verification of the data -- this model fits fine,
# and a plot of the result looks exactly as I think it should (a barbell).
library(latentnet)
lsm.ergmm <- ergmm(X ~ euclidean(d=2))
plot(lsm.ergmm)

And here, the model in Stan:

data {
    int<lower=0> N;   // number of nodes
    int<lower=0> edges[N,N]; //for now non-blank => 1
    real<lower=0> sigma_alpha; // alpha is the density
    vector[2] mu_z;             // mean of the latent positions
    matrix<lower=0>[2,2] sigma_z; // variance of the latent positions
}

parameters {
    real alpha;     // density intercept
    vector[2] z[N]; // latent positions
}


model {

    real d; // just a convenience variable, to split up a long line

    // prior on alpha
    alpha ~ normal(0, sigma_alpha);

    // latent variables
    for (i in 1:N) {
        z[i] ~ multi_normal(mu_z, sigma_z);
    }

    for(i in 1:N){
        for(j in 1:N){
            d = sqrt(pow(z[i,1]-z[j,1],2) + pow(z[i,2]-z[j,2],2));
            edges[i,j] ~ bernoulli_logit(alpha - d);
        }
    }
}

What led you to conclude there was an infinite gradient? You could use `d = hypot(z[i,1] - z[j,1], z[i,2] - z[j,2]);` You want to vectorize the multi-normal, but really you just want to use a regular normal since `sigma_z` is a unit matrix. That'd just be `for (k in 1:2) z[ , k] ~ normal(mu_z[k], 1);`. — Bob Carpenter, Nov 24 '17 at 19:09
The error claims an infinite gradient explicitly. Whether that's true or not, I didn't ascertain -- a solution was to use the squared L2 distance between latent positions (i.e. remove the sqrt), presumably because it removed sharp points from the surface — one_observation, May 04 '18 at 12:35

Infinite gradient in simple network Latent Space Model in RStan

0 Answers0