Defining a simple neural netwok in mxnet error

Question

I am doing making simple NN using MXnet , but having some problem in step() method

x1.shape=(64, 1, 1000)
y1.shape=(64, 1, 10)

net =nm.Sequential()
net.add(nn.Dense(H,activation='relu'),nn.Dense(90,activation='relu'),nn.Dense(D_out))

for t in range(500):
    #y_pred = net(x1)

    #loss = loss_fn(y_pred, y)
    #for i in range(len(x1)):

    with autograd.record():
        output=net(x1)
        loss =loss_fn(output,y1)
    loss.backward()
    trainer.step(64)
    if t % 100 == 99:
        print(t, loss)
        #optimizer.zero_grad()

UserWarning: Gradient of Parameter dense30_weight on context cpu(0) has not been updated by backward since last step. This could mean a bug in your model that made it only use a subset of the Parameters (Blocks) for this iteration. If you are intentionally only using a subset, call step with ignore_stale_grad=True to suppress this warning and skip updating of Parameters with stale gradient

Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation, as suggested when you created this account. [Minimal, complete, verifiable example](https://stackoverflow.com/help/minimal-reproducible-example) applies here. We cannot effectively help you until you post your MCVE code and accurately specify the problem. We should be able to paste your posted code into a text file and reproduce the problem you specified. Your code is not a MCVE, and you haven't specified a problem. — Prune, Aug 30 '19 at 20:35

score 0 · Accepted Answer · answered Sep 22 '19 at 15:59

The error indicates that you are passing parameters in your trainer that are not in your computational graph. You need to initialize the parameters of your model and define the trainer. Unlike Pytorch, you don't need to call zero_grad in MXNet because by default new gradients are written in and not accumulated. Following code shows a simple neural network implemented using MXNet's Gluon API:

# Define model
net = gluon.nn.Dense(1)
net.collect_params().initialize(mx.init.Normal(sigma=1.), ctx=model_ctx)
square_loss = gluon.loss.L2Loss()
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.0001})

# Create random input and labels
def real_fn(X):
    return 2 * X[:, 0] - 3.4 * X[:, 1] + 4.2

X = nd.random_normal(shape=(num_examples, num_inputs))
noise = 0.01 * nd.random_normal(shape=(num_examples,))
y = real_fn(X) + noise

# Define Dataloader
batch_size = 4
train_data = gluon.data.DataLoader(gluon.data.ArrayDataset(X, y), batch_size=batch_size, shuffle=True)
num_batches = num_examples / batch_size

for e in range(10):

    # Iterate over training batches
    for i, (data, label) in enumerate(train_data):

    # Load data on the CPU
        data = data.as_in_context(mx.cpu())
        label = label.as_in_context(mx.cpu())

        with autograd.record():
            output = net(data)
            loss = square_loss(output, label)

    # Backpropagation
        loss.backward()
        trainer.step(batch_size)

        cumulative_loss += nd.mean(loss).asscalar()

    print("Epoch %s, loss: %s" % (e, cumulative_loss / num_examples))

Defining a simple neural netwok in mxnet error

1 Answers1