Q1, I am trying to implement autoencoder, and I have data like this:
- 800 300 1 100000 -0.1
- 789 400 1.6 100500 -0.4
804 360 1.2 100420 -0.2
....
How do I suppose to normalize these data to be able for training?
Q2, Because I don't know the way to do the normalization, so I skip it and just apply the raw data to autoencoder for training, but the gradient become Nan after several iterations, here is the code.
BATCH_SIZE=1
BETA=3
INPUT=89
HIDDEN=64
EPOCHS=1
LR=0.01
RHO=0.1
raw_data=Loader('test.csv')
print(np.shape(raw_data))
raw_data=torch.Tensor(raw_data)
train_dataset=Data.TensorDataset(data_tensor=raw_data,target_tensor=raw_data)
train_loader=Data.DataLoader(dataset=train_dataset,batch_size=BATCH_SIZE,shuffle=True)
model=SparseAutoEncoder(INPUT,HIDDEN)
optimizer=optim.Adam(model.parameters(),lr=LR)
loss_func=nn.MSELoss()
for epoch in range(EPOCHS):
for b_index,(x,_) in enumerate(train_loader):
x=x.view(-1,INPUT)
x=Variable(x)
encoded,decoded=model(x)
loss=loss_func(decoded,x)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print("Epoch: [%3d], Loss: %.4f" %(epoch + 1, loss.data))
raw_data has the shape of (2700,89) , it contains 89 dimensions in each row, and with different scale of value(as Q1 mentioned).