1

I have large count data with 65 feature variables, Claims as the outcome variable, and Exposure as an offset variable. I want to implement the Poisson loss function in a neural network using Python. I develop the following codes to work. Is following codes the right or wrong way to do it??

LOGEXP01 = np.log2(EXP01) # log of exposure variable

# Define the model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(66, activation='linear', input_shape=(66,)),
    tf.keras.layers.Dense(1, activation='relu')
])

# Compile the model with Poisson loss
model.compile(optimizer='Adam', loss="poisson")

#x1s is 65 feature variables, combine features variables with exposure
X_exp = np.concatenate((x1s, LOGEXP01), axis=1)

# Train the model (y01 is claims response variable)
model.fit([X_exp], y01, epochs=10, batch_size=32)

I also other way to use exposure as an offset with following codes:

inputs = Input(shape=(X_train.shape[1],))   
x = Dense(65, activation="relu")(inputs)  
x = Dense(45, activation="relu")(x)  
x = Dense(10, activation="relu")(x)  
rate = Dense(1, activation=tf.exp)(x)  
p_y = tfp.layers.DistributionLambda(tfd.Poisson)(rate)  
model_p = Model(inputs=inputs, outputs=p_y)

def NLL(y_true, y_hat, exposure):    
return -y_hat.log_prob(y_true, exposure) 

model_p.compile(Adam(learning_rate=0.01), loss=NLL(E_train)) 

#X_train is 65 feature variables, y_train is response claims, E_train is offset exposure.

hist_p = model_p.fit(x=X_train, y=y_train, validation_split=0.2, epochs=10, verbose=1, sample_weight=E_train)

Are the above codes using exposure as an offset in NLL the right way to fit the model?

F74
  • 31
  • 2

0 Answers0