Initialize masked language model with RobertaForMaskLM missing gelu activation layer

Asked Jul 18 '23 at 17:33

Active Jul 18 '23 at 17:34

Viewed 18 times

i am trying to train a masked language model from scratch. i use below code to create the Roberta model architecture. but when I compare it with RobertaLM, I found it does not have the GELU activation layer. could someone help explain how to correctly do this? thanks

config = RobertaConfig(
vocab_size= 50265,
max_position_embeddings=514,
num_attention_heads=12,
num_hidden_layers=12,
type_vocab_size=1,
)
model = RobertaForMaskedLM(config=config)

click the image for the missing gelu activation layers

edited Jul 18 '23 at 17:34

asked Jul 18 '23 at 17:33

stuart zong

Initialize masked language model with RobertaForMaskLM missing gelu activation layer

0 Answers0