Disclaimer: I am very new to Neural Network and Tensorflow.
I am trying to create a QA application where user asks a question and the application gives the answer. Most of the traditional methods I tried did not work or is not accurate enough or requires manual intervention. I was researching about unsupervised QA application, that is when I came across BERT.
BERT as google claims is state of the art neural network model and achieved highest score in leader board for Squad 2.0. I wish to use this model for my application and test it's performance.
I have created a Windows 2012 Datacenter edition Virtual Machine in Compute Engine. I have created Cloud TPU using ctpu.
I have the BERT large uncased model in Cloud Storage.
How do I train the BERT large uncased model with SQUAD 2.0?
Please feel free to correct me if I am wrong, I have the understanding that Cloud TPU is just a device like CPU or GPU. However if you read this, they are explaining like Cloud TPU is a virtual machine ("On Cloud TPU you can run with BERT-Large as...").
Where do I run run_squad.py as mentioned in here?
python run_squad.py \
--vocab_file=$BERT_LARGE_DIR/vocab.txt \
--bert_config_file=$BERT_LARGE_DIR/bert_config.json \
--init_checkpoint=$BERT_LARGE_DIR/bert_model.ckpt \
--do_train=True \
--train_file=$SQUAD_DIR/train-v2.0.json \
--do_predict=True \
--predict_file=$SQUAD_DIR/dev-v2.0.json \
--train_batch_size=24 \
--learning_rate=3e-5 \
--num_train_epochs=2.0 \
--max_seq_length=384 \
--doc_stride=128 \
--output_dir=gs://some_bucket/squad_large/ \
--use_tpu=True \
--tpu_name=$TPU_NAME \
--version_2_with_negative=True
How to access the the storage bucket files from Virtual Machine for this argument vocab_file
?
Is the external IP address the value for $TPU_NAME
environment variable?