I'm doing a stance detection project of brexit tweets using USE. For my code I'm trying to apply this notebook: https://www.kaggle.com/kshitijmohan/sentiment-analysis-universal-sentence-encoder-91
But I get this error:
input must be a vector, got shape: []
(t_stance is the stance, 0 = remain, 1 = leave, 2 = neutral)
RANDOM_SEED = 42
np.random.seed(RANDOM_SEED)
tf.random.set_seed(RANDOM_SEED)
train = pd.read_csv(base_dir + "k500_train.csv")
module_url = "https://tfhub.dev/google/universal-sentence-encoder/4"
use = hub.load(module_url)
from sklearn.preprocessing import OneHotEncoder
type_one_hot = OneHotEncoder(sparse=False).fit_transform(
train.t_stance.to_numpy().reshape(-1,1)
)
train_reviews, test_reviews, y_train, y_test =\
train_test_split(
train.text,
type_one_hot,
test_size=.4,
random_state=42
)
X_train = []
for r in tqdm(train_reviews):
emb = use(r)
review_emb = tf.reshape(emb, [-1]).numpy()
X_train.append(review_emb)
X_train = np.array(X_train)
0%| | 0/210000 [00:01<?, ?it/s]
InvalidArgumentError: input must be a vector, got shape: []
[[{{node StatefulPartitionedCall/StatefulPartitionedCall/text_preprocessor/tokenize/StringSplit/StringSplit}}]] [Op:__inference_restored_function_body_10218]
Function call stack:
restored_function_body