I'm using a ColumnTransformer in my Python script to transform categorical variables in a dataset for use in a linear regression model. I've used the OneHotEncoder to transform the categorical variable in question, and the transformer appears to be working correctly based on the output. However, when I try to fit the transformed data to a LinearRegression model, I receive the error ValueError: could not convert string to float: 'New Hampshire'. I suspect the issue may be related to the ColumnTransformer not properly converting the categorical variable to a numerical format, but I'm not sure how to address this issue. Any suggestions on how to resolve this error would be greatly appreciated
the link to dataset https://www.kaggle.com/datasets/justin2028/unemployment-in-america-per-us-state
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
data_set = pd.read_csv('Unemployment in America Per US State.csv')
X = data_set.iloc[:, :-1]
y = data_set.iloc[:, -1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
ct = ColumnTransformer([('one_hot_encoder', OneHotEncoder(sparse=False), [1])], remainder='passthrough')
X_train = ct.fit_transform(X_train)
X_test = ct.transform(X_test)
regressor = LinearRegression()
regressor.fit(X_train, y_train)
y_pred = regressor.predict(X_test)`
I am using a column transformer to transform my data. And I have applied one-hot encoding on categorical variables. I was expecting the categorial data to be transformed but it give me this error ValueError: could not convert string to float.