I am trying to make linear regression model that predicts the son's length from his father's length
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
%matplotlib inline
from sklearn.linear_model import LinearRegression
Headings_cols = ['Father', 'Son']
df = pd.read_csv('http://www.math.uah.edu/stat/data/Pearson.txt',
delim_whitespace=True, names=Headings_cols)
X = df['Father']
y = df['Son']
model2 = LinearRegression()
model2.fit(y, X)
plt.scatter(X, y,color='g')
plt.plot(X, model.predict(X),color='g')
plt.scatter(y, X, color='r')
plt.plot(y, X, color='r')
I get error
ValueError: could not convert string to float: 'Father'
The second thing is calculating the average length of the sons, and the standard error of the mean ?