0

I am trying to Construct a Poisson regression model and while doing the below code it turning out an Patsy error as Number of rows mismatch between data argument and type (29 versus 1):

import pandas as pd
from patsy import dmatrices
import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt

df = pd.read_csv('ships.csv', header=0, infer_datetime_format=True, parse_dates=[0], index_col=[0])

mask = np.random.rand(len(df)) < 0.8
df_train = df[mask]
df_test = df[~mask]
print('Training data set length='+str(len(df_train)))
print('Testing data set length='+str(len(df_test)))

expr = """ damage ~ type + construction + operation + months """

y_train, X_train = dmatrices(expr, df_train, return_type='dataframe')
y_test, X_test = dmatrices(expr, df_test, return_type='dataframe')

Output:

PatsyError: Number of rows mismatch between data argument and type (29 versus 1) damage ~ type + construction + operation + months ^^^^

Would anybody please help me about this?

Thanks

Zerone
  • 127
  • 1
  • 12

1 Answers1

1

Thanks all but I have sorted it out.The issue was with df = pd.read_csv('ships.csv', header=0), rest of the blocks were unnecessary as there are no datetime data.

Thanks

Zerone
  • 127
  • 1
  • 12