0

I am creating different polynomial regression models, by passing different powers of same teaching feature.

So if I want a polynomial model of degree 3 of the feature 'x'. Then to the regression model, I am passing x^1,x^2 and x^3 as the features.

The following function is used to create an Sframe table of powers of 'x'. From the values of 'x' passed to it, along with the degree powers that need to be created.

def polynomial_sframe(feature, degree):

# assume that degree >= 1
# initialize the SFrame:
poly_sframe = graphlab.SFrame()

#poly_sframe['power_1'] equal to the passed feature
poly_sframe['power_1'] = feature

# first check if degree > 1
if degree > 1:

    # then loop over the remaining degrees:
    # range usually starts at 0 and stops at the endpoint-1. 
    for power in range(2, degree+1): 

        #give the column a name:
        name = 'power_' + str(power)

        # then assign poly_sframe[name] to the appropriate power of feature
        poly_sframe[name] = feature.apply(lambda x: x**power)

return poly_sframe

Then using Sframe generated from the above function. I am able to generate different polynomial expression's for different degrees of X. As shown in the following code.

poly3_data = polynomial_sframe(sales['sqft_living'], 3)

my_features = poly3_data.column_names() # get the name of the features

poly3_data['price'] = sales['price'] # add price to the data since it's the target

model3 = graphlab.linear_regression.create(poly3_data, target = 'price', features = my_features, validation_set = None)

Graphlab is able to generate a model upto degree 4. After that if fails and for the following code. It will show that an overflow error has occurred.

poly15_data = polynomial_sframe(sales['sqft_living'], 5)

my_features = poly15_data.column_names() # get the name of the features

poly15_data['price'] = sales['price'] # add price to the data since it's the target

model15 = graphlab.linear_regression.create(poly15_data, target = 'price', features = my_features, validation_set = None)

---------------------------------------------------------------------------
ToolkitError                              Traceback (most recent call last)
<ipython-input-76-df5cbc0b6314> in <module>()
      2 my_features = poly15_data.column_names() # get the name of the features
      3 poly15_data['price'] = sales['price'] # add price to the data since it's the target
----> 4 model15 = graphlab.linear_regression.create(poly15_data, target = 'price', features = my_features, validation_set = None)

C:\Users\mk\Anaconda2\envs\dato-env\lib\site-  
packages\graphlab\toolkits\regression\linear_regression.pyc in create(dataset, target, features, l2_penalty, l1_penalty, solver, feature_rescaling, convergence_threshold, step_size, lbfgs_memory_level, max_iterations, validation_set, verbose)
    284                         step_size = step_size,
    285                         lbfgs_memory_level = lbfgs_memory_level,
--> 286                         max_iterations = max_iterations)
    287 
    288     return LinearRegression(model.__proxy__)

C:\Users\mk\Anaconda2\envs\dato-env\lib\site- 
packages\graphlab\toolkits\_supervised_learning.pyc in create(dataset, target, model_name, features, validation_set, verbose, distributed, **kwargs)
    451     else:
    452         ret = _graphlab.toolkits._main.run("supervised_learning_train",
--> 453                                            options, verbose)
    454         model = SupervisedLearningModel(ret['model'], model_name)
    455 

C:\Users\mk\Anaconda2\envs\dato-env\lib\site-
packages\graphlab\toolkits\_main.pyc in run(toolkit_name, options, verbose, show_progress)
     87         _get_metric_tracker().track(metric_name, value=1, properties=track_props, send_sys_info=False)
     88 
---> 89         raise ToolkitError(str(message))

ToolkitError: Exception in python callback function evaluation: 
OverflowError('long too big to convert',): 
Traceback (most recent call last):
File "graphlab\cython\cy_pylambda_workers.pyx", line 426, in graphlab.cython.cy_pylambda_workers._eval_lambda
File "graphlab\cython\cy_pylambda_workers.pyx", line 171, in graphlab.cython.cy_pylambda_workers.lambda_evaluator.eval_simple
File "graphlab\cython\cy_flexible_type.pyx", line 1193, in graphlab.cython.cy_flexible_type.process_common_typed_list
File "graphlab\cython\cy_flexible_type.pyx", line 1138, in graphlab.cython.cy_flexible_type._fill_typed_sequence
File "graphlab\cython\cy_flexible_type.pyx", line 1385, in graphlab.cython.cy_flexible_type._ft_translate
OverflowError: long too big to convert

Is this error because my computer lacks memory to compute the regression model? How would this error be fixed?

Mustafa Khan
  • 397
  • 1
  • 5
  • 13

1 Answers1

0

It seems like you have a typo at the end of this line: poly15_data = polynomial_sframe(sales['sqft_living'], 5)

Change 5 to 15 and it should work.