0

I am trying to implement the Fibonacci Sequence in Python with Machine Learning. I want my program to predict the next 5 digits after the given input. Such as, if I pass [0,1,1], it will predict and return [2,3,5,8,13]. However, I can't find a way to do this. my program can currently predict the next digit only. Yes, I could hard-code it, updating the array with the new outputs, but I don't want to do that. My code:

#! /usr/bin/python3
from sklearn import svm
from sklearn.linear_model import LinearRegression

features = [
 [0,1,1],
 [2,3,5],
 [8,13,21],
 [34,55,89],

 ]
labels = [2,8,34,144]

clf = LinearRegression()
clf.fit(features, labels)
test = [[144, 233, 377]]
print(clf.predict(test))

Any help?

shamilpython
  • 479
  • 1
  • 5
  • 18
  • you can append the output to current input and feed it to model. (In your case you need to iterate 5 times) – Sociopath Aug 03 '18 at 05:19
  • That's what I said, I don't want to hard-code it. – shamilpython Aug 03 '18 at 06:33
  • As I'm sure you're aware, Linear Regression is not the best way to go about this. But, if you must, you will need to regress over a longer sequence than three numbers to get a relevant line. For example, in this article they regress over 100 numbers in the sequence: https://medium.com/@curiousily/predicting-the-next-fibonacci-number-with-linear-regression-in-tensorflow-js-f62a9230b133 Finally, you're importing svm (support vector machine) but not using it, at least in the code you have currently posted. So, you can remove that import. – davecom Aug 03 '18 at 06:35
  • sorry, forgot to remove that import. – shamilpython Aug 03 '18 at 07:02
  • But what you would you recommend instead of Linear Regression? – shamilpython Aug 03 '18 at 07:03
  • Would you like to try sequence to sequence generation? https://machinelearningmastery.com/sequence-prediction/ – Ankita Mehta Aug 03 '18 at 08:59
  • hmm, looks fun. I'll check it – shamilpython Aug 03 '18 at 09:08

2 Answers2

2

This might help you out; see the notes in the code

from sklearn.linear_model import LinearRegression

#define your inputs
features = [ [0,1,1],
             [2,3,5],
             [8,13,21],
             [34,55,89] ]

labels = [2,8,34,144]

# create your linear regression extrapolator
clf = LinearRegression()
clf.fit(features, labels)

# create a simple function to find the next number in the fibonacci sequence
def find_next(feat_list):
    # feat_list is your input list of numbers
    result = clf.predict(feat_list)
    result = result.tolist()
    result = [int(x) for x in result]
    return result

# create one more function to iterate and add as many numbers to the sequence as you want
def find_next_numbers(feat_list, how_many):
    # feat_list is your input list of numbers
    # how_many is the number of numbers you want to append
    result = []
    for i in range(how_many):
        nextnum = find_next(feat_list)
        result = result + nextnum
        # remove the smallest number and add the number you just found
        # before you iterate again using this new list as input
        feat_list[0] = feat_list[0][1:] + nextnum
    return result


# test it
test = [[144, 233, 377]]    
print(find_next_numbers(test, 5))
jberrio
  • 972
  • 2
  • 9
  • 20
  • But, the problem is as I mentioned, i don't want to hard-code it. I would like a solution where it'll be able to predict the next five digits _without_ coding it explicitly. – shamilpython Aug 03 '18 at 07:05
  • That is a perfect solution. Do you have any problem with it? – jberrio Aug 03 '18 at 07:34
2

If you want multiple outputs from your model, then you have to train it in such way. Then this becomes a multi-output problem, where you give 3 features and want to predict 5 outputs.

Look at my answer here for some description about this.

Currently you are training it to predict a single value. So the model will always predict single value. Train the model by feeding multiple values in output.

Something like this:

# Three features per row
features = [[0,   1,  1],
            [2,   3,  5],
            [8,  13, 21],
            [34, 55, 89]]

# This changed.
# Now a single label consists of a list of output values to be predicted
# 5 outputs per row
labels = [[2,     3,   5,   8,  13], 
          [8,    13,  21,  34,  55], 
          [34,   55,  89, 144, 233], 
          [144, 233, 377, 610, 987]]

clf = LinearRegression()
clf.fit(features, labels)
test = [[144, 233, 377]]
print(clf.predict(test))

# Output
# array([[ 610.,  987., 1597., 2584., 4181.]])

But note that as I mentioned in my linked answer, all scikit-learn estimators are not capable of predicting multiple outputs.

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132