1

My input values are 1, 2, 3, 4, ... and my output values are 1*1, 2*2, 3*3, 4*4, ... My code looks like this:

$reg = new LeastSquares();

$samples = array();
$targets = array();
for ($i = 1; $i < 100; $i++)
{  
  $samples[] = [$i];
  $targets[] = $i*$i;
}

$reg->train($samples, $targets);
  
echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";

I expect it to output roughly 25 and 100. But I get:

-1183.3333333333
-683.33333333333

I also tried to use SVR instead of LeastSquares but the values are strange too:

2498.23
2498.23

I am new to ML. What am I doing wrong?

Olivier
  • 13,283
  • 1
  • 8
  • 24
zomega
  • 1,538
  • 8
  • 26
  • [`LeastSquares`](https://php-ml.readthedocs.io/en/latest/machine-learning/regression/least-squares/) is for **linear** regression. How do you expect it to work here? – Olivier Jan 09 '23 at 08:29
  • @Olivier I also tried [SVR](https://php-ml.readthedocs.io/en/latest/machine-learning/regression/svr/). How to fix it? – zomega Jan 09 '23 at 11:31
  • Do you know what SVR is? – Olivier Jan 09 '23 at 19:31
  • @Olivier Not really. I thought AI is like a black box which is a neural network. You give it inputs and outputs and in this training proccess it learns. That is what I wanted to do. But now it seems I have to choose the right model based on my input data. That is new to me because I thought the neural network does everything by itself. – zomega Jan 09 '23 at 19:51
  • Least squares don't use a neural network. I'm not familiar with SVR but I highly doubt it uses a neural network. – Olivier Jan 09 '23 at 20:45
  • And even for techniques based on neural networks: there are many different types of NN, and many possible configurations (number of neurons, number of layers, transfer function...). You need knowledge to design a good NN. It's not magical. – Olivier Jan 09 '23 at 20:52
  • 1
    I think `PolynomialRegression` or `Multilayer Perceptron` is more suitable for this case – executable Jan 11 '23 at 11:25

1 Answers1

1

As others have pointed out in the comments LeastSquares is for fitting a linear model to your data (training examples).

Your data set (target = samples^2) is inherently non-linear. If you try to picture what happens when you fit the best possible (in a least square of residuals sense) line to a quadratic curve you get a negative y-intercept (a sketch of this below):

enter image description here

You've trained your linear model on data up to x=99, y=9801, which will mean you have a very large y-intercept. So down at x=5 or x=10 you end up with a large negative value as you've found.

If you use support vector regression with a degree-2 polynomial it will do a good job of capturing the pattern of your data:

<?php
require_once __DIR__ . '/vendor/autoload.php';
use Phpml\Regression\SVR;
use Phpml\SupportVectorMachine\Kernel;

$samples = array();
$targets = array();
for ($i = 1; $i <= 100; $i++)
{  
  $samples[] = [$i];
  $targets[] = $i*$i;
}

$reg = new SVR(Kernel::POLYNOMIAL, $degree = 2);
$reg->train($samples, $targets);

echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";
?>

Returns:

25.0995
100.098

From your response in the comments its clear that you're looking to apply a neural network so that you don't have to worry about what degree of model to fit to your data. A neural network with a single hidden layer can fit any continuous function arbitrarily well with enough hidden nodes, and enough training data.

Unfortunately php-ml doesn't seem to have a MLP (multilayer perceptron - another term for a neural network) for regression available out-of-the-box. I'm sure you could build one from appropriate layers but if your goal is to get up and running with training regression models quickly it might not be the best approach.

kabdulla
  • 5,199
  • 3
  • 17
  • 30