I have an error appears when I execute my classe :
PHP Warning: A non-numeric value encountered in /php-ai/php-ml/src/Regression/LeastSquares.php
What I tried to do it's to scrap some price for a product and use a regression approach to make a prediction. Currently it's a first approach to test and after I refactor this function
I do not find exactly what's happen with this error.
maybe I make a mistake somewhere
My function
use Phpml\Dataset\ArrayDataset;
use Phpml\Regression\LeastSquares;
use Phpml\CrossValidation\RandomSplit;
use Phpml\Metric\Accuracy;
use Phpml\Math\Average;
use Phpml\Math\StandardDeviation;
use Phpml\NeuralNetwork\MultilayerPerceptron;
function predict_price($data, $data_type) {
try {
// Validate input parameters
if (!is_array($data)) {
throw new InvalidArgumentException('Data parameter must be an array');
}
if (!in_array($data_type, ['csv', 'array', 'database'])) {
throw new InvalidArgumentException('Invalid data type: ' . $data_type . '. Allowed data types: csv, array, database');
}
// create a new linear regression model
$regression = new LeastSquares();
// load the dataset
if ($data_type === 'array') {
if (!isset($data['samples']) || !is_array($data['samples'])) {
throw new InvalidArgumentException('Data array parameter must contain a "samples" array');
}
if (!isset($data['labels']) || !is_array($data['labels'])) {
throw new InvalidArgumentException('Data array parameter must contain a "labels" array');
}
$samples = array_map(function($sample) {
if (!is_array($sample) || count($sample) != 2) {
throw new InvalidArgumentException('Each sample in the "samples" array must be an array with 2 values');
}
$timestamp = strtotime($sample[0]);
if ($timestamp === false) {
throw new InvalidArgumentException('Invalid timestamp: ' . $sample[0] . '. Timestamp should be in the format yyyy-mm-dd');
}
return [$timestamp, $sample[1]];
}, $data['samples']);
$labels = $data['labels'];
$dataset = new ArrayDataset($samples, $labels);
}
// split the dataset into training and testing sets
$randomSplit = new RandomSplit($dataset, 0.3);
$trainingSamples = $randomSplit->getTrainSamples();
$trainingLabels = $randomSplit->getTrainLabels();
$testingSamples = $randomSplit->getTestSamples();
$testingLabels = $randomSplit->getTestLabels();
// train the model on the training set
$regression->train($trainingSamples, $trainingLabels);
// make predictions on the testing set
$predictions = $regression->predict($testingSamples);
// evaluate the performance of the model
$metric = new Accuracy();
$accuracy = $metric->score($testingLabels, $predictions);
// calculate the confidence of the prediction
$confidence = $regression->getCoefficients()[0];
// return the accuracy of the model, the prediction for a new data point, and the confidence of the prediction
$newSample = [date('yyyy-mm-dd'), 1];
$newPrediction = $regression->predict([$newSample]);
return array('accuracy' => $accuracy, 'prediction' => $newPrediction[0], 'confidence' => $confidence);
} catch (InvalidArgumentException $e) {
// handle input validation errors
return array('error' => 'Invalid input: ' . $e->getMessage());
} catch (Exception $e) {
// handle other exceptions and return an error message
return array('error' => 'Unexpected error: ' . $e->getMessage());
}
}
and below my example to execute :
$historical_prices = array(
array('2022-01-01', 10),
array('2022-01-02', 11),
array('2022-01-03', 12),
array('2022-01-04', 13),
array('2022-01-05', 14),
array('2022-01-06', 15),
array('2022-01-07', 16),
array('2022-01-08', 17),
array('2022-01-09', 18),
array('2022-01-10', 19),
);
$data = array(
'samples' => $historical_prices,
'labels' => array(11, 12, 13, 14, 15, 16, 17, 18, 19, 20),
);
$result = predict_price($data, 'array');
echo 'Accuracy: ' . $result['accuracy'] . "<br>";
echo 'Predicted price: ' . $result['prediction'] . "<br>";
echo 'Confidence: ' . $result['confidence'] . "<br>";
echo '-------<br>';
result :
Accuracy: 0 Predicted price: 2.9258048044576 Confidence: 1.1612428352237E-8