1

I am implementing a linear regression algorithm to generate market price predictions using some price values that the program is reading from a csv file. However, the output is showing -nan(ind) errors for some of the generated values.

B0 represents the value of Y when X=0 and B1 is the Regression Coefficient (this represents the change in the dependent variable based on the unit change in the independent variable). The error value is the error in the predicted value of B0 and B1.enter image description here

The LinearRegression class takes in vectors created by another class from a csv file and assigns the price values from the vectors to the X and Y arrays.

Product: BTC/USDT
Final Values are: B0=-nan(ind) B1=-nan(ind) error=-5352

Product: DOGE/BTC
Final Values are: B0=2.85961e-07 B1=9.47116e-14 error=-2.63192e-08

Product: DOGE/USDT
Final Values are: B0=0.00144125 B1=2.39817e-06 error=-0.00022323

Product: ETH/BTC
Final Values are: B0=0.0189528 B1=0.000414906 error=-0.00296171

Product: ETH/USDT
Final Values are: B0=-nan(ind) B1=-nan(ind) error=-117.329

LinearRegression.cpp

#include "LinearRegression.h"
#include <iostream>
#include <algorithm>
#include <vector>

/*sorts based on absolute min value or error*/
bool LinearRegression::custom_sort(double a, double b) 
{
    double ax = std::abs(a-0);
    double bx = std::abs(b-0);
    return ax < bx;
}

void LinearRegression::gradientDescent(std::vector<OrderBookEntry>& ordersX, std::vector<OrderBookEntry>& ordersY)
{
    /*Intialization Phase*/
    double err; // for calculating error on each stage
    double b0 = 0; // represents the value of y when x = 0
    double b1 = 0; // represents the change in y based on the unit change in x
    double alpha = 0.01; // learning rate
    std::vector<double>error; // array to store all error values

    /*Training Phase*/
    for (int i = 0; i < 200; i++) // loop 200 times since there are 50 values and I want 4 epochs
    {
        int index = i % 50; // for accessing index after every epoch
        double p = b0 + b1 * ordersX[index].price; // calculate prediction
        err = p - ordersY[index].price; // calculate error
        b0 = b0 - alpha * err;
        b1 = b1 - alpha * err * ordersX[index].price;
        error.push_back(err);
    }

    std::sort(error.begin(), error.end(), &LinearRegression::custom_sort); // sorting based on error values
    std::cout << "Final Values are: " << "B0=" << b0 << " " << "B1=" << b1 << " " << "error=" << error[0] << std::endl;
    std::cout << std::endl;
}

The values appear to overflow.

Product: BTC/USDT
B0=53.52 B1=286918 error=-5352
B0=-1.53829e+07 B1=-8.24746e+10 error=1.53829e+09
B0=4.42182e+12 B1=2.37074e+16 error=-4.42183e+14
B0=-1.27144e+18 B1=-6.8188e+21 error=1.27144e+20
B0=3.65976e+23 B1=1.96425e+27 error=-3.65977e+25
B0=-1.05437e+29 B1=-5.65961e+32 error=1.05437e+31
B0=3.03864e+34 B1=1.63144e+38 error=-3.03865e+36
B0=-8.75964e+39 B1=-4.70329e+43 error=8.75967e+41
B0=2.52572e+45 B1=1.35635e+49 error=-2.52573e+47
B0=-7.28392e+50 B1=-3.91165e+54 error=7.28395e+52
B0=2.10128e+56 B1=1.12879e+60 error=-2.10129e+58
B0=-6.0662e+61 B1=-3.26004e+65 error=6.06622e+63
B0=1.75227e+67 B1=9.41843e+70 error=-1.75227e+69
B0=-5.06252e+72 B1=-2.72117e+76 error=5.06254e+74
B0=1.4633e+78 B1=7.86881e+81 error=-1.4633e+80
B0=-4.23258e+83 B1=-2.27668e+87 error=4.23259e+85
B0=1.22524e+89 B1=6.59392e+92 error=-1.22525e+91
B0=-3.54875e+94 B1=-1.90989e+98 error=3.54876e+96
B0=1.02847e+100 B1=5.53833e+103 error=-1.02848e+102
B0=-2.98238e+105 B1=-1.60601e+109 error=2.98239e+107
B0=8.64847e+110 B1=4.65727e+114 error=-8.6485e+112
B0=-2.50817e+116 B1=-1.35078e+120 error=2.50818e+118
B0=7.27685e+121 B1=3.92018e+125 error=-7.27688e+123
B0=-2.11186e+127 B1=-1.1377e+131 error=2.11187e+129
B0=6.13003e+132 B1=3.30292e+136 error=-6.13005e+134
B0=-1.78027e+138 B1=-9.59566e+141 error=1.78028e+140
B0=5.17428e+143 B1=2.79014e+147 error=-5.17429e+145
B0=-1.5051e+149 B1=-8.11909e+152 error=1.50511e+151
B0=4.38024e+154 B1=2.36314e+158 error=-4.38025e+156
B0=-1.27505e+160 B1=-6.87965e+163 error=1.27505e+162
B0=3.715e+165 B1=2.0061e+169 error=-3.71501e+167
B0=-1.08336e+171 B1=-5.85049e+174 error=1.08336e+173
B0=3.16242e+176 B1=1.70942e+180 error=-3.16243e+178
B0=-9.24162e+181 B1=-4.9963e+185 error=9.24166e+183
B0=2.70125e+187 B1=1.46044e+191 error=-2.70126e+189
B0=-7.89927e+192 B1=-4.2726e+196 error=7.89929e+194
B0=2.31098e+198 B1=1.24997e+202 error=-2.31098e+200
B0=-6.76224e+203 B1=-3.65832e+207 error=6.76226e+205
B0=1.97988e+209 B1=1.07151e+213 error=-1.97988e+211
B0=-5.79898e+214 B1=-3.13841e+218 error=5.799e+216
B0=1.69851e+220 B1=9.19241e+223 error=-1.69852e+222
B0=-4.97593e+225 B1=-2.69352e+229 error=4.97595e+227
B0=1.45827e+231 B1=7.89507e+234 error=-1.45827e+233
B0=-4.27794e+236 B1=-2.31801e+240 error=4.27796e+238
B0=1.25859e+242 B1=6.83369e+245 error=-1.2586e+244
B0=-3.71086e+247 B1=-2.0151e+251 error=3.71088e+249
B0=1.09502e+253 B1=5.95039e+256 error=-1.09502e+255
B0=-3.23726e+258 B1=-1.76121e+262 error=3.23727e+260
B0=9.60347e+263 B1=5.23657e+267 error=-9.6035e+265
B0=-2.85728e+269 B1=-1.55905e+273 error=2.85729e+271
B0=8.35796e+274 B1=4.48066e+278 error=-8.35799e+276
B0=-2.40228e+280 B1=-1.28797e+284 error=2.40229e+282
B0=6.90536e+285 B1=3.70228e+289 error=-6.90539e+287
B0=-1.98555e+291 B1=-1.06486e+295 error=1.98556e+293
B0=5.71528e+296 B1=3.06749e+300 error=-5.7153e+298
B0=-1.64656e+302 B1=-8.83838e+305 error=1.64656e+304
B0=inf B1=inf error=-inf
B0=-nan(ind) B1=-nan(ind) error=inf
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
Product: ETH/USDT
B0=1.17329 B1=137.662 error=-117.329
B0=-159.298 B1=-18703.8 error=16047.1
B0=21805 B1=2.56029e+06 error=-2.19643e+06
B0=-2.98522e+06 B1=-3.50584e+08 error=3.00702e+08
B0=4.08941e+08 B1=4.8046e+10 error=-4.11926e+10
B0=-5.60438e+10 B1=-6.5845e+12 error=5.64527e+12
B0=7.68169e+12 B1=9.02644e+14 error=-7.73773e+14
B0=-1.05514e+15 B1=-1.24231e+17 error=1.06282e+17
B0=1.45221e+17 B1=1.70978e+19 error=-1.46276e+19
B0=-1.99914e+19 B1=-2.35429e+21 error=2.01366e+21
B0=2.75449e+21 B1=3.2459e+23 error=-2.77448e+23
B0=-3.80181e+23 B1=-4.4849e+25 error=3.82935e+25
B0=5.25454e+25 B1=6.20037e+27 error=-5.29256e+27
B0=-7.26757e+27 B1=-8.57946e+29 error=7.32012e+29
B0=1.01069e+30 B1=1.19916e+32 error=-1.01796e+32
B0=-1.41271e+32 B1=-1.67607e+34 error=1.42281e+34
B0=1.97455e+34 B1=2.34266e+36 error=-1.98868e+36
B0=-2.76135e+36 B1=-3.27792e+38 error=2.7811e+38
B0=3.8645e+38 B1=4.58828e+40 error=-3.89212e+40
B0=-5.4218e+40 B1=-6.45205e+42 error=5.46045e+42
B0=7.64965e+42 B1=9.13339e+44 error=-7.70386e+44
B0=-1.08302e+45 B1=-1.2932e+47 error=1.09066e+47
B0=1.53631e+47 B1=1.83788e+49 error=-1.54714e+49
B0=-2.1836e+49 B1=-2.61243e+51 error=2.19897e+51
B0=3.10432e+51 B1=3.71452e+53 error=-3.12616e+53
B0=-4.41571e+53 B1=-5.2858e+55 error=4.44675e+55
B0=6.29338e+55 B1=7.54515e+57 error=-6.33754e+57
B0=-8.98433e+57 B1=-1.07722e+60 error=9.04726e+59
B0=1.28377e+60 B1=1.54054e+62 error=-1.29276e+62
B0=-1.83724e+62 B1=-2.20625e+64 error=1.85008e+64
B0=2.63373e+64 B1=3.16576e+66 error=-2.6521e+66
B0=-3.78114e+66 B1=-4.54731e+68 error=3.80748e+68
B0=5.44207e+68 B1=6.55779e+70 error=-5.47988e+70
B0=-7.85387e+70 B1=-9.47068e+72 error=7.90829e+72
B0=1.13452e+73 B1=1.36839e+75 error=-1.14237e+75
B0=-1.63956e+75 B1=-1.97795e+77 error=1.65091e+77
B0=2.37008e+77 B1=2.85942e+79 error=-2.38648e+79
B0=-3.42728e+79 B1=-4.13605e+81 error=3.45098e+81
B0=4.96088e+81 B1=5.99092e+83 error=-4.99515e+83
B0=-7.19026e+83 B1=-8.68868e+85 error=7.23987e+85
B0=1.04287e+86 B1=1.26026e+88 error=-1.05006e+88
B0=-1.51315e+88 B1=-1.82918e+90 error=1.52358e+90
B0=2.1965e+90 B1=2.65557e+92 error=-2.21163e+92
B0=-3.18977e+92 B1=-3.85755e+94 error=3.21173e+94
B0=4.63367e+94 B1=5.60388e+96 error=-4.66557e+96
B0=-6.73142e+96 B1=-8.14093e+98 error=6.77776e+98
B0=9.78307e+98 B1=1.18366e+101 error=-9.85038e+100
B0=-1.42254e+101 B1=-1.72127e+103 error=1.43232e+103
B0=2.06894e+103 B1=2.50376e+105 error=-2.08316e+105
B0=-3.01155e+105 B1=-3.64699e+107 error=3.03224e+107
B0=4.24917e+107 B1=4.98438e+109 error=-4.27928e+109
B0=-5.81026e+109 B1=-6.82206e+111 error=5.85275e+111
B0=7.95277e+111 B1=9.33795e+113 error=-8.01088e+113
B0=-1.08878e+114 B1=-1.27866e+116 error=1.09673e+116
B0=1.4915e+116 B1=1.75235e+118 error=-1.50239e+118
B0=-2.04404e+118 B1=-2.40152e+120 error=2.05896e+120
B0=2.80168e+120 B1=3.29215e+122 error=-2.82212e+122
B0=-3.84833e+122 B1=-4.53097e+124 error=3.87635e+124
B0=5.29652e+124 B1=6.23596e+126 error=-5.335e+126
B0=-7.29132e+126 B1=-8.58662e+128 error=7.34428e+128
B0=1.00462e+129 B1=1.18385e+131 error=-1.01192e+131
B0=-1.3866e+131 B1=-1.63574e+133 error=1.39665e+133
B0=1.91645e+133 B1=2.26142e+135 error=-1.93032e+135
B0=-2.65065e+135 B1=-3.12912e+137 error=2.66981e+137
B0=3.68623e+137 B1=4.3736e+139 error=-3.71274e+139
B0=-5.15246e+139 B1=-6.11302e+141 error=5.18932e+141
B0=7.20164e+141 B1=8.54422e+143 error=-7.25316e+143
B0=-1.00713e+144 B1=-1.19553e+146 error=1.01433e+146
B0=1.40947e+146 B1=1.67345e+148 error=-1.41954e+148
B0=-1.97745e+148 B1=-2.35321e+150 error=1.99155e+150
B0=2.79e+150 B1=3.33115e+152 error=-2.80977e+152
B0=-3.95e+152 B1=-4.71657e+154 error=3.9779e+154
B0=5.60326e+154 B1=6.70318e+156 error=-5.64276e+156
B0=-7.96409e+156 B1=-9.52811e+158 error=8.02013e+158
B0=1.13222e+159 B1=1.35477e+161 error=-1.14018e+161
B0=-1.61051e+161 B1=-1.92785e+163 error=1.62183e+163
B0=2.29534e+163 B1=2.75188e+165 error=-2.31144e+165
B0=-3.27679e+165 B1=-3.92887e+167 error=3.29974e+167
B0=4.6822e+167 B1=5.61868e+169 error=-4.71497e+169
B0=-6.70082e+169 B1=-8.0467e+171 error=6.74764e+171
B0=9.6058e+171 B1=1.15462e+174 error=-9.67281e+173
B0=-1.37907e+174 B1=-1.6585e+176 error=1.38867e+176
B0=1.98485e+176 B1=2.39177e+178 error=-1.99864e+178
B0=-2.86448e+178 B1=-3.45417e+180 error=2.88433e+180
B0=4.13784e+180 B1=4.99081e+182 error=-4.16649e+182
B0=-5.97986e+182 B1=-7.21402e+184 error=6.02124e+184
B0=8.64423e+184 B1=1.04289e+187 error=-8.70403e+186
B0=-1.25001e+187 B1=-1.50851e+189 error=1.25865e+189
B0=1.80934e+189 B1=2.18502e+191 error=-1.82184e+191
B0=-2.62245e+191 B1=-3.16896e+193 error=2.64054e+193
B0=3.80358e+193 B1=4.59646e+195 error=-3.8298e+195
B0=-5.5188e+195 B1=-6.67144e+197 error=5.55683e+197
B0=8.01114e+197 B1=9.68548e+199 error=-8.06633e+199
B0=-1.16338e+200 B1=-1.40693e+202 error=1.17139e+202
B0=1.69e+202 B1=2.04386e+204 error=-1.70164e+204
B0=-2.4551e+204 B1=-2.96918e+206 error=2.472e+206
B0=3.5681e+206 B1=4.31706e+208 error=-3.59266e+208
B0=-5.18832e+208 B1=-6.27787e+210 error=5.224e+210
B0=7.54589e+210 B1=9.13177e+212 error=-7.59777e+212
B0=-1.09838e+213 B1=-1.33014e+215 error=1.10593e+215
B0=1.54977e+215 B1=1.81792e+217 error=-1.56075e+217
B0=-2.11913e+217 B1=-2.48816e+219 error=2.13463e+219
B0=2.90056e+219 B1=3.40576e+221 error=-2.92175e+221
B0=-3.97101e+221 B1=-4.66356e+223 error=4.00001e+223
B0=5.43984e+223 B1=6.3912e+225 error=-5.47955e+225
B0=-7.45508e+225 B1=-8.75887e+227 error=7.50948e+227
B0=1.02184e+228 B1=1.20072e+230 error=-1.02929e+230
B0=-1.40357e+230 B1=-1.65255e+232 error=1.41379e+232
B0=1.93176e+232 B1=2.27439e+234 error=-1.9458e+234
B0=-2.65931e+234 B1=-3.13173e+236 error=2.67863e+236
B0=3.6641e+236 B1=4.31778e+238 error=-3.69069e+238
B0=-5.05726e+238 B1=-5.96592e+240 error=5.0939e+240
B0=6.98972e+240 B1=8.24789e+242 error=-7.0403e+242
B0=-9.6675e+242 B1=-1.14126e+245 error=9.7374e+244
B0=1.34445e+245 B1=1.59515e+247 error=-1.35412e+247
B0=-1.87922e+247 B1=-2.22955e+249 error=1.89266e+249
B0=2.6266e+249 B1=3.11627e+251 error=-2.64539e+251
B0=-3.67322e+251 B1=-4.36038e+253 error=3.69948e+253
B0=5.14066e+253 B1=6.10345e+255 error=-5.17739e+255
B0=-7.21221e+255 B1=-8.58267e+257 error=7.26362e+257
B0=1.01758e+258 B1=1.21495e+260 error=-1.02479e+260
B0=-1.44065e+260 B1=-1.72024e+262 error=1.45083e+262
B0=2.04363e+262 B1=2.4448e+264 error=-2.05804e+264
B0=-2.90468e+264 B1=-3.47512e+266 error=2.92512e+266
B0=4.12944e+266 B1=4.94115e+268 error=-4.15849e+268
B0=-5.87388e+268 B1=-7.0313e+270 error=5.91518e+270
B0=8.37161e+270 B1=1.00367e+273 error=-8.43035e+272
B0=-1.19512e+273 B1=-1.43295e+275 error=1.20349e+275
B0=1.7077e+275 B1=2.04926e+277 error=-1.71966e+277
B0=-2.44394e+277 B1=-2.93481e+279 error=2.46102e+279
B0=3.50345e+279 B1=4.21118e+281 error=-3.52789e+281
B0=-5.02977e+281 B1=-6.04894e+283 error=5.06481e+283
B0=7.23918e+283 B1=8.72333e+285 error=-7.28948e+285
B0=-1.04474e+286 B1=-1.25981e+288 error=1.05198e+288
B0=1.50916e+288 B1=1.82026e+290 error=-1.51961e+290
B0=-2.18099e+290 B1=-2.63111e+292 error=2.19608e+292
B0=3.15275e+292 B1=3.80367e+294 error=-3.17456e+294
B0=-4.55906e+294 B1=-5.50188e+296 error=4.59058e+296
B0=6.59908e+296 B1=7.96927e+298 error=-6.64467e+298
B0=-9.56466e+298 B1=-1.15579e+301 error=9.63065e+300
B0=1.38725e+301 B1=1.67643e+303 error=-1.39682e+303
B0=-2.01283e+303 B1=-2.43322e+305 error=2.0267e+305
B0=2.92184e+305 B1=3.53251e+307 error=-2.94197e+307
B0=-inf B1=-inf error=inf
B0=-nan(ind) B1=-nan(ind) error=-inf
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)
B0=-nan(ind) B1=-nan(ind) error=-nan(ind)

Sample data

ahmadalibin
  • 133
  • 8
  • 1
    maybe not the problem, i am not sure if `abs(a) < abs(b)` is a strict weak ordering. (if it isnt then it can explain the problem) – 463035818_is_not_an_ai Dec 09 '20 at 15:30
  • largest_prime_is_463035818 is correct, because NaN values mess it up. Need to check for a NaN and handle that case special. – Eljay Dec 09 '20 at 15:33
  • However, without NaN values, then `abs(a) < abs(b)` is a strict weak ordering effectively. And even with NaN, it seems to me that it is still a strict weak ordering, as Nan values are incomparable, even between them – Damien Dec 09 '20 at 15:37
  • What is `ordersX.size()` in your test case? Also, is `int index = i % 5;` supposed to only run through the first `5` elements of the size 50 arrays? – dxiv Dec 10 '20 at 03:41
  • @dxiv `ordersX.size()` is 50. `int index = i % 5;` is a mistake, it's supposed to be `int index = i % 50;`. I have edited it. – ahmadalibin Dec 10 '20 at 05:13
  • @ahmadalibin You are overflowing the `double` range because of the `err * x[index]` term which grows exponentially. You'll need to review the algorithm, or otherwise normalize the ranges beforehands. – dxiv Dec 10 '20 at 06:01
  • @dxiv Is this how I should normalize the range [link](https://stackoverflow.com/questions/10376600/normalizing-a-list-of-doubles-to-range-1-to-1-or-0-255)? Do I normalize the range of `err * x[index]` or another variable? – ahmadalibin Dec 10 '20 at 06:50
  • @ahmadalibin You would normalize one or both of the `orderX`, `orderY` vectors, calculate your estimates, then "reverse" the normalization at the end. The details of how to do that depend, however, on what the original ranges are, and what you are actually calculating since that's not obvious from the code. – dxiv Dec 10 '20 at 07:07
  • @dxiv Can I ask if this is the proper way to normalize vectors [link](https://wtools.io/paste-code/b2Sm)? Also, how do I reverse the normalization after I have calculated the estimates? – ahmadalibin Dec 10 '20 at 12:54
  • @ahmadalibin You didn't answer this part: "*the details of how to do that depend, however, on what the original ranges are, and what you are actually calculating since that's not obvious from the code*". Try the code with some simple small cases, for example a 2-point problem with `orderX = { 1, 1000 }` and `orderY = { 1, 1000 }`. It's hard to tell what the calculated `b0`, `b1` and `err` are supposed to be. – dxiv Dec 10 '20 at 17:56
  • @dxiv The values for the vectors are taken from here [cryptocurrency asking prices csv](https://wtools.io/paste-code/b2S8). `ordersX` contains the prices of the products from time stamp 17:01:24 while `ordersY` contains the prices of the products from time stamp 17:01:30. B0 represents the value of Y when X=0 and B1 is the Regression Coefficient (this represents the change in the dependent variable based on the unit change in the independent variable). The error value is the error in the predicted value of B0 and B1. – ahmadalibin Dec 11 '20 at 01:28
  • @ahmadalibin The code doesn't do what you describe. Did you try that 2-point example? You say you should be getting `b0 = 0`, `b1 = 1`, `err = 0` but the iteration gets nowhere near that. – dxiv Dec 11 '20 at 03:03
  • @dxiv I got the code from here [link](https://www.analyticsvidhya.com/blog/2020/04/machine-learning-using-c-linear-logistic-regression/). The code is supposed to create a linear regression slope that will be used to predict values. B0 would be the intercept point of the Y axis and B1 is the slope. I think the explanation I posted wasn't clear before. – ahmadalibin Dec 11 '20 at 03:43
  • @ahmadalibin Not all random code from the Internet is ready and fit for your purposes. In this case, you'll need to change/fix the algorithm so that it actually works, and only worry about the implementation later. – dxiv Dec 11 '20 at 05:00

1 Answers1

0

(Too long for a comment.)

OP's question is about fixing the implementation, but the actual problem is with the algorithm, or perhaps with some unstated assumptions that the algorithm makes.

As mentioned in a comment, the code is borrowed from Machine Learning using C++: A Beginner’s Guide to Linear and Logistic Regression. The claim that it models a linear regression is based on the following example.

For the scope of this tutorial, we’ll use this dataset:

x y
1 1
2 3
3 3
4 2
5 5
6 5

We’ll train our dataset on the first 5 values and test on the last value:

[ ... ]

We’ll enter the test value which is 6. The answer we get is 4.9753 which is quite close to 5. Congratulations! We just completed building a linear regression model with C++, and that too with good parameters.

A minor problem is that the posted code has a typo where double x[] = {1, 2, 4, 3, 5}; is mistakenly used instead of double x[] = {1, 2, 3, 4, 5};. Once the typo is corrected, the estimated value drops to 4.852, which is still close to 5, just not the same value as quoted.

The real problem, however, is that the algorithm does not perform nearly as well for other sample data. Suppose we take the same dataset as above, and just rescale the x axis by a factor of 10.

x y
10 1
20 3
30 3
40 2
50 5
60 5

Expectation would be that the estimate for x = 60 would still be something near y = 5. Instead, the iterations diverge, and the end result is:

The value predicted by the model= 5.74768e+09
dxiv
  • 16,984
  • 2
  • 27
  • 49
  • Ok thanks for your help but now I'm back to square one since I have no idea how to write code to generate predictions. Do you have any tips on how to do that? – ahmadalibin Dec 11 '20 at 09:20
  • @ahmadalibin [Simple linear regressions](https://en.wikipedia.org/wiki/Simple_linear_regression) is a well studied problem. You first need to decide on *your* goals and choose an algorithm, then - and only then - worry about the code. – dxiv Dec 11 '20 at 17:18