2

I have a simple table called LOGENTRY with fields called "DATE" and "COST". Example:

+--------------+-------+
| DATE         | COST  |
+--------------+-------+
| MAY 1 2013   | 0.8   |
| SEP 1 2013   | 0.4   |
| NOV 1 2013   | 0.6   |
| DEC 1 2013   | 0.2   |
+--------------+-------+

I would like to find the slope of the COST field over time (a range of rows selected), resulting in SLOPE=-0.00216 (This is equivalent to Excel's SLOPE function, aka linear regression).

Is there a simple way to SELECT the slope of COST? If I do the math in the calling language (php) I can find slope as:

SLOPE =  (N * Sum_XY - Sum_X * Sum_Y)/(N * Sum_X2 - Sum_X * Sum_X);

I saw some similar questions posted but they are more complex. I'm trying to strip this example down to the simplest situation - so I can understand the answer :) Here's as close as I got...but MYSQL complains about the syntax near: 'float)) AS Sum_X, SUM(CAST(LOGENTRY.DATE as float) * CAST(LOGENTRY.DATE'

SELECT 
  COUNT( * ) AS N, 
  SUM( CAST( LOGENTRY.DATE AS FLOAT ) ) AS Sum_X, 
  SUM( CAST( LOGENTRY.DATE AS FLOAT ) * CAST( LOGENTRY.DATE AS FLOAT ) ) AS Sum_X2, 
  SUM( LOGENTRY.COST ) AS Sum_Y, SUM( LOGENTRY.COST * LOGENTRY.COST ) AS Sum_Y2, 
  SUM( CAST( LOGENTRY.DATE AS FLOAT ) * LOGENTRY.COST ) AS Sum_XY
FROM LOGENTRY
TSG
  • 4,242
  • 9
  • 61
  • 121
  • with this information no one can answer. When ever you post a question please post sample data , structure and desired output – Muhammad Raheel Feb 15 '13 at 17:13
  • Possible duplicate: http://stackoverflow.com/questions/9084761/how-to-calculate-the-slope-in-sql – sgeddes Feb 15 '13 at 17:16
  • I saw posting 9084761 but it is more complicated (additional field KEYWORD). I'm trying to keep it simple so I and others can understand it better. – TSG Feb 15 '13 at 17:23
  • Do you desperately need it done within MySQL? Most likely a lot easier in whatever is receiving the data. – Orbling Feb 15 '13 at 17:30
  • Data is being returned to PHP. I *thought* it was always more efficient to do it in the SQL – TSG Feb 15 '13 at 17:30
  • Well, it is if the data quantity is huge, otherwise perhaps not. – Orbling Feb 15 '13 at 17:31
  • Possible duplicate: http://stackoverflow.com/questions/4400440/linear-regression-finding-slope-in-mysql – Orbling Feb 15 '13 at 17:32
  • Put parentheses around the sub-select in the FROM clause, and a comma after before it (to separate the tables, or use a JOIN). – Orbling Feb 15 '13 at 17:34
  • Take a look at this fiddle: http://sqlfiddle.com/#!2/601fa5/9 -- the slope result shows as 0 I imagine due to mantissa underflow, needs to be in a double. Hence why I added the numerator and denominator to the output. Though no quicker than doing what you have done in your answer. – Orbling Feb 15 '13 at 18:02

1 Answers1

4

It seems that MySQL cannot cast a date as float (as per the other examples in stackoverflow). Perhaps the other examples refer to another database. So by converting dates to unix_timestamps I am able to get an answer...with the final calculation in PHP. If this is WRONG...please post and I will remove answer...

SELECT
        COUNT(*) AS N,
        SUM(UNIX_TIMESTAMP(LOGENTRY.DATE)) AS Sum_X,
        SUM(UNIX_TIMESTAMP(LOGENTRY.DATE) * UNIX_TIMESTAMP(LOGENTRY.DATE)) AS Sum_X2,
        SUM(LOGENTRY.COST) AS Sum_Y,
        SUM(LOGENTRY.COST*LOGENTRY.COST) AS Sum_Y2,
        SUM(UNIX_TIMESTAMP(LOGENTRY.DATE) * LOGENTRY.COST) AS Sum_XY
    FROM LOGENTRY
TSG
  • 4,242
  • 9
  • 61
  • 121
  • That's not too bad, though check your data to make sure you're getting a good answer out of it. – Orbling Feb 15 '13 at 18:03