0

I have 2 csv datasets with 1 column 'v'. The data in these 2 files are exactly same. The column 'v' contains values that would be converted to decimal? before performing stdev.

sum(v), avg(v) values are same in both the datasets, but stdev values are not matching. How is it even possible ?

Here is the code,

@ds1 =
EXTRACT v decimal?            
FROM @ds1_path
USING Extractors.Csv(skipFirstNRows : 1);

@ds2 = EXTRACT v decimal?            
FROM @ds2_path
USING Extractors.Csv(skipFirstNRows : 1);

@data =
SELECT STDEV(v) AS stdev,
       SUM(v) AS sum,
       AVG(v) AS avg,
       VAR(v) AS vari,
       "ds1" AS type
FROM @ds1
UNION ALL
SELECT STDEV(v) AS stdev,
       SUM(v) AS sum,
       AVG(v) AS avg,
       VAR(v) AS vari,
       "ds2" AS type
FROM @ds2;

This gives the following output. If you notice sum, avg values are exactly same, but VAR and STDEV values are not matching.

Can somebody please help ?

output

  • can you try to see whether you get same stdev for same CSV file in the both parts of UNION ALL. I think the ds1, ds2 are different data sets – Venkataraman R Mar 30 '20 at 04:22
  • Venkataraman R - the ds1 and ds2 are identical datasets, both are having same number of records and same values. I expecting to see matching stdev from both the datsets (just like sum and average ) .. Please refer the output attached. – Pasupathy Devaraj Mar 30 '20 at 05:17

0 Answers0