0

I am trying to load the SQL query result to a new table after some transformation. But even the simplest script is failing.

   DECLARE @inquery nvarchar(max) = N'
   SELECT TOP 2000000  * from SQL Table'

   DECLARE @Rscript nvarchar(max) = N'

            sqlConnString = "Driver={SQL Server};SERVER='+@@SERVERNAME+N';DATABASE='+DB_NAME()+N';Trusted_Connection=True;"
            outTabName <- "OutputTable"
            outTabDS <- RxSqlServerData(table = outTabName, connectionString = sqlConnString)
          rxDataStep(inData = InputDataSet, outFile = outTabDS, maxRowsByCols = NULL, rowsPerRead = 500000)
   '
   EXEC sp_execute_external_script @language = N'R'
          , @script = @Rscript
          , @input_data_1 = @inquery
   WITH result sets none;

When I run this with 1M rows, it runs but fails to write with 2M rows. THough RevoScaleR function process the data in chunks then why having more numbers of rows is a problem? Because same query ger results in SQL server. Max Memory percentage is also allocated to 50 of 32 GB RAM.

Ashish Tripathi
  • 339
  • 4
  • 4
  • How big is the data are you trying to load? Check this link: https://support.microsoft.com/en-us/help/3171055/fix-cannot-use-r-functions-to-import-data-when-data-frame-is-larger-th and why can't you allocate more than 32GB? Might be you don't allocate enough resources, try to increase allocate memory to 40Gb and re-run SQL script. – CR241 Oct 18 '18 at 21:20
  • The issue should be something else for 2M rows even 8GB memory should be more than enough and moreover, RevoscaleR uses the data processing in chunks. When I check the size of data it is 1.2GB. And I already allocated max Memory percent to 50. – Ashish Tripathi Oct 19 '18 at 00:08

1 Answers1

0

RevoScaleR functions processes the data in chunks when using data sources like RxSqlServerData. In this case, if you use RxSqlServerData for the input data instead of a data frame, you should not hit the memory limits. Here is an example:

DECLARE @Rscript nvarchar(max) = N'

    sqlConnString = "Driver={SQL Server};SERVER='+@@SERVERNAME+N';DATABASE='+DB_NAME()+N';Trusted_Connection=True;"
    inTabName <- "InputTable"
    outTabName <- "OutputTable"
    inTabDS <- RxSqlServerData(table = inTabName, connectionString = sqlConnString)
    outTabDS <- RxSqlServerData(table = outTabName, connectionString = sqlConnString)
  rxDataStep(inData = inTabDS, outFile = outTabDS, maxRowsByCols = NULL, rowsPerRead = 500000)'
EXEC sp_execute_external_script @language = N'R'
      , @script = @Rscript
WITH result sets none;
Arun Gurunathan
  • 306
  • 1
  • 5
  • Thank you Arun for the quick response. I tried that way as well before. When I tried, I was able to read all the data without error but when it comes to writing into a table it got failed. for example, for the same code you have writtern, I am getting error: [Microsoft][ODBC SQL Server Driver]String data, right truncation But when, I rake the outfile out from rxDataStep then I get the message: Rows Read: 10, Total Rows Processed: 10, Total Chunk Time: 0.069 seconds I know it worked for entire 2M rows. But here for testing I was using only 10 rows and even that was failing. – Ashish Tripathi Oct 18 '18 at 23:48
  • with the error message: [Microsoft][ODBC SQL Server Driver]String data, right truncation ...When I check into the database, new output tables get created with only header no data. – Ashish Tripathi Oct 18 '18 at 23:56
  • Based on the error, input query length may be an issue(due to number of columns?), can you troubleshoot by using a sqlQuery parameter instead of table for RxSqlServerData? – Arun Gurunathan Oct 22 '18 at 17:43
  • I have used sqlQuery parameter only not the table for inTabDS , as I am taking the part of data from a table. (select * from table where year = 2015 (this has around 120 columns with 2M rows.. It looks huge but size is 1GB and it should at least not fail. may take more time. – Ashish Tripathi Oct 22 '18 at 20:45