0

I'm using dplyr in my stored procedure. I'm passing in a dataset that's returned by querying responses based on a provided input parameter, StudyID. The goal is to return the results of dplyr from the R script back to the stored procedure. However, I receive the error in my question title.

My stored procedure is as follows:

ALTER PROCEDURE [dbo].[spCodeMeans]
-- Add the parameters for the stored procedure here
@StudyID int,
@StudyID_outer int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here

BEGIN TRY

        exec sp_execute_external_script
        @language = N'R',
        @script = N'
            # Summary Mean
            #
            # Calculates the mean of all independent variables in a table of data
            # grouped by \code{code}. Note that independent variables are
            # identified as all columns matching the following pattern: the letter
            # "c" followed by a one-or-more digit number.
            #
            # @param x Table to summarize
            # @importFrom rlang .data
            #
            # @return Summary table where each distinct \code{code} value is
            #   represented by one row with columns for the respective means of
            #   each independent variable.
            #install.packages("dplyr")

            code_mean <- function(x) {

              dplyr::group_by(x, .data$code) %>%
                dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), mean)

            }

            cmresult <- code_mean(x = input_data_1)

            # Summary Standard Deviation
            #
            # Calculates the standard deviation of all independent variables in a
            # table of data grouped by \code{code}. Note that independent variables
            # are identified as all columns matching the following pattern: the
            # letter "c" followed by a one-or-more digit number.
            #
            # @param x Table to summarize
            # @importFrom rlang .data
            #
            # @return Summary table where each distinct \code{code} value is
            #   represented by one row with columns for the respective standard
            #   deviations of each independent variable.

            code_sd <- function(x) {

              dplyr::group_by(x, .data$code) %>%
                dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), stats::sd)

            }

            sdresult <- code_sd(x = input_data_1)
        ',
    @input_data_1 = N'
        Select Responses = 
        c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
        c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
        c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
        c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
        c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
        from ClosedStudyResponses c
        where DependentVarYN = 0
    ',
    @params = N'@StudyID int',
    @StudyID = @StudyID_outer
    --@output_data_1_name = N'dfcm',
    --@output_data_2_name = N'dfsd'

WITH RESULT SETS (
(cmsresult varchar(MAX)),
(sdresult varchar(MAX)))
END TRY

BEGIN CATCH
    THROW;
END CATCH
END

I call spCodeMeans per below:

DECLARE @out varchar(MAX), @out1 varchar(MAX);

EXEC    spCodeMeans
        @StudyID = 21,
        @StudyID_outer = 21

SELECT   @out, @out1;

GO

Error syntax:

Msg 39004, Level 16, State 20, Line 4
A 'R' script error occurred during execution of 'sp_execute_external_script' 
with HRESULT 0x80004004.
Msg 39019, Level 16, State 1, Line 4
An external script error occurred: 
Error in dplyr::group_by(x, .data$code) %>% 
dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")),  : 
could not find function "%>%"
Calls: source -> withVisible -> eval -> eval -> code_mean

Error in execution.  Check the output for more information.
Error in eval(ei, envir) : 
Error in execution.  Check the output for more information.
Calls: source -> withVisible -> eval -> eval -> .Call
Execution halted

How do I remedy the error such that the query results are passed into the R Script and both mean and standard deviation are returned?

Update Per comments, I added to the stored procedure so that it is now as follows:

    BEGIN TRY

        exec sp_execute_external_script
        @language = N'R',
        @script = N'
            # Summary Mean
            #
            # Calculates the mean of all independent variables in a table of data
            # grouped by \code{code}. Note that independent variables are
            # identified as all columns matching the following pattern: the letter
            # "c" followed by a one-or-more digit number.
            #
            # @param x Table to summarize
            # @importFrom rlang .data
            #
            # @return Summary table where each distinct \code{code} value is
            #   represented by one row with columns for the respective means of
            #   each independent variable.
            #install.packages("dplyr")

            code_mean <- function(x) {

              `%>%` = magrittr:: `%>%`
              dplyr::group_by(x, .data$code) %>%
                dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), mean)

            }

            cmresult <- code_mean(x = input_data_1)

            # Summary Standard Deviation
            #
            # Calculates the standard deviation of all independent variables in a
            # table of data grouped by \code{code}. Note that independent variables
            # are identified as all columns matching the following pattern: the
            # letter "c" followed by a one-or-more digit number.
            #
            # @param x Table to summarize
            # @importFrom rlang .data
            #
            # @return Summary table where each distinct \code{code} value is
            #   represented by one row with columns for the respective standard
            #   deviations of each independent variable.

            code_sd <- function(x) {

              `%>%` = magrittr:: `%>%`
              dplyr::group_by(x, .data$code) %>%
                dplyr::summarize_at(dplyr::vars(dplyr::matches("c\\d+")), stats::sd)

            }

            sdresult <- code_sd(x = input_data_1)
        ',
    @input_data_1 = N'
        Select Responses = 
        c.StudyID, c.RespID, c.ProductNumber, c.ProductSequence, c.BottomScaleValue, 
        c.BottomScaleAnchor, c.TopScaleValue, c.TopScaleAnchor, c.StudyDate,
        c.DayOfWeek, c.A, c.B, c.C, c.D, c.E, c.F,
        c.DependentVarYN, c.VariableAttributeID, c.VarAttributeName, c.[1] as c1, 
        c.[2] as c2, c.[3] as c3, c.[4] as c4, c.[5] as c5, c.[6] as c6, c.[7] as c7, c.[8] as c8
        from ClosedStudyResponses c
        where DependentVarYN = 0
    ',
    @params = N'@StudyID int',
    @StudyID = @StudyID_outer
    --@output_data_1_name = N'dfcm',
    --@output_data_2_name = N'dfsd'

WITH RESULT SETS (
(cmsresult varchar(MAX)),
(sdresult varchar(MAX)))
END TRY

BEGIN CATCH
    THROW;
END CATCH
END

The original error is resolved and is replaced by the following:

Msg 39004, Level 16, State 20, Line 4
A 'R' script error occurred during execution of 'sp_execute_external_script' 
with HRESULT 0x80004004.
Msg 39019, Level 16, State 1, Line 4
An external script error occurred: 
Error in dplyr::group_by(x, .data$code) : object 'input_data_1' not found
Calls: source ... eval -> code_mean -> %>% -> eval -> eval -> <Anonymous>

Error in execution.  Check the output for more information.
Error in eval(ei, envir) : 
Error in execution.  Check the output for more information.
Calls: source -> withVisible -> eval -> eval -> .Call
Execution halted
SidC
  • 3,175
  • 14
  • 70
  • 132
  • 1
    you have to either load the library tidyverse to be able to use `%>%` or you can load the `library(magrittr)` where `%>%` is unpacked from. In your execution aboad, you are just loading the specific functions from their packages without attaching the whole package and thus you miss to use the other functions available within the package – Onyambu Apr 09 '18 at 00:54
  • @Onyambu Is this carried out by running install.packages('tidyverse') from the command line in the C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\R_SERVICES\bin folder? – SidC Apr 09 '18 at 01:04
  • within your `code_mean` and `code_sd`functions include `\`%>%\`=magrittr::\`%>%\`` before you start the code – Onyambu Apr 09 '18 at 01:08
  • Can you add this as your answer? – SidC Apr 09 '18 at 01:10
  • If it works I would be glad to add it as an answer.. Maybe there is another underlying problem.. But that is the one I have seen so far – Onyambu Apr 09 '18 at 01:11

0 Answers0