12

We have a client application accessing a SQL Server database (mirrored and clustered) through a C# dll with retry logic on specific error numbers.

We are having issues during fail overs where transient errors are being thrown by the .dll where catching them in retry logic would have allowed the client application to continue elegantly after the fail over.

Here is a list of errors we currently catch in the retry logic:

0 
-2
-1
2 
53
64
233
596
924
1205
1222
2801
4060
6005
10053
10054
10060
40143
40197
40501
40613

Does anyone know of a more comprehensive list of errors which the DB could throw during a fail over, which is recoverable once the fail over is complete?

Their must be loads of software out their that has had to deal with these, but I can't seem to find a decent list.

Thanks, Chris.

ChrisMurray
  • 577
  • 1
  • 4
  • 21
  • I'm assuming your using a Microsoft SQL server. In that case the meaning of those numbers can be found [here](https://technet.microsoft.com/en-us/library/cc645603(v=sql.105).aspx) the negative numbers are listed in the 1-1000 section. By default in C# you get a exception, those tend to contain more information then just the error number. – Nick Otten Apr 22 '16 at 14:09
  • Hi Nick, thanks for your comment. The .dll was written in house, so the errors behind these error codes are well known. My concern is that we have missed some error codes off. I am hoping someone may have a more exhaustive list of transient error codes we should be retrying on. – ChrisMurray Apr 22 '16 at 14:27
  • 1
    See MS SQL Server error codes for example here: https://azure.microsoft.com/en-us/documentation/articles/sql-database-develop-error-messages/ – Alex Kudryashev Apr 25 '16 at 15:35
  • All the error messages are in `sys.messages` so you could query that for likely keywords. – Martin Smith Apr 26 '16 at 19:40
  • I know about the big list of errors listed on the [Microsoft website](https://msdn.microsoft.com/en-us/library/cc645611.aspx?f=255&MSPPError=-2147217396) what I'm interested in seeing is the list of SQLException error codes that users catch and retry on within their applications - rather than just retrying on every SQLException thrown by the connection. – meh-uk Apr 27 '16 at 16:23
  • I think this question might already be answered here http://stackoverflow.com/a/32401432/201648, although this doesn't cover the negative error codes. – Aaron Newton Apr 28 '16 at 10:46
  • It doesn't - what I'm looking for is a list of sensible error codes to catch in the application. I'd much rather give someone the bounty than not - so if someone can post an answer that gives such a sensible list I'll award the bounty. – meh-uk Apr 30 '16 at 16:38

3 Answers3

4

Due to the apparent lack of a universal list, we have gone down the route of retrying on all errors which leave the connection in a broken state.

ChrisMurray
  • 577
  • 1
  • 4
  • 21
1

I think this guys here had a similar problem, might want to check.

Is there an overview of all SQL Server 2012 error codes?

Community
  • 1
  • 1
Rafael
  • 1,495
  • 1
  • 14
  • 25
  • This isn't really what I'm looking for - I'm well aware Microsoft has a big list of error codes, I'm curious as to which error codes people catch and retry on in their applications. – meh-uk Apr 28 '16 at 11:00
  • Well, I guess this would be sort of personal as to the company, since most of them behave on specific mistakes they might just create a try-catch for the common mistakes they have, as for that it might even be a crime to put the list over here, due to copyright blablabla. Anyhow, I don't think that this is the idea of Stack Overflow, really, because then it would be more of a personal based decision than a solution itself. – Rafael Apr 28 '16 at 12:15
  • 1
    What I'm really looking for is a sensible list of common mistakes. – meh-uk Apr 28 '16 at 13:50
-2
Create PROC uspErrorLog
 ( 
@userId varchar(15)  ,
@ExcType varchar(255)='',
@ExcMessage varchar(255)='',
@ExcSource varchar(255)='',
@ExcStackTrace varchar(255)=''
               ,@pageUrl varchar(150)=''
               ,@methodName varchar(150)=''
               ,@lineNo int
               ,@timeZone varchar(150)
                 )
AS
BEGIN
        BEGIN TRY
        BEGIN TRANSACTION

INSERT INTO [Common].[ErrorLogs]
           ([userId]
           ,[ExceptionType]
           ,[ExceptionMessage]
           ,[ExceptionSource]
           ,[ExceptionStackTrace]
           ,[pageUrl]
           ,[MethodName]
           ,[LineNumber]
           ,[Timezone]
           ,[LogDate])
     VALUES
           ( @userId
           ,@ExcType
           ,@ExcMessage
           ,@ExcSource
           ,@ExcStackTrace
           ,@pageUrl
           ,@methodName
           ,@lineNo
           ,@timeZone
           ,getdate()
           )
        COMMIT TRAN
    END TRY

    BEGIN CATCH
        IF @@TRANCOUNT > 0

        SELECT NULL AS ID
            ,'Cant Perform Insert Action.Error:' +  Error_message()  AS Message
            ,ERROR_LINE() AS linentto

        ROLLBACK TRAN --RollBack in case of Error                                                      

    END CATCH
END



 And call this procedure in your stored procedures in {catch} by passing the variables.Like this

BEGIN CATCH

IF @@TRANCOUNT > 0
        SELECT NULL AS ID
            ,'Cant perform operation.Error:' + Error_message() AS Message
            ,ERROR_LINE() AS linoo

    ROLLBACK TRAN --RollBack in case of Error                              

    DECLARE @methodname VARCHAR(200)
        ,@msg VARCHAR(max)
        ,@lineno INT

    SET @methodname = (
            SELECT Error_procedure()
            )
    SET @msg = (
            SELECT Error_message()
            )
    SET @lineno = (
            SELECT Error_line()
            )

    EXEC [dbo].[uspErrorLog] @CreatedBy
        ,''
        ,@msg
        ,''
        ,''
        ,''
        ,@methodname
        ,@lineno
        ,''

END CATCH

  • 2
    We already have methods of logging the error codes set up. The problem is when we encounter one which we do not have retry logic on it is stopping the client application, so we are only finding these transient errors one at a time. – ChrisMurray Apr 27 '16 at 14:08