When developers are using Laravel to create queries against a MSSQL database that uses default SQL collation and the tables contain varchar data type columns (ASCII) for the vast majority of tables columns. The included Laravel ORM creates queries that are passed into SQL that have bound parameters that are of the nvarchar data type, i.e Unicode or utf8, which when the values are passed in have the wrong data type for where clauses (predicate evaluation) to the varchar data in the column. This mismatch in the data types results in an implicit conversion in the execution plan thus rendering any indexes non searchable. SQL server will do table seek/scan on the clustered index since that is the default heuristic behavior when no index is able to be used. Maybe not an issue in small data sets but in tables with billions of rows and monthly partitioning it can result in SQL scanning hundreds of GB of data or scanning every partition to find the data which if the index had been used would take milliseconds and not minutes. This issue is not specific to the Laravel Eloquent ORM but crops up in many ORMs that assume a utf8 character set but the database was designed with English as the only language and so is ASCII with varchar data types.
I found some similar questions Laravel, SQL varchar , on Stackoverflow that referenced the SQL grammar file and changing the data type declarations from nvarchar to varchar in the file, https://github.com/laravel/framework/blob/9.x/src/Illuminate/Database/Schema/Grammars/SqlServerGrammar.php but it seemed they were talking about migrations and not the queries generated by the Eloquent ORM itself. If this results in the same thing occurring, that changing the SQL grammar file will result on queries being generated with varchar bound parameters is not apparent in those answers or ones I found on laracast as well. A definitive answer on how to handle data type mismatches generated in the ORM is what I am looking for.
Here is how the code looks
The code generate by Eloquent is as follows, the parameters @P1,... are all passed in to MSSQL as nvarchar(4000) resulting in the need to have a CONVERT function in then where clause (Geohash7 = CONVERT(VARCHAR(7), @P1) by using a raw SQL code overload.
declare @p1 int set @p1=1 exec sp_prepexec @p1 output,N'@P1 nvarchar(4000),@P2 nvarchar(4000),@P3 nvarchar(4000),@P4 nvarchar(4000) ... .. .. ,N'select GMTDate date, SUM(Scans) scans from DB1.dbo.Geohash with (nolock) where (Geohash7 = CONVERT(VARCHAR(7), @P1) or Geohash7 = CONVERT(VARCHAR(7), @P2) or Geohash7 = CONVERT(VARCHAR(7), @P3) or Geohash7 = CONVERT(VARCHAR(7), @P4) ... .. .. ,N'9vffnth',N'9vffnsu',N'9vffnt5',N'9vffnsg'....
If the CONVERT is not there added in as a raw SQL overload then the code passed in would have this look directly from Eloquent.
declare @p1 int set @p1=1 exec sp_prepexec @p1 output,N'@P1 nvarchar(4000),@P2 nvarchar(4000),@P3 nvarchar(4000),@P4 nvarchar(4000) ... .. .. ,N'select GMTDate date, SUM(Scans) scans from DB1.dbo.Geohash with (nolock) where (Geohash7 = @P1 or Geohash7 = @P2 or Geohash7 = @P3 or Geohash7 = @P4 ... .. .. ,N'9vffnth',N'9vffnsu',N'9vffnt5',N'9vffnsg'....
The SQL execution plan generated by the first set of code will use an index to seek/scan for records. The SQL execution plan for the second set of code will not use an index since SQL will do an implicit conversion on the nvarchar(4000) parameter and that results in no possible index use, it will always do a clustered index seek/scan which is not a efficient or viable plan when you have billions of rows in the table. The goal is to not have Eloquent use nvarchar(4000) as the parameter passed in but varchar(XXX) or even varchar(4000) so that SQL server can use indexes built on the underlying table data which is varchar data type.