sql server filestream folder count not matching table count, caused by duplicates?

Question

I have a table (myTable) in which I have a field flagged as being a Filestream, on this server it is the only filestream and it saves to the filestream location of F:\foo

SELECT COUNT(1) FROM myTable results in 37,314 but the folder properties of F:\foo are 36,358 files. All the rows in myTable have data in the Filestream column, does that mean 956 were complete duplicates?

If so, how does SQL Server determine what is and what is not a duplicate (is it a complete binary compare? as I don't think it would be worth SQL Storing data at a block-differential level)? As I can't seem to find any information SQL Server consolidating duplicate records for filestreams.

Additionally when I re-save many of the same records again (making the count say 45,000) the total files in F:\foo increase which to me indicates that the duplicate checking (if there is any such thing) is not perfect.

Does SQL Server consolidate similar files in filestreams together or not? Is there a stored procedure that can be executed to cause SQL to re-scan the filestream filegroup and look for further duplicates to consolidate existing space?

Server in question is SQL Server 2012 Enterprise with SP1 but has also happened on our UAT SQL Server 2012 Standard Edition with SP1 box.

sql server filestream folder count not matching table count, caused by duplicates?

0 Answers0