0

According to your experience, what's the best option for performing semantic searches for html, word and pdf files? Should the files be saved in a varchar(max) column or directly on disk (in a FileTable)?

File sizes aren't restricted to a maximum predefined size, but most of them are expected to have more than 1 or 2 MB. We think that we'll be having several thousands docs a month (we might have more than 200k files uploaded a year) and we are interested in getting results from those 3 types of files.

What are your recommendations?

Thanks guys!

Luis

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Luis Abreu
  • 4,008
  • 9
  • 34
  • 63
  • There are just too many options here for a reasonable answer. But certainly don't store your files in a varchar(max) column. You will not be able to retrieve them intact. If you store in a persistent table you should use varbinary(max). Storing the file name in the database with the file on disc is a third option. They all have merits and problems. – Sean Lange Jul 14 '17 at 14:40
  • Hello Sean. Yep, my mistake... Initially the idea was to keep only html files, thus the varchar max column... – Luis Abreu Jul 14 '17 at 15:42

0 Answers0