3

I have a table definition as below

CREATE TABLE [dbo].[Dialogs](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [DiscussionID] [int] NOT NULL,
    [ApprovedByUserID] [int] NULL,
    [AddedByUserID] [int] NULL,
    [Text] [nvarchar](max) NULL,
    [ApprovalStatus] [int] NULL,
    [ApprovedOn] [datetime] NULL,
    [AddedOn] [datetime] NOT NULL,
 CONSTRAINT [PK_dbo.Dialogs] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Text column contains html entered by user. i want to create full text index on Text column, i also need support for html filter so that when any user type <div>,<p> or any other html tag then it do not return any results.

To create index i use below SQL

CREATE FULLTEXT INDEX ON [Dialogs]
 ( 
  [Text] TYPE COLUMN '.html'
 ) 
  KEY INDEX [PK_dbo.Dialogs]
      ON AOPRDefault; 

but SQL Server throws error

Incorrect syntax near '.html'.

can any one please give a example of how to specify TYPE COLUMN option when creating full text index.

Arjun Vachhani
  • 1,761
  • 3
  • 21
  • 44

1 Answers1

5

Firstly, the problem is that you should be referring to a column where you have '.html' rather than a literal, so you might have something like:

-- ADD COMPUTED COLUMN TO STORE FILE TYPE
ALTER TABLE dbo.Dialogs ADD FileExtension AS '.html'; 

CREATE FULLTEXT INDEX ON dbo.Dialogs ([Text] TYPE COLUMN FileExtension)
    KEY INDEX [PK_dbo.Dialogs] ON AOPRDefault;

However you are misunderstanding the purpose of the TYPE COLUMN property, according to the documentation:

TYPE COLUMN type_column_name

Specifies the name of a table column, type_column_name, that is used to hold the document type for a varbinary(max) or image document. This column, known as the type column, contains a user-supplied file extension (.doc, .pdf, .xls, and so forth). The type column must be of type char, nchar, varchar, or nvarchar.

Specify TYPE COLUMN type_column_name only if column_name specifies a varbinary(max) or image column, in which data is stored as binary data; otherwise, SQL Server returns an error.

Note

At indexing time, the Full-Text Engine uses the abbreviation in the type column of each table row to identify which full-text search filter to use for the document in column_name. The filter loads the document as a binary stream, removes the formatting information, and sends the text from the document to the word-breaker component. For more information, see Configure and Manage Filters for Search

Since your index is on a text column this is not applicable and the create index statement would return an error. Even if you were storing the html document as binary data, then it would still not work as you intended, what you are after is html parsing, which is a separate issue from full text indexing.

Community
  • 1
  • 1
GarethD
  • 68,045
  • 10
  • 83
  • 123
  • your solution solves my problem if i change my column data type to `varbinary(max)`. but i also need to display records where if user entered 3 random word from dialog then it should display dialog with those word as result, for that i need full text. any other alternative to full text to filter result? – Arjun Vachhani Dec 16 '15 at 10:17
  • I am not 100% sure I am afraid, I have done very little work with full text indexes. The best solution I can think of is a computed column that parses your html, then a full text index on this, but it seems like a fair bit of overhead, so you would have to decide if the juice is worth the squeeze. – GarethD Dec 16 '15 at 12:37
  • thanks, i am also thinking full text index on computed column is good solution. – Arjun Vachhani Dec 16 '15 at 12:41