I am working on manipulating data from particular column. I have 1 table and it has varchar(max) column in it. Following is the input in that column and want following output.
requirements : 1) Remove all html content 2) Output contains only message body and attachment links. 3) Need SQL query or stored Procedure
Input :
Sample message body.
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
------=_NextPart_001_0026_01C44313.1C14C7A0
Content-Type: text/html; charset="us-ascii"
Content-Transfer-Encoding: 8bit
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML xmlns="http://www.w3.org/TR/REC-html40" xmlns:o =
"urn:schemas-microsoft-com:office:office" xmlns:w =
"urn:schemas-microsoft-com:office:word">
<HEAD>
</HEAD>
<BODY lang=EN-US style="tab-interval: 36.0pt" vLink=purple link=blue>
<p>This is html content.</p>
<p>Html content should be removed.</p>
</BODY></HTML>
------=_NextPart_001_0026_01C44313.1C14C7A0--
------=_NextPart_000_001B_01C44313.1B899EA0
Content-Type: application/vnd.ms-excel; name="attachment1.xls"
Content-Disposition: attachment; filename="attachment1.xls"
Content-Transfer-Encoding: base64
G:\fakepath\Attach\attachment1.xls
------=_NextPart_000_001B_01C44313.1B899EA0
Content-Type: application/msword; name="attachment2.DOC"
Content-Disposition: attachment; filename="attachment2.DOC"
Content-Transfer-Encoding: base64
G:\fakepath\Attach\attachment2.DOC
------=_NextPart_000_001B_01C44313.1B899EA0--
Output :
Sample message body.
Lorem Ipsum is simply dummy text of the printing and typesetting industry.
Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged.
G:\fakepath\Attach\attachment1.xls
G:\fakepath\Attach\attachment2.DOC
Note: There are several records in this table with similar kind of value in the column. (I mean some of the records have 1 attachment link and some have 2 or more.)
Here is what I have done so far :
SELECT STUFF(columnName,CHARINDEX('<!DOCTYPE HTML',columnName),CHARINDEX('</html>',columnName) - CHARINDEX('<!DOCTYPE HTML',columnName) + 7,'') AS removeHTML from TableName;
SELECT CASE WHEN CHARINDEX('Content-Transfer-Encoding: base64', columnName) > 0 THEN SUBSTRING(columnName, CHARINDEX('Content-Transfer-Encoding: base64', columnName) + 33, LEN(columnName))ELSE columnName END AS attachmentLinks from TableName