0

I have a database full of emails. It contains the to, from, subject, body, etc as columns. I am trying to output essentially the same format, but for the application I will be using the output file for, the body text cannot be longer than about 300 characters. I don't want to just grab the first 300 characters, because I need all the text. What I want to do is chunk the text up such that if an email body is say, 900 characters, to return 3 separate rows, each containing a third of the message, and have the rows all contain the to, from, subject, etc information.

i.e.) I have this:

sender1      recipient1       subject1       body1(600 characters)
sender2      recipient2       subject2       body2(150 characters)

I want this:

sender1     recipient1        subject1       body1(first 300 characters)
sender1     recipient1        subject1       body1(next 300 characters)
sender2     recipient2        subject2       body2(150 characters)

Thanks!

1 Answers1

0

Here's an example of an approach:

SELECT p.i, SUBSTR(e.body,p.i*n.len+1,n.len) AS bodypart
  FROM ( SELECT 'abcdefghijklmnopqrstuvwxyz' AS body
       ) e
 CROSS
  JOIN ( SELECT 10 AS len
       ) n
  JOIN ( SELECT 0 AS i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
                       UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6
                       UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
       ) p
    ON p.i <= LENGTH(e.body) DIV n.len

That query splits out up to ten (10) parts up to n.len characters in length.

To get parts up to 300 characters in length, change the literal 10 to 300, or whatever max size you want. To get more parts returned, you would modify the inline view aliased as p to return a set of integers 0 thru the maximum number of parts you need. To get this to run against your table, you would replace the inline view aliased as e with your table containing the body column, and return the other columns you want.

Something like this:

SELECT e.sender
     , e.recipient
     , e.subject
     , SUBSTR(e.body,p.i*n.len+1,n.len) AS bodypart
     , p.i AS bodypart_number
  FROM myemailtable e
 CROSS
  JOIN ( SELECT 300 AS len
       ) n
  JOIN ( SELECT 0 AS i UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3
                       UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6
                       UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
       ) p
    ON p.i <= LENGTH(e.body) DIV n.len

Update

That query has a corner case, when LENGTH(e.body) is an even multiple of n.len, it will return an "extra" empty piece. One ugly-ish fix is to change the predicate in the ON clause:

    ON p.i < LENGTH(e.body) DIV n.len 
    OR (p.i = LENGTH(e.body) DIV n.len AND LENGTH(e.body) MOD n.len > 0)

Also note that if you store the result from the bodypart expression in a VARCHAR column, any trailing space(s) will be removed from each part.

spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • I included the column `i` in the resultset, because I thought you might want to keep track of the order of the pieces extracted from the body. Note that there is a "corner" case where an extra "empty" piece will be returned in `LENGTH(e.body)` is an even multiple of `n.len`. – spencer7593 Feb 05 '13 at 23:41