4

I'm creating CSV & TSV files using AWS Data Pipeline. The files are creating just fine, but I can't figure out how to create files with column headers.

At first, I expected the headers to generate automatically based on the SQL query I'm running to get the export. That didn't work, but was ok.

Then I added a list of column definitions to the "column" attribute of the DefaultDataFormat3 feature/node.

{
  "escapeChar": "\\",
  "name": "DefaultDataFormat3",
  "column": [
    "id INT",
    "field1 STRING",
    "field2 STRING"
  ],
  "columnSeparator": "|",
  "id": "DataFormatId_jEXqL",
  "type": "TSV",
  "recordSeparator": "\\n"
}

I still just get CSVs and TSVs with no header row in the export.

T. Brian Jones
  • 13,002
  • 25
  • 78
  • 117

1 Answers1

7

I ran across a blog post explaining a solution for this. If you are using a query for your data you can add the column names as the first result:

SELECT 'firstName', 'lastName', 'email'
UNION ALL
SELECT firstName, lastName, email
FROM users
MrHen
  • 2,420
  • 2
  • 25
  • 39