In general, there's no built-in functionality for processing XML files in Greenplum. The thing you are talking about is the gpfdist client-side transformation that allows you to convert your XML to CSV and then process it inside of the database.
The best option in processing the XMLs inside of the Greenplum would be to decompose them to the relational model and load as a set of tables. If you need to process just a general XML files, you can write a function in PL/Java or PL/Python that would parse XML for you and extract the information you need.
Here's an example of the function:
create or replace function parse (inxml varchar, node varchar) returns varchar as $BODY$
from xml.etree import ElementTree
def getText (node, subnode):
res = None
if node is not None:
sn = node.find(subnode)
if sn is not None:
res = sn.text
else:
for el in node:
res2 = getText(el, subnode)
if res2 is not None:
res = res2
break
return res
et = ElementTree.fromstring(inxml.strip())
return getText(et, node)
$BODY$
language plpythonu
volatile;
And here's an example of its invocation:
select parse (
'<?xml version="1.0" encoding="utf-8"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Dont forget me this weekend!</body>
<attachments>
<attachment>file1</attachment>
<attachment>file2</attachment>
<attachment>file3</attachment>
</attachments>
</note>',
'attachment');