I was able to accomplish this. Assuming you are talking about Streamsets Datacollector. The rest will be pragmatic to parse your string to grab the specific parts of your file string in the Jython Evaluator.
Set up a Pipeline:
(Directory Origin) -> (Expression Evaluator) -> (Jython Evaluator) -> (Trash)
==== Configuration:
Directory Origin:
File Name Pattern: ddsample_*
First File to Process: ddsample_20211203
Expression Evaluator:
Field Expressions
Output Field: /filename_from_header
Field Expression: ${record:attribute('filename')}
Jython Evaluator : Script
for record in sdc.records:
try:
txt=record.value['filename_from_header']
record.value['filename_from_header'] = txt[9:]
sdc.output.write(record)
except Exception as e:
sdc.error.write(record, str(e))
Then Click Preview and click on the Jython evaluator:
