1

I am using Spring cloud stream to read a file and split using file splitter and emit each line as a message using DSL style, the file am reading has a header row, just wondering if there is an easy way to skip the header row before/after reading.

Any help is appreciated.

here is how my splitter and integrationFlow looks like:

enter code here
  return IntegrationFlows
            .from("....")
            .split(Files.splitter(true, true)/
            .charset(StandardCharsets.UTF_8)
            .applySequence(true), //emmit sequenceNumber to header
             e -> e.id("fileSplitter")
            );


enter code here
    IntegrationFlow integrationFlow = integrationFlowBuilder
            .<Object, Class<?>>route(Object::getClass, m -> m
                    .channelMapping(FileSplitter.FileMarker.class, "markers.input")
                    .channelMapping(String.class, "lines.input"))
            .get();
Suresh Naik
  • 275
  • 4
  • 11

2 Answers2

1

If I read this right you are using one of our OOB apps, the file source: https://github.com/spring-cloud-stream-app-starters/file/blob/master/spring-cloud-starter-stream-source-file/README.adoc and deploying using Spring Cloud Dataflow dsl such as stream create file ----file.consumer.mode=lines --file.directory=/tmp/ | sink correct?

If so, there's a special header called sequence_number when you are reading files in the lines mode. You can add a filter in between to drop those messages based on a header expression.

Vinicius Carvalho
  • 3,994
  • 4
  • 23
  • 29
  • Thanks for the response.. I don't use DataFlow, but separate source and sink process app.. here's how the splitter looks like .from(s -> s.file(new File(fileDir)) .filter(getFileFilter(fileName)), e -> e.poller(poller)) .split(Files.splitter(true, true) .charset(StandardCharsets.UTF_8) .applySequence(true), //emmit sequenceNumber to header e -> e.id("fileSplitter")... and – Suresh Naik Aug 29 '17 at 23:15
  • the Channel mapping looks like this: .>route(Object::getClass, m -> m .channelMapping(FileSplitter.FileMarker.class, "markers.input") .channelMapping(String.class, "skus.input")) .get(); – Suresh Naik Aug 29 '17 at 23:23
  • That's not readable. Consider to make an edit to your question with the code. The answer is correct though: you really just have to skip that row downstream using filter. – Artem Bilan Aug 30 '17 at 01:21
  • @ArtemBilan - I have updated my question with code snippet I am using, the answer looks correct to me, I am emmiting sequenceNumber for that reason, however I coudn't figure out the actual filter expression I should be using, Can you give a sample filter with header expression that skip the message with sequenceNumber=0 ? – Suresh Naik Aug 30 '17 at 02:07
  • `.filter("headers.sequenceNumber != 0")` – Artem Bilan Aug 30 '17 at 02:09
  • @ArtemBilan - Thanks!! that helped.. The sequenceNumber starts with 1, Since I am emitting marker as well the first row in the file gets sequenceNumber=2. `.filter("headers.sequenceNumber != 2")` – Suresh Naik Aug 30 '17 at 02:44
0

Spring Integration 5.1.5 solution:

@Bean
public MessageSource<File> sourceDirectory() {
    FileReadingMessageSource messageSource = new FileReadingMessageSource();
    messageSource.setDirectory(new File("./data/input"));
    return messageSource;
}

@Bean
public IntegrationFlow folderFlow() {
    FileSplitter fileSplitter = new FileSplitter();
    fileSplitter.setFirstLineAsHeader("columns");
    return IntegrationFlows.from(sourceDirectory(), configurer -> configurer.poller(Pollers.fixedDelay(1000)))
            .split(fileSplitter)
            .handle(System.out::println)
            .get();
}
Pavel
  • 2,557
  • 1
  • 23
  • 19