0

I am using SFTP Source in Spring cloud dataflow and it is working for getting files define in sftp:remote-dir:/home/someone/source , Now I have a many subfolders under the remote-dir and I want to recursively get all the files under this directory which match the patten. I am trying to use filename-regex: but so far it only works on one level. How do I recursively get the files I need.

Gary Russell
  • 166,535
  • 14
  • 146
  • 179
gamepop
  • 424
  • 2
  • 13

1 Answers1

2

The inbound channel adapter does not support recursion; use a custom source with the outbound gateway with an MGET command, with recursion (-R).

The doc is missing that option; fixed in the current docs.

I opened an issue to create a standard app starter.

EDIT

With the Java DSL...

@SpringBootApplication
@EnableBinding(Source.class)
public class So44710754Application {

    public static void main(String[] args) {
        SpringApplication.run(So44710754Application.class, args);
    }

    // should store in Redis or similar for persistence
    private final ConcurrentMap<String, Boolean> processed = new ConcurrentHashMap<>();

    @Bean
    public IntegrationFlow flow() {
        return IntegrationFlows.from(source(), e -> e.poller(Pollers.fixedDelay(30_000)))
                .handle(gateway())
                        .split()
                        .<File>filter(p -> this.processed.putIfAbsent(p.getAbsolutePath(), true) == null)
                        .transform(Transformers.fileToByteArray())
                        .channel(Source.OUTPUT)
                        .get();
    }

    private MessageSource<String> source() {
        return () -> new GenericMessage<>("foo/*");
    }

    private AbstractRemoteFileOutboundGateway<LsEntry> gateway() {
        AbstractRemoteFileOutboundGateway<LsEntry> gateway = Sftp.outboundGateway(sessionFactory(), "mget", "payload")
                .localDirectory(new File("/tmp/foo"))
                .options(Option.RECURSIVE)
                .get();
        gateway.setFileExistsMode(FileExistsMode.IGNORE);
        return gateway;
    }

    private SessionFactory<LsEntry> sessionFactory() {
        DefaultSftpSessionFactory sf = new DefaultSftpSessionFactory();
        sf.setHost("10.0.0.3");
        sf.setUser("ftptest");
        sf.setPassword("ftptest");
        sf.setAllowUnknownKeys(true);
        return new CachingSessionFactory<>(sf);
    }

}

And with Java config...

@SpringBootApplication
@EnableBinding(Source.class)
public class So44710754Application {

    public static void main(String[] args) {
        SpringApplication.run(So44710754Application.class, args);
    }

    @InboundChannelAdapter(channel = "sftpGate", poller = @Poller(fixedDelay = "30000"))
    public String remoteDir() {
        return "foo/*";
    }

    @Bean
    @ServiceActivator(inputChannel = "sftpGate")
    public SftpOutboundGateway mgetGate() {
        SftpOutboundGateway sftpOutboundGateway = new SftpOutboundGateway(sessionFactory(), "mget", "payload");
        sftpOutboundGateway.setOutputChannelName("splitterChannel");
        sftpOutboundGateway.setFileExistsMode(FileExistsMode.IGNORE);
        sftpOutboundGateway.setLocalDirectory(new File("/tmp/foo"));
        sftpOutboundGateway.setOptions("-R");
        return sftpOutboundGateway;
    }

    @Bean
    @Splitter(inputChannel = "splitterChannel")
    public DefaultMessageSplitter splitter() {
        DefaultMessageSplitter splitter = new DefaultMessageSplitter();
        splitter.setOutputChannelName("filterChannel");
        return splitter;
    }

    // should store in Redis, Zookeeper, or similar for persistence
    private final ConcurrentMap<String, Boolean> processed = new ConcurrentHashMap<>();

    @Filter(inputChannel = "filterChannel", outputChannel = "toBytesChannel")
    public boolean filter(File payload) {
        return this.processed.putIfAbsent(payload.getAbsolutePath(), true) == null;
    }

    @Bean
    @Transformer(inputChannel = "toBytesChannel", outputChannel = Source.OUTPUT)
    public FileToByteArrayTransformer toBytes() {
        FileToByteArrayTransformer transformer = new FileToByteArrayTransformer();
        return transformer;
    }

    private SessionFactory<LsEntry> sessionFactory() {
        DefaultSftpSessionFactory sf = new DefaultSftpSessionFactory();
        sf.setHost("10.0.0.3");
        sf.setUser("ftptest");
        sf.setPassword("ftptest");
        sf.setAllowUnknownKeys(true);
        return new CachingSessionFactory<>(sf);
    }

}
Gary Russell
  • 166,535
  • 14
  • 146
  • 179
  • I am trying to model it on the basis of [link](https://github.com/spring-cloud/spring-cloud-stream-app-starters/blob/master/sftp/spring-cloud-starter-stream-source-sftp/src/main/java/org/springframework/cloud/stream/app/sftp/source/SftpSourceConfiguration.java) but having challenges configuring `Consumer()`. Can you please suggest how I can structure it so that it works like sftpSource. Thanks. – gamepop Jun 23 '17 at 03:40
  • 1
    See the edit to my answer for a couple of solutions; one with the java dsl, the other with java config. – Gary Russell Jun 23 '17 at 14:24
  • Thanks this works perfectly. I also want to delete the file after picked up from sftp remote location, delete-remote-files option is also missing. What is the best way to achieve it, Do I again use Outbound Gateway and execute rm command – gamepop Jun 23 '17 at 18:16
  • 1
    Yes; see [Spring Integration ftp sample](https://github.com/spring-projects/spring-integration-samples/blob/master/basic/ftp/src/test/resources/META-INF/spring/integration/FtpOutboundGatewaySample-context.xml) - it uses XML config, but the same techniques apply - see the `expression` on the RM gateway. However, that particular expression won't work for MGET (that sample uses LS/split/GET). With MGET, you could retain the remote directory structure, so you could rebuild the remote file path for the RM gateway - use `.localDirectoryExpression("'/tmp/' + #remoteDirectory")` in the MGET gateway. – Gary Russell Jun 23 '17 at 18:43
  • 1
    The local file is available in the `file_originalFile` header after the transformer. I opened [INT-4304](https://jira.spring.io/browse/INT-4304). – Gary Russell Jun 23 '17 at 18:48