0

After establishing connection to a remote ftp or sftp server programmatically using java is it possible to read the files of /home/www-data/content/ without writing to a file in local system. Basically i want to extract metadata of files using Apache Tika from that path without downloading.

UPDATE :

I have tried to connect with JSch which is an implementation of SSH2.

JSch jsch = new JSch();
session = jsch.getSession(SFTPUSER,SFTPHOST,SFTPPORT);
session.setPassword(SFTPPASS);
java.util.Properties config = new java.util.Properties();
config.put("StrictHostKeyChecking", "no");
session.setConfig(config);
session.connect();
user850234
  • 3,373
  • 15
  • 49
  • 83
  • It definitely is possible but you will have to provide more details on how you connect to remote ftp/sftp from your Java code - what library you use etc – rootkit Feb 18 '13 at 19:52
  • @rootkit007 : I have updated my question with what i have tried so far. – user850234 Feb 18 '13 at 20:00

2 Answers2

1

For SFTP using JSCH library you should use ChannelSFTP.get() method and supply OutputStream instance that does not write to disk (eg, ByteArrayOutputStream). See JSCH stock example here:

http://www.jcraft.com/jsch/examples/Sftp.java.html

And JavaDoc for get() method:

http://epaul.github.com/jsch-documentation/javadoc/com/jcraft/jsch/ChannelSftp.html#get(java.lang.String,java.io.OutputStream,com.jcraft.jsch.SftpProgressMonitor,int,long)

For FTP you will have to use something else as JSCH only supports SFTP protocol.

rootkit
  • 2,165
  • 2
  • 29
  • 43
1

You might want to try out Apache Commons VFS (Virtual File System).

They have a pretty decent example of a simple SFTP file download, but in your case you could just change the process() method of that example and let it parse the file with TIKA.

With commons vfs you can just work with a FileObject . On the FileObject you can call doGetInputStream() which in turn you can hand over to TIKA for further processing.

Jeroen
  • 3,076
  • 1
  • 17
  • 16