To retrieve the datas from Pubmed DataSet [ncbi], I used FireFTP Addon (in firefox) to retrieve the xml, pdf, txt contents. [http://www.ncbi.nlm.nih.gov/pmc/tools/ftp/] I have successfully installed Apache Flume.
The main objective is -- I need to connect FTP with Flume and store the final result dataset in Cassandra.
Can anyone help me how to connect FTP Source with Flume.
Thank you so much in advance.