59

I want to create a file in HDFS and write data in that. I used this code:

Configuration config = new Configuration();     
FileSystem fs = FileSystem.get(config); 
Path filenamePath = new Path("input.txt");  
try {
    if (fs.exists(filenamePath)) {
        fs.delete(filenamePath, true);
    }

    FSDataOutputStream fin = fs.create(filenamePath);
    fin.writeUTF("hello");
    fin.close();
}

It creates the file, but it does't write anything in it. I searched a lot but didn't find anything. What is my problem? Do I need any permission to write in HDFS?

Thanks.

iggymoran
  • 4,059
  • 2
  • 21
  • 26
csperson
  • 901
  • 3
  • 12
  • 17

4 Answers4

75

an alternative to @Tariq's asnwer you could pass the URI when getting the filesystem

import org.apache.hadoop.fs.FileSystem
import org.apache.hadoop.conf.Configuration
import java.net.URI
import org.apache.hadoop.fs.Path
import org.apache.hadoop.util.Progressable
import java.io.BufferedWriter
import java.io.OutputStreamWriter

Configuration configuration = new Configuration();
FileSystem hdfs = FileSystem.get( new URI( "hdfs://localhost:54310" ), configuration );
Path file = new Path("hdfs://localhost:54310/s2013/batch/table.html");
if ( hdfs.exists( file )) { hdfs.delete( file, true ); } 
OutputStream os = hdfs.create( file,
    new Progressable() {
        public void progress() {
            out.println("...bytes written: [ "+bytesWritten+" ]");
        } });
BufferedWriter br = new BufferedWriter( new OutputStreamWriter( os, "UTF-8" ) );
br.write("Hello World");
br.close();
hdfs.close();
Miguel Pereira
  • 1,781
  • 16
  • 14
  • 7
    How to get variable 'bytesWritten'? – Wei Lin Apr 19 '16 at 02:19
  • Try looking at the OutputStream docs? ex: https://docs.oracle.com/javase/7/docs/api/java/io/DataOutputStream.html – Miguel Pereira Apr 20 '16 at 12:07
  • 5
    import statements would be helpful... Where is Configuration coming from in particular? – Dan Ciborowski - MSFT Dec 01 '16 at 21:17
  • Configuration and many other come from org.apache.hadoop.* from the 'org.apache.hadoop:hadoop-common:jar:X.X.X' library that you pick – pleonasmik May 11 '17 at 14:31
  • 1
    import statements incase anyone is wondering : ```import org.apache.hadoop.fs.FileSystem import org.apache.hadoop.conf.Configuration import java.net.URI import org.apache.hadoop.fs.Path import org.apache.hadoop.util.Progressable import java.io.BufferedWriter import java.io.OutputStreamWriter ``` – user238607 Dec 04 '20 at 19:44
  • A general advice to be careful with using `FileSystem.close()` call. It is possible that the FS is cached somewhere in the system (see description of [`FileSystem.get()`](https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/fs/FileSystem.html#get-java.net.URI-org.apache.hadoop.conf.Configuration-)) and there will be issues when something tries to use that cache after it was explicitly closed like that. – Johnny Baloney Nov 18 '21 at 12:12
24

Either define the HADOOP_CONF_DIR environment variable to your Hadoop configuration folder or add the following 2 lines in your code :

config.addResource(new Path("/HADOOP_HOME/conf/core-site.xml"));
config.addResource(new Path("/HADOOP_HOME/conf/hdfs-site.xml"));

If you don't add this, your client will try to write to the local FS, hence resulting into the permission denied exception.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Tariq
  • 34,076
  • 8
  • 57
  • 79
1

This should do the trick

import org.apache.commons.io.IOUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;

import java.io.*;
import java.nio.charset.StandardCharsets;


public static void writeFileToHDFS() throws IOException {
        Configuration configuration = new Configuration();
        configuration.set("fs.defaultFS", "hdfs://localhost:9000");
        configuration.addResource(new Path("/HADOOP_HOME/conf/core-site.xml"));
        configuration.addResource(new Path("/HADOOP_HOME/conf/hdfs-site.xml"));
        FileSystem fileSystem = FileSystem.get(configuration);
        //Create a path
        String fileName = "input.txt";
        Path hdfsWritePath = new Path("/user/yourdesiredpath/" + fileName);
        FSDataOutputStream fsDataOutputStream = fileSystem.create(hdfsWritePath,true);

        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(fsDataOutputStream,StandardCharsets.UTF_8));
        bufferedWriter.write("Java API to write data in HDFS");
        bufferedWriter.close();
        fileSystem.close();
    }
-2

Please try the below approach.

FileSystem fs = path.getFileSystem(conf);
SequenceFile.Writer inputWriter = new SequenceFile.Writer(fs, conf, path, LongWritable.class, MyWritable.class);
inputWriter.append(new LongWritable(uniqueId++), new MyWritable(data));
inputWriter.close();