0
     String clientId ="***********";
     String authTokenEndpoint = "***";
     String clientKey = "****";
     AccessTokenProvider provider = new ClientCredsTokenProvider(authTokenEndpoint, clientId, 
           clientKey);
     String accountFQDN = "******";  // 
     ADLStoreClient client = ADLStoreClient.createClient(accountFQDN, provider);

1.How to use ADLStoreClient obj to upload a file? Is there anything like

s3.putObject(BucketName, FileName, new File(Srcfileloc))

in azure sdk.

2.Documentation states like creating a file and writing to it. What if my data is like image or big tar file!

3.Do I have to follow two separate approach for adls gen 1 and adls gen2. Can the same code be reused?

Pod Mo
  • 248
  • 2
  • 8
chanskar
  • 1
  • 1
  • Do you have any other concerns? If you have no other concerns, could you please [accept the answer](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work) – Jim Xu Mar 16 '20 at 00:38

1 Answers1

2

Q1: How to upload file to Azure data lake

According to my test, we can upload file from on premise to Azure data lake with the following code. I use 800 MB tar.gz file for test

  1. Sdk
 <dependency>
        <groupId>com.microsoft.azure</groupId>
        <artifactId>azure-data-lake-store-sdk</artifactId>
        <version>2.2.3</version>
      </dependency>
      <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-nop</artifactId>
        <version>1.7.21</version>
      </dependency>
  1. code
AccessTokenProvider provider = new ClientCredsTokenProvider(authTokenEndpoint, clientId, clientKey);
        ADLStoreClient client = ADLStoreClient.createClient(accountFQDN, provider);

        String filepath="D:\\download\\ideaIU-2019.3.3.tar.gz";
        Path path = Paths.get(filepath);

        String filename = "test/" +path.getFileName();

        try{
            FileInputStream in = new FileInputStream(filepath);
            ADLFileOutputStream out =client.createFile(filename,IfExists.OVERWRITE);
            int bufSize = 4 * 1000 * 1000;
            out.setBufferSize(bufSize);
            byte[] buffer = new byte[bufSize];
            int n;

            while ((n = in.read(buffer)) != -1) {
                out.write(buffer, 0, n);

            }
            out.close();
            in.close();
        }
        catch (Exception e){
           // process exception


        }

enter image description here

For more details, please refer to the sample

Q2: How to upload file to Azure data lake Gen2

According to my test, we can use the following code

  1. SDk
<dependency>
      <groupId>com.azure</groupId>
      <artifactId>azure-storage-file-datalake</artifactId>
      <version>12.0.1</version>
    </dependency>
    <dependency>
      <groupId>com.fasterxml.jackson.core</groupId>
      <artifactId>jackson-core</artifactId>
      <version>2.10.3</version>
    </dependency>
  1. code
try {
          StorageSharedKeyCredential sharedKeyCredential =
                  new StorageSharedKeyCredential(accountName, accountKey);
          DataLakeServiceClientBuilder builder = new DataLakeServiceClientBuilder();
          builder.credential(sharedKeyCredential);
          builder.endpoint("https://" + accountName + ".dfs.core.windows.net");
          DataLakeServiceClient client = builder.buildClient();
          String fileSystem = "test";
          DataLakeFileSystemClient fileSystemClient = client.getFileSystemClient(fileSystem);
          DataLakeDirectoryClient directoryClient =fileSystemClient.getDirectoryClient("testFolder");
          String filepath="D:\\download\\ideaIU-2019.3.3.tar.gz";
          Path path = Paths.get(filepath);
          DataLakeFileClient fileClient = directoryClient.createFile(String.valueOf(path.getFileName()));
          fileClient.uploadFromFile(filepath, true);

      }catch(Exception ex){

          System.out.println(ex.getMessage());
      }

For more details, please refer to the docuemnt and the sample

Jim Xu
  • 21,610
  • 2
  • 19
  • 39
  • Do you have a similar example in java for downloading a file (or the whole content of a directory) from DataLake Gen 1? Thanks! – cris Mar 04 '21 at 13:46