2

I have my own s3 running locally instead of aws s3. Is there a way to overwrite s3.amazonaws.com?
I have created hive-site.xml and put it in ${HIVE_HOME}/conf/.
This is what I have got in .xml:

<configuration>
<property>
    <name>fs.s3n.impl</name>
    <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>
<property>
    <name>fs.s3n.endpoint</name>
    <value>local_s3_ip:port</value>
</property>
<property>
    <name>fs.s3n.awsAccessKeyId</name>
    <value>VALUE</value>
</property>
<property>
    <name>fs.s3n.awsSecretAccessKey</name>
    <value>VALUE</value>
</property>    

Now I want to create table and if I put:

LOCATION('s3n://hive/sample_data.csv')

I have an error:
org.apache.hadoop.hive.ql.exec.DDLTask. java.net.UnknownHostException: hive.s3.amazonaws.com: Temporary failure in name resolution

It doesn't work neither for s3 nor s3n.

Is it possible to overwrite default s3.amazonaws.com and use my own s3?

s_z_p
  • 98
  • 1
  • 8

2 Answers2

2
  1. Switch to the S3A Connector (and Hadoop 2.7+ JARs)
  2. set "fs.s3a.endpoint" to the hostname of your server
  3. and "fs.s3a.path.style.access" = true (rather than expect every bucket to have DNS)

Expect to spend time working on authentication options as signing is always a troublespot in third-party stores.

stevel
  • 12,567
  • 1
  • 39
  • 50
0

With this configuration I am able to reach my own s3 endpoint.

<configuration>
    <property>
        <name>fs.s3a.impl</name>
        <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
    </property>
    <property>
        <name>fs.s3a.endpoint</name>
        <value> <ip>:<port> </value>
    </property>
    <property>
        <name>fs.s3a.path.style.access</name>
        <value>true</value>
    </property>
   <property>
      <name>fs.s3a.access.key</name>
      <value> <ak> </value>
   </property>
   <property>
      <name>fs.s3a.secret.key</name>
      <value> <sk> </value>
   </property>
    <property>
        <name>fs.s3a.awsAccessKeyId</name>
        <value> <ak> </value>
    </property>
    <property>
        <name>fs.s3a.awsSecretAccessKey</name>
        <value> <sk> </value>
    </property>
    <property>
        <name>fs.s3a.connection.ssl.enabled</name>
        <value>false</value>
    </property>

s_z_p
  • 98
  • 1
  • 8
  • In my case that wasn't enough. I could create tables with s3a locations, but inserting into them required adding bucket-specific secrets, like `fs.s3a.bucket.MYBUCKET.access.key` – Alexey Apr 28 '23 at 15:26