0

I start a hive with spark thriftserver and set the hdfs as the storage.

And I copied the hdfs-site.xml and core-site.xml

core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
            <name>fs.defaultFS</name>
            <value>hdfs://x.x.x.x:9000</value>
    </property>
</configuration>


hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
            <value>1</value>
    </property>
</configuration>

create a resource on starrocks

+-------+--------------+---------------------+---------------------------+
| Name  | ResourceType | Key                 | Value                     |
+-------+--------------+---------------------+---------------------------+
| hive0 | hive         | hive.metastore.uris | thrift://x.x.x.x:9083 |
+-------+--------------+---------------------+---------------------------+

create an external table

CREATE EXTERNAL TABLE `hive_table` (
  `id` int(11) NULL COMMENT ""
) ENGINE=HIVE 
COMMENT "HIVE"
PROPERTIES (
"database" = "rowdata",
"table" = "ht",
"resource" = "hive0",
"hive.metastore.uris"  =  "thrift://x.x.x.x:9083"
);

try to query the table got an error.

ERROR 1064 (HY000): Failed to get remote files, msg: com.starrocks.connector.exception.StarRocksConnectorException: Failed to get hive remote file's metadata on path: RemotePathKey{path='file:/opt/zw/spark-3.3.1-bin-hadoop3/sbin/spark-warehouse/rowdata.db/ht', isRecursive=false}. msg: File /opt/zw/spark-3.3.1-bin-hadoop3/sbin/spark-warehouse/rowdata.db/ht does not exist

It seems does not read the hdfs directly.

the starrocks version is 2.5.0.

Anyone know the reason?

Angle Tom
  • 1,060
  • 1
  • 11
  • 29

1 Answers1

0

I found out the reason. That's my hive configuration error. I found the hive meta is incorrect in the DBS table. the path is incorrect. After fixed that it works well now.

Angle Tom
  • 1,060
  • 1
  • 11
  • 29