Questions tagged [spring-data-hadoop]

Spring for Apache Hadoop is an open-source project that provides unified configuration model and easy to use APIs for using HDFS, MapReduce, Pig, and Hive as well as developing and deploying YARN applications.

Spring for Apache Hadoop simplifies developing Apache Hadoop by providing a unified configuration model and easy to use APIs for using HDFS, MapReduce, Pig, and Hive. It also provides integration with other Spring ecosystem project such as Spring Integration and Spring Batch enabling you to develop solutions for big data ingest/export and Hadoop workflow orchestration.

Home page: http://projects.spring.io/spring-hadoop/

GitHub repo: https://github.com/spring-projects/spring-hadoop

47 questions
1
vote
0 answers

Fix Avro filename with spring data hadoop

My goal is to write a directory on HDFS. For that I use: spring-data-hadoop:2.4.0.RELEASE spring-data-hadoop-store:2.4.0.RELEASE In my config class I define a bean @Bean public DataStoreWriter dataStoreWriter() { return new…
vincent
  • 1,214
  • 12
  • 22
1
vote
1 answer

spring hbaseTemplate throws java.lang.IllegalArgumentException: Not a host:port pair: PBUF

I am a newbie for Hbase and I want to continue to use spring solution, hBaseTemplate to access HBase. But I tested a lot of times and can never be successful in doing so. This is what I did. The sample I am using is:…
user3006967
  • 3,291
  • 10
  • 47
  • 72
1
vote
0 answers

spring distcp creates target folder as filename

I am using spring DistCp to copy a file within hdfs. My code looks like this distcp.copy(null, null, null, "/tmp", null, null, null, null, null, null, null, new String[]{"/user/aq728y/publish/test.txt",…
adeelmahmood
  • 2,371
  • 8
  • 36
  • 59
1
vote
1 answer

how to pass parameters from web requests to spring boot yarn application

I'm using spring-boot and spring-boot-yarn to submit yarn applications to a cluster. My use-case is close to the one described in this tutorial https://github.com/spring-guides/gs-yarn-basic. The only difference is that my 'client' is supposed to be…
1
vote
1 answer

running a distcp job from spring in hadoop 2.x

I have been using spring data hadoop in on of my projects and have been able to run distcp jobs in hadoop 1.x. Recently we have upgraded to hadoop 2.x and for that I upgraded spring-data-hadoop to 2.0.4. Most of the stuff is still working but I am…
adeelmahmood
  • 2,371
  • 8
  • 36
  • 59
1
vote
0 answers

Spring Data Hadoop

I using Hadoop 2.4.x, Spring 4.0.6, Spring-Data-Hadoop 2.0.1.RELEASE-hadoop24 I'm running only jar file then no problems. But Running with tomcat have some error; my hadoop config is below:
JSH
  • 141
  • 1
  • 2
  • 5
1
vote
0 answers

Is jar creation necessary to execute MR on a remote cluster

I have been trying Spring Data for Hadoop to execute a MR job from my local Windows STS on a remote Hadoop cluster. The issue I face is mentioned in detail here There's a similar thread that has forced me to ask the below question. Is it necessary…
Kaliyug Antagonist
  • 3,512
  • 9
  • 51
  • 103
1
vote
4 answers

ClassNotFoundException after job submission

I'm trying out Spring Data - Hadoop for executing the MR code on a remote cluster from my local machine's IDE //Hadoop 1.1.2, Spring 3.2.4, Spring-Data-Hadoop 1.0.0 Tried with these versions : Hadoop 1.2.1, Spring 4.0.1, Spring-Data-Hadoop…
Kaliyug Antagonist
  • 3,512
  • 9
  • 51
  • 103
1
vote
1 answer

Spring data - hadoop connectivity

I'm trying out Spring Data - Hadoop for executing the MR code on a remote cluster from my local machine's IDE Hadoop 1.1.2, Spring 3.2.4, Spring-Data-Hadoop 1.0.0 My bean configuration file viz. applicationContext.xml is as follows :
0
votes
0 answers

hadoop configuration error in runtime with openjdk11

I am migrating our application to openjdk11 and with this setup my application is throwing below error. PLease help on this Note : With Jdk 1.8 the same code and configurations are working fine . Java version: openjdk 11 Springboot-hadoop : 2.4.0…
0
votes
0 answers

Spring Hadoop - Unable to locate Spring NamespaceHandler for XML schema namespace

I want to enable Hive in my Spring Hadoop project. I understand that we can't use JavaConfiguration so I am using XML. I have an error regarding the Spring Namespace. Error starting ApplicationContext. To display the auto-configuration report…
Geoff L.
  • 99
  • 1
  • 2
  • 9
0
votes
1 answer

Issue when writing to HDFS using spring data hadoop

I was trying to write a simple text to HDFS using spring data hadoop. But I'm getting an unknown issue upon writing. Exception in thread "main" org.springframework.data.hadoop.store.StoreException: Store output context not yet initialized;…
Sachin
  • 1,675
  • 2
  • 19
  • 42
0
votes
2 answers

Issue when executing spring bean

I've a bean named textFileWriter to write string entities to HDFS. I've configured the spring bean in bean config file. While executing am getting NullPointerException. Please help me on this. My bean configuration :-
Sachin
  • 1,675
  • 2
  • 19
  • 42
0
votes
1 answer

How to simulate hdfs operations using spring data

I'm new to spring data-hadoop and would like to ask one general question. I have files in different format and would like to extract the useful content with Apache Tika and store as text files in HDFS. I've gone through the reference documentation…
Sachin
  • 1,675
  • 2
  • 19
  • 42
0
votes
1 answer

How to set up hadoop distributed cache using spring data

I'm new to spring data and trying to distribute all the spring data dependencies through distributed cache. But it's not working and no useful resources are found. My configuration inside application-context.xml :-
Sachin
  • 1,675
  • 2
  • 19
  • 42