Questions tagged [sequencefile]

A SequenceFile is a Hadoop binary file containing key/value pairs.

A SequenceFile is a file format used by Hadoop for the efficient storage and retrieval of key/value pairs. It is also possible to use compression techniques for more efficient storage.

For more information view the API documentation or the Wiki page.

157 questions
0
votes
1 answer

Fail to write SequenceFile with Pig

I want to store some Pig variable to Hadoop SequenceFile, so as to run external MapReduce jobs. Assume that my data has the (chararray, int) schema: (hello,1) (test,2) (example,3) I wrote this storing function: import java.io.IOException; import…
Aslan986
  • 9,984
  • 11
  • 44
  • 75
0
votes
1 answer

how to change the hadoop sequence file value to jackson parser?

I have one problem which I really don't know what to do. I have a Hadoop sequence file containing the links of the webpage. Each entry of the Hadoop sequence file, the key will be the URL of one webpage and the value would be its attributes and its…
user2970089
  • 235
  • 3
  • 14
0
votes
1 answer

Increase number of splits for SequenceFileInputFormat

I'm using SequenceFileInputFormat as my input for my map, where the key is text and the value is text. There are 106 files each of them is between 500 MB to 750 MB. I saw my logs and it says there that the number of splits is 290. I want to know if…
user3690321
  • 65
  • 2
  • 8
0
votes
1 answer

classcastException when sorting sequenceFile in hadoop?

I am following Hadoop-The definitive guide 3rd edition by Tom White. I have successfully written a sequenceFile into HDFS. I followed the example the author gave in book. but when I try to run the sort (pg 138), I get the classCastException. The…
eagertoLearn
  • 9,772
  • 23
  • 80
  • 122
0
votes
0 answers

SeqFilesFromDirectory() error on amazon EMR

I am trying to run a simple program on Amazon EMR which converts text files in a directory into sequence files. The program runs fine on my local machine but gives me following error on Amazon EMR. Could someone please tell me how to get rid of this…
0
votes
1 answer

Read sequencefile in Hadoop 2.0

I am trying to read a sequencefile in hadoop 2.0 but I am unable to achieve it. I am using the below code which works perfectly fine in hadoop 1.0. Please let me know if I am missing something wrt 2.0 Configuration conf = new Configuration(); try…
Arjun
  • 31
  • 1
  • 6
0
votes
2 answers

Exporting sequence file to Oracle by Sqoop

I have been trying to find some documentations about how we can export sequence file to Oracle by using Sqoop. Is that possible? Currently I have my files(in HDFS) in text based format and I am using Sqoop to export those files to some Oracle's…
dreamer
  • 1,039
  • 2
  • 16
  • 36
0
votes
2 answers

Hive SequenceFile with Java Class; just pass to toString()

I've got a Hadoop SequenceFile where the key is IntWritable and the value is some arbitrary Java class implementing Writable, and with an interesting toString() method. I would love to make a two column Hive table where the first column is the key…
Joseph Victor
  • 819
  • 6
  • 16
0
votes
1 answer

Error when getting original images from Hadoop sequenceFile

I first pack all my images into Hadoop sequenceFile: FSDataInputStream in = null; in = fs.open(new Path(uri)); //uri is the image location in HDFS byte buffer[] = new byte[in.available()]; in.read(buffer); context.write(imageID, new…
hakunami
  • 2,351
  • 4
  • 31
  • 50
0
votes
2 answers

Mahout : Cannot convert into sequence file

I'm trying to convert some text files into mahout sequence files. So I do mahout seqdirectory -i inputFolder -o outputFolder But I always get this exception java.lang.Exception: java.lang.RuntimeException:…
IrishDog
  • 460
  • 1
  • 4
  • 21
0
votes
2 answers

If I store all my images in SequenceFile, how I can I design mapper to process a selection of them?

I do have lots of image files and need to store them in HDFS, in order to avoid the Small Files Problem, I am planning to store my image files using Sequence Files. My problem is that I need to create a MapReduce program that processes only a…
zaz
  • 25
  • 1
  • 4
0
votes
0 answers

Convert Text File to Sequence File

I am a newbie to Hadoop and Mahout. I wanted to know how to convert a simple text file containing a set of vectors to sequence file. I have tried the MR framework and changed outputFormat to SequenceFileOutputFormat, and I get following output…
Jayant
  • 346
  • 3
  • 14
0
votes
0 answers

Store Complex data with SequenceFile Hadoop

my question is how to generate a sequenceFile from text to output some format like this: , the left side is the key, and the right side is the value.
Yan
  • 153
  • 3
  • 12
0
votes
1 answer

Sequencefiles which map a single key to multiple values

I am trying to do some preprocessing on data that will be fed to LucidWorks Big Data for indexing. LWBD accepts SolrXML in the form of Sequencefile files. I want to create a Pig script which will take all the SolrXML files in a directory and output…
0
votes
0 answers

Cannot write the output of the reducer to a sequence file

I have a Map function and a Reduce function outputting kep-value pairs of class Text and IntWritable.. This is just the gist of the Map part in the Main function : TableMapReduceUtil.initTableMapperJob( tablename, // input HBase table name …
Pavan
  • 658
  • 2
  • 7
  • 28
1 2 3
10
11