Getting this error:- Can not read value at 0 in block -1 in file file: localdirectory/samplefile.parquet
.
I have to read a directory containing parquet file from s3 bucket. For this, I am downloading the directory from s3 in local and reading it in stand alone java project. Below code is used for downloading...
MultipleFileDownload multipleFileDownload = transferManager.downloadDirectory(bucketName, keyPrefix, directoryToSaveParquetFiles);
I, then tried to read parquet files from local directory. Code used to read parquet file.
Configuration conf = new Configuration();
conf.set("parquet.avro.readInt96AsFixed", "true");
ParquetReader<GenericRecord> reader = AvroParquetReader.<GenericRecord>builder(new Path(filePath)).withConf(conf).build();
GenericRecord obj = reader.read();
while (obj != null) {
//read attributes
obj = reader.read();
}
Used maven dependency
import com.amazonaws.services.s3.transfer.TransferManager;
import com.amazonaws.services.s3.transfer.TransferManagerBuilder;
import org.apache.avro.generic.GenericRecord;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.parquet.avro.AvroParquetReader;
import org.apache.parquet.hadoop.ParquetReader;
import org.apache.commons.io.FileUtils;
Used sdk in project
11 Amazon Corretto version 11.0.11
Works perfectly fine in local machine(I'm using Intellij IDEA). But when deployed to AWS ECS, getting this error
Can not read value at 0 in block -1 in file file: localdirectory/samplefile.parquet