Questions tagged [data-ingestion]
248 questions
-1
votes
1 answer
Apache Doris: Tablet writer failed to write
tablet_id=27306172, txn_id=28573520, err=-235
This error occurred when I was doing the data import and I believe there is something wrong with my version compaction.

Bill H
- 1
- 1
-1
votes
1 answer
15 TB data ingestion from S3 to DynamoDB
I have to ingest 15 TB of data from S3 to DynamoDB. There isn't any transformation required except that for adding a new column (insert date).
The data in S3 is in parquet format with snappy compression. The data in S3 has a different partition key…

dba
- 11
- 2
-1
votes
1 answer
Extracting multiple excel files as Pandas data frame
I'm trying to create a data ingestion routine to load data from multiple excel files with multiple tabs and columns in the pandas data frame. The structuring of the tabs in each of the excel files is the same. Any help would be appreciated!!
folder…

Harsh780
- 13
- 5
-1
votes
1 answer
Ingest unstructured file into snowflake table
Have file with 200 rows, when I tried to load file into snowflake table it will print 200 rows, but I want is 1 row contains data for 200 rows.
create or replace table sample_test_single_col (LOADED_AT timestamp, FILENAME string, single_col…

CodeM
- 21
- 8
-1
votes
1 answer
Load file from Cloud Storage to BigQuery to single string column
We are designing a new ingestion framework (Cloud Storage -> BigQuery) using Cloud Functions. However, we receive some files (json, csv) that are corrupted and cannot be inserted as is (bad field names, missing columns, etc.) not even as external…

a54i
- 95
- 1
- 5
-2
votes
1 answer
DATA INGESTION -TypeError: cannot unpack non-iterable NoneType object
I am getting this error in data ingestion part (training pipeline). I am trying to run trainining_pipeline.py and this error shows up.
Full traceback:
Traceback (most recent call last):
File "src\pipelines\training_pipeline.py", line 12, in…

bhavay bukkal
- 1
- 1
-2
votes
1 answer
How to read csv files stored on adls path without downloading it locally
Command to find file is as below :
hdfs dfs -ls {adls file location path}
command to read listed file

Shweta P
- 1
-2
votes
1 answer
Ingesting csv data to hadoop
Currently I'm trying to ingesting data to hdfs. The type of data i was trying to ingest is csv.
Hadoop 3.1.1 installed on ubuntu.
data sample stored on /home/hadoop/test.csv
I've Tried
source1
hadoop@ambari:~$ hdfs dfs -put /home/hadoop/test.csv…

yuliansen
- 470
- 2
- 14
- 29