Questions tagged [kiba-etl]
36 questions
1
vote
1 answer
Is database processing supported for the non Pro version?
I'm new to Ruby, want to do a proof of concept and compare with Apache Camel for an ETL project.
Not clear with differences from the Pro version for database support. So, what can be done with database processing with (not Pro) Kiba?
Seems like all…

Mozart
- 13
- 2
1
vote
1 answer
Best place to check headers for a CSV file in with kiba ETL
I need to check that :
header line is present
header contain a specfic set of headers
What the best place to do that. I have some possible solution but don't know the more idiomatic one
Check before running the full ETL for exemple before the…

djtal64
- 425
- 4
- 9
1
vote
3 answers
ETL to csv files, split up and then pushed to s3 to be consume by redshift
Just getting started with Kiba, didn't find anything obvious, but I could be just channeling my inner child (who looks for their shoes by staring at the ceiling).
I want to dump a very large table to Amazon Redshift. It seems that the fastest way…

Ken Mayer
- 1,884
- 1
- 13
- 16
1
vote
1 answer
Is there a sample implementation of Kiba ETL Job using s3 bucket with csv files as source and the destination is in s3 bucket also?
I have csv file in s3 and I wanted to transform some columns and put the result in another s3 bucket and sometimes in same bucket but with different folder. Can I achieve it using Kiba? Im possible.. do I need to store the csv data in database first…

Yakitate
- 81
- 1
- 5
1
vote
1 answer
Kiba-etl Mutliple Transformation-Multiple Destination
I am trying to having mutliple transformation ,distributing it to multiple destination .
for example :
orginal.csv:
title
movies1
movies2
movies3
movies4
adding to the .themoviedb and it is transformed to this…
1
vote
1 answer
Kiba: "Incremental Sync" between tables
My goal is to map entries from a large table to a smaller one, respecting a query based on GROUP BY statement. I want to sync them through Kiba, in a incremental way, i.e., without rewriting unchanged entries.
Is Kiba able to identify and run the…

Ricardo Amigo
- 11
- 2
1
vote
0 answers
ETL flow for taking data from a remote service, transforming it to a local ORM Model, then setting up relationships?
I recently set up my first "etl" flow to take data from a remote service, modify it to fit my local models, then save it. Now that I've finished, it feels rather grotesque for a few reasons
my source is the JSON from the remote service
my transform…

ElderFain
- 93
- 5
1
vote
2 answers
Is it possible to skip loading a row using the kiba-etl gem?
Is there a way I can skip loading certain rows if I deem the row invalid using the kiba-etl gem?
For example, if there is a validation that must be passed before I load it into the system or errors that occur and I still need to push the data into…

TheAznShumai
- 53
- 5
1
vote
1 answer
How to run kiba etl in a rails environment?
I have to load data into a Spree application. Spree makes use of Rails Engines.
All examples use pretty print or CSV destinations, but I want to use spree models in the destination, eg. SpreeModel.create!(row)
I tried to do rails runner "exec('kiba…

mardocp
- 13
- 3
0
votes
1 answer
Can Kiba support "Bulk" destination instead of 1 by 1?
I understand Kiba's core is to process rows 1 by 1. And I want this, until the destination step.
I want to push the transformed data to a Kafka Topic, however it's preferred to do it in Bulk rather than individually. Is this possible?
Assumming we…

Jorge G
- 1
0
votes
1 answer
Reorder rows in a Kiba job
I have a kiba job that takes a CSV file (with Kiba::Common::Sources::CSV), enrich its data, merge some rows (with the ChainableAggregateDestination destination described here) and saves it to another CSV file (with…

Spone
- 1,324
- 1
- 9
- 20
0
votes
1 answer
kiba-etl Pattern to split transformations into independent pipelines
Kiba is a very small library, and it is my understanding that most of its value is derived from enforcing a modular architecture of small independent transformations.
However, it seems to me that the model of a series of serial transformations does…

mollerhoj
- 1,298
- 1
- 10
- 18
0
votes
1 answer
How to structure a Kiba project that needs to do multiple HTTP calls
I'm looking at writing one of our ETL (or ETL like) processes in kiba and I wonder how to structure it. The main question I have is the overall architecture. The process works roughly like this:
Fetch data from an HTTP endpoint.
For each item…

ujh
- 4,023
- 3
- 27
- 31
0
votes
2 answers
Transpose CSV rows and columns during ETL process using Kiba (or plain Ruby)
A third party system produces an HTML table of parent teacher bookings:
Blocks Teacher 1 Teacher 2 Teacher 3
3:00 pm Stu A Stu B
3:10 pm Stu B Stu C
...
5:50 pm Stu D Stu A Stu E
The number…

Matthew
- 1,300
- 12
- 30
0
votes
1 answer
Is there a standard pattern for invoking related pipelines in Kiba ETL?
I'm working on an ETL pipeline with Kiba which imports into multiple, related models in my Rails app. For example, I have records which have many images. There might also be collections which contain many records.
The source of my data will be…

edjones
- 3
- 1