I recently set up my first "etl" flow to take data from a remote service, modify it to fit my local models, then save it. Now that I've finished, it feels rather grotesque for a few reasons
my source
is the JSON from the remote service
my transform
replaces each row
with a newly defined local model
, based on the original source
row
transform
also looks at various values and defines additional local relationships
destination
then calls .save
on my newly replaced rows
which are now models in the ORM.
- How am I supposed to create local records based on an external datasource? My models don't look like the remote sources. Is it right to supplant a
row
entry with my new Model object? - If I am supposed to replace the value of
row
with my local model, then I presume I should split each subsequent action into atransform
on that new row (now a model)?
In all, my .etl looks like
pre_process do
@some = <Go To DB and fetch data>
@variables = <Setup More Information>
end
source MyRemoteSource
transform DoABunchOfWork,@some,@variables
destination CallSaveOnModels
DoABunchOfWork class has about 6 methods that process
will call to manipulate or setup relationships in various ways