Questions tagged [luigi]

Luigi is a Python package that helps you build complex pipelines of batch jobs.

Luigi is a Python package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more.

For further information, see the documentation at luigi.readthedocs.io.

Getting Luigi

Run pip install luigi to install the latest stable version from PyPI.

For bleeding edge code, git clone https://github.com/spotify/luigi and python setup.py install. Bleeding edge documentation can be found here.

If you want to run the central scheduler (highly recommended), you need to install Tornado which you can do from PyPI as well: pip install tornado.

348 questions
8
votes
3 answers

What to do when I don't want Luigi to output a file but show the task as complete?

When looping over files with Luigi I do not what to be forced to save empty files just to show that the task was complete, and let the next task check if there are any rows in the txt, etc. How can I have a task showing it succeeded (i.e. the run…
George Pamfilis
  • 1,397
  • 2
  • 19
  • 37
8
votes
2 answers

Where is the luigi config file?

I have installed luigi by pip command and I would like to change the port for the web UI. I tried to find the config file but I couldn't. Do I need to create one?
Iamasupernoob
  • 83
  • 1
  • 4
8
votes
1 answer

How to enable dynamic requirements in Luigi?

I have built a pipeline of Tasks in Luigi. Because this pipeline is going to be used in different contexts, it was possible that it would require to include more tasks at the beginning of or the end of the pipeline or even totally different…
Kaleidophon
  • 589
  • 1
  • 5
  • 16
8
votes
1 answer

How to Dynamically create a Luigi Task

I am building a wrapper for Luigi Tasks and I ran into a snag with the Register class that's actually an ABC metaclass and not being pickable when I create a dynamic type. The following code, more or less, is what I'm using to develop the dynamic…
Brian Bruggeman
  • 5,008
  • 2
  • 36
  • 55
8
votes
1 answer

Luigi parameter default values and mocks

I am trying to mock something that supplies a default value for a luigi parameter. A dumb example showing what I'm trying to accomplish: Task under test: import luigi from bar import Bar bar = Bar() class Baz(luigi.Task): qux =…
Don Roby
  • 40,677
  • 6
  • 91
  • 113
7
votes
1 answer

How to handle output with Luigi

I'm trying to grasp how luigi works, and I get the idea, but actual implementation is a bit harder ;) This is what i have: class MyTask(luigi.Task): x = luigi.IntParameter() def requires(self): return OtherTask(self.x) def…
4c74356b41
  • 69,186
  • 6
  • 100
  • 141
7
votes
1 answer

How to make a Luigi task generate an in-memory list as target

I'm trying to write an etl pipeline using luigi. As far as I understand from the documentation a task in luigi can generate a target that can be either some type of file storage or a database. To decrese the processing time I would like to have as…
djWann
  • 2,017
  • 4
  • 31
  • 36
7
votes
1 answer

Luigi fails to finish all tasks listed in the require method

Say I have a task with the following dependency structure class ParentTask(luigi.Task): def requires(self): return [ChildTask(classLevel=x) for x in self.class_level_list] def run(self): yadayda The child task runs fine on…
Junchen
  • 1,749
  • 2
  • 18
  • 25
6
votes
1 answer

Luigi: Is there a way to pass 'false' to a bool parameter from the command line?

I have a Luigi task with a boolean parameter that is set to True by default: class MyLuigiTask(luigi.Task): my_bool_param = luigi.BoolParameter(default=True) When I run this task from terminal, I sometimes want to pass that parameter as False,…
DalyaG
  • 2,979
  • 2
  • 16
  • 19
6
votes
1 answer

How to make a Parameter available to all Luigi Tasks?

In the Luigi docs, the use of a luigi.Config class is recommended for global configuration. However, I am running into issues when using such a config class in order to pass a commandline argument to various Tasks in the pipeline. Here's a…
Librarian
  • 198
  • 1
  • 8
6
votes
2 answers

python luigi localTarget pickle

I am running on Windows 7, Python 2.7 via Anaconda 4.3.17, Luigi 2.4.0, Pandas 0.18, sklearn version 0.18. Per below, I am trying to have a luigi.LocalTarget output be a pickle to store a few different objects (using firstJob) and then read from…
user975
  • 63
  • 3
6
votes
1 answer

Workers dying early due to uneven work distribution in Luigi (2.6.1)

We are trying to run a simple pipeline distributed on a docker swarm cluster. The luigi workers are deployed as replicated docker services. They start successfully and after a few seconds of asking for work to luigi-server, they begin to die due to…
fcisneros
  • 63
  • 3
6
votes
3 answers

Event Handling in Python Luigi

I've been trying to integrate Luigi as our workflow handler. Currently we are using concourse, however many of the things we're trying to do is a hassle to get around in concourse so we made the switch to Luigi as our dependency manager. No problems…
6
votes
1 answer

How to configure Luigi task retry correctly?

I am trying to configure Luigi's retry mechanism so that failed tasks will be retried a few times. However, while the task is retried successfully, Luigi exits unsuccessfully: ===== Luigi Execution Summary ===== Scheduled 3 tasks of which: * 2 ran…
Jakabov
  • 106
  • 1
  • 8
6
votes
2 answers

Organizing files when using Luigi pipeline?

I am using Luigi for my workflow. My workflow is divided into three general parts - import, analysis, export. Within each part, there are multiple Luigi tasks. I could have everything in a single file. But if I want to keep everything separate, as…
1
2
3
23 24