Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Object of type ... is not JSON serializable" when ListParameters of Task instances #3247

Open
maxgalli opened this issue Aug 10, 2023 · 0 comments

Comments

@maxgalli
Copy link

Hi, I would like to do feed a class that inherits from luigi.Task with a luigi.ListParameters where the items are instances of classes that inherit from luigi.Task as well, like in the following small reproducer:

import luigi

class SubTask(luigi.Task):
    param = luigi.Parameter()

    def run(self):
        with self.output().open('w') as f:
            f.write(f"SubTask {self.param}\n")

class MainTask(luigi.Task):
    subtasks = luigi.ListParameter()

    def requires(self):
        return self.subtasks

    def run(self):
        with self.output().open('w') as f:
            f.write("MainTask\n")

if __name__ == "__main__":
    subA = SubTask(param='A')
    subB = SubTask(param='B')
    luigi.build([MainTask(subtasks=[subA, subB])], local_scheduler=True)

However, this seems to raise the following error:

...

TypeError: Object of type SubTask is not JSON serializable

The only way I can think of to make this work consists in instantiating subA and subB before defining MainTask, create a dictionary where the instances are values and then simply pass a list of strings when instantiating MainTask, like e.g.:

import luigi

class SubTask(luigi.Task):
    param = luigi.Parameter()

    def run(self):
        with self.output().open('w') as f:
            f.write(f"SubTask {self.param}\n")

subA = SubTask(param='A')
subB = SubTask(param='B')

dct = {
    'A': subA,
    'B': subB
}

class MainTask(luigi.Task):
    subtasks = luigi.ListParameter()

    def requires(self):
        return [dct[i] for i in self.subtasks]

    def run(self):
        with self.output().open('w') as f:
            f.write("MainTask\n")

luigi.build([MainTask(subtasks=['A', 'B'])], local_scheduler=True)

So I have two questions:

  • is there a nicest way to do this which does not require to create instances of SubTask before defining MainTask?
  • would it be a nice addition to luigi the possibility to run the first snippet without incurring in the above mention error?

Thank you,

Massimiliano

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant