Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deleting/Cleanup older TFX runs #69

Open
nadgirsanket opened this issue Sep 15, 2020 · 8 comments
Open

Deleting/Cleanup older TFX runs #69

nadgirsanket opened this issue Sep 15, 2020 · 8 comments

Comments

@nadgirsanket
Copy link

I'm using MySQL db for storing the artifacts generated via TFX runs.

Its been a while TFX has been in production. Since many pipelines have run, MLMD database is getting filled up. Due to large tables, the performance of the database has decreased as well.

Is there a way to programmatically and graciously delete older runs to free up storage and improve DB performance?

@hughmiao
Copy link
Contributor

Currently no APIs are supported to delete the provenance information, as deleting some of the nodes may break the lineage easily. A best practice is to use different mysql db instances to partition those pipelines runs, if they are independent.

qq, how many pipelines are currently stored in your db, and which APIs are most affected in your case?

@nadgirsanket
Copy link
Author

Thanks for the reply. Would it be a good idea to add this as a feature request? Because at some point, there will be a need to delete runs from the database. Since the data for a single run is stored across multiple tables, it would be a good idea to have a cleanup API.

Currently, I'm using a single database to store all the pipelines (one pipeline per customer). Each pipeline has a unique name. There are more than 15,000+ pipelines. Each pipeline having 100s of runs of its own.

My requirement is, to be able to delete older runs in each pipeline based on some filter criteria.

@hughmiao
Copy link
Contributor

hughmiao commented Sep 16, 2020

Would it be a good idea to add this as a feature request? Because at some point, there will be a need to delete runs from the database. Since the data for a single run is stored across multiple tables, it would be a good idea to have a cleanup API.

I think it is a good idea to explore the alternatives. Note the pipeline / run and how they are used in other runs are defined in the application level, i.e., TFX in this case. The scope of the subgraph to be deleted needs to be defined carefully. Let's add a FR in TFX and discuss what are the caveats and alternatives and potential tooling to do this (e.g., CLI, APIs) in this deployment mode, where all pipeline are kept in a single db.

My requirement is, to be able to delete older runs in each pipeline based on some filter criteria.

e.g., for this case, abandon a run of a pipeline may be tricky in TFX, e.g., the run may used an artifact that is generated in a previous run, probably we need to at least keep that artifact generated by other runs.

/cc some tfx folks: @ruoyu90 , @1025KB

Due to large tables, the performance of the database has decreased as well.

Note keeping the runs helps to reason about provenance, e.g., what are the jobs used a particular dataset, etc.

Apart from using separated db to isolate the runs, another alternative is to improve the API performance that TFX uses. What are the phases in tfx runs that you have noticed the performance downgrade?

@pselden
Copy link

pselden commented Oct 27, 2020

I've encountered another issue here #74 regarding hitting some maximum size limit. I believe isolating the dbs would not help in this case since it's for a single pipeline so some sort of cleanup would be necessary.

@hughmiao
Copy link
Contributor

@pselden Let's follow up in #74 .

@Bobgy
Copy link

Bobgy commented Jul 6, 2021

Can we repurpose this issue to be generic "Deleting/Cleanup old MLMD entries"?
This is also a request coming from Kubeflow Pipelines now.

@hughmiao
Copy link
Contributor

hughmiao commented Jul 9, 2021

/cc KFP folks on the pipeline deletion tools too @neuromage

@rustam-ashurov-mcx
Copy link

I know that the original issue was created almost 2 years ago but any luck with this functionality?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants