Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support improved transformation metadata from column lineage #2851

Open
davidjgoss opened this issue Jul 8, 2024 · 1 comment
Open

Support improved transformation metadata from column lineage #2851

davidjgoss opened this issue Jul 8, 2024 · 1 comment

Comments

@davidjgoss
Copy link
Contributor

davidjgoss commented Jul 8, 2024

The OpenLineage standard column lineage facet has been extended in 1.17.1 so that each field in inputFields can now have an array of transformations describing transformations specific to that input field in the context of the output field. See OpenLineage/OpenLineage#2756.

Ideally Marquez should support storing and serving this data if present in OpenLineage events.

Note that the existing transformationType and transformationDescription fields at the output field level still exist but have been deprecated.

Database

The corresponding table in Marquez would be column_lineage, with each row there effectively representing one entry in inputFields. We could add another table joining with this e.g. column_lineage_transformations or - perhaps more pragmatically - use a JSON column on the existing table to hold transformations.

API

The transformations array could be added to the ColumnLineageInputField model which is included in the column lineage response and the dataset response.

@mattwparas
Copy link
Contributor

I'd be happy to contribute this change (since I would also like to see the feature implemented), but would probably need a little guidance on how to get started

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants