Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#450: Polars spreadsheet IO along with example and test case. #471

Closed

Conversation

swapdewalkar
Copy link
Contributor

@swapdewalkar swapdewalkar commented Oct 18, 2023

PR address #450

Changes

Polars spreadsheet IO along with example and test case.

How I tested this

  • Unit test
  • materialize example
  • script

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

Comment on lines +37 to +38
from polars.selectors import _selector_proxy_
from polars.datatypes import DataType, DataTypeClass
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skrawcz Any better way to do this? This is unused import, not adding this import causing issue in materialization step for not finding this classes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I don't understand, why are they required if they're not referenced below?


def save_data(self, data: DATAFRAME_TYPE) -> Dict[str, Any]:
data.write_excel(self.workbook, **self._get_saving_kwargs())
return utils.get_file_metadata(self.workbook)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might not work if it's a workbook object. So probably want to check the type here/pull out some information if possible.

@@ -77,6 +79,24 @@ def test_polars_feather(tmp_path: pathlib.Path) -> None:
assert metadata["path"] == file_path


def test_polars_spreadsheet(df: pl.DataFrame, tmp_path: pathlib.Path) -> None:
file = tmp_path / "test.excel"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will break.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do this this will break?

Copy link
Collaborator

@skrawcz skrawcz Nov 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I don't recall. 😅 But I think I might have been referring to how the code looked previously? Otherwise from reading things, you can pass a path and it'll save to it -- so this won't break as far as I can tell without running the code :)

@skrawcz
Copy link
Collaborator

skrawcz commented Oct 31, 2023

@swapdewalkar how are things going? Do you need any help?

@swapdewalkar
Copy link
Contributor Author

@skrawcz I will try to finish this week?

@skrawcz
Copy link
Collaborator

skrawcz commented Nov 22, 2023

@swapdewalkar how are things going? Do you need any help?

Should map to https://pola-rs.github.io/polars/py-polars/html/reference/api/polars.DataFrame.write_excel.html
"""

workbook: Union[Workbook, BytesIO, str, Path] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't allow this to be None.
I don't think the default that polars has makes sense here. So I think we want someone to always input something here to direct where to save it.

Comment on lines +85 to +86
kwargs1 = writer._get_saving_kwargs()
writer.save_data(df)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should save the response from save_data() and validate that.

@swapdewalkar
Copy link
Contributor Author

#776

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants