-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Add doctest to tox #4285
Conversation
8a36467
to
f8e75bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks @Fokko! I left a few comments.
>>> file_content = input_file.open().read() # Read the contents of the PyArrowFile instance | ||
>>> # Read the contents of the PyArrowFile instance | ||
>>> # Make sure that you have permissions to read/write | ||
>>> # file_content = input_file.open().read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense for now to get the checkdocs test passing but we talked about either mocking out S3 with some conftest monkeypatches or using a minio instance for more robustness. I'm not exactly sure if there's a way to inject mocks into checkdocs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me dive into this and see if I can get this working 👍🏻
We should also move the setup.py from setuptools import setup
setup() Let me know what you think. |
Hey @samredai thanks for the feedback. I think we can remove the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great but I left 2 small comments.
python/setup.cfg
Outdated
fastavro>=1.3.2<1.4.0 | ||
hmsclient==0.1.1 | ||
boto3 | ||
pyarrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be just for tests, right? We've discussed keeping hard requirements to a minimum and using primarily optional dependencies. For example, arrow is required if you load ArrowFileIO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point then I think it's worth adding this to the extras_require
block, so something like this:
[options.extras_require]
arrow =
pyarrow
This would allow someone to do pip3 install iceberg[arrow]
which will provide all of the iceberg core dependencies + optional arrow dependencies and as we expand this, something like pip3 install iceberg[arrow,jdbc]
would be valid. This still allows you to just do pip3 install iceberg
if you already have a specific pyarrow version in your environment that you want to use instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @samredai @rdblue Great point, and I agree. Pulling in boto might be too much if you don't use it at all, same for Hive.
Ps. Maybe later on we also might want to look into a full dependency management system for the python codebase. Something like Poetry or PDM. We don't really pin versions of packages, which might cause problems when a downstream package is released with a breaking change (which unfortunately still happens quite a lot).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @Fokko!
Looks great! Thank you for getting this working, @Fokko! I really like having doctests enforced. This will be useful for the transforms PR. |
Closes #4284
While going through the Python code, I noticed that not all the examples are valid. I've added doctest to check the examples in the CI.
Some observations: