Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds prototype spark caching code #726

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

skrawcz
Copy link
Collaborator

@skrawcz skrawcz commented Mar 1, 2024

I haven't tested this, but this I think should be
good enough to try things, you'd just want to
annotate what you want.

To use, you'd just need to pass in the spark session to the read_kwargs argument of the constructor as "spark_session" for being able to read back things.

Changes

  • h_cache

How I tested this

  • I haven't

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

I haven't tested this, but this I think should be
good enough to try things, you'd just want to
annotate what you want.
@skrawcz skrawcz marked this pull request as ready for review March 1, 2024 17:46
Rather than using persist with pyspark, you could just
tell Spark to read from it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant