Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove import * from codebase and documentation #260

Merged
merged 2 commits into from
Oct 5, 2024

Conversation

paulooctavio
Copy link
Contributor

Proposed changes

This PR tackles issue #75 by eliminating wildcard imports (import *) from both codebase and documentation.

Types of changes

What types of changes does your code introduce to quinn?
Put an x in the boxes that apply

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

@SemyonSinchenko
Copy link
Collaborator

@paulooctavio Thank you for the contribution! Can you, please, reopen thir PR with a target planning-1.0-release, not main? Because it is a breaking change and I think we should make it part of the planning 1.0 release instead of pushing to the current main

@MrPowers
Copy link
Collaborator

LGTM!

@SemyonSinchenko - what part of this PR is breaking?

@SemyonSinchenko
Copy link
Collaborator

SemyonSinchenko commented Sep 16, 2024

LGTM!

@SemyonSinchenko - what part of this PR is breaking? @MrPowers

It seems to me that extensions won't work with the change.

@paulooctavio
Copy link
Contributor Author

Which ones exactly? Also, shouldn't we have some kind of test to account for breaking changes?

@SemyonSinchenko
Copy link
Collaborator

Which ones exactly? Also, shouldn't we have some kind of test to account for breaking changes?

It seems to me that because you removed import * it won't actually execute the line DataFrame.transform = getattr(DataFrame, "transform", _ext_function) and won't patch a DataFrame class.

@paulooctavio
Copy link
Contributor Author

paulooctavio commented Sep 16, 2024

@SemyonSinchenko How exactly can I check that? I tested it using the code below and it seems to work.

Test code:

from pyspark.sql import SparkSession, DataFrame
import warnings

# Define _ext_function (as from the original dataframe_ext.py file)
def _ext_function(spark, f):
    warnings.warn(
        "Extensions may be removed in future versions. Please use explicit functions instead.",
        category=DeprecationWarning,
        stacklevel=2,
    )
    return f(spark)

# Check if DataFrame already has a transform method
if hasattr(DataFrame, 'transform'):
    print("Removing existing 'transform' method from DataFrame for testing purposes.")
    delattr(DataFrame, 'transform')

# Now patch the DataFrame class with the transform method using _ext_function
DataFrame.transform = getattr(DataFrame, "transform", _ext_function)

# Initialize Spark session
spark = SparkSession.builder.master("local").appName("TestSession").getOrCreate()

# Create a simple DataFrame
data = [(1, "Alice"), (2, "Bob")]
columns = ["id", "name"]
df = spark.createDataFrame(data, columns)

# Check if transform method exists after patch
if hasattr(df, 'transform'):
    print(f"After patch: 'transform' method exists on DataFrame")

    # Test invoking the patched transform method
    def dummy_transform(spark):
        print("Transform function works!")
        return df

    df.transform(dummy_transform)
else:
    print("Patch failed, transform method not found on DataFrame.")

Output:

Removing existing 'transform' method from DataFrame for testing purposes.
24/09/16 20:39:29 WARN Utils: Your hostname, ... resolves to a loopback address: ...; using ... instead (on interface lo)
24/09/16 20:39:29 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/09/16 20:39:29 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
After patch: 'transform' method exists on DataFrame
/home/paulooctavio/projects/quinn/try.py:38: DeprecationWarning: Extensions may be removed in future versions. Please use explicit functions instead.
  df.transform(dummy_transform)
Transform function works!

@SemyonSinchenko
Copy link
Collaborator

@paulooctavio Sorry, my fault. I was thinking that the patching won't work with that change. I checked, it works fine. Thank you for your patience!

Copy link
Collaborator

@SemyonSinchenko SemyonSinchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeffbrennan jeffbrennan merged commit 7c1332a into mrpowers-io:main Oct 5, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants