New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sourcery Starbot ⭐ refactored hasansalimkanmaz/transformers #1

Open

SourceryAI wants to merge 1 commit into hasansalimkanmaz:main from SourceryAI:main

SourceryAI commented Sep 9, 2022

Thanks for starring sourcery-ai/sourcery ✨ 🌟 ✨

Here's your pull request refactoring your most popular Python repo.

If you want Sourcery to refactor all your Python repos and incoming pull requests install our bot.

Review changes via command line

To manually merge these changes, make sure you're on the main branch, then run:

git fetch https:/sourcery-ai-bot/transformers main
git merge --ff-only FETCH_HEAD
git reset HEAD^

SourceryAI commented

View reviewed changes

Author

SourceryAI left a comment

Sourcery timed out performing refactorings.

Due to GitHub API limits, only the first 60 comments can be shown.

conftest.py Outdated

- make_reports = terminalreporter.config.getoption("--make-reports")
- if make_reports:
+ if make_reports := terminalreporter.config.getoption("--make-reports"):

Author

SourceryAI Sep 9, 2022

Function pytest_terminal_summary refactored with the following changes:

Use named expression to simplify assignment and conditional (use-named-expression)

		@@ -67,6 +67,7 @@
		you need to go back to main before executing this.
		"""

Author

SourceryAI Sep 9, 2022

Lines 83-90 refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

Comment on lines -238 to -227

-extras = {}
+extras = {"ja": deps_list("fugashi", "ipadic", "unidic_lite", "unidic")}
-extras["ja"] = deps_list("fugashi", "ipadic", "unidic_lite", "unidic")

Author

SourceryAI Sep 9, 2022

Lines 238-240 refactored with the following changes:

Merge dictionary assignment with declaration (merge-dict-assign)

examples/flax/conftest.py Outdated

- make_reports = terminalreporter.config.getoption("--make-reports")
- if make_reports:
+ if make_reports := terminalreporter.config.getoption("--make-reports"):

Author

SourceryAI Sep 9, 2022

Function pytest_terminal_summary refactored with the following changes:

Use named expression to simplify assignment and conditional (use-named-expression)

examples/flax/test_flax_examples.py Outdated

Comment on lines 65 to 68

- if os.path.exists(path):
- with open(path, "r") as f:
- results = json.load(f)
- else:
+ if not os.path.exists(path):
  raise ValueError(f"can't find {path}")
+ with open(path, "r") as f:
+ results = json.load(f)

Author

SourceryAI Sep 9, 2022

Function get_results refactored with the following changes:

Swap if/else branches (swap-if-else-branches)
Remove unnecessary else after guard condition (remove-unnecessary-else)

examples/flax/language-modeling/run_t5_mlm_flax.py Outdated

Comment on lines 340 to 345

- mask_indices = np.asarray([self.random_spans_noise_mask(expandend_input_length) for i in range(batch_size)])
+ mask_indices = np.asarray(
+ [
+ self.random_spans_noise_mask(expandend_input_length)
+ for _ in range(batch_size)
+ ]
+ )

Author

SourceryAI Sep 9, 2022

Function FlaxDataCollatorForT5MLM.__call__ refactored with the following changes:

Replace unused for index with underscore (for-index-underscore)

examples/flax/question-answering/run_qa.py

Comment on lines -290 to +216

- else:
- if self.train_file is not None:
- extension = self.train_file.split(".")[-1]
- assert extension in ["csv", "json"], "`train_file` should be a csv or a json file."
- if self.validation_file is not None:
- extension = self.validation_file.split(".")[-1]
- assert extension in ["csv", "json"], "`validation_file` should be a csv or a json file."
- if self.test_file is not None:
- extension = self.test_file.split(".")[-1]
- assert extension in ["csv", "json"], "`test_file` should be a csv or a json file."
+ if self.train_file is not None:
+ extension = self.train_file.split(".")[-1]
+ assert extension in ["csv", "json"], "`train_file` should be a csv or a json file."
+ if self.validation_file is not None:
+ extension = self.validation_file.split(".")[-1]
+ assert extension in ["csv", "json"], "`validation_file` should be a csv or a json file."
+ if self.test_file is not None:
+ extension = self.test_file.split(".")[-1]
+ assert extension in ["csv", "json"], "`test_file` should be a csv or a json file."

Author

SourceryAI Sep 9, 2022

Function DataTrainingArguments.__post_init__ refactored with the following changes:

Remove unnecessary else after guard condition (remove-unnecessary-else)

examples/flax/question-answering/run_qa.py Outdated

Comment on lines 335 to 340

- layer_norm_named_params = set(
- [
- layer[-2:]
- for layer_norm_name in layer_norm_candidates
- for layer in flat_params.keys()
- if layer_norm_name in "".join(layer).lower()
- ]
- )
+ layer_norm_named_params = {
+ layer[-2:]
+ for layer_norm_name in layer_norm_candidates
+ for layer in flat_params.keys()
+ if layer_norm_name in "".join(layer).lower()
+ }

Author

SourceryAI Sep 9, 2022

Function create_train_state refactored with the following changes:

Replace list(), dict() or set() with comprehension (collection-builtin-to-comprehension)
Replace unneeded comprehension with generator (comprehension-to-generator)

examples/flax/question-answering/run_qa.py

Comment on lines -384 to +296

- schedule_fn = optax.join_schedules(schedules=[warmup_fn, decay_fn], boundaries=[num_warmup_steps])
- return schedule_fn
+ return optax.join_schedules(
+ schedules=[warmup_fn, decay_fn], boundaries=[num_warmup_steps]
+ )

Author

SourceryAI Sep 9, 2022

Function create_learning_rate_fn refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

examples/flax/question-answering/run_qa.py

Comment on lines -401 to +312

- batch = shard(batch)
- yield batch
+ yield shard(batch)

Author

SourceryAI Sep 9, 2022

Function train_data_collator refactored with the following changes:

Inline variable that is immediately yielded (inline-immediately-yielded-variable)

SourceryAI commented

View reviewed changes

examples/flax/question-answering/run_qa.py Outdated

Comment on lines 418 to 415

- batch = {k: np.array(v) for k, v in batch.items()}
- yield batch
+ yield {k: np.array(v) for k, v in batch.items()}

Author

SourceryAI Sep 9, 2022

Function eval_data_collator refactored with the following changes:

Inline variable that is immediately yielded (inline-immediately-yielded-variable)

examples/flax/question-answering/utils_qa.py Outdated

 and not any(p["offsets"] == (0, 0) for p in predictions)

 and all(p["offsets"] != (0, 0) for p in predictions)

Author

SourceryAI Sep 9, 2022

Function postprocess_qa_predictions refactored with the following changes:

Invert any/all to simplify comparisons (invert-any-all)

examples/flax/question-answering/utils_qa.py Outdated

Comment on lines 347 to 341

+ start_index = int(start_indexes[i])
  for j in range(end_n_top):
- start_index = int(start_indexes[i])

Author

SourceryAI Sep 9, 2022

Function postprocess_qa_predictions_with_beam_search refactored with the following changes:

Hoist statements out of for/while loops (hoist-statement-from-loop)

examples/flax/summarization/run_summarization_flax.py

Comment on lines -313 to +216

- else:
- if self.train_file is not None:
- extension = self.train_file.split(".")[-1]
- assert extension in ["csv", "json"], "`train_file` should be a csv or a json file."
- if self.validation_file is not None:
- extension = self.validation_file.split(".")[-1]
- assert extension in ["csv", "json"], "`validation_file` should be a csv or a json file."
+ if self.train_file is not None:
+ extension = self.train_file.split(".")[-1]
+ assert extension in ["csv", "json"], "`train_file` should be a csv or a json file."
+ if self.validation_file is not None:
+ extension = self.validation_file.split(".")[-1]
+ assert extension in ["csv", "json"], "`validation_file` should be a csv or a json file."

Author

SourceryAI Sep 9, 2022

Function DataTrainingArguments.__post_init__ refactored with the following changes:

Remove unnecessary else after guard condition (remove-unnecessary-else)

examples/flax/summarization/run_summarization_flax.py Outdated

Comment on lines 367 to 366

- batch = {k: np.array(v) for k, v in batch.items()}
- yield batch
+ yield {k: np.array(v) for k, v in batch.items()}

Author

SourceryAI Sep 9, 2022

Function data_loader refactored with the following changes:

Inline variable that is immediately yielded (inline-immediately-yielded-variable)

examples/flax/token-classification/run_flax_ner.py

Comment on lines -350 to +267

- batch = shard(batch)
- yield batch
+ yield shard(batch)

Author

SourceryAI Sep 9, 2022

Function train_data_collator refactored with the following changes:

Inline variable that is immediately yielded (inline-immediately-yielded-variable)

examples/flax/token-classification/run_flax_ner.py Outdated

Comment on lines 364 to 361

- batch = {k: np.array(v) for k, v in batch.items()}
- yield batch
+ yield {k: np.array(v) for k, v in batch.items()}

Author

SourceryAI Sep 9, 2022

Function eval_data_collator refactored with the following changes:

Inline variable that is immediately yielded (inline-immediately-yielded-variable)

examples/flax/token-classification/run_flax_ner.py

Comment on lines -472 to +369

- label_list = list(unique_labels)
- label_list.sort()
+ label_list = sorted(unique_labels)

Author

SourceryAI Sep 9, 2022

Function main refactored with the following changes:

Remove an unnecessary list construction call prior to sorting (skip-sorted-list-construction)
Simplify if expression by using or [×2] (or-if-exp-identity)
Merge nested if conditions (merge-nested-ifs)

This removes the following comments ( why? ):

# save checkpoint after each epoch and push checkpoint to the hub

examples/flax/vision/run_image_classification.py

Comment on lines -242 to +165

- schedule_fn = optax.join_schedules(schedules=[warmup_fn, decay_fn], boundaries=[num_warmup_steps])
- return schedule_fn
+ return optax.join_schedules(
+ schedules=[warmup_fn, decay_fn], boundaries=[num_warmup_steps]
+ )

Author

SourceryAI Sep 9, 2022

Function create_learning_rate_fn refactored with the following changes:

Inline variable that is immediately returned (inline-immediately-returned-variable)

examples/legacy/run_chinese_ref.py

Comment on lines -20 to +29

 if (

 return (

 (cp >= 0x4E00 and cp <= 0x9FFF)

 or (cp >= 0x3400 and cp <= 0x4DBF) #

 or (cp >= 0x20000 and cp <= 0x2A6DF) #

 or (cp >= 0x2A700 and cp <= 0x2B73F) #

 or (cp >= 0x2B740 and cp <= 0x2B81F) #

 or (cp >= 0x2B820 and cp <= 0x2CEAF) #

 or (cp >= 0x3400 and cp <= 0x4DBF)

 or (cp >= 0x20000 and cp <= 0x2A6DF)

 or (cp >= 0x2A700 and cp <= 0x2B73F)

 or (cp >= 0x2B740 and cp <= 0x2B81F)

 or (cp >= 0x2B820 and cp <= 0x2CEAF)

 or (cp >= 0xF900 and cp <= 0xFAFF)

 or (cp >= 0x2F800 and cp <= 0x2FA1F) #

 ): #

 return True

 return False

 or (cp >= 0x2F800 and cp <= 0x2FA1F)

 )

Author

SourceryAI Sep 9, 2022

Function _is_chinese_char refactored with the following changes:

Simplify boolean if expression (boolean-if-exp-identity)
Remove unnecessary casts to int, str, float or bool (remove-unnecessary-cast)
Lift code into else after jump in control flow (reintroduce-else)
Replace if statement with if expression (assign-if-exp)

SourceryAI commented

View reviewed changes

examples/legacy/run_chinese_ref.py

Comment on lines -48 to +47

- chinese_word = len(token) > 1 and is_chinese(token)
- if chinese_word:
+ if chinese_word := len(token) > 1 and is_chinese(token):
  word_set.add(token)
- word_list = list(word_set)
- return word_list
+ return list(word_set)

Author

SourceryAI Sep 9, 2022

Function get_chinese_word refactored with the following changes:

Use named expression to simplify assignment and conditional (use-named-expression)
Inline variable that is immediately returned (inline-immediately-returned-variable)

examples/legacy/run_chinese_ref.py

Comment on lines -58 to +53

		max_word_len = max([len(w) for w in chinese_word_set])
		max_word_len = max(len(w) for w in chinese_word_set)

Author

SourceryAI Sep 9, 2022

Function add_sub_symbol refactored with the following changes:

Replace unneeded comprehension with generator (comprehension-to-generator)
Use f-string instead of string concatenation (use-fstring-for-concatenation)

examples/legacy/run_language_modeling.py Outdated

Comment on lines 234 to 237

- bool(training_args.local_rank != -1),
+ training_args.local_rank != -1,
  training_args.fp16,
  )

Author

SourceryAI Sep 9, 2022

Function main refactored with the following changes:

Remove unnecessary casts to int, str, float or bool [×2] (remove-unnecessary-cast)
Merge else clause's nested if statement into elif (merge-else-if-into-elif)
Merge dictionary updates via the union operator (dict-assign-update-to-union)

examples/legacy/run_openai_gpt.py Outdated

Comment on lines 67 to 72

- output = []
  next(f) # skip the first line
- for line in tqdm(f):
- output.append((" ".join(line[1:5]), line[5], line[6], int(line[-1]) - 1))
+ output = [
+ (" ".join(line[1:5]), line[5], line[6], int(line[-1]) - 1)
+ for line in tqdm(f)
+ ]

Author

SourceryAI Sep 9, 2022

Function load_rocstories_dataset refactored with the following changes:

Move assignment closer to its usage within a block (move-assign-in-block)
Convert for loop into list comprehension (list-comprehension)

examples/legacy/run_openai_gpt.py

Comment on lines -192 to +193

 return list(tokenize_and_encode(o) for o in obj)

 return [tokenize_and_encode(o) for o in obj]

Author

SourceryAI Sep 9, 2022

Function main refactored with the following changes:

Replace list(), dict() or set() with comprehension (collection-builtin-to-comprehension)
Invert any/all to simplify comparisons (invert-any-all)

examples/legacy/multiple_choice/utils_multiple_choice.py

Comment on lines -276 to +271

 logger.info("LOOKING AT {} test".format(data_dir))

 logger.info(f"LOOKING AT {data_dir} test")

Author

SourceryAI Sep 9, 2022

Function RaceProcessor.get_test_examples refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

examples/legacy/multiple_choice/utils_multiple_choice.py

Comment on lines -289 to +284

		files = glob.glob(input_dir + "/*txt")
		files = glob.glob(f"{input_dir}/*txt")

Author

SourceryAI Sep 9, 2022

Function RaceProcessor._read_txt refactored with the following changes:

Use f-string instead of string concatenation (use-fstring-for-concatenation)

examples/legacy/multiple_choice/utils_multiple_choice.py Outdated

Comment on lines 300 to 296

 for _, data_raw in enumerate(lines):

 race_id = "%s-%s" % (set_type, data_raw["race_id"])

 for data_raw in lines:

 race_id = f'{set_type}-{data_raw["race_id"]}'

Author

SourceryAI Sep 9, 2022

Function RaceProcessor._create_examples refactored with the following changes:

Remove unnecessary calls to enumerate when the index is not used (remove-unused-enumerate)
Replace interpolated string formatting with f-string (replace-interpolation-with-fstring)

examples/legacy/multiple_choice/utils_multiple_choice.py

Comment on lines -325 to +320

 logger.info("LOOKING AT {} train".format(data_dir))

 logger.info(f"LOOKING AT {data_dir} train")

Author

SourceryAI Sep 9, 2022

Function SynonymProcessor.get_train_examples refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)

examples/legacy/multiple_choice/utils_multiple_choice.py

Comment on lines -330 to +325

 logger.info("LOOKING AT {} dev".format(data_dir))

 logger.info(f"LOOKING AT {data_dir} dev")

Author

SourceryAI Sep 9, 2022

Function SynonymProcessor.get_dev_examples refactored with the following changes:

Replace call to format with f-string (use-fstring-for-formatting)


'Refactored by Sourcery'

7511de6

SourceryAI force-pushed the main branch from 3589133 to 7511de6 Compare

October 30, 2023 23:31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment