Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for renaming duplicate headers #956

Merged
merged 7 commits into from
Nov 15, 2022

Conversation

fortydegrees
Copy link
Contributor

Added code to automatically rename headers to solve the issue in #129

I tried to add a renameDuplicateHeaders config option but left it out as I am doing the processing in the Parse.parse function where I don't have the ability to read the config. I'm not sure it fits the styling of the codebase to modify the input outside of that function so have left it out.

@pokoli
Copy link
Collaborator

pokoli commented Oct 24, 2022

Hi Thanks for creating the PR.

As starting point we need to have a test case (not a test file). This means creating a function on tests/test-cases.js with a simple file that demostrates the functiona.

I have some concerns about your code:

  • There are prints and other comments that are uselless. Could you please remove them?
  • We have an option to decide if the file has a header or not but your code is always executed no mather if the header is disabled.
  • There is a while loop inside the for that it seems to me that can be simplified.

Could you please address such issues? Thanks

@fortydegrees
Copy link
Contributor Author

fortydegrees commented Oct 25, 2022

Thanks for the quick feedback & sorry for the misunderstanding re: test cases.

I've added a test case, removed the useless comments, and added a conditional based on whether the header is present.

You were right - the renaming code could be refactored. I originally just took the code from that issue thread which worked but was inefficient. I've rewritten it in a simpler way which produces the same result.

Copy link
Collaborator

@pokoli pokoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this. Everything looks nice, I just added some comments to improve code redability and ensure everything works as expected. Could you please review them?

papaparse.js Outdated Show resolved Hide resolved
papaparse.js Outdated Show resolved Hide resolved
papaparse.js Outdated Show resolved Hide resolved
tests/test-cases.js Outdated Show resolved Hide resolved
@fortydegrees
Copy link
Contributor Author

Have addressed all comments re: code-cleanup.

Good call on the transformHeader() test. I added it and It originally failed. I added a check for the function when initialising the header in my code. Hopefully this is satisfactory.

I also added an extra test in the case of an existing header matching an intended substitution. An unlikely situation but thought it'd be wise to cover it.

Copy link
Collaborator

@pokoli pokoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everythinkg looks nice now. I leaved some minor issues that should be addressed.

Could you fix them? I think we will be able to merge it once they are addressed.

Thanks for your time on this.

tests/test-cases.js Outdated Show resolved Hide resolved
papaparse.js Outdated
header = config.transformHeader(header, i);
var header_name = header
if (header_map.includes(header_name)){
if (!(header in header_count)){
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its simpler to write as:

count = header_count[header_name] || '1';
header_name = header+separator+count;
header_count[header_name] = count +1;

Removeing the if else lines.

Copy link
Contributor Author

@fortydegrees fortydegrees Nov 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I struggled to make it work without an if statement. The problem being that it would still add the suffix at the end. The best I got was:

let count = header_count[header] || 0
if (count > 0) header_name = header+separator+count;
header_count[header] = count+1

Have committed this for you consideration if you prefer.

papaparse.js Outdated Show resolved Hide resolved
papaparse.js Outdated Show resolved Hide resolved
papaparse.js Outdated Show resolved Hide resolved
@pokoli
Copy link
Collaborator

pokoli commented Nov 14, 2022

@fortydegrees linting is failing, could you have a look?

https:/mholt/PapaParse/actions/runs/3463298598/jobs/5783355405

@fortydegrees
Copy link
Contributor Author

@fortydegrees linting is failing, could you have a look?

https:/mholt/PapaParse/actions/runs/3463298598/jobs/5783355405

Should be fixed now. Thank you for your excellent help in cleaning everything up!

@pokoli pokoli merged commit c1cbe16 into mholt:master Nov 15, 2022
@pokoli
Copy link
Collaborator

pokoli commented Nov 15, 2022

@fortydegrees I've merged your work. Thanks you so much!

@sethmcleod
Copy link

Thank you for resolving this @fortydegrees!

@pokoli Would you mind doing another minor version bump so we can use these changes in production?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants