Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

number of fields in a row has to be based on the header #82

Closed
mykhaylo- opened this issue Apr 9, 2021 · 3 comments
Closed

number of fields in a row has to be based on the header #82

mykhaylo- opened this issue Apr 9, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@mykhaylo-
Copy link
Contributor

To reproduce

Have a csv file with header row with 3 columns and two data rows, first data row with two columns, second - with three. Like this:

First name Last name Citizenship
John Bobkins
Michael Pepkins US

While invoking 'readAllWithHeaderAsSequence' on this file, the CSVFieldNumDifferentException is thrown saying that two colums are expected but three are found. It happens because 'fieldsNum' variable in the CsvFileReader.kt is initialized based on the first data row, while it has to be initialized based on the header row.

Expected behavior
The following code has to return two rows:

csvReader().open(filePath) {
                readAllWithHeaderAsSequence().forEach {

. . . 
}

Environment

  • kotlin-csv version 0.11.1
  • java version - java8
  • OS: Windows 10
@mykhaylo- mykhaylo- added the bug Something isn't working label Apr 9, 2021
@doyaaaaaken
Copy link
Collaborator

@mykhaylo-
Thank you for reporting!!
I'll investigate this on weekend, thanks.

@mykhaylo-
Copy link
Contributor Author

mykhaylo- commented Apr 12, 2021

@doyaaaaaken thank you.
For your convenience here is the problematic place: https:/doyaaaaaken/kotlin-csv/blob/f4637eb89ed6c1d0d18b23de2405e41294a4c6a9/src/jvmMain/kotlin/com/github/doyaaaaaken/kotlincsv/client/CsvFileReader.kt#L46
Initially "fieldsNum" variable is null and while going over the first data row it is getting initialized with number of fields of this row. Normally when first data row has all the columns set there is no issue and parser works well, but when the first row doesn't have the last column(-s) set, "fieldsNum" is initialized incorrectly. I think the fix here has to be to initialize "fieldsNum" with header row number of fields ( in case "readAllWithHeaderAsSequence" is invoked and header row is present).
Please see my PR with proposed fix: #83

@doyaaaaaken
Copy link
Collaborator

@mykhaylo-
Fixed at v0.15.2, thanks for your contribution!! 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants