-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue with duplicate column name #136
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the improvement! Makes sense to produce an error instead of doing nothing if this is unexpected behavior. I left a couple of small hints in the review for touch-ups before we merge.
I wonder how we could help users to load CSV files with duplicate columns in the future.
- should we add an option to force load a CSV with duplicate column names to get the old behavior?
- should we offer to enumerate duplicate column names, like
samename
,samename1
,samename2
?
If we don't offer any resolution at all, users will not be able to resolve duplicate column names. If you get the error, what do you do then? Perform a string replace operation on the first line of text to disambiguate? SwiftCSV would have the power to offer this in a convenient spot since our code is "column" aware.
""" | ||
|
||
XCTAssertThrowsError(try CSV<Named>(string: csvString)) { error in | ||
XCTAssertEqual(error as? CSVParseError, CSVParseError.generic(message: "Duplicate column names found: name")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you try to replace the equality check with a switch-case for this error?
XCTAssertEqual(error as? CSVParseError, CSVParseError.generic(message: "Duplicate column names found: name")) | |
switch error as? CSVParseError { | |
case .generic(message: let message): XCTAssertEqual("Duplicate column names found: name", message) | |
default: XCTFail("Expected CSVParseError.generic") | |
} |
If that works, please revert the Equatable
conformance of the error afterwards.
Motivation: there's just this one place in tests where equatability is used, and the conformance affects public SwiftCSV API. We can't undo that easily after we ship it, and there's no long-term motivation to make the errors equatable in apps. switch-case is the best bet there, too.
SwiftCSV/Parser.swift
Outdated
// Check for duplicate column names | ||
let duplicateColumns = header.duplicates() | ||
if !duplicateColumns.isEmpty { | ||
throw CSVParseError.generic(message: "Duplicate column names found: \(duplicateColumns.joined(separator: ", "))") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I think this warrants a new error case 👍
case duplicateColumns(columnNames: [String])
or something like it maybe.
SwiftCSV/Extension.swift
Outdated
@@ -0,0 +1,13 @@ | |||
// | |||
// Extension.swift |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename the file to Array+duplicates.swift
@TomorrowMC I'm preparing a Privacy Manifest release soon. Do you want to finish changes for that? |
@TomorrowMC Friendly ping to see if you want to follow up on this |
Sorry for delay, The problems above have solved and equatable conformance has reverted |
@TomorrowMC Great, thank you! Please add a line to the |
SwiftCSV/ParsingState.swift
Outdated
|
||
|
||
public static func == (lhs: CSVParseError, rhs: CSVParseError) -> Bool { | ||
switch (lhs, rhs) { | ||
case (.generic(let message1), .generic(let message2)): | ||
return message1 == message2 | ||
case (.quotation(let message1), .quotation(let message2)): | ||
return message1 == message2 | ||
case (.duplicateColumns(let columns1), .duplicateColumns(let columns2)): | ||
return columns1 == columns2 | ||
default: | ||
return false | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I believe this is dead code a.t.m. and we can remove this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, has fixed these problems
@DivineDominion Thank you, I'v done these things. It is my first time to contribute to a open source repository, so if there is any I need to do please feel free to reply :) |
The problem of data loss due to duplicate column names has been resolved, now if there is a duplicate column name in the first row, a CSVParseError exception will be thrown indicating the name of the first duplicate column name encountered.