-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in several calendars with new continuity test #4904
Conversation
@@ -269,7 +269,18 @@ pub(crate) fn midnight<C: ChineseBased>(moment: Moment) -> Moment { | |||
pub(crate) fn new_year_in_sui<C: ChineseBased>(prior_solstice: RataDie) -> (RataDie, RataDie) { | |||
// s1 is prior_solstice | |||
// Using 370 here since solstices are ~365 days apart | |||
// Both solstices should fall in December |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assertion fails for years before about 20,000 BCE. Is this WAI? I'm not sure which factor influences this more:
- Is this Gregorian drift? (some web sites tell me that it drifts by a day every 3216 years)
- Or is this an artifact of floating point error in astronomical calculations?
- Or is this sidereal drift?
This matters because it breaks our invariant that Chinese new year lands on a day that is a positive offset relative to January 19 of the related ISO year. I moved it back to January 1 in this PR, which makes it work for several more millennia, but it still breaks with sufficiently large negative years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is WAI but we could make the assertion stop failing outside that range, or clamp the Chinese calendar.
AIUI the Gregorian calendar accounts for the precession of the equinoxes (the biggest factor here). There are some other smaller factors that lead to the 3216 number, but the thing is that at that scale the planetary bodies are not that predictable. Basically I'm not convinced there's a sensible way to actually make any predictions about dates that far off for something like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think this is a combination of Gregorian drift and floating point error.
Tests are failing because I need to regen Chinese cached calendar data (does this break semver or is this new in 1.5?), but I want to wait until we discuss the above question. |
I disagree with the commit changing the bounds of the Chinese calendar. Firstly, the calendar is not that old (even the thing about January 20 in 30 AD is dubious, but that at least works with the math). More importantly, the way the Chinese new year is defined ("such that the winter solstice falls in the eleventh month") mathematically gives you a narrow range of ~30 days in which it may occur. I don't have the math at hand but when implementing this optimization I did the math and. It also matches Aslaksen's The Mathematics of the Chinese Calendar. This is a researched and mathematically accurate set of bounds and I do not think we should get rid of it because floating point error is breaking tests for date ranges that are irrelevant for the calendar. |
It's no longer an optimization since we have unused space there but I'd still prefer to keep it around, if the computer is producing dates outside of that range the computer is probably wrong. |
Please see my comment above regarding the Chinese calendar drift. |
Ah, I was reviewing this on a phone and couldn't see all the comments. I am still leaning towards considering those as cases we shouldn't try to fix. I'm somewhat open to bumping up the new year offset range to allow for more dates, it doesn't cost us anything. We should still document this well and have assertions that between 31 AD and maybe 2000 years from now, the new year falls in the narrower range specified, citing Reingold and Aslasken. |
According to Wolfram Alpha (query):
|
The problem is that everything currently ends up calculating a Should we use a different non-packed intermediate type without so many constraints? |
The range of dates Temporal supports is from -100 million to 100 million days on either side of January 1 1970 (about 273000 AD). |
I understand this. I don't think that affects what I said: These dates are beyond precise predictability and it is not meaningful to talk about them in this way. There is probably going to be a shift in the solstice of that form, but I don't consider those dates as important for the Chinese calendar in the first place.
No, we should not introduce a separate code path. I am somewhat open to relaxing the constraints provided it doesn't have a perf impact (and has the strong assertions for the actual date ranges we care about), but we should not introduce multiple codepaths in calendar code (once a |
I do care about making things work for all dates in the range supported by Temporal. They don't have to be perfectly accurate, but the data model should support them. Even if I increase the date offset to 64 days, we only get to about 20,000 AD/BC which is still only about 10% of the range required by Temporal.
I don't want to introduce another code path, but currently we use |
Brainstorming a few options:
|
Right, I understand your proposal, I don't want to increase the stack size of the Date type here. I think the status quo with an explicit clamp for the solstice and no panic when the years are out of "bounds" is my ideal choice, and I don't think it breaks the data model. However I think expanding to Jan 1 is acceptable too. In general I am fine with restricting our assertions to year ranges we consider relevant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer the previous solution to this: I was expecting we could bind things without tweaking the algorithm too much. Let's stick to that, but not remove any asserts (just tweak them to only fire for modern year ranges).
let following_solstice = winter_solstice_on_or_before::<C>(prior_solstice + 370); // s2 | ||
debug_assert_eq!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: keep these asserts for modern year ranges (±2000 years?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The asserts are obsolete if I use the bind
functions.
Okay, as demonstrated in #4929, increasing the size of the offset above 32 bits is unavoidable. Given the increased bits, I've now implemented a solution that pins the winter solstice to December 20-23 and allows the new year to land as early as January 19, which is the earliest the second New Moon could occur after December 20. It's not actually possible for a real-life Chinese New Year to land on January 19, because if the Winter Solstice were December 20 and there was a 29-day month starting on December 21, either it or the month after it would be a leap month. However, since we pin the winter solstice, it is possible that there are no winter leap months, and therefore January 19 ends up being a possible output of the algorithm. This solution is still better than the original solution in this PR because it works for a longer range of years and should theoretically be able to work for an arbitrarily long range. |
I already pulled out some of the changes, but if desired, I can split this PR up further. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the chinese fix: I'm okay both with the current solution and the previous one of using more unused PackedYearInfo bits (provided no stack or data size increases) to set FIRST_NY to 1 and not do any clamping. I'm not sure which I prefer, I think I actually slightly prefer the original approach given that Gregorian drift is real, but either is fine.
components/calendar/src/islamic.rs
Outdated
@@ -221,31 +221,34 @@ impl IslamicTabular { | |||
#[derive(Copy, Clone, Debug, Hash, PartialEq, Eq, PartialOrd, Ord)] | |||
pub(crate) struct IslamicYearInfo { | |||
packed_data: PackedIslamicYearInfo, | |||
/// Is the previous year 355 days (short = 354) | |||
prev_year_long: bool, | |||
prev_year_length: u16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: this was an attempt to keep stack size small, and similar to the Chinese thing I'm not sure what we gain from breaking the calendrical rule that the calendar is 354 or 355 days (at least in Chinese case it is in part due to Gregorian drift, here this is an actual facet of the calendar's definition)
What years is this failing for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails for various years in the islamic-observational calendar in the range under test. I think it always too short (353 days). If you want to keep the bits small, I can compress this down to a 3 or 4 value enum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be fine. However, can you post the actual years it fails for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and ideally, I would like to have that assertion stick around for the other calendars
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2024-05-22T23:12:39.835Z WARN [calendrical_calculations::islamic] (ObservationalIslamic) Found year -269 AH with length 353
2024-05-22T23:12:40.852Z WARN [calendrical_calculations::islamic] (ObservationalIslamic) Found year -40 AH with length 353
2024-05-22T23:12:41.775Z WARN [calendrical_calculations::islamic] (ObservationalIslamic) Found year 168 AH with length 353
2024-05-22T23:12:43.965Z WARN [calendrical_calculations::islamic] (ObservationalIslamic) Found year 670 AH with length 353
2024-05-22T23:12:44.374Z WARN [calendrical_calculations::islamic] (ObservationalIslamic) Found year 763 AH with length 353
2024-05-22T23:12:45.868Z WARN [calendrical_calculations::islamic] (ObservationalIslamic) Found year 1107 AH with length 353
2024-05-22T23:12:49.495Z WARN [calendrical_calculations::islamic] (SaudiIslamic) Found year 454 AH with length 353
2024-05-22T23:12:52.285Z WARN [calendrical_calculations::islamic] (SaudiIslamic) Found year 669 AH with length 353
2024-05-22T23:12:55.263Z WARN [calendrical_calculations::islamic] (SaudiIslamic) Found year 899 AH with length 353
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed code that retains the assertion in calendrical_calculations
, and icu_calendar
is fine with all three year lengths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, thanks. these years are in practical range for these calendars so we should ideally figure out what's wrong with the math but til then it's fine to hack it this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #4930 and I'll add references in the code
if new_rd - prev_rd == 30 { | ||
lengths[11] = true; | ||
}); | ||
// To maintain invariants for calendar arithmetic, if astronomy finds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: thanks, i was considering a technique like this
enum IslamicYearLength { | ||
L355, | ||
L354, | ||
L353, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: document this as an edge case that shouldn't actually occur (and list the positive years/cals for which it does occur)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
}; | ||
if iso_month < 12 || iso_day < 20 { | ||
#[cfg(feature = "logging")] | ||
log::trace!("({}) Solstice out of bounds: {solstice:?}", C::DEBUG_NAME); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should upgrade this to a debug assertion for iso_years that are ~±2000years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of feel that the original Jan 1 changes were better, at this point (sorry, I've gotten turned around by the original arguments and the data) but let's first land this.
Branch with individual commits archived at https:/sffc/omnicu/tree/archive/negative-tests |
Fixes #2703
Fixes #4914
Mostly fixes #2713
I wrote a test that checks for invariants on calendar behavior. In addition to ISO round-trip, it tests simple day-based calendar arithmetic, which is supported in all calendars, and it exercises a lot of code paths that aren't otherwise covered, such as month length, year length, and leap year status, which unveiled bugs in multiple calendars (Observational Islamic, Saudi Islamic, Hebrew, Chinese, Coptic, and Ethiopian).
There are several loosely related changes in this PR so I made an effort for it reviewable commit-by-commit.