Leap year questions come up all of the time in computing, but if there is any true season for it, it's now. The end of February is approaching and developers wonder: does my process know that it's a leap year, and will it behave properly?
People often ask how to use SAS to calculate the leap years. The complicated answer is:
- Check whether the year is divisible by 4 (MOD function)
- But add exceptions when divisible by 100
- Yeah...except when it's also divisible by 400.
The simple answer is: ask SAS. You can create a SAS date value with the MDY function. Feb 29 is a valid date for leap years; in off years, MDY returns a missing value.
data leap_years(keep=year); length date 8; do year=2000 to 2200; /* MISSING when Feb 29 not a valid date */ date=mdy(2,29,year); if not missing(date) then output; end; run; |
Here's an excerpt of the result:
2000 2004 2008 2012 2016 2020 2024 /* skip a few */ 2088 2092 2096 2104 2108 2112 2116 2120
Notice how 2000 was included (divisible by 400), but 2100 is not? That's not a leap year, and SAS knows it. Did you?
"Leap year considerations" are built into all SAS functions that calculate differences and intervals between date values (such as DATDIF, INTCK, INTNX, etc.).
More leap year topics
-
Read about SAS date values in the doc. One fun excerpt:
Using century dates greater than 4000 might result in incorrect dates. SAS does not consider century years that are divisible by 4000 to be leap years. Computations on dates that use a century date greater than or equal to 4000 might be off by days, depending on the computation. SAS does not consider the years 4000, 8000, 12000, 16000, and 20000 to be leap years.
- For a fun leap year explainer, watch this episode of StarTalk with Neil deGrasse Tyson.
- In the year 9999...: history of leap year and some software bugs
(This article was originally published in 2016, updated in 2024 with new links and additional information.)
8 Comments
Maybe I'm doing something wrong, but the SAS YrDif function doesn't appear to work correctly for persons born on a February 29th. For example, if you run the below cited code, YrDif with a parameter of 'AGE' will return 52 on 28 Feb 2016 for a person born on 29 Feb 1964. The correct age is 51 (they would turn 52 the following day). A slightly more complex routine follows the YrDif function which calculates the age correctly. Output is shown below the code.
DATA _NULL_; birthDate = '29FEB1964'd; asOfDate = '28FEB2016'd; /** This simple age calculation will correctly calculate the age of everyone *except* those born on a February 29th. **/ AgePerSAS = INT( yrDif(birthDate, asOfDate, 'AGE') ) ; PUT 'NOTE- ' ; PUT 'NOTE- ' ; PUT 'NOTE: BirthDate = ' BirthDate MMDDYY10. ; PUT 'NOTE- AsOfDate = ' AsOfDate MMDDYY10. ; PUT 'NOTE- AgePerSAS = ' AgePerSAS; /** This more complex age calculation will correctly calculate the age of everyone *including* those born on a February 29th. **/ ActualAge = floor ( ( intck('month',birthDate,asOfDate) - ( day(asOfDate) < day(birthDate)) ) / 12 ) ; PUT 'NOTE- ActualAge = ' ActualAge; PUT 'NOTE- ' ; PUT 'NOTE- ' ; RUN &Control_Value;
Output:
NOTE: BirthDate = 02/29/1964 AsOfDate = 02/28/2016 AgePerSAS = 52 ActualAge = 51 NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
Regards,
Jim
Jim, this is working as designed. The YRDIF function was originally designed to calculate "age spans" for use in financial and government applications (and not for annotating birthday cards!), so some differences can be expected. Here's how YRDIF works (and I got this from an internal doc written by the developer, who apparently fields questions from leap-year-pedants all of the time...)
The ACT/ACT algorithm is used for computing age, with the following exceptions:
1) if startyear is a leap year, then use the previous year as the start year
2) if endyear is a leap year, then use the next year as the end year
3) if startdate is 29FEB, then use 28FEB
4) if enddate is 29FEB, then use 28FEB
5) after computing, reduce the resultant year count by 1 or 2 as per items (1) and (2)
Don't like/agree with the result? Then use INTCK as you've done in your example.
Ah. Functioning as designed. And it's pretty darned close. I did a quick run of some "leap baby" years (1960, 1964, and 1968) comparing them to the years 2000 - 2016. You're only off one day per year, February 28th. Impact = small. But perhaps something to be aware of.
Jim
Obs birthDate asOfDate SASAge correctAge Year_Type 1 29FEB1960 28FEB2000 40 39 Leap Year 2 29FEB1960 28FEB2001 41 40 3 29FEB1960 28FEB2002 42 41 4 29FEB1960 28FEB2003 43 42 5 29FEB1960 28FEB2004 44 43 Leap Year 6 29FEB1960 28FEB2005 45 44 7 29FEB1960 28FEB2006 46 45 8 29FEB1960 28FEB2007 47 46 9 29FEB1960 28FEB2008 48 47 Leap Year 10 29FEB1960 28FEB2009 49 48 11 29FEB1960 28FEB2010 50 49 12 29FEB1960 28FEB2011 51 50 13 29FEB1960 28FEB2012 52 51 Leap Year 14 29FEB1960 28FEB2013 53 52 15 29FEB1960 28FEB2014 54 53 16 29FEB1960 28FEB2015 55 54 17 29FEB1960 28FEB2016 56 55 Leap Year 18 29FEB1964 28FEB2000 36 35 Leap Year 19 29FEB1964 28FEB2001 37 36 20 29FEB1964 28FEB2002 38 37 21 29FEB1964 28FEB2003 39 38 22 29FEB1964 28FEB2004 40 39 Leap Year 23 29FEB1964 28FEB2005 41 40 24 29FEB1964 28FEB2006 42 41 25 29FEB1964 28FEB2007 43 42 26 29FEB1964 28FEB2008 44 43 Leap Year 27 29FEB1964 28FEB2009 45 44 28 29FEB1964 28FEB2010 46 45 29 29FEB1964 28FEB2011 47 46 30 29FEB1964 28FEB2012 48 47 Leap Year 31 29FEB1964 28FEB2013 49 48 32 29FEB1964 28FEB2014 50 49 33 29FEB1964 28FEB2015 51 50 34 29FEB1964 28FEB2016 52 51 Leap Year 35 29FEB1968 28FEB2000 32 31 Leap Year 36 29FEB1968 28FEB2001 33 32 37 29FEB1968 28FEB2002 34 33 38 29FEB1968 28FEB2003 35 34 39 29FEB1968 28FEB2004 36 35 Leap Year 40 29FEB1968 28FEB2005 37 36 41 29FEB1968 28FEB2006 38 37 42 29FEB1968 28FEB2007 39 38 43 29FEB1968 28FEB2008 40 39 Leap Year 44 29FEB1968 28FEB2009 41 40 45 29FEB1968 28FEB2010 42 41 46 29FEB1968 28FEB2011 43 42 47 29FEB1968 28FEB2012 44 43 Leap Year 48 29FEB1968 28FEB2013 45 44 49 29FEB1968 28FEB2014 46 45 50 29FEB1968 28FEB2015 47 46 51 29FEB1968 28FEB2016 48 47 Leap Year
Jim, you're correct - the ACT/ACT and AGE algorithms are different. Here's a SAS program that emulates the ACT/ACT algorithm and compares to YRDIF for ACT/ACT and AGE.
data _null_; input start: date9. end: date9.; inc365=0; inc366=0; startyr=year(start); endyr=year(end); yeardays=mdy(12,31,startyr)-mdy(1,1,startyr)+1; ndays=mdy(1,1,startyr+1)-start; if yeardays=365 then inc365+ndays; else inc366+ndays; do i=startyr+1 to endyr-1; yeardays=mdy(12,31,i)-mdy(1,1,i)+1; if yeardays=365 then inc365+365; else inc366+366; end; yeardays=mdy(12,31,endyr)-mdy(1,1,endyr)+1; ndays=end-mdy(1,1,endyr); /* last day not included */ if yeardays=365 then inc365+ndays; else inc366+ndays; total=inc365/365+inc366/366; by_yrdif=yrdif(start,end,'ACT/ACT'); by_yrdif_age=yrdif(start,end,'AGE'); matched_DIF=(total=by_yrdif); matched_AGE=(int(total)=by_yrdif_age); put start=date9. end=date9. matched_DIF= total= by_yrdif= by_yrdif_age matched_AGE= ; cards; 29FEB1960 28FEB2000 29FEB1960 28FEB2001 29FEB1960 28FEB2002 29FEB1960 28FEB2003 29FEB1960 28FEB2004 29FEB1960 28FEB2005 29FEB1960 28FEB2006 29FEB1960 28FEB2007 29FEB1960 28FEB2008 29FEB1960 28FEB2009 29FEB1960 28FEB2010 29FEB1960 28FEB2011 29FEB1960 28FEB2012 29FEB1960 28FEB2013 29FEB1960 28FEB2014 29FEB1960 28FEB2015 29FEB1960 28FEB2016 29FEB1964 28FEB2000 29FEB1964 28FEB2001 29FEB1964 28FEB2002 29FEB1964 28FEB2003 29FEB1964 28FEB2004 29FEB1964 28FEB2005 29FEB1964 28FEB2006 29FEB1964 28FEB2007 29FEB1964 28FEB2008 29FEB1964 28FEB2009 29FEB1964 28FEB2010 29FEB1964 28FEB2011 29FEB1964 28FEB2012 29FEB1964 28FEB2013 29FEB1964 28FEB2014 29FEB1964 28FEB2015 29FEB1964 28FEB2016 29FEB1968 28FEB2000 29FEB1968 28FEB2001 29FEB1968 28FEB2002 29FEB1968 28FEB2003 29FEB1968 28FEB2004 29FEB1968 28FEB2005 29FEB1968 28FEB2006 29FEB1968 28FEB2007 29FEB1968 28FEB2008 29FEB1968 28FEB2009 29FEB1968 28FEB2010 29FEB1968 28FEB2011 29FEB1968 28FEB2012 29FEB1968 28FEB2013 29FEB1968 28FEB2014 29FEB1968 28FEB2015 29FEB1968 28FEB2016 run;
In running the sample SAS code backwards ( do year=2200 to 1 by -1; ) I notice the MDY function does not report any leap years earlier than 1584.
The years 1600, 1596, 1592, 1588 and 1584 are leap years apparently, but that's it, no leap years are reported earlier than that.
I'm wondering why ?
Is it related to Pope Gregory's Edict in 1582 ?
Indeed! The whole sordid history is chronicled here.
Why do I get ERROR?
435 %let prodar =2023;
436 %let refar =%eval(&prodar.-1);
451
452 /* 1 Leap day for production year (prodar)
455
456 data _prodleapday_;
457 length prodday 8;
458 prodday=mdy(02,29,&prodar.);
459 if missing(prodday) then call symputx('dat',cats(&prodar.,'0228'));
460 else call symputx('dat',cats(&prodar.,'0229'));
461 run;
NOTE: Invalid argument to function MDY(2,29,2023) at line 458 column 9.
prodday=. _ERROR_=1 _N_=1
NOTE: Mathematical operations could not be performed at the following
places. The results of the operations have been set to missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 458:9
NOTE: The data set WORK._PRODLEAPDAY_ has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
462 %put &dat.;
20230228
463
464 /* 2 Leap day for reference year (refar)
465
466 data _refleapday_;
467 length refday 8;
468 refday=mdy(02,29,&refar.);
469 if missing(refday) then call symputx('forradat',cats(&refar.,'0228'));
470 else call symputx('forradat',cats(&refar.,'0229'));
471 run;
NOTE: Invalid argument to function MDY(2,29,2022) at line 468 column 8.
refday=. _ERROR_=1 _N_=1
NOTE: Mathematical operations could not be performed at the following
places. The results of the operations have been set to missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 468:8
NOTE: The data set WORK._REFLEAPDAY_ has 1 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
472 %put &forradat.;
20220228
Hi, this isn't an error exactly -- it's a NOTE that indicates the value (02/29/2023) is not a valid argument to MDY...because it's not a valid date in a non-leap-year. As far as I can see this is working as you intend.