Jedi SAS Tricks – Roll Your Own Function

A SAS user (who lives in the the US) emailed me a question about SAS functions. He was reading UTC (Coordinated Universal Time) datetime values from server logs, and to make future calculations and comparisons easier, he wanted to transform the value to local datetime.  The INTNX() function worked great, but did not account for daylight savings time (DST), and, he wondered, “Does SAS have an interval calculating function that takes DST into account?”  I couldn’t find one, so I finally had to answer "No, it doesn't".  And I really hated having to say that!  I thought “Surely, the power of the SAS can bend this data to my will!”  And off I went to conquer the problem using the SAS Function Compiler procedure, affectionately known as PROC FCMP.

Introduced in SAS 9.1, PROC FCMP lets you to create custom SAS functions and CALL routines.  PROC FCMP syntax is very much like DATA step, and you can leverage most features of Base SAS when defining your routines. The custom functions and CALL routines created are used in subsequent DATA steps or SAS procedures just as you would any standard SAS function or CALL routine.

First I searched for some starter code, and found this excellent tidbit “Sample 24735: Compute daylight saving time”.  I thought I remembered that Canada and the US shared the days and times they shifted to DST. Because the US recently legislated changes to DST, I searched the Internet and found this article which says the US and Canada still shift on the same days. And I read it on the Internet - so it must be true!!  With the truth now firmly on my side, I set out to write a function what would convert DST to local time, and would work for Canada and US time zones.

The first step was writing a data step prototype which would successfully adjust datetime values. Here is the prototype of the function I want to build:

data TEST;
   /* Set values for testing */
   DT='26JUN12:17:00:00'dt;
   FORMAT DT DATETIME. NEWDATE DATETIME.;
   ZONE=-5;
   year=year((datepart(dt)));
   day1=mdy(1,1,year);
   /* Year2006 DST starts 2nd SUN in March, ends 1st SUN in November. */
   else do;
      dst_month_start=intnx('month',day1,2);
      dst_beg=intnx('week.1',dst_month_start
                  , (weekday(dst_month_start) in (2,3,4,5,6,7))+1);
      dst_month_end=intnx('month',day1,10);
      dst_end=intnx('week.1',dst_month_end,(weekday(dst_month_end) ne 1));
   end;
   NEWDATE=dt+(zone*60*60);
   if dst_beg

This worked like a charm, so now it was time to convert my data step code into a function with PROC FCMP. However, it was annoying to have to look up the correct “minus number” for my time zone every time I used the routine while I was testing. I wanted my function to use the text time zone name instead - ‘Eastern’ for example. So I added a new character variable "Z" and a SELECT group do the lookup for the ZONE variable numeric value. The function takes 2 parameters: DT (a datetime value) and Z (the name of the time zone). Here is my code:

proc fcmp outlib=sasuser.myfunc.UTC2LOCAL;
   function UTC2LOC(dt,z $);
      Z=UPCASE(Z);
      select (z);
         when ('NEWFOUNDLAND') ZONE=-3.5;
         when ('ATLANTIC') ZONE=-4;
         when ('EASTERN') ZONE=-5;
         when ('CENTRAL') ZONE=-6;
         when ('MOUNTAIN') ZONE=-7;
         when ('PACIFIC') ZONE=-8;
         when ('SASKATCHEWAN') ZONE=-6;
         when ('ARIZONA') ZONE=-7;
         when ('ALASKA') ZONE=-9;
         when ('HAWAII') ZONE=-11;
         otherwise do;
            zone=0;
            PUT 'Valid zones: NEWFOUNDLAND, ALLANTIC, EASTERN, '
                'CENTRAL, MOUNTAIN, PACIFIC, ALASKA, ARIZONA, '
                'HAWAII or SASKATCHEWAN.';
         end;
      end;
      year=year((datepart(dt)));
      day1=mdy(1,1,year);
      /* Year2006 DST from 2nd SUN in MAR to 1st SUN in NOV */
      else do;
         dst_month_start=intnx('month',day1,2);
         dst_beg=intnx('week.1',dst_month_start
                       ,(weekday(dst_month_start) in (2,3,4,5,6,7))+1);
         dst_month_end=intnx('month',day1,10);
         dst_end=intnx('week.1',dst_month_end,(weekday(dst_month_end) ne 1));
      end;
      NEWDATE=dt+(zone*3600);
      if dst_beg

And now to test my handiwork! This data step will use the new function to convert a UTC datetime value to local time:

options cmplib=sasuser.myfunc;
data TEST;
   do UTC='31JAN12:17:00:00'dt
         ,'31MAR12:17:00:00'dt
         ,'30JUN12:17:00:00'dt
         ,'30NOV12:17:00:00'dt;
      Newfoundland=UTC2LOC(UTC,'Newfoundland');
      Atlantic=UTC2LOC(UTC,'atlantic');
      Eastern=UTC2LOC(UTC,'EASTERN');
      Central=UTC2LOC(UTC,'Central');
      Saskatchewan=UTC2LOC(UTC,'SASKATCHEWAN');
      Mountain=UTC2LOC(UTC,'Mountain');
      Arizona=UTC2LOC(UTC,'Arizona');
      Alaska=UTC2LOC(UTC,'Alaska');
      Hawaii=UTC2LOC(UTC,'Hawaii');
      output;
   end;
   FORMAT UTC -- HAWAII datetime.;
run;
 
PROC PRINT data=test noobs;
   format UTC -- HAWAII tod.;
RUN;

Output:Listing of the data created using the custom function.
It worked just like I had hoped! But I wondered – would it work in PROC SQL? Let’s see:

proc sql;
select UTC
      ,UTC2LOC(UTC,'EASTERN') format=datetime. as Eastern
   from test
;
quit;

Output:Output of an SQL query using the custom function.

Beautiful! I've decided I like PROC FCMP!!

That's all for this episode. Until the next time, may the SAS be with you!
Mark

PS: You can download the package for this blog posting here.

tags: custom, customizing, programming, SAS Jedi, tips and tricks

15 Comments

  1. Posted May 3, 2012 at 11:24 am | Permalink

    Good article. thanks for the FCMP starter code.

    • Mark Jordan Mark Jordan
      Posted May 3, 2012 at 11:45 am | Permalink

      :-) Glad you found it useful!
      Mark

  2. Vikas
    Posted May 3, 2012 at 2:57 pm | Permalink

    Hi Mark, These codes are really very usefull, however, i am still not able to understand some statements of this codes like following, if you could please elaborate it further.

    dst_month_start=intnx('month',day1,3);
    dst_beg=intnx('week.1',dst_month_start,(weekday(dst_month_start) ne 1));
    dst_month_end=intnx('month',day1,9);
    dst_end=intnx('week.1',dst_month_end
    ,(weekday(dst_month_end) in (6,7))+4);

    Please elabore the meaning of 3 in (intnx('month',day1,3);) and 9 in (dst_month_end=intnx('month',day1,9);). Also, what is week.1 in this with complete definition of
    dst_end=intnx('week.1',dst_month_end
    ,(weekday(dst_month_end) in (6,7))+4);.

    I would really be very grateful for this help.

    Thanks,
    Vikas Sinha

    • Mark Jordan Mark Jordan
      Posted May 8, 2012 at 2:06 pm | Permalink

      Vikas,
      Let's take a block from the code to discuss separately:

      data test;
         /* Here is a datetime value to work with */
         do DT='26JUN11:17:00:00'dt, '26JUN12:17:00:00'dt, '26JUN13:17:00:00'dt;
            /* Extract the YEAR from the datetime value */
            year=year((datepart(dt)));
            /* Find the SAS date value for January 1 of that YEAR */
            day1=mdy(1,1,year);
            /* Find the month that DST starts for that year */
            dst_month_start=intnx('month',day1,3);
            /* Adjust the date to the first Sunday of the month, if the first is not Sunday */
            dst_beg=intnx('week.1',dst_month_start,(weekday(dst_month_start) ne 1));
            output;
         end;
         format d: date9. dt datetime.;
      run;

      In the expression dst_month_start=intnx('month',day1,3), the 3 is the third argument to the INTNX function, which specifies the number of intervals to increment the value in the second argument. The first argument specified the interval to use. In this case, we are using INTNX to increment the date (DAY1) by 3 months. If DAY1 was 01JAN2011, the first boundary is crossed at FEB, the next at MAR, the 3rd at APR. So dst_month_start will contain the SAS date for 01APR2011.

      In the expression dst_beg=intnx('week.1',dst_month_start,(weekday(dst_month_start) ne 1)), the 'week.1' interval specification is using a shift index - in this case, "weekly intervals, beginning on Sunday". Intervals and shift indexes are a complex topic - you can read more about this here and here. The third argument to the INTNX function is the boolean expression (weekday(dst_month_start) ne 1). If weekday(dst_month) evaluates to '1' (Sunday), then the whole expression is FALSE - and yields the number 0. Otherwise the expression is true, yielding the number 1. So the INTNX function takes dst_month_start (in this case, 01APR2011), shifts it 1 week forward (to the next Sunday) if dst_month is not already a Sunday or leaves it alone (shifts it 0 intervals) if dst_month is already a Sunday. My calendar shows that 1APR2011 was NOT a Sunday, so the first iteration of the do loop produces a dst_beg value of 03APR2011. In the second iteration, 01APR2012 WAS a Sunday, so the value of dst_beg remains 01APR2012.

      I hope this helps!
      Mark

  3. Posted May 4, 2012 at 6:56 am | Permalink

    I'll have to keep this trick in my toolbag. Thanks!

  4. Tom Vincent
    Posted May 4, 2012 at 1:40 pm | Permalink

    Happy Star Wars Day! May The Fourth Be With You!

    • Mark Jordan Mark Jordan
      Posted May 8, 2012 at 2:59 pm | Permalink

      and also with you!

  5. Posted May 5, 2012 at 12:09 pm | Permalink

    Oh my goodness! Amazing article dude! Many thanks, However I
    am having problems with your RSS. I don't know why I am unable to join it. Is there anybody having identical RSS problems? Anyone who knows the answer can you kindly respond? Thanx!!

    • Mark Jordan Mark Jordan
      Posted May 8, 2012 at 3:03 pm | Permalink

      Get your SAS Training Post RSS feed
      here

  6. Gabe
    Posted May 10, 2012 at 9:41 am | Permalink

    This is totally a minor thing, but I don't think you need the two sets of IF-THEN-DO statements. They can just be combined, like this:

    dst_month_start=intnx('month',day1,2+(year<=2006));
    dst_beg=intnx('week.1',dst_month_start,(weekday(dst_month_start) in (2,3,4,5,6,7))+(year>2006));
    dst_month_end=intnx('month',day1,9+(year>2006));
    dst_end=intnx('week.1',dst_month_end, ((weekday(dst_month_end) in (6,7))+4)*(year<=2006) + (weekday(dst_month_end) ne 1)*(year>2006) );

  7. Frank Poppe
    Posted February 27, 2013 at 5:48 am | Permalink

    Hi,
    This was the example I needed to write a function that converts UTC-time to local time. Thanks!

    I'd like to point a some points I learned and realized while doing it.

    The code to determine the starting and ending days can be simplified using the NWKDOM function. I learned that from Rick Wicklin's blog: http://blogs.sas.com/content/iml/2012/10/29/computing-dst/.

    On the other hand I complicated it by adding code to cater for the European rules for DST. I arrived at this:

    year = year ( datepart ( UTCdt ) ) ;
    select ( DSTmethod ) ;
    when ( "USA" ) do ;
    dst_beg = nwkdom ( 2 , 1 , 3 , year ) ; /*DST begins 2nd Sun in March */
    dst_end = nwkdom ( 1 , 1 , 11 , year ) ; /*DST ends 1st Sun in Nov */
    end ;
    when ( "EUR" ) do ;
    dst_beg = nwkdom ( 5 , 1 , 3 , year ) ; /* last Sunday of March */
    dst_end = nwkdom ( 5 , 1 , 10 , year ) ; /* last Sunday of October */
    end ;
    otherwise put 'ERROR: Error in the code for function UTC2local for DST value' DST ;
    end ;

    In your code you do not address the time at which the switch to and from DST is made. For both methods (USA and EUR) that is 2 o'clock in the morning, local time.
    Since I am working with a continuous stream of measurements I have measurements before and after that time which have to be treated differently. That lead to the following piece of code, changing the date values above to datetime values valid in UTC.

    TimeZone is an input parameter giving the timezone to which the UTCdt has to be converted, with values -12 to +14 (values with fractions are alllowed, so even people in timezones with offsets of 30 or 45 minutes can use it!).

    dst_beg = intnx ( "hour" , dhms ( DST_beg , 2 , 0 , 0 ) , -TimeZone ) ;
    dst_end = intnx ( "hour" , dhms ( DST_end , 2 , 0 , 0 ) , -TimeZone ) ;
    if UTCdt >= DST_beg and UTCdt < DST_end then DSTdiff=1 ;
    LOCdt = intnx ( "hour" , UTCdt , ( TimeZone + DSTdiff ) ) ;

    Since you will now have two different hours with the same datetime-values I choose to return the values as a formatted datetime value, in which I change the colon after the hour into a plus if the value was within the DST period. This will also result in the right sorting if the formatted datetime values are sorted alphabetically (in ASCII).

    • Mark Jordan Mark Jordan
      Posted February 27, 2013 at 8:56 am | Permalink

      Wow, Frank! Nice work - and thanks for sharing :-)
      Mark

  8. Albert
    Posted October 28, 2013 at 4:45 am | Permalink

    Hi, realted with this I have found another problem.
    Function INTCK that calculate an interval diff between two dates. It not taking account the DST, and when the DST transition is between "from date" and "to date" the number of intervals are worng calculate. In DST-March the hours calculate by INTCK between two date(with DST-23h between both date) is +1h than real and in DST-October INTCK gets -1h than reality.

    • Mark Jordan Mark Jordan
      Posted November 11, 2013 at 1:45 pm | Permalink

      Thanks for posting your findings!
      Mark

One Trackback

  1. By Top 10 blogs of 2012 - The SAS Training Post on January 2, 2013 at 9:30 am

    [...] Jedi SAS Tricks – Roll Your Own Function By Mark Jordan [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <pre lang="" line="" escaped=""> <q cite=""> <strike> <strong>