What's your favorite function in SAS?

46

programmingLast time I checked, there are well over 500 functions and call routines in SAS.  I’ve taught SAS programming courses for 15 years, and I’ll admit that occasionally my students will ask me about a particular function that I have honestly never heard of.  I remember the first time this happened, a student told me he thought the SPEDIS function was the greatest thing in SAS.  As a new instructor, I was a bit embarrassed I had never heard of the SPEDIS function, so at a break I asked the other three local instructors, who probably had a combined 50+ years of SAS experience.  I felt a little better when none of them had heard of it either!

With so many functions available, it is easy for a new programmer to get overwhelmed.  I was asked to consolidate the LONG list into a more accessible list of favorites.  After putting the request out to my fellow instructors world-wide, these are our collective favorites:

Category Function Description
Character CAT Does not remove leading or trailing blanks, and returns a concatenated character string.
CATS Removes leading and trailing blanks, and returns a concatenated character string.
CATX Removes leading and trailing blanks, inserts delimiters, and returns a concatenated character string.
COMPBL Removes multiple blanks from a character string.
COMPRESS Returns a character string with specified characters removed from the original string.
FIND Searches for a specific substring of characters within a character string.
LEFT Left-aligns a character string.
LENGTH Returns the length of a non-blank character string, excluding trailing blanks, and returns 1 for a blank character string.
LOWCASE Converts all letters in an argument to lowercase.
PROPCASE Converts all words in an argument to proper case.
SCAN Returns the nth word from a character string.
SUBSTR Extracts a substring from an argument.
TRANWRD Replaces all occurrences of a substring in a character string.
TRIM Removes trailing blanks from a character string, and returns one blank if the string is missing.
UPCASE Converts all letters in an argument to uppercase.
Date & Time DATEPART Extracts the date from a SAS datetime value.
INTCK Returns the number of interval boundaries of a given kind that lie between two dates, times, or datetime values.
INTNX Increments a date, time, or datetime value by a given time interval, and returns a date, time, or datetime value.
MDY Returns a SAS date value from month, day, and year values.
MONTH Returns the month from a SAS date value.
QTR Returns the quarter of the year from a SAS date value.
TODAY Returns the current date as a numeric SAS date value.
WEEK Returns the week-number value.
WEEKDAY From a SAS date value, returns an integer that corresponds to the day of the week.
YEAR Returns the year from a SAS date value.
YRDIF Returns the difference in years between two dates according to specified day count conventions; returns a person’s age.
Descriptive Statistics LARGEST Returns the kth largest nonmissing value.
MAX Returns the largest value.
MEAN Returns the arithmetic mean (average).
MEDIAN Returns the median value.
MIN Returns the smallest value.
N Returns the number of nonmissing numeric values.
NMISS Returns the number of missing numeric values.
SMALLEST Returns the kth smallest nonmissing value.
STD Returns the standard deviation of the nonmissing arguments.
SUM Returns the sum of the nonmissing arguments.
Special INPUT Returns the value that is produced when SAS converts an expression by using the specified informat. (Used for converting character columns to numeric)
PUT Returns a value using a specified format. (Used for converting numeric columns to character)
Truncation CEIL Returns the smallest integer that is greater than or equal to the argument, fuzzed to avoid unexpected floating-point results.
FLOOR Returns the largest integer that is less than or equal to the argument, fuzzed to avoid unexpected floating-point results.
INT Returns the integer value, fuzzed to avoid unexpected floating-point results.
ROUND Rounds the first argument to the nearest multiple of the second argument, or to the nearest integer when the second argument is omitted.

Are we missing any of your favorites?

Share

About Author

Stacey Syphus

Technical Trainer/SAS Enterprise Guide Curriculum Manager

Stacey Syphus is a senior manager and instructor for SAS Education. Her areas of expertise include SAS programming, SAS Enterprise Guide and SAS Studio. Occasionally she gets back to her roots and teaches a statistics course. When not teaching or writing SAS training, she is likely chauffeuring her 3 children to various activities, visiting fun places near her Northern California home, or watching college football or basketball.

Related Posts

46 Comments

  1. Stacey,

    Thanks for a nice blog. I can understand why you didn't include it, but I have always thought the GEODIST function was very cool. For anyone who doesn't know, it returns the geodetic distance in kilometers or miles between two latitude and longitude coordinates. Also the various ZIP functions (ZIPCITY, ZIPCITYDISTANCE, ZIPFIPS, ZIPNAME, ZIPNAMEL, ZIPSTATE) can be very useful. ZIPCITYDISTANCE even gives you the distance between two zip codes.

    Susan

    • FYI ... ZIPCITYDISTANCE gets the lat/long of the zips used in the function from the SASHELP.ZIPCODE data set and uses the same method to calculate distance as the GEODIST function. One think to watch for is that ZIPCITYDISTANCE returns miles while the default for GEODIST (as already stated is kilometers). You can also get miles from GEODIST by using the M option in the function call.

  2. I like PUT and INPUT; although I always get them mixed up and have to look up the documentation to make sure I'm using the correct one.

    • Stacey Syphus

      LOL! For at least a year I had a post-it on my monitor reminding me which function to use for which type of variable conversion... You're not alone :)

    • Hi Adraine,

      I was taught to "use the alphabet" to remember which function to use for the different variable conversion (thanks Sam!)So, using the first letters of the following words; character, numeric, input & put, put them in alphabetic order:
      Character
      Input
      Numeric
      Put
      Character

      The functions are in the middle, of the variable types!
      So input does character to numeric and put does numeric to character.
      There are probably plenty of other ways to remember it but this stuck with me.

    • I just remember that "numeric" is shorter than "character", and PUT is shorter than INPUT. So if you're starting with a numeric to convert to character, it's the shorter one--PUT. And if you're starting with a character to convert to numeric, it's the longer one--INPUT.

    • Stacey Syphus

      I also have Chris to thank for LOTS of things I have learned! I must have missed the UUIDGEN function post, but just checked it out. Very cool!

  3. If I'm allowed more than one I'd have to choose the PRX family of functions. The ability to use Perl Regular Expressions in SAS can be a huge help if you're doing a lot of complex text processing.

  4. Michelle Homes

    I recall this great blog post that outlined how to do fuzzy matching and the spedis function which your readers may like http://blogs.sas.com/content/sgf/2015/01/27/how-to-perform-a-fuzzy-match-using-sas-functions/

    I've always loved the intnx and intck functions for financial date calculations and the Perl regular expression for character manipulation. Handy tip sheet at https://support.sas.com/rnd/base/datastep/perl_regexp/regexp-tip-sheet.pdf

  5. You mention that CAT* functions, which make string processing much easier. Two other newer functions (I started with SAS 79, so they seem newer to me) that I use a lot now are IFC and IFN.

    • Stacey Syphus
      Stacey Syphus on

      Can't say I was doing anything with SAS 79, but I do still think of the CAT functions as new! What a great addition...

  6. No love for coallesce\coallesceC? So much utility, replaces any instance of 'if not missing X then X else if not missing Y then Y... etc'.

    • Peter Lancashire on

      Yes, I was thinking of voting for that. It is invaluable in complex SQL outer joins to get default processing right. Saves a ton of obscure DATA step code.

  7. I absolutely love the SCAN function, especially its macro version (%SCAN). Combined with an iterative step, I can extract elements in a macro variable that represent key words, such as client IDs or variable names.

  8. Otto Schramek on

    Very good choice! :-)
    Some of you already mentioned MISSING and COALESCE[C].
    I want to add just one more: REVERSE
    For example, You can get the last word of a string:
    last_word=reverse(scan(reverse(), 1))

      • Stacey Syphus
        Stacey Syphus on

        REVERSE is cool, but did you know you can use negative numbers with the SCAN function to count from the right? So SCAN(var, -1) will give you the last word. I love that trick!

  9. Prasanna Sondur on

    Good one to refer. Thanks for sharing! Would have wished to see index() and strip() as well in the list.

    • Stacey Syphus
      Stacey Syphus on

      Thanks! I agree I overlooked STRIP(), but I left out INDEX on purpose... Did you know the FIND function was introduced in SAS 9, and does exactly what INDEX does, but has 2 additional arguments that allow you to make the search case insensitive and select a start position? Along with the CAT functions, I consider it to be one of the great "new" additions.

  10. Robert Allison
    Robert Allison on

    I find myself reading lots of oddball data into SAS from text, and the scan() function really comes in handy for parsing the data in various flexible ways.

  11. So many cool functions I've never used!

    Two of my favourites have already been mentioned - GEODIST and PRXMATCH.

    One of my other favourites is SOUNDEX (similar to SPEDIS); however, I have ditched it for COMPGED. If you do a lot of text-based analytics, check out my post https://communities.sas.com/t5/SAS-Communities-Library/PROC-SQL-Continued-Basic-Text-Analytics-Using-Song-Titles/ta-p/241007 where I use SOUNDEX, and then update it with COMPGED.

    Now I'm curious abut UUIDGEN - something to research tomorrow at work :-)
    Chris

  12. My favourites are probably STRIP() and COALESCE(), which I use mostly in PROC SQL, but CHOOSEN() and CHOOSEC() are also useful in PROC SQL instead of the CASE WHEN ELSE END when you have sequential numeric choices.

  13. David Rosenfeld on

    What about the LAG function? When understood and used properly (not inside a test) it makes cross-record comparison easy, especially when used in conjunction with the retain statement, as when one needs to collapse clumps of records while testing for a break in time between the end date of one and the start date of another.
    David R.

  14. I have to give another vote to STRIP and COALESCE(C) as well. Also, I'm a frequent user of NOTDIGIT for checking if it's safe to perform character-to-numeric conversion using PUT.

Leave A Reply

Back to Top