Point/Counterpoint: Symbolic versus mnemonic logical operators in SAS

22
mnemonic2

In SAS, the DATA step and PROC SQL support mnemonic logical operators. The Boolean operators AND, OR, and NOT are used for evaluating logical expressions. The comparison operators are EQ (equal), NE (not equal), GT (greater than), LT (less than), GE (greater than or equal), and LE (less than or equal). These character-based operators are called mnemonic because their names make it easy to remember what the operator does.

mnemonic1

Each mnemonic operator in SAS has an equivalent symbolic operator. The Boolean operators are & (AND), | (OR), and ^ (NOT). The comparison operators are = (EQ), ^= (NE), > (GT), < (LT), >= (GE), and <= (LE). The symbol for the NOT and NE operators can vary according to the computer that you use, and the tilde character (~) can be used in place of the caret (^).

Mnemonic operators tend to appear in older languages like FORTRAN, whereas symbolic operators are common in more recent languages like C/C++, although some relatively recent scripting languages like Perl, PHP, and Windows PowerShell also support mnemonic operators. SAS software has supported both operators in the DATA step since the very earliest days, but the SAS/IML language, which is more mathematically oriented, supports only the symbolic operators.

Functionally, the operators in SAS are equivalent, so which ones you use is largely a matter of personal preference. Since consistency and standards are essential when writing computer programming, which operators should you choose?

The following sections present arguments for using each type of operator. The argument for using the mnemonic operators is summarized by Mnemonic Norman. The argument for using symbols is summarized by Symbolic Sybil. Finally, there is a rejoinder by Practical Priya. Thanks to participants on the SAS-L discussion forum and several colleagues at SAS for sharing their thoughts on this matter. Hopefully Norman, Sybil, and Priya represent your views fairly and faithfully.

Use the mnemonic operators

Hi, I'm Mnemonic Norman, and I've been programming in SAS for more than 30 years. I write a lot of DATA step, SQL, and macro code. I exclusively use the mnemonic operators for the following reasons:

  1. Easy to type. I can touch-type the main alphabet, but I've never mastered typing symbols without looking down at my fingers. In addition, exotic symbols like | (OR) are not usually located in an easy-to-reach location on my keyboard. By using the mnemonic operators, I can avoid hitting the SHIFT key and can write programs faster.
  2. Easy to read. Even complex comparisons are easy to read because they form a sentence in English:
    if x gt 0 AND sex eq "MALE" then ...
  3. Easy to remember. There is a reason why these are called mnemonic operators! I program in several different languages, and each one uses a different NE operator. In SAS it is ^=. In Lua the NE operator is ~=, in Java it is !=, and the ANSI standard for SQL is <>. I use NE so I don't have to remember the correct symbol.
  4. Easy to communicate. My boss and clients are not statisticians. They can understand the mnemonic operators better than abstract symbols.
  5. Easy to see. I don't want to emphasize my age, but statistics show that most people's eyesight begins to diminish after age 40. I find the symbols | and ^ particularly difficult to see.
  6. Easy to distinguish assignment from comparison. I like to distinguish between assignment and logical comparison with equality, but SAS uses the = symbol for both. Therefore I use the equal sign for assignment and use EQ for logical comparison. For example, in the statement
    b = x EQ y;
    it is easy to see that b is a variable that holds a Boolean expression. The equivalent statement
    b = x = y;
    looks strange. (Furthermore, in the C language, this expression assigns the value of y to both b and x.)
  7. Easy to use macro variables. I reserve the ampersand for macro variables. If I see an expression like x&n, I immediately assume that the expression resolves to a name like x1 or x17. To avoid confusion with macro variables, I type x AND n when that is what I intend.
  8. Easy to cut and paste. Because the less-than and greater-than symbols are used to delimit tags in markup languages such as HTML and XML, they can disappear when used in Web pages. In fact, I dare you to try to post this comment to Rick's blog: "I use the expression 0 < x and y > 1." This is what you'll get: "I use the expression 0 1."

Use the symbolic operators

Hi, I'm Symbolic Sybil, and I've been programming in SAS for a few years. In school I studied math, statistics, and computer science. In addition to SAS, I program in C/C++, and R. I use symbolic operators exclusively, and here are reasons why:

  1. Consistent with mathematics. When a text book or journal presents an algorithm, the algorithm uses mathematical symbols. If you study Boolean logic, you use symbols. Symbols are a compact mechanism for representing complex logical conditions. Programs that implement mathematical ideas should use mathematical notation.
  2. Consistent with other modern languages. I don't use FORTRAN or SQL. I might write a DATA step to prepare data, then jump into PROC IML to write an analysis. Sometimes I call a package in R or a library in C++. I use symbols because all the languages that I use support them.
  3. Distinguish variables from operators. Symbols are not valid variable names, so it is easy see which tokens are operators and which are variables. Although Norman claims that symbols are hard to see, I argue that they stand out! If a data set has variables named EQ and LT, the expression EQ > LT is more readable than the equivalent expression EQ GT LT.
  4. Enforce coding discipline. Some of Norman's arguments are the result of lazy programming habits. The only reason he can't remember symbols is because he doesn't use them regularly. If you put spaces around your operators, you will never confuse x&n and x & n. As to remembering which operators are supported by which programming language, that is an occupational hazard. We are highly paid professionals, so learn to live with it. I don't think the solution is to use even more operators!
  5. Easy to communicate. I disagree with Norman's claim that his non-statistical boss and clients will understand character-based operators easier. How patronizing! Did they drop out of school in the third grade? Furthermore, in the modern world, we need to be inclusive and respectful of different cultures. The character-based operators are Anglocentric and might not be easy to remember if your client is not a native English speaker. In Spanish, "greater than" is "mayor que" and "equal" is "igual". In contrast, mathematical symbols are universal.

Use them both, but be consistent

Hi, I'm Pratical Priya. There is no a need to start a flame war or to make this an either/or debate. As a famous computer scientist wrote, "the nice thing about standards is that you have so many to choose from."

I used symbols exclusively until I consulted on a project where the client insisted that we use mnemonic operators. Eventually I gained an appreciation for mnemonic operators. I think they are easier to see and are more readable for experts and non-experts alike.

Today I use a combination of symbols and mnemonic operators. Like Norman, I find the logical operators AND, OR, and NOT easier to type and to read than the symbols &, |, and ^. For the relational (comparison) operators, I always use <, >, and =. I learned these symbols in school and they are universally understood.

I argue that a hybrid approach is best: In the DATA step I use mnemonic Boolean operators but use symbols for comparison operators. This presents a clear visual separation between clauses that FORM Boolean expression and clauses that OPERATE ON logical expressions, like this:
if x = 5 AND missing(y) OR y < z then ...

However, I'm embarrassed to admit that I do not consistently use symbols for the comparison operators. I also use NE, which is inconsistent with my scheme but is more readable than ^=. If my keyboard had a "not equals" symbol (≠), I'd use it, but until then I'm sticking with NE.

Your turn: Which logical operators do you use and why?

Norman, Sybil, and Priya have made some good points. Who do you agree with? What rules do you follow so that your SAS programs use logical operators in a readable, consistent manner? Leave a comment, but as Norman said, be careful typing < and >. You might want to use the HTML tags &lt; and &gt;.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

22 Comments

  1. I'm like Priya I like using the logical operators AND, OR and NOT and symbolic comparison operators. I found it interesting to read Priya also uses NE as I too prefer to use it than the ^= operator.

    I also like to use parenthesis to make it easier to read and ensure the evaluation is as expected.

    With the team of awesome alliteration avengers, SAS programmers are guided to the choices available. Ripper, Rick!

  2. I'm actually designing a new language, syntactically based on python but adds rule based programming and many operators for tensor calculus and constructive solid geometry.

    My question is how do people feel about typing non-ASCII characters.

    For example, the union of two spatial domains, should it be "c = union(a, b)" or "c = a \{insert big U symbol here\} b", or how about lambdas, or better yet, circle plus for tensor operations.

    I realize most people here probably don't do much involving many mathematical operations, but what do you guys think about non-ascii characters in programming languages. What I'm thinking about is having an operator system such that you could type in the statement "c = union (a, b)" in a plain text editor, but the operators would be define such that they would include some sort of pretty print or symbolic form, so that when viewed in a special editor, they could be viewed as full unicode glyphs.

  3. Please don't write about what you don't understand. GT is not the same as > in several languages, PERL being one of them. Likewise, there are differences between &, && and AND in some languages.

    The answer to your dumb question is, you use the correct one for the context.

    • Rick Wicklin

      Thanks for your comment. I referred to Perl because it supports AND and OR as Boolean operators. The precedence of operators and the existence of the && operator wasn't relevant for this post.

      • In Perl specifically, the 'and' and 'or' keywords also have difference precedence than && and ||. Which one you use and when is very much driven by the idioms of the language more than mere pragmatism.

        For example, consider the statement:

        open my $fh, "file" or die "could not open file";

        That statement works as you would expect (and nearly reads as English, but I digress). If you rewrote it as:

        open my $fh, "file" || die "could not open file";

        it behaves very differently. It never executes the "die" statement even if the 'open' fails.

        (Note: The situation is different in C++, where the keywords 'and', 'or' and 'not' are literally interchangeable with &&, || and !.)

        For the Practical Priyas of the world, you still have to take precedence into account if the language dictates it.

  4. David Williams on

    I use mnemonics mostly due to SAS and many others allowing the '=' to be used as a comparison operator. C had it right with having '==' for the equal comparison. Since I'm using 'EQ', I'm going to be consistent and use 'LE', 'GT', etc.

  5. The "=" sign is a special case, since it valid syntax in SAS for assignment as well as for logic. This can lead to confusion-- a=b is assigning a to have the value that b has, but (a=b) is a logic test with the value 1 if a=b and 0 otherwise.

    To avoid this and related confusion, my style is to reserve "=" for assignment and use mnemomics for all logic.

  6. Just last year I helped draft SAS code guidelines for my division, and this debate of course came up. From experience, I find mnemonic comparison operators hard to read. A lot of our code is from mainframe days, so the variable names are short. This makes them hard to distinguish at a glance from the mnemonic operators.

    So our guidelines sIt might have helped that nobody older than 30 volunteered to be part of the guideline group, but we decided on using symbols unless.

    Now I partially ignore my own rules, like Pragmatic Priya, especially since I moved to programming in Enterprise Guide. EG has syntax highlighting for AND, OR, and NOT, and I never liked used ^ for NOT (too hard to notice). However, I still cringe and hit CTRL-H whenever I see mnemonic comparison operators.

  7. I think you left out that some people use < > as the equivalent of NE, but can also be used as maximum, IIRC. Because of it's dual meaning, it should be avoided, but some people don't realize it has a dual meaning based on context / PROC / DATA step.

  8. I prefer and suggest that people use mnemonic symbols for logical operators. The reason is that I have seen some people completely mix up '=' and 'eq'. Not only they would use '=' in logical operations (which is valid), but they would use 'eq' as an assignment operator as in their minds these two are interchangeable. How would you like a statement like this:

    name eq 'Peter';

  9. Pingback: Popular posts from The DO Loop in 2015 - The DO Loop

  10. Abirami Jothiramalingam on

    I would prefer to use mnemonic operators since it stands out and you know where you have used comparisons in your code. And also, it is easier to type than typing the words and you don't have to remember things like "eq" is for "equal" and so on. And it is most practical because when using other programming tools, they all have the standard mnemonics, so it is easier to switch between programming languages.

Leave A Reply

Back to Top