Creating strings: Concatenation and substitution

0

"Convergence after 23 iterations to (1.23, 4.56)."

That's the message that I want to print at the end of a program. The problem, of course, is that when I write the program, I don't know how many iterations an algorithm requires nor the value to which an algorithm converges.

How can you create such a string at run time? There are at least two ways: string concatenation and string substitution.

String Concatenation

Most statistical programmers solve this problem by using string concatenation. In the SAS DATA step, the double bar operator (||) is used for string concatenation. However, in SAS/IML software, that operator is used for the horizontal concatenation of matrices. Instead, PROC IML uses the addition operator (+) to concatenate strings.

As I've described before, you can convert numerical values to character strings by using the STRIP(CHAR()) technique or the PUTN function . For example, the following statements create the desired message:

proc iml;
/** assume some algorithm produces the following
    values for Iter, x, and y **/
Iter = 23;  x = 1.23;  y = 4.56;
 
/** convert values to character strings **/
sIter = strip(char(Iter));
sX = putn(x, "BEST4.");
sY = putn(y, "BEST4.");
 
msg = "Convergence after " + sIter +
      " iterations to (" + sX + ", " + sY + ")";

This creates the desired message. However, to a statistical programmer, there are several problems with the string concatenation approach:

  1. It is hard to determine what the message will look like by reading the program.
  2. If you want to change the message, you have to search through the whole program to find it. It would be better if the message could be defined near the top of the program.
  3. It is prone to error. For example, you might accidentally omit the space character before and after the sIter variable.
  4. In production-quality software, strings are often reviewed by a technical editor and even translated into other languages. Editors and translators might be unable to make sense of the concatenation statement.

String Substitution

The string substitution approach addresses all of these concerns. Instead of forming the message by concatenation strings, you define a "template" for the message:

msg = "Convergence after %n iterations to (%x, %y)";

This statement can be placed at the top of the program. It can be easily read by editors and translators. It is clear that the message has the correct spacing, punctuation, and so on.

The template includes "placeholders" for the values that will be computed later in the program. You can use any placeholder that you want. I like to prefix my placeholders with a percent sign (%) because that reminds me of the sprintf function in the C/C++ language.

After the values are computed, you can use the TRANWRD function to replace the placeholders by the actual values. For example, after converting the values to strings, the following statements substitute the string values into the template:

s = tranwrd(msg, "%n", sIter);
s = tranwrd(s, "%x", sX);
s = tranwrd(s, "%y", sY);

The first TRANWRD call replaces "%n" by the value of sIter. The subsequent lines replace "%x" and "%y" by the values of sX and sY, respectively.

It is now easy to change the message to use a complete sentence or even to change the order of the parameters. For example, you can change the message template to the following:

msg = "The solution is (%x, %y) after %n iterations.";
The rest of the program does not need to change. The TRANWRD function will substitute the correct values into the correct locations.
Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

Leave A Reply

Back to Top