Initializing vectors by using repetition factors

7

The SAS/IML language has a curious syntax that enables you to specify a "repetition factor" when you initialize a vector of literal values. Essentially, the language enables you to specify the frequency of an element. For example, suppose you want to define the following vector:

proc iml;
x = {1 2 2 2 2 3 3 3 3 3 4 4 5 5 5 5 5 5};

The vector has one 1, followed by four 2s, followed by five 3s, two 4s, and six 5s. An alternative syntax is to specify the "repetition factor" for each element by using a positive integer enclosed in brackets, like so:

x = {1 [4]2 [5]3 [2]4 [6]5};

You can think of the repetition factor as the frequency or number of occurrences of the value that follows the closing bracket.

Admittedly, this simple example does not save a lot of typing, but if a value is repeated tens or hundreds of times, this syntax not only saves typing, but also is less prone to error and is clearer to read. For example, repetition factors make it easy to specify the genders of 100 subjects:

gender = {[42]"Female" [58]"Male"};

I find this syntax interesting because I am not aware of many other languages that support repetition factors like this. FORTRAN has repetition factors for the FORMAT and the DATA statement. This syntax is supported in the SAS DATA step, which obviously preceded and inspired the SAS/IML syntax:

data A;
array x(18) (1 4*2 5*3 2*4 6*5);
run;
Furthermore, the SAS/SCL language had repetition factors for defining arrays. Does anyone know of other languages that support a similar syntax for initializing arrays?

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

7 Comments

  1. When I used Genstat in the mid 1990s I could achieve the same sort of thing using syntax like:

    2(1),4(2) = 1,1,2,2,2,2

    there were other more flexible ways along the lines of:

    2(1,3) = 1,1,3,3
    (1,3)2 = 1,3,1,3
    (1...3)= 1,2,3
    2(1...3)=1,1,2,2,3,3
    (3...1)2=3,2,1,3,2,1
    2(0,2...6)=0,0,2,2,4,4,6,6

    it may have been possible to have repetition factors at both ends, but I don't recall that.

  2. Hi Rick

    Is there a function in IML that can create a within-scalar repetition factor based on a single character? I.e. say I have index variable i = 1 to 3. I would like to automatically define scalar *, **, *** in a do loop based on the index. I suppose I could initialise the first star and then create a second loop that adds a star to the previous result but that seems clunky. Any ideas welcome.

    Regards

    • Again, solved it. Sorry, I guess posting is acting as a stimulus. I initialised the scalar before the do loop and then added a "*" to it in every loop.

      Ciao

  3. Pingback: Repetition factors versus frequency variables - The DO Loop

  4. Ian Wakeling on

    The following simple module can be used to generate the more complex vectors in my comment above.

    start rep(a, b, c, d, inc);
        return (  j(1, d, 1) @ do(b, c, inc) @ j(1, a, 1)  );
    finish;

    For example:

    x = rep(2, 1, 3, 1, 1); gives 1,1,2,2,3,3
    x = rep(1, 3, 1, 2, -1); gives 3,2,1,3,2,1
    x = rep(2, 0, 6, 1, 2); gives 0,0,2,2,4,4,6,6

    repetition factors are also possible at both ends, so:

    x = rep(2, 0, 12, 3, 4);

    gives.

    0,0,4,4,8,8,12,12,0,0,4,4,8,8,12,12,0,0,4,4,8,8,12,12

Leave A Reply

Back to Top