Large matrices in SAS/IML

8

Last week, SAS released the 14.1 version of its analytics products, which are shipped as part of the third maintenance release of 9.4. If you run SAS/IML programs from a 64-bit Windows PC, you might be interested to know that you can now create matrices with about 231 ≈ 2 billion elements, provided that your system has enough RAM. (On Linux operating systems, this feature has been available since SAS 9.3.)

A numerical matrix with 2 billion elements requires 16 GB of RAM. In terms of matrix dimensions, this corresponds to a square numerical matrix that has approximately 46,000 rows and columns. I've written a handy SAS/IML program to determine how much RAM is required to store a matrix of a given size.

If you are running 64-bit SAS on Windows, this article describes how to set an upper limit for the amount of memory that SAS allocate for large matrices.

The MEMSIZE option

The amount of memory that SAS can allocate depends on the value of the MEMSIZE system option, which has a default value of 2GB on Windows. Many SAS sites do not override the default value, which means that SAS cannot allocate more than 2 GB of system memory.

You can run PROC OPTIONS to display the current value of the MEMSIZE option.

proc options option=memsize value;
run;
Option Value Information For SAS Option MEMSIZE
    Value: 2147483648
    Scope: SAS Session
    How option value set: Config File
    Config file name:
            C:\Program Files\SASHome\SASFoundation\9.4\nls\en\sasv9.cfg

The value 2,147,483,648 is shown in the SAS log. The value is unfortunately in bytes. This number corresponds to 2 GB. Unless you change the MEMSIZE option, you will not be able to allocate a square matrix with more than about 16,000 rows and columns. For example, unless SAS can allocate 5 GB or more of RAM, the following SAS/IML program will produce an error message:

proc iml; 
/* allocate 25,000 x 25,000 matrix, which requires 4.7 GB */
x = j(25000, 25000, 0);


ERROR: Unable to allocate sufficient memory.

You can use the MEMSIZE system option to permit SAS to allocate a greater amount of system memory. SAS does not grab this memory and hold onto it. Instead, the MEMSIZE option specifies a maximum value for dynamic allocations.

The MEMSIZE option only applies when you launch SAS, so if SAS is currently running, save your work and exit SAS before continuing.

Changing the command-line invocation for SAS

If you run SAS locally on your PC, you can add the -MEMSIZE command-line option to the shortcut that you use to invoke SAS. This example uses "12G" to permit SAS to allocate up to 12 GB of RAM, but you can use different numbers, such as 8G or 16G.

  1. Locate the "SAS 9.4" icon on your Desktop or the "SAS 9.4" item on the Start menu.
  2. Right-click on the shortcut and select Properties
  3. A dialog box appears. Edit the Target field and insert -MEMSIZE 12G at the end of the current text, as shown in the image.

  4. memsize
  5. Click OK.

Every time you use this shortcut to launch SAS, the SAS process can allocate up to 12 GB of RAM. You can also specify -MEMSIZE 0, which permits allocations up to 80% of the available RAM. Personally, I do not use -MEMSIZE 0 because it permits SAS to consume most of the system memory, which does not leave much for other applications. I rarely permit SAS to use more than 75% of my RAM.

After editing the shortcut, launch SAS and call PROC OPTIONS. This time you should see something like the following:

Option Value Information For SAS Option MEMSIZE
    Value: 12884901888
    Scope: SAS Session
    How option value set: SAS Session Startup Command Line

SAS configuration files

A drawback of the command-line approach is that it only applies to a SAS session that is launched from the shortcut that you modified. In particular, it does not apply to launching SAS by double-clicking on a .sas or .sas7bdat file.

An alternative is to create or edit a configuration file. The SAS documentation has long and complete instructions about how to edit the sasv9.cfg file that sets the system options for SAS when SAS is launched.

SAS 9 creates two default configuration files during installation. Both configuration files are named SASV9.CFG. I suggest that you edit the one in !SASHOME\SASFoundation\9.4, which on many installations is c:\program files\SASHome\SASFoundation\9.4. By default, that configuration file has a -CONFIG option that points to a language-specific configuration file. Edit the file and put the -MEMSIZE option and any other system options after the -CONFIG option, as follows:

-config "C:\Program Files\SASHome\SASFoundation\9.4\nls\en\sasv9.cfg"
-RLANG
-MEMSIZE 12G

Notice that I also put the -RLANG option in this sasv9.cfg file. The -RLANG system option specifies that SAS/IML software can interface with the R language.

In Windows 10, you might not have permission to save the sasv9.cfg file. In that case, you can right-click on the Notepad icon and select "Run as administrator." Then, from within Notepad, you can navigate to the configuration file and edit it.

If you now double-click on a .sas file to launch SAS, PROC OPTIONS reports the following information:

Option Value Information For SAS Option MEMSIZE
    Value: 12884901888
    Scope: SAS Session
    How option value set: Config File
    Config file name:
            C:\Program Files\SASHome\SASFoundation\9.4\SASV9.CFG

If you add multiple system options to the configuration file, you might want to go back to the SAS 9.4 Properties dialog box (in the previous section) and edit the Target value to point to the configuration file that you just edited.

Remote SAS servers

If you connect to a remote SAS server and submit SAS/IML programs through SAS/IML Studio, SAS Enterprise Guide, or SAS Studio, a SAS administrator has probably provided a configuration file that specifies how much RAM can be allocated by your SAS process. If you need a larger limit, discuss the situation with your SAS administrator.

Final thoughts on big matrices

You can create SAS/IML matrices that have millions of rows and hundreds of columns. However, you need to recognize that many matrix computations scale cubically with the number of elements in the matrix. For example, many computations on an n x n matrix require on the order of n3 floating point operations. Consequently, although you might be able to create extremely large matrices, computing with them can be very time consuming.

In short, allocating a large matrix is only the first step. The wise programmer will time a computation on a sequence of smaller problems as a way of estimating the time required to tackle The Big Problem.

Share

About Author

Rick Wicklin

Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of SAS/IML software. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

8 Comments

  1. Does that mean 46000 x 46000 is the maximum even if there are more RAM? My system has 64Gb of RAM and I am trying to work on the 50000 x 50000 matrix and was not able to allocate the memory successfully. I am using IML 14.2 and have set MEMSIZE = 48Gb.

  2. Steve Hillis on

    Hi Rick,
    After editing the config file on my recent Windows 10 desktop with no problem, I tried doing the same thing with my Windows 10 laptop (about a year old) but it didn't have any effect on the memory. Then I noticed in the SAS printout of amount of memory that it said that the appropriate cong file was -config "C:\Program Files\SASHome\SASFoundation\9.4\nls\1d\sasv9.cfg" -- note the change from "en" to "1d". So I also edited that file and the memory increased. Can you tell me why the needed files are in different places on my two computers? Thanks.

    Steve

    • Rick Wicklin

      There are three directories in the most recent releases of SAS 9, which configure a SAS session to run with various encodings.

      The en directory is used for the wlatin1 encoding, which is used by many English-speaking users. The u8 directory is for UTF-8 encoding. The 1d directory
      configures SAS to run with a double-byte character set (DBCS) encoding that supports Asian languages. That configuration supports multi-byte characters, but uses English resources instead of the translated messages and templates. I am not an expert on encodings, but see Bouedo (2020) for an overview. My limited understanding is that most people use UTF-8 for running SAS on data that include multi-byte characters. I don't know why your system is using 1d.

Leave A Reply

Back to Top