Related article: Adaptive SAS programming for the Software Development Life Cycle
When new SAS users get introduced to the SAS Business Intelligence software (SAS BI), along with the thrill of a wide spectrum of new and desirable functionality, almost always comes a state of confusion – if not panic.
How do we go about adapting it to our organization IT guidelines? How do we arrange and support development and production environments? How do we set up our SAS developers so they can collaborate, but not overwrite each other’s work? What are the best practices for setting up SAS BI environments? What the hell are SAS folders? Why do we need them and how are they different from the operating system folders and directories?
Let’s make everything straight. This post will explain the two distinct dimensions to a SAS BI environment: folder structure and software life cycle environments. It will also provide some suggestions on how you can set up your SAS Business Intelligence environment the right way.
What folders are we talking about?
The folder structure dimension is two-fold (pun is accidental). There is a metadata folder structure and a physical folder structure (or directory structure). It is imperative to absolutely distinguish between them and realize that they are by no means are related to each other.
Metadata folder structure is a logical tree created within SAS metadata and is used to store different SAS objects such as library definitions, data table definitions, information maps, SAS web reports, stored process definitions, dashboard definitions, SAS prompt definitions, SAS job definitions and other software elements.
Physical folder/directory structure is located within file system of an operating system – Windows, UNIX, zOS - and is used to store any files (SAS code, data tables, Excel files, HTML, PDF, RTF and other files.)
When it comes to the physical folders or directories where you typically would store some of your SAS code, SAS data and other data, they can be organized alongside your customized SAS metadata folders, or in any other logical way reflecting your business structure. Keep in mind that some objects in the SAS metadata rely on the physical folder structure (such as library definitions), so in many cases, you cannot independently change physical folder structure without affecting SAS folder objects.
Here are a few helpful resources:
- For managing SAS metadata folders, use Best Practices for Managing SAS Folders.
- Pay close attention to moving SAS objects between SAS folders as described in the Best Practices for Maintaining Associations among Objects in SAS Folders.
- You can also customize your SAS folders to align them with your organization needs. See Customizing Your Folder Structure.
What’s the best practice for multiple SAS environments?
Development of SAS BI systems usually employs a variation of the three-tier or three-environment life cycle approach:
- Development/Integration environment
- Staging/Testing environment
- Production/Operation environment
With SAS BI development, each of these environments could be implemented either as a folder structure within single SAS BI installation or as a completely separate SAS BI installation.
During the systems development life cycle, development activities, along with the corresponding metadata and filesystem objects, are gradually propagated from Development/Integration environment to Staging/Testing, and finally to Production/Operation environment.
Make sure you maintain identical SAS folder and physical folder structures among all the environments for smooth and seamless migration of the metadata and files from one environment to another.
In the development environment only, it is also possible to create a pre-development or sandbox folder with personal sub-folders corresponding to each member of the development team. The sandbox is where developers can experiment with different coding approaches before moving their code to the development/integration.
To encourage code sharing among developers and to protect them from accidentally overwriting each other’s code, these personal folders can be assigned permissions allowing the owner of the folder READ and WRITE permissions, and READ ONLY permission to all other developers.
How did you set up your SAS Business Intelligence environment?
Instead of a conclusion, I would like to invite you to share how you set up SAS environment in your organization. What are your considerations? What are pros and cons? What would you advise to others?Related article: Adaptive SAS programming for the Software Development Life Cycle
Thanks for the reply again. I think I missed saying this part. I will make it clear. I meant to say, in our environment, new fields with data are added within actual tables very frequently. So if new fields (new columns with data – data structure) are added to the actual physical table, how can we ensure the registered tables in the SMC get mapped to those new fields added to the table in the actual database. If this happens very frequently, is there any other way other than admins performing a metadata refresh every time the change happens in the physical table.
Nirmal, in case your metadata changes frequently (such as new columns are added) I would refer you to the PROC METALIB that you can run automatically in the event of adding a new column or on a particular schedule without SAS admin manual intervention. It will allow you to synchronize the metadata with actual physical tables in a library.
Thanks for the quick reply. We have already registered tables under these libraries through SMC. If there is data change in the actual physical table on the database side, how does these already registered tables which contains the data under the libraries (which the user opens to view from SAS Enterprise Guide Explorer) reflect the actual data change that happened on the physical table. Please let us know.
Nirmal, It seems that you are confusing metadata with data. SAS metadata has nothing to do with the actual data changes. When you explore your data from SAS EG you read actual data. In the process of reading that actual data SAS goes through the metadata to see what are the table names, column names and types, and whether you have permission to see (READ) or modify (WRITE) the data.
Leonid, Read your post and good to know that it is best practice to register SAS libraries for a pool of users who will access the same content. What if the users need to refresh the metadata as frequently as possible. If the database updates once every 5 minutes for example, how can admins ensure registered library metadata also reflects the changes in the physical tables. Please let us know your thoughts.
Nirmal,You don't need to refresh your metadata every time your data change. When your data change it means you either add new records or modify old records; the data structure (table names, column names and types) stays the same. That is your metadata stays the same. You only need to change your metadata when you change your data structure or add new users or change their access level, etc. Frequency of your metadata updates has nothing to do with frequency of the data updates.
Great reading all your posts, very informative. Wanted to ask you, if it is best practice to register SAS lbraries in SAS SMC for the SAS users at my site. Or is it better if every user has their own user id and password to connect to their data sources like Oracle, DB2, SQL etc.
Please let me know, appreciate your opinion.
Great question. If your users access the same data sources, then the best practice would be to register those data sources as libraries in SAS MC. Doing it that way would eliminate necessity of defining those libraries in user's code and allow for effective usage of such interactive tools as SAS Data Integration Studio. Doing it that way would also allow you to control which user or user group has access to which library.
Leonid, as probably expected I will react. Le us do first is some environment clarification.
There are not only two but many more system using folders.
- Metadata - which one as there can many versions setup (D T A P segregations)
- OS layers Server-side or Desktop-side visible with OS tools or SAS-tools
As the Desktop may be virtualized and the server can be part of a grid this can add more instances to care about placing and handling data/code.
Basically you have to think in three locations Desktop/Server and Metadata but some replications can add a lot to that. This is very confusing for intended users
b/ I always make a segregation in code objects and data objects.
+ Code types is going into version management (develop only = horizontal view ) and release management for life cycle management (ALM = vertical view).
The version management can be as easy being just only prohibit simultenous editting as the developers group is relative small and are being managed for their work (they are communicating).
+ Data types is not going into any type of release/version management often is must be kept segregated.
This approach is fulfilling all classical requirements of segregation testing etc. The normal programming and ETL development can be solved with this.
This for the business type of information as the technical building parts (eg SAS) have their own approach. This can be confusing as SAS is getting indicated as business application :<(
For self service BI and the analytics field this is resulting an other problem as they are needing the real production data and also are developing new reports/views.
With that requirement it can make sense to have two segregated "production: environments one where the development on production data is done and one where the approved results are placed.
c/ Multiple SAS environments. Let us keep that to environments as the multi-tier aspect (metadata web/mid data desktop) is used within an environment/level.
You have the following:
-i- The building in a environment of infra tools including the SAS and lay out structure for the business data/code. This can be done as a proof of concept.
This one will server as basic image for all other machines/evironments.
This will serve for all technical aspects including involved fixes to roll out.
-d- The Development environment for the business (code/data)
-t- The Test environment (system-integration) of what is delivered by developers
As these two environments are often used within the same project team it is possible to share the same same machine but the data/code and metadata must get separated.
-a- User acceptance testing is for validating bigger impacts before all this being deployed operational (p). Often a dedicated machine is required is one aspect can be performance validation.
-p- This is the operational business environment.
As previous being indicated it can exist of two (or more) environments.
All depends on the maturity level of ICT with business requirements/policies.
Thank you, Jaap, for sharing your vision. Agree, I did not address many aspects of the SAS BI environment, I presented only a simplified bird's-eye view intended for the novice BI users, and I welcome such detail comments as yours. Ideally, this post should serve as a seed for growing a broader discussion. We would like to hear our BI customers' views, pains and stories.
Your blog will be very helpful for learning SAS Admin. Kindly post step by step process of implementing SAS EBI environment.
Many Thanks for your kind help.
Thank you, Shafeeque, for your comment. In this post, I just scratched the surface on the SAS BI implementation. I don't think though that this blog is the right format for step by step process of implementing SAS EBI environment. There are many resources on this topic, for example SAS® 9.4 Intelligence Platform: Installation and Configuration Guide.