What is the best way to organize your SAS work in a SAS Enterprise Guide project? There are no project templates or enforced structure, really, but isn't there a best practice?
I don't have a single prescription for the best project organization. I believe that it depends on the nature of the work you're doing, how you're sharing projects among team members, and on your own personal preferences and working style. SAS tools are often deliberately flexible, which means the onus is on you to keep the chaos under control.
I can offer some guiding principles though. These guidelines center along one theme: bring some discipline into your projects. Don't let them get all (and this is a technical term) higgledy-piggledy. Here are my top 10 tips.
1. Arrange process flows (when you have multiple flows in a project) in the order that you expect to run them
Having the process flows arranged in logical order makes it easy to see how the project will be run, and can provide an at-a-glance overview of the project organization.
Note that you don't have to create the flows in the order that they will be run. During the design process, you might begin with sample data to design an analysis and output reports, and then go back later to refine the data import process, which naturally would run first. That's okay. You can always rearrange the flows later by dragging each process flow icon in the Project Tree window to the position where it should be.
2. Assign concise, meaningful names to each of your process flows.
"Process Flow" is not a very descriptive label, so make use of the Rename feature to assign a better name. (Right-click on the name in the Project Tree window and select Rename, or press F2 to go into "rename" mode.)
Some people like to use numbers to indicate the sequence. For example, a project might contain process flows with names like: "1. Import Data", "2. Data description", "3. Basic analysis", and so on. For stable projects, numbered flows can work well. For projects that undergo frequent changes, the practice of numbering the process flows can cause additional maintenance when you need to insert a new process flow between two others, resulting in a renumbering exercise.
3. Don't make the flows too big – if a flow gets really long, see if it makes sense to break it up into multiple flows
If you're a programmer, this concept should make sense. Think of your project as a big SAS program, and the process flows allow you to organize the program into subroutines for easier maintenance.
You can easily move items from one flow to another. Simply right-click on an item to move, and select Move to. To break up an existing process flow, first select File->New->Process Flow to create an empty flow. Then rename the new flow as appropriate, and begin moving items from the too-complex flow into it.
4. Use Note objects, and link them to the nodes they describe
You can never have too much documentation. If your project contains SAS programs, you would use SAS code comments to help make your program more readable and maintainable (unless of course, you are deliberately working to avoid that goal). To document a process flow, add a note with File->New->Note. The note object allows you to add plain text descriptions to your flow. You should also rename the note item to provide a meaningful label within the flow.
To make it obvious which items the note describes, link the note to other items in the flow. (Use right-click->Link To, or "draw" the link by clicking near the border of the note icon, when the cursor appears as cross-hairs, and drag the link arrow to the target item.)
For richer documentation, you can add external documents to your process flow, such as PDF or Microsoft Word files. To add these, select File->Open->Other and browse to the document file to add. Note that this does not embed the document file within your project; it merely adds a reference, making the document easy to access. This means that the document must be present in the referenced location or else you won't be able to open it within the project.
5. Rename tasks, queries, programs using descriptive (but concise) names
If a process flow is like a sentence (as in "language", not as in "prison"), then a task is like a verb, while a result is like a noun. But the default names that SAS Enterprise Guide assigns for tasks don't always describe the actions well. For example, you might use the query builder to calculate the sum of a variable across categories in a data set. But the default label on the task might be something like "Query for DC.PROJECTS". Well, that could be anything.
Rename the query builder node to reflect the action, such as "Count Projects per State". You can still see that it's a query task from the icon that's used within the flow. And if you don't recognize the icon on sight, you can hover the mouse cursor over the icon to see a tooltip that reminds you what type of task is used for the action.
To rename a node, select the node in the flow and press F2 (to go into "rename" mode), or right-click and select Rename.
6. Rename default output data sets from tasks using shorter, meaningful names
Queries and tasks that create output data sets will use auto-generated names, by default. The names are often cumbersome and generic, built by combining the task name with the input data name, such as WORK.QUERY_FOR_MART_VENDORSPEND. Use the options within the task to select a different name that is shorter and more descriptive.
These names aren't used just for display, but are the physical output data sets that are created when the task runs. By using shorter names, you can make it easier to find the output data sets in the file dialog later, or to refer to them within SAS programs.
Note: You cannot rename the output data sets within the process flow. To rename an output data set, you must modify the task that created it, and then re-run the task to refresh the name. The output data set names are usually controlled in the "results" options for the particular task.
7. Turn on Auto Arrange for a good first pass at layout, then turn it off for manual refinements
For small flows, the Auto Arrange feature of the process flow will present a nice, readable layout. But as you add multiple "branches" to your flow, you might notice that the process flow diagram grows vertically with lots of white space in between branches.
You can adjust this by turn off Auto Arrange. Right-click in the process flow and select Auto Arrange, "unchecking" the option from on to off. Then you can select any of the items within the flow and drag them to where you want them to be on the canvas.
If things get crazy, turn Auto Arrange back on to make your nodes all "snap to". But take note: the Auto Arrange setting affects all of the flows within your project, so if you toggle the setting to tweak one flow, you may find that your other flows are also "rearranged" automatically.
8. Change the background color of process flows to make them distinct
Consider using this feature to make it easy to see which flow you're on. To change the color of a process flow, right-click on an empty spot on the flow canvas and select Background color.
9. Configure the first process flow to run automatically ("autoexec")
You can use the Autoexec process flow to enforce initialization of certain libraries, macros, and more. Read more about the Autoexec process flow here.
10. Remember: a project is like a recipe.
It tells you (and SAS Enterprise Guide and SAS) what the ingredients are and how to combine them, but it doesn't always contain the ingredients themselves. The project refers to external pieces, such as data sets, .SAS files (programs) and data files to import.
Keeping your "recipe" clean and organized will increase the chances that you can successfully "cook" with it repeatedly, and that colleagues can use it to repeat your results.
Here are a few other resources that can help you to learn how leverage the "project" aspect of SAS Enterprise Guide:
- SAS Talks episode: SAS Enterprise Guide for Programmers: What's in it for Me?
- SAS Global Forum paper: Find Out What You're Missing: Programming with SAS Enterprise Guide
- SAS Global Forum paper: SAS Enterprise Guide 4.3: Finally a Programmer's Tool (by Marje Fecht and Rupinder Dhillon)
- Several papers by Andy Ravenna (SAS Education), linked from his sasCommunity.org page
Chris many thanks for the good tips. Your blog is a must read for every Enterprise Guide user.
For tip 1, I recommend to start process flows with numbers like 00_setup, 10_importData, 20_prepareForReports etc. Just in case one has a shaky hand and rearranges the process flows by accident.
I also add a 99_tests, where I put stuff for testing, that is not really needed in any other process flow
I am doing something like this too. Best thing is, for large projects, if one numbers them correctly, with one system for the entire project, one can easily make an ordered list that runs all of the processes in the order that they have been numbered, project wide. This is possible because as you create an ordered list, you get to sort the list of node-objects in the Add/Select dialog which also allows multi-selection. It kind-of makes the links inside the process flows less necessary, but they are still important to show dependency and order on a documentative level.
I also use a numbering system, not usually as fleshed out as the one listed by Bruno, above. But as my project grows and various flows develop and finalize, I start pre-appending numbers to the descriptive labels for each flow. Nothing demonstrates intended order more quickly (to me) than numerical sequence. If I need to insert, then I go with decimal places.
Thanks for the additional tips you shared.
I have about 100 EG projects. Is there any easy and fastest way to export all the Code in the projects.
Unfortunately, there is not a way to automate the "Export All Code" feature in scripting language using SAS Enterprise Guide automation.
It is possible to use the scripting language to enumerate all of the task and code items within a project, and then export each one individually (using the SaveAs method for each one). But that does not take into account the sequence of items as shown in the process flow, so it's possible (likely) that the items would be exported out of order.
It would also be possible to create a custom task and use the ISASProject interfaces to automate some of this. There is an example that uses the ISASProject interface here: http://support.sas.com/documentation/onlinedoc/guide/customtasks/. See the "Program Manager" example. Creating a custom task requires .NET programming.
Pingback: Are 64-bit client applications twice as good as 32-bit applications? - The SAS Dummy
99% of my SAS programmers at my work place hated EG but are being force to use it because
it is company policy. Most are saying that it slow down their productivity 20+ percent.
Many organizations have successfully transitioned to EG and maintained a happy SAS user community, while also satisfying their IT governance policies. The key to success is planning and education, involving the users before the transition to understand their work practices, and ensuring that they have everything they need (information and infrastructure) to get the job done.
In addition to programming, EG brings lots of capabilities that aren't possible or easily accomplished in traditional SAS. Educating users about these often helps to ease the transition.
SAS has many resources to help, including training, discussion forums, SAS Talks webinars, and more.
I agreed with you that it is about Education and I would say SI and people with your expertise
should take responsibility on this (how can companies take on this responsibility when we do not know much about it?). I know you have post many posts on EG but Chris what the
programmers want is a good EG book FOR programmers. As far as I am aware there is only one
book on EG and it is for SAS users not for SAS programmers. Why is there no such book!!!
Surely those at SI who developed EG must knows its capabilities so why not document it ?!!!
Unless there is such a book SAS programmers will continue to be sceptic of EG even programmers at SI. So I post this challenge to you and colleagues to write this book!!
We want the real stuff in this book and *NOT* the the click, click, click...cosmetics stuff !!
I look forwards to reading this book soon.
John, thanks for the comments. I'm proud of the way that SAS trainers, authors, and members of the user community have stepped up to deliver good information for SAS programmers who use SAS Enterprise Guide. Here is a sampling of the offerings:
SAS Enterprise Guide for Experienced SAS Programmers (training course)
SAS Programming for Enterprise Guide Users (book by trainer Neil Constable)
SAS Enterprise Guide 4.3: Finally a Programmer's Tool (paper by SAS user Marje Fecht)
Becoming a Better Programmer with SAS Enterprise Guide 4.3 (paper by Andy Ravenna)
New Goodies for the SAS Programmer in SAS Enterprise Guide 4.3 (free webinar by Yours Truly)
Nice work on this blog.
I've been linking programs between projects for a long time in SAS, but is there now a way to link one project to another, similar to linking objects within a project? Have you heard of this upgrade or is there an upgrade in the works?
The company I work for has a limitation of 500 resulting datasets per project, which is sometimes confining. Creating and visually seeing the link between programs would be a really handy upgrade for an already powerful tool.
If anyone has a reply I would greatly appreciate your response.
Dear SAS dummy...
Is there a way to copy one process flow and its contents from one egp file to another?
Sincerely, Misplaced Instruction.
Unfortunately, no. You can copy one task at a time. You can also use Task Templates to store your favorite settings for convenient reuse across projects. (See the SAS Enterprise Guide online help and search for "task templates".) A very useful variation of this is in the Query Builder, where you can build complex queries (and subqueries) and put them in templates.
Using Task Templates does seem right in this case. I exported the code from the original file's process flow and used the Program Analyzer to remake the process. I made it work, only by beating down with trial and error. In the end, copying the code one task at a time was the way to go.
...and I use Query Builder and I think it is very useful too.
Pingback: Adding “sticky notes” in your SAS Enterprise Guide projects - The SAS Dummy
Is there any way to capture the information displayed in EG "Project Maintenance" in a SAS table or text file? This would be very useful for creating input/output table listings and other project documentation.
There are two tools that might help. First, check out the Project Reviewer task, which captures the lists of tasks in each process flow and can create a report and SAS data sets with the information.
Second, look into the automation APIs, which allow you to write a script that reports on the contents of an EG project.
Getting a SAS code from Enterprise guide project(.egp)
I am trying to get sas program code from Enterprise Guide .egp project,How can I do that ?I have used in EG the Export methods "Export all code in Project" and "Export all code in process flow",but its not giving me complete sas program.some of the code is missing,lets say the actual code has like 10 proc sql's then after exporting i can see only 2 proc sql's in the code.
and I tried a method someone has suggested in the internet,change egp project
.egp extension to .zip and unzip it then you will get the code in the folder.
I tried this method with a sample program,its working.but when i tried with actual project,I could not find any code after unzipping in the folders.
does anyone has any any idea how can I get complete sas program code from the EG project.
I would really appreciate all valuable suggestions.
You could accomplish this with the automation API and scripting. See this sasCommunity article and specifically this VB Script example.
I loved your blog article. Really Cool.
I have a EG Project with a number of flows - I assume that it is not possible to reference temporary work tables across Process Flows?
Eg I import data in PF1 and leave the table in work, when I am in PF2, I am unable to reference this table as it does not see the table in the PF2 work library?
I was wondering if this is an issue that can be resolved - my workaround is to write the tables I want from PF1 in a permanent location then add a proc dataset to remove in PF2 - but ideally I could just reference the same work library across Process Flows.
You should be able to add your WORK data to separate flows by selecting the data from File->Open or from the Server List (navigate to WORK library, and open the data). Even when the WORK data isn't yet created (because you haven't run initial flows that create it) the reference will remain. Just don't try to open tasks/run the flows that depend on that WORK data before it's actually created.
I understand how to rename queries and even datasets created by these queries (through the query builder) in EG. But what I can't seem to figure out is how to change the name of the output datasets of imports - for example importing an access table or excel worksheet. Say I'm importing an excel sheet that's called "AWP". I go through the import wizard and it doesn't have an option to name the output dataset like query builder does - the actual dataset name becomes "EGTASK.AWP" but in the EG project view it is labeled "Data Imported From AWP.xlsx". I want this label in the project view to just say "AWP". How can this be accomplished?
Hi "Arthur", I see that Casey answered your question on the SAS Support Communities.
Is there a way to change the icon size and label font in Process Flow, as icons and text is overlapping when Auto Arrange option is active?
No, not that I'm aware of. You can adjust your system resolution and DPI settings, which affects text display in all applications, and that might have an effect that you want. But there isn't a setting specific to the EG application.
Is it possible to change the background colour of an Enterprise Guide Project using code instead of right mouse click and change colour?
I run code for testing and production. It would be good to have a different EG colour when the test libraries are selected etc. Also a good warning that you've selected the wrong mode.
David: unfortunately, changing the process flow color is NOT something that can be coded or automated. It's available on the menus only.
After an update of EG i found my flows jumbled where the html-outputs were smacked on top of icons of temp-files. Even after a re-run this jumble remains. How can I govern this?
Hi Terje, that sounds like a bug (and a little familiar) -- you might open a track with SAS Tech Support on that.
Pingback: Copy an entire process flow in SAS Enterprise Guide - The SAS Dummy
1.) Is it possible to add a diagram to the template - gallery?
2.) Can I plot regression plots with confidence intervals from the WORK.AOUTAutoRegForecasts? (Fitplot would do the job - but the x-Axis enumerates the observations (where I want time) and the legends are not appropriate ...)
You can add new pre-sets for SAS tasks by using Task Templates. Most tasks have a "Save as Task Template" option, which allow you to expand your task menu with presets. Check the EG online help for "task templates". Regarding the regression plots, you can use ODS SELECT to capture data, including parameter estimates and CIs, from most regression procedures. If you want to build your own plots, you can do so with PROC SGPLOT. Rick Wicklin has many tips about this on The DO Loop blog -- search for some of the key terms to find what you need.
Thanks for sharing some great best practise tips to try and make EG life easier.
We've recently moved from v4.1 to 7.1 (don't judge on the use of "classic" EG!). One thing I'm wondering is if you can keep some datasets created during the code run in a static position within the Process Flow, without linking them to another script/output?
You could do this in v4.1 by linking a dataset to another and then removing the link, forever locking it in the manually chosen place within the Process Flow.
This is purely to satisfy my OCD on keeping the visual look of the EGP the same for every run!
I think you might have found a "hidden feature" in EG 4.1 -- a behavior that was not exactly intended. However, you should be able to arrange your flow by turning Auto Arrange off (remember, it affects the entire project). Another trick is to add a note (sticky note style) and add a link between it and the data set node -- that might keep things organized the way you like.
Thanks - I'll try the notes and see how I get on.
It is possible to include the same program into a process flow? We come from SAS under Z/OS and we do this in a JCL.
We want to include the same program several times in the same process flow.
Luis, you cannot include the same physical program multiple times in a flow. If the program is stored on the server file system and referenced from your flow, you can include multiple code nodes that reference it with %INCLUDE -- of course.
Also, you can reference the same program multiple times by including it in an Ordered List (File->New->Ordered List). Unfortunately, you can't reference a single program multiple times in the *same* ordered list, but you can define multiple ordered lists in your project, and reference that program in each list you create. This might help you to define the sequence of actions that you want.
I understand how to label objects the EG process flow diagram, except for one - when I have HTML output from a task, the icon label says simply "HTML - " where is the node that generated the output. I'd like to label the output's icon with something more descriptive like "Variable selection report". Furthermore, I'd like to do this without using an ODS statement. That helps a little by using a path name that might be descriptive, but it's not what I want.
Good question, but I'm afraid the answer is no -- EG does not allow you to relabel (and keep the new labels) for automatically generated results, because they are simply overwritten the next time that you refresh. Best thing you can do, maybe, is to add a Note (sticky-note style) as a piece of documentation in your flow.