I've been a longtime fan of the WNYC radio show, Radio Lab. Hosts Robert Krulwich and Jad Abumrad have interviewed famed biologist E.O. Wilson a few times. Listening to him describe his study of ants and how they make tracks and trails is a great story of perseverance and attention to detail. E.O. Wilson had a bit of a head start on us, but here at SAS, we're pretty good at tracking bugs, too.
No one really likes to admit it, but even SAS has defects in our software, although we do our best to find them all during development. So, when a developer or a tester (or even the occasional manager, pointy-haired or not) finds a bug, they enter it into our bug tracking system. Once there, we, well, track them. This is the story of our bug tracker, DEFECTS.
The image to the right shows a small region of our DEFECTS web client that we use for entering new defects. Here, I'm selecting the defect's component JAVA.PROCESS from several alternatives which match what I've typed so far, the input "JAVA.P". The system uses Ajax to provide this incremental completion. The blue asterisks mark required (and predictable) defect description fields. The defect entry web client is implemented using Google Web Toolkit. Click the image for a larger screenshot showing more of the current user interface. In a future entry, I'll discuss the evolution of DEFECTS to show how it came to its current form. (DEFECTS has evolved faster than any species of Formicidae.)
The second half of the defects system is the reporting side. We have a number of clients which provide varying degrees of power for tracking defects. The broadest (imagine a young E.O. Wilson looking at ants with his first magnifying glass) allows users to just perform plain text searches of a repository of HTML-formatted defects via a Google appliance. Moving up to microscopes, users can search for defects by entering product names or specific lower-level component names. The scanning electron microscope tracker allows users to enter arbitrary SQL where clauses or SQL queries to find very precisely specified sets of defects.
Not your average spay/neuter program
Once we have tracked the bugs, the next step is to fix them. No, this is not like fixing a stray dog; I mean repairing the underlying software faults.
The DEFECTS tool plays a crucial role in this process. All code for SAS products is stored in our source management systems; we have several CVS servers for this purpose. Developers implement fixes or new features on their workstations, then test them. When they are confident their changes are correct, they return to the DEFECTS system and change the defect's status to a REQUEST. This sends a signal (email) to the testing contacts associated with the software component and defect. The quality assurance team and management reviews the request and if they are satisfied that sufficient tests have been run and that the change will not have an adverse effect elsewhere, they change the status to APPROVE. The defect's unique identifier now may be used as a FIXID—an identifier for the fix. In order to commit code changes, our CVS systems validate the commit requests. All requests require a comment that matches the checkin request: the comment must begin with "FIXID <FIXID>:". If this is missing or the defect has not been APPROVE'd, CVS rejects the commit. Once committed, the developer and tester work to validate the fix against the full system software build, and when done, mark the defect as FIXED.
Finally, we can query our system and list all the source files that have been changed using a FIXID—a very useful code auditing feature. (Sometimes, this feels like an overly heavyweight process. In the future, I'll discuss Agile software development at SAS.)
A critical tool
A R&D director once told me that introducing new tools into SAS R&D is hard because so much of our workflow and information management is focused on two key tools: email and DEFECTS. In practice, DEFECTS is used for tracking more than bugs. All new features for our products go into the DEFECTS system, as do redesign and refactoring work to make the code perform better or to render it more maintainable or reusable. DEFECTS is used to track change requests that have potentially wide impact and the related work to support development of SAS' myriad products—such as creating new branches in our source management system, requesting new installers, or adding new third party binaries in support of our Java and Adobe Flex-based product development.
Almost all changes to our extensive code base and our development tools and processes, can and are tracked with DEFECTS. That's a lot of work being done by the 2,000 or so people doing product development, testing, installation, and deployment. Next, I'll explore this aspect of tracking bugs.
By the numbers
![]()
My colleague Chris Hemedinger (The SAS Dummy) mentioned how we
“eat our own dogfood”—that is, the DEFECTS system, primarily written with SAS, is a great vehicle for validating the SAS platform.
Here's some information from Dick Wiersma, the development manager. On a typical weekday:
- the web server handles about 850,000 HTTP requests.
- the SAS/SHARE server writes more than 7,000,000 lines to the log file (with verbose logging enabled).
- SAS/SHARE processes about 375,000 SQL queries from SAS/IntrNet htmSQL web pages. These are for various reports. We have a library of stored reports for listing defects by product, component, developer, department, or any of a wide number of other categorization fields.
- the server handles about 150,000 separate connections from client applications.
To meet this demand, the DEFECTS system's SAS/SHARE server uses SASFILE statements to load the entire database—about 26 gigabytes— into memory. SAS is run using a MEMSIZE value of 40GB. The tables in the DEFECTS database vary from a handful of records to 55 million records. Dick shared some of his team's experience tuning the SAS/SHARE servers in Performance tuning for SAS/SHARE servers.
The Open/Closed Principle
Software developers may be aware of the Open/Closed Principle, first defined by Bertrand Meyer and later refined by Robert C. Martin. This useful design principle states that software systems should be open for extension or reimplementation, but closed for modification.
An inversion of this principle applies to our DEFECTS system, to great benefit to us, but also with some costs. The big benefit of our defects system is that it is open: the set of SAS data sets which contain the defects data, their schema, and even the server where the data is stored, is open. For example, there are ten key tables which contain the essential data about stored defects, plus several auxiliary tables which serve as lookup tables for input validation. For example, one table
lists the components including JAVA.PROCESS (see the defect entry screen shown above). We can write new tools to query the data and generate reports. This has created an ecosystem of productivity tools: managers can easily find and track defects that they care about. As an example, we have a web tool that lets users create ad-hoc reports (using SAS htmSQL) by choosing key fields to query on (such as product, department, or employee) and users can save those queries for later execution.
Of course, this degree of openness comes at a cost. It is difficult to change the schema because doing so could break tools and reports that so many depend on. For example, one of the key fields in the defects data set is the text field which holds the name of the software component. Since DEFECTS was designed when the SAS system only supported eight character variable names, this variable has the name "CMPONENT" instead of "COMPONENT", and we must live with this awkwardness today, even though the SAS data set format has supported longer variable names for many releases. However, this is a cost we are willing to take because our system is an internal one, and the benefits of the openness outweigh the benefits of higher level abstraction that would otherwise hide the schema behind an abstraction layer.
Designing such a robust, complete, and consistent framework and API is difficult and intricate work, with many benefits, but sometimes we don't have the luxury of designing that way, and forcing all users to use such an API might just make the system harder to work with and less open.
This openness also partially explains why we do not use a commercial or open source software product like Bugzilla, Trac, or JIRA. While these are fine tools, switching to these products would be very disruptive—they would be incompatible with so many of our existing tools and reports, and they would not be as open and as extensible. And, they would be a different brand of dogfood, so to speak, and we know it's hard to teach an old dog to eat new dog food: it's a classic case of vendor lock in.
Trail's end
Tracking defects is essential to SAS' quality imperative, and it is an integral part of every developer and tester's experience. I hope this revue has given you a sense of what bug tracking at SAS is like. I think E.O. Wilson might be proud.
Watch the Peer Revue playbill for an upcoming discourse on the evolution of DEFECTS.
