Maps, neural nets and more at the Interface Symposium

0

The 2011 Interface symposium brought together computational scientists, statisticians and mathematicians for three days of meetings and technical sessions on the interface between computing science and statistics. This year's symposium was held at the new Building C conference center at SAS world headquarters June 1-3. Several SAS and JMP developers were involved in sessions throughout the event.

Overcoming mapping challenges

Hoisting a large inflatable globe Atlas-style, Xan Gregg, Senior Manager of Software Development for JMP, pointed out one of the unique challenges associated with using maps to show geographic data. "The earth is round; computer screens are flat," he said. Gregg was one of several SAS and JMP developers to discuss maps, nets, optimization and profiling at Interface 2011.

If you treat the surface of the globe as a rectangle, which a lot of mapping software does, it distorts the relative areas, making the polar extremities look larger and the equatorial regions look smaller. To illustrate this, Gregg compared Greenland to Mexico: Greenland is much bigger on a flat, unprojected map but in reality is about the same size as Mexico, as he pointed out on the globe.

Another mapping challenge: area size versus importance. Gregg recalled the maps shown throughout the 2008 US presidential election, where each state was shaded red or blue based on the expected or actual electoral vote. "Those maps distort the data because the area of red outweighs the area in blue," said Gregg. "The state of Montana is bigger than the state of New Jersey in size, but New Jersey has a much greater weight in terms of voting and population." Gregg showed alternative maps created with the pre-released version of JMP to demonstrate a better way of viewing the results.

Executive Vice President of JMP and co-founder of SAS John Sall opened his segment by showing a contour map of Longs Peak, Colorado, to help demonstrate how variance in inputs (i.e., rain or wind) could impact the outcome of an attempt to parachute in at the highest evolution. The best answer, Sall said, might not lie in aiming for the highest peak, but the highest flat area where a reasonably good landing could be assured. Engineers face this same dilemma in wanting their processes to be robust, he said. "The goal is to make a good product even if there is variance."

The benefits of neural nets

Christopher Gotwalt, JMP Software Development Director, shared his belief in neural networks as a flexible modeling tool, particularly when working with large data sets. Gotwalt used two simulation study examples – building the optimal heat exchanger and finding a pocket of oxygen depletion inside a NASA rocket – to help illustrate how computer experiments can provide an inexpensive alternative when time, expensive materials and other obstacles prevent physical experimentation. "Neural nets scale much better to large data sets, categorical inputs are handled naturally and they place lower demands on the design of the simulation study," he said.

Interactive graphics today and yesterday

Bradley Jones, JMP Principal Research Fellow, made an impression with a PowerPoint presentation he first used 20 years ago and stumbled upon recently when cleaning his office for an upcoming campus move. "I called this an interpret graph when I first came up with it," Jones said. The same visualization is still in use today as the Profiler in JMP, but today's interactive graphics now allow modelers to explore multi-dimensional and multivariate functions. "We're limited by our [human]"software and hardware" to be able to visualize things in three dimensions. We're not so good beyond three dimensions." Interactive graphics, he said, give us a better view by allowing us to look at relationships dynamically, and with multiple variants, rather than in static fashion. "So this is 20 years ago, this is today," he said. "Life has gotten better for us. We have a lot more control over our graphs."

Social network analysis a popular topic

Several presenters discussed social network analysis, including Director of Research and Development Jin-Whan Jung. "Analyzing and graphing by individuals are no longer enough," he said. "Data are linked so you can capture network information." The problem, however, is that linked data create big data, and the issue becomes how to make those big pieces small, he said.

Dominic Jann, with SAS Solutions OnDemand, pointed to his own name as an example of how linked data can grow exponentially, noting that it is often misspelled in company databases. Cleansing and filtering are important steps in managing big data.

Social networking analysis is becoming more important to marketing functions, according to SAS Technical Student Leslie Sall. Marketers in the telecommunications field, for instance, are interested in the information to help with segmentation, retention, cross-sell/up-sell/viral product adoption and acquisition. "Community structure can be studied to determine levels of "betweeness' and "closeness' – who has the greatest influence and who can be targeted for special offers to create the greatest impact. For example, if I get an iPhone, it may be more likely that my brother will get an iPhone," she said.

SAS Software Development Manager Dan Kelly discussed the ways social network analysis is contributing to fraud detection by web account abuse and identity theft, social welfare fraud, banking fraud, insurance and retail warranty service claims. "In days of old, my risk score would be a function of my transaction history – the number of times I did something or the average amount of dollars I received," he said. "Once you apply the network idea, the question becomes: How do I roll up that information to the group to which I am connected? How does it compare?"

SAS Senior Solutions Architect Barry deVille highlighted some case studies of how telecommunications companies are using social network analysis to explore different types of data to more effectively market to new customers and keep existing ones. Companies have historically examined "behavioral data," such as the number of calls or usage statistics from month to month, deVille said, when trying to cultivate new customers or prevent losing existing ones. "Now what we've seen in the last couple of years is a drive to looking at reciprocal relations between communities of users and policies with respect to relating to those users," he said.

Data mining for insurance

In a session on how data mining impacts business knowledge, SAS Research Statistician Developer Billie Anderson gave a presentation on how insurance companies use generalized linear models to predict insurance claims and determine policy rates for customers. Anderson outlined some of the statistical issues surrounding two of the primary models used in the insurance business – count models and pure premium models. Anderson also said a rate-making node also will be included in the next release of SAS Enterprise Miner, scheduled for release later this year.

Arati Bechtel, JMP Communications Specialist and Chad Austin, SAS Communications Specialist, also contributed to this post.

Tags
Share

About Author

Becky Graebe

Director, Communications

In addition to traditional employee communication efforts at SAS, Becky Graebe oversees an award-winning global intranet and a variety of enterprise social media channels. Her goal is to create a working environment where SAS employees around the world feel connected and inspired to share fresh ideas, solutions and expertise with colleagues and customers. Having studied at Southern Methodist University and earned her degree from Stetson University, she now serves on the Employee Communications Section board for the National Public Relations Society of America, is an active member of Triangle Women in Communications, and volunteers with Citizen Schools and the Wake County Support Circle Program.

Comments are closed.

Back to Top