My last post described my top general business analytics books, those that would appeal to business leaders and analysts alike. This post is a bit more specific, and covers books that will help you to learn for yourself. It is therefore mainly aimed at analysts — but I still hope it will inspire!
It is not designed to replace books on specific issues, such as these SAS publications, but hopefully to start you thinking about some particular areas where you might need to know more about ‘how’.
The Data Warehouse Lifecycle Toolkit - Ralph Kimball, Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker
If you are doing analytics professionally, and particularly using it in a production process, you will need a data warehouse or at least a data-mart. This is a basic starter, whether you need a hub that will provide one insight about your clients, vendors, and transactions, or a basis for an analytical decision support system. Whichever you want, this book is a classic, and covers three crucial aspects of data warehouse building: ETL development, multi-dimensional modelling, and application layer deployment. Personally, I found this book extremely useful to improve my understanding and communication skills with data developers and architects. It also helped me to understand the big picture of all aspects of Analytical Lifecycle.
While I’m on data warehouses, there are two other important books which I ought to mention, too:
- Kimball R., Ross M., “The Data Warehouse Toolkit. The Definitive Guide to Dimensional Modeling”.
- Inmon B., Hackathorn R., “Using the Data Warehouse”.
These are also useful introductions to the world of data warehouses, and well worth a look.
A Guide to Econometrics – Peter Kennedy
I had read a lot of econometrics books full of mathematical formulas before I read this one about applied econometrics, and never really ‘got’ it. I can honestly say that this book changed my life, because I finally started to understand what applied econometrics is all about. This book will introduce you to the details of econometrics by telling stories, not providing formulas, although there are a few really essential ones in there too. I particularly recommend this book to quantitative analysts and students, because of its very pragmatic and applied approach.
Peter Kennedy has also published a very popular research paper entitled: “Sinning in the Basement: What are the Rules? The Ten Commandments of Applied Econometrics”, in which he provides analysts with ten best-practice rules of applied econometrics. These rules have inspired one of my SAS colleagues to write a blog series on the top ten econometricians’ sins and how to avoid them.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction – Trevor Hastie, Robert Tibshirani, Jerome Friedman
If you want to call yourself a data scientist, you have to read this book. It is an absolute classic for anyone in a predictive analytics, data mining or statistical discipline. It’s heavy on statistical formulas, but then its target group is graduate students. Statistical learning itself never became a buzzword, unlike machine learning. I see machine learning as having more of an engineering and computer science focus than statistical learning, which is more focused on the statistical properties of predictive power, estimator’s bias and variance.
I find the idea of statistical learning more personally appealing, because it answers the question “why?” rather than “how?” to implement this in programming language, which is more the domain of machine learning. In practice, of course, you need to know both “why?” and “how?”, and this book, which is freely available on, can help you to understand both.
Stay in touch for my next blog post. It'll be focused on domain-specific applications of analytics.