Used price: $174.95
Buy one from zShops for: $150.00
For a more up-to-date treatment, see McLachlan's recent book in the Wiley statistics series. However, this book provides valuable explanations of Bayes rules and shows pictorially what the boundaries look like for linear and quadratic classifiers. In fact I borrowed their pictures in Chapter 2 of my book on bootstrap methods.
Used price: $90.00
Buy one from zShops for: $76.98
My reasons for disappointment with this book are as follows:
Given the 27 years that have elapsed since the publication of the first edition of the book, and the immense progress that has taken place in pattern recognition, machine learning, computational learning theory, grammar inference, statistical inference, algorithmic information theory, and related areas, the revisions and additions in the 2000 edition are essentially of a patchwork nature. In my opinion, they do not reflect the current understanding of the topic of pattern recognition.
A disproportionate number of pages are devoted to topics like density estimation despite the fact that it has been well established in recent years, through the work of Vapnik and others, that when working with limited data, trying to solve the problem of pattern classification through density estimation (which turns out to be, in a well-defined sense of the term, a much harder problem than pattern classification) is rather futile. When modern techniques for learning pattern classifiers from limited data sets (e.g., support vector classifiers) are touched on in the book, the treatment is disappointingly superficial and in some cases, misleading.
There is virtually no discussion of problems of learning from large high dimensional data sets, incremental refinement of classifiers, learning from sequential data, distributed algorithms, etc. The treatment of non-numeric pattern recognition techniques (e.g., automata, languages, etc.) is extremely superficial. There is almost no discussion of essential aspects such as preprocessing and feature extraction techniques for dealing with variable length, semistructured, or unstructured patterns.
There is very little contact made with a large body of pattern recognition algorithms, results, and approaches developed by the machine learning community, with the possible exception of the decision tree algorithm.
There is little discussion of the extremely important topic of computational complexity and data requirements of learning algorithms.
On the positive side, the discussion of most topics that were originally covered in the 1973 edition has been further refined and in many cases, made more accessible through the addition of illustrative examples and diagrams. Topics such as Bayesian networks receive an intutive and accessible treatment. The exercises at the end of each chapter seem useful
Perhaps it is too difficult for any individual or a small group of individuals to write a textbook that reflects the state of the art in pattern recognition. Perhaps my expectations of Duda and Hart (based largely on the extraordinary job that did on the 1973 edition of their book) were too high to have a reasonable chance of being met by the 2000 edition. Perhaps I have come to expect more out of graduate level textbooks after having worked as a researcher and an educator in this field for over a decade at a major university.
In short, the book fell significantly short of my expectation.
With this in mind the authors and their new coauthor David Stork go about the task of providing a revision. True to the goals of the original the authors undertake to describe pattern recognition under a variety of topics and with several available methods to cover each topic. Important new areas are covered and old but now deemed less significant are dropped. Advances in statistical computing and computing in general also dictate the topics. So although the authors are the same and the title is almost the same (note that scene analysis is dropped from the title) it is more like an entirely new book on the subject rthan a revision of the old. For a revision, I would expect to see mostly the same chapters with the same titles and only a few new chapters along with expansion of old chapters.
Although I view this as a new book, that is not necessarily bad. In fact it may be viewed as a strength of the book. It maintains the style and clarity of the original that we all loved but represents the state-of-the-art in pattern recognition at the beginning of the 21st Century.
The original had some very nice pictures. I liked some of them so much that I used them with permission in the section on classification error rate estimation in my bootstrap book. This edition goes much further with beautiful graphics including many nice three-dimensional color pictures like the one on the cover page.
The standard classical material is covered in the first five chapters with new material included (e.g. the EM algorithm and hidden markov models in Chapter 3). Chapter 6 covers multilayer neural networks (a totally new area). Nonmetric methods including decision trees and the CART methodology are covered in Chapter 8. Each chapter has a large number of relevant references and many homework exercises and computer exercises.
Chapter 9 is "Algorithm-Independent Machine Learning" and it includes the wonderful "No Free Lunch" theorem (Theorem 9.1), a discussion of the minimum desciption length principle, overfitting issues and Occam's razor, bias - variance tradeoffs,resampling method for estimation and classifier evaluation, and ideas about combining classifiers.
Chapter 10 is on unsurpervised learning and clustering. In addition to the traditional techniques covered in the first edition the authors include the many advances in mixture models.
I was particularly interested in that part of Chapter 9. There is good coverage of the topics and they provide a number of good references. However, I was a bit disappointed with the cursory treatment of bootstrap estimation of classification accuracy (section 9.6.3 on pages 485 - 486). I particularly disagree with the simplistic statement "In practice, the high computational complexity of bootstrap estimation of classifier accuracy is rarely worth possible improvements in that estimate (Section 9.5.1)". On the other hand, the book is one of the first to cover the newer and also promising resampling approaches called "Bagging" and "Boosting" that these authors seem to favor.
Davison and Hinkley's bootstrap text is mentioned for its practical applications and guidance for bootstrapping. The authors overlook Shao and Tu which offers more in the way of guidance. Also my book provides some guidance for error rate estimation but is overlooked.
My book also illustrate the limitations of the bootstrap. Phil Good's book provides guidance and is mentioned by the authors. But his book is very superficial and overgeneralized with respect to guiding practitioners. For these reasons I held back my enthusiasm and only gave this text four stars.