
Used price: $125.48
Buy one from zShops for: $125.48



The book is divided into 3 sections. The first is on 'information retrieval' (IR), the second on data mining, and the third describes a 'case study.'
According to the authors, IR is engaged in storage, retrieval, organization and display of unstructured or ambiguous file structures. Research is currently engaged in classifying, filtering, modeling, query design and user interface issues. The key question for IR is 'relevance' assessment. Each topic gets at least a few paragraphs, some a few pages.
The authors differentiate data mining from IR in terms of focus. A data mining project is designed specifically for finding hidden structure (whatever that means), while IR might be characterized as the 'quick and dirty query.' This is a bit confusing, but the emphasis on terminology makes it unimportant. Most of the data mining section is a review of various measures used to determine the existence of associations. This includes some simple formulas. Also, there is a section on webcrawlers and text mining.
Though the book is titled 'mining the www', the largest section is IR, what most would call 'search engines.' Mining itself gets only about 1/4 of the book.
The case study is fairly brief, but outlines a way to structure a simple project.
The book contains a nice bibliography.

about what you can do with the overwhelming World Wide Web.
If you are curious about what are behind those search engines and
how can these "things" get you stuck in front of
your computer around the clock, this is the book for you.
It not only tells you how these "things" work,
but also calms you a little bit by telling you that
those guys who developed these "things" REALLY tried hard to
get you what you want and in the meantime save you some time :)
The best part is that you don't need to know many theories and
you still get some sense about the devils who drive these engines.
If you are a professional who wants to know where to read about the
"know how", this book could be a good starting point.
It not only gives you a good survey of what is going on,
but also provides you with 286 references that guide you
to what you need to know next.
If you are a graduate student who wants to start a project
on the subject, this book could save you some time.
It takes you only couples of hours to scan through it.
By the end, you would probably know where to dig deeper or
you might get burnt and choose a different subject.
One thing I was wondering was that the authors didn't go further
in many aspects. Some subsections have only four to five sentences.
These could be spaces to extend.

Used price: $8.50


Used price: $44.54
Collectible price: $37.06


Used price: $18.00
Collectible price: $21.18
A major problem is getting a grasp on the synthesis of these three fields, DM, IR, and WWW technology. Even current research in DM is distributed among gropus of people with such diverse backgrounds effective communication of research results across groups is extremely difficult.
This book has taken the major concepts from these three fields and organized them in outline form. The outline cuts just deep enough to be meaningful and never too deeply to "lose" the reader. For the serious student, this book provides a Christmas tree on which other books can hang like ornaments.
Obviously, I think very highly of this book. It is not the "be all and the end all", but it fills an important niche. ... Almost limits it to library and other institutional purchases. Which is a shmae because I'm sure every worker in WWWIR&DM would like to have a copy on their shelves.
BTW, the bibliography isn't bad either, and, includes many www URLs, a must for any truly useful bibliography in todays environment. The search engines just aren't good enough yet to give you all the URLs you need. But, then, improving them is part of why there is so much active research in WWWDM&IR.
Feel free to write the author of this review (Dr. Jack Aiken, PhD)...