Data and Web Mining

Instructor: Salvatore Orlando

Goals

Data Mining involves a set of techniques and methods to extract novel knowledge from large databases, to be profitably exploited by decisional processes. Data Mining is one of the main activities in the complex process of Knowledge Discovery in Databases (KDD). The course deals with the fundamentals of this subject, by focusing on the most important algorithmic techniques.
Moreover, the course uses as case of study the Web, and the chances of extracting useful knowledge by mining the hyperlink structure of the Web, its contents and the usage logs.

Contents

Recommended Reading List

Other books

Assessments

Written exam (60%), talk on a scholarly paper (40%).

For an example of written exam, click here

Teaching Methods

Class lectures. Discussion on scholarly papers.

Mailing list

Students can subscribe (unsubscribe) to the mailing list of the course: http://listserver.dsi.unive.it/wws/subrequest/datamining. After the subscription, students can send email to the following address: datamining@dsi.unive.it. A confirmation email is requested to complete the message delivery.


Slides

  1. Introduction
  2. Data
  3. Association Mining
  4. Association Mining - 2
  5. Classification
  6. Classification: Alternative techniques
  7. Clustering
  8. Web Search e Information Retrieval
  9. Link Analysis
  10. Web Usage Mining

Student Seminars

The list of papers available to prepare the seminars is the following:

Before preparing your talk (max 20 min), look at the suggested method(s) to read scientific papers well illustrated in these two documents: doc1 and doc2.
Be sure to point out the following items in your talk: (1) General/Specific subject of the paper, (2) Hypothesis, (3) Research Methodology, (4) Results, and finally (5) Summary of key points.


Written exams


MSc Thesis Topics (Under Construction)