Implementations

Home | Short CV | Research | Implementations | Publications

MTV: Succinctly Summarizing Data with Itemsets

go to webpage Go to the MTV page

Attribute Clustering

go to webpage Go to the Attribute Clustering page

Background Models for Itemset Ranking

go to webpage Go to the Iterative Scaling page

μ-Miner (Low Entropy & Dependency Miner)

The μ-Miner algorithm finds non-redundant dependencies between sets of attributes with a low entropy and a high dependence, where a dependence of 1 means the rule is a functional dependency, and a dependence of 0 means the attribute sets are independent. It is possible to prune attributes with a very high or low support, and to only mine non-derivable attribute sets.

C++ source code Download C++ source code

SMuRFIG (Simple Multi-Relational Frequent Itemset Generator)

This algorithm generates relational frequent itemsets from a relational database, i.e. itemsets with items from several entity tables which are connected by relation tables. The supports are expressed in the keys of these entity tables. The algorithm reverts to a standard frequent itemset miner if the database contains only one entity.

SMuRFIG webpage Go to the SMuRFIG webpage.

XMiner (eXtensible itemset Miner)

XMiner is an Eclat-based algorithm that generates frequent itemsets from categorical datasets, using an adapted support measure which takes possible missing values (NULLs) into account (see "Mining itemsets in the presence of missing values"). The algorithm reverts to a standard frequent itemset miner when no attribute values are missing.

C++ source code Download C++ source code

Eclat implementation in Python

A simple Python implementation of the frequent itemset mining algorithm Eclat. See the papers "New algorithms for fast discovery of association rules" by Zaki et al. and "Fast vertical mining using diffsets" by Zaki & Gouda for details.

Python script Download Python script

Supporter

Supporter is a simple tool that computes the supports of a collection of (generalized) itemsets in a given dataset. The itemsets can be read from file or from stdin, the results are saved to disk or are written to stdout. Hence, Supporter can be used as an interactive itemset support querier.

C++ source code Download C++ source code

Random Relational Database Generator

A very simple Python script generating synthetic relational databases.

Python script Download Python script