Software
The PairViz R package
This is an R package by Catherine Hurley and myself, available from
CRAN , Below are some additional
materials on this package that might not be available on CRAN.
Eikosograms
This is an interactive java application which displays
eikosograms, useful
for teaching probability.
Quail
Quail is a free extension to ANSI Common Lisp that runs on Macintoshes and Windows machines.
More info can be had at
the Quail site.
Data sets
- This is the segmentation data from
UCI Machine Learning
Repository
I have put this data set into a form that is ready for use in Quail (and in S/R below)
which I
call the pixels dataset:
- A
summary of the data.
Briefly, there are 7 classes, 19 continuous measurements, 210 observations
in the training set and 2100 in the test set.
Each observation is a pixel taken from an image; the measurements are
characteristics of that pixel and its neighbours and the class of the
pixel is the part of the image it comes from (e.g. CEMENT, PATH, FOLIAGE, SKY,
etc.).
- The training data: pixels-train.lsp
- The checker data (see below in the S language
part) is used here
in Quail to show how it is that a linear discriminant
might still work on data if the variables are expanded with
appropriate functions of the original explanatory variates.
Code
Directory containing Quail-code described here.
- Quail-code
for class heights and mixtures of univariate Gaussians (normals)
S language
S is a statistical programming language developed at Bell Labs from the 1970s to the
present time (see
history of S ).
Splus
and
R
are statistical systems based on separate implementations
of the S language.
The department here has a page containing some useful information on
the language and its R implementation.
Below is a bunch of code written mostly for classroom/course uses. It has been written
in the S language
and tested only in the R implementation of S.
For those who are new to S there are a few classic mistakes that can be easily made.
Some of these are recorded here . For the more adventurous,
take care to follow the scoping rules of S -- some examples .
Data sets
- This is the segmentation data from
UCI Machine Learning
Repository
I have put this data set into a form that is ready for use in S/R which I
call the pixels dataset:
- A
summary of the data.
Briefly, there are 7 classes, 19 continuous measurements, 210 observations
in the training set and 2100 in the test set.
Each observation is a pixel taken from an image; the measurements are
characteristics of that pixel and its neighbours and the class of the
pixel is the part of the image it comes from (e.g. CEMENT, PATH, FOLIAGE, SKY,
etc.).
- The training data: pixels.train
The S dataframe is called pixels.tr and has 210 observations.
- The test data: pixels.test
The S dataframe is called pixels.te and has 2100 observations.
- A
function to produce nuggets data.
- The Gauss2 data from the notes.
- The checker board data from the notes.
Code
Directory containing R-code .