======================================================================
Editors note: This file was received on 6/29/93 and has not been 
updated for subsequently received databases. - David W. Aha
======================================================================

1. Summary Table of Database Statistics

2. Donated by: Peter Turney & Michael Jankulak

As I mentioned, I was working on a table that summarized the UCI
data sets, to help me choose the data sets that are appropriate for my
needs. A summer student -- Michael Jankulak, a recent graduate of
the University of Toronto -- has prepared the table for me. It is
appended below. I would like to donate the table to the UCI
repository, for others who may need such a list.

Best wishes,
Peter.

3. Characteristics Presented

1.	Database Name
2.	Number of Instances (i.e. examples, data points, observations)
3.	Number of Features (i.e. dimensions, attributes)
4.	Number of Classes (assuming a discrete class variable)
5.	Percent of Features that have continuous/integer values
6.	Percent of Features that have nominal values
7.	Missing Features (yes or no)
8.	Highest Reported Accuracy (taken from "Past Usage")
9.	Percent of Instances in the Majority Class (to compare with 8)

4. Table

1.		2.	3.	4.	5.	6.	7.	8.	9.
name		#cases	#feat	#class	%num	%symb	miss	%accur	%major
------------------------------------------------------------------------------
anneal		898	38	6	24	76	yes	-	76
audiology-stan	226	70	24	0	100	yes	-	25
imports-85	205	26	*	62	38	yes	88	*
breast-cancer-w	699	10	2	100	0	yes	94	66
wpbc            198	30	2	100	0	yes   	-	76
wdbc		569	30	2	100	0	no	-	63
bridges		108	13	*	0	100	yes	-	25
kr-vs-kp	3196	36	2	0	100	no	-	52
machine		209	9	8	78	22	no	-	58
credit-app	690	15	2	40	60	yes	-	56
echocardiogram	132	13	2	69	31	yes	90	44
flag		194	30	*	33	67	no	-	*
glass		214	10	7	100	0	no	-	36
hayes-roth	132	5	3	0	100	no	-	39
heart-disease	920	13	5	50	50	yes	77	45
hepatitis	155	19	2	32	68	yes	83	79
horse-colic	368	28	*	32	68	yes	-	*
segmentation	2310	19	7	100	0	no	-	14
ionosphere	351	34	2	100	0	no	97	64
iris		150	4	3	100	0	no	-	33
labor		57	16	0	50	50	no	98	55
lenses		24	4	3	0	100	no	-	62
letter		20000	16	26	100	0	no	80	4
liver-disorders	345	6	2	100	0	no	-	60
lung-cancer	32	56	3	0	100	yes	-	40
promoters	106	58	2	0	100	no	-	50
splice		3190	61	3	0	100	no	-	50
monks-1		432	6	2	0	100	no	100	50
monks-2		432	6	2	0	100	no	100	67
monks-3		432	6	2	0	100	no	100	53
mushroom	8124	22	2	0	100	yes	95	52
pima-diabetes	768	8	2	100	0	no	76	65
shuttle-l-c	15	6	2	0	100	yes	-	60
solar-flare	1389	13	*	23	77	no	-	*
soybean-large	683	35	19	0	100	yes	97	13
lrs		531	102	100	99	1	no	-	10
satellite	6435	36	6	100	0	no	-	24
shuttle**	14500	9	7	100	0	no	-	79
vehicle		846	18	4	100	0	no	-	26
new-thyroid	215	5	3	100	0	no	100	70
thyroid0387	9172	29	21	24	76	yes	-	74
hypothyroid	3163	25	2	28	72	yes	-	95
sick-euthyroid	3163	25	2	28	72	yes	-	91
allbp		3772	29	3	24	76	yes	-	96
allhyper	3772	29	5	24	76	yes	-	97
allhypo		3772	29	5	24	76	yes	-	92
allrep		3772	29	4	24	76	yes	-	97
dis		3772	29	2	24	76	yes	-	98
sick		3772	29	2	24	76	yes	-	94
ann-thyroid	7200	21	3	29	71	yes	-	93
tic-tac-toe	958	9	2	0	100	no	99	65
sonar		208	60	2	100	0	no	83	53
vowel		990	10	11	100	0	no	56	9
votes		435	16	2	0	100	yes	95	61
wine		178	13	3	100	0	no	100	40
zoo		101	17	7	12	88	no	-	41


* any of the features can be used as the class feature
** the compressed training set in this directory may be corrupted.




