ordEval {CORElearn} | R Documentation |

The method evaluates the quality of ordered attributes specified by the formula with ordEval algorithm.

ordEval(formula, data, file=NULL, rndFile=NULL, variant=c("allNear","attrDist1","classDist1"), ...)

`formula` |
Either a formula specifying the attributes to be evaluated and the target variable, or a name of target variable, or an index of target variable. |

`data` |
Data frame with evaluation data. |

`file` |
Name of file where evaluation results will be written to. |

`rndFile` |
Name of file where evaluation of random normalizing attributes will be written to. |

`variant` |
Name of the variant of ordEval algorithm. Can be any of |

`... ` |
Other options specific to ordEval or common to other context-sensitive evaluation methods (e.g., ReliefF). |

The parameter `formula`

can be interpreted in three ways, where the formula interface is the most elegant one,
but inefficient and inappropriate for large data sets. See also examples below. As `formula`

one can specify:

- an object of class
`formula`

used as a mechanism to select features (attributes) and prediction variable (class). Only simple terms can be used and interaction expressed in formula syntax are not supported. The simplest way is to specify just response variable:

`class ~ .`

. In this case all other attributes in the data set are evaluated. Note that formula interface is not appropriate for data sets with large number of variables.- a character vector
specifying the name of target variable, all the other columns in data frame

`data`

are used as predictors.- an integer
specifying the index of of target variable in data frame

`data`

, all the other columns are used as predictors.

In the data frame `data`

take care to supply the ordinal data as factors and to provide equal levels for them
(this is not necessary what one gets with `read.table`

).
See example below.

The output can be optionally written to files `file`

and `rndFile`

,
in a format used by visualization methods in `plotOrdEval`

.

The variant of the algorithm actually used is controlled with `variant`

parameter
which can have values "allNear", "attrDist1", and "classDist1". The default value
is "allNear" which takes all nearest neighbors into account in evaluation of attributes.
Variant "attrDist1" takes only neighbors with attribute value at most 1 different from
current case into account (for each attribute separately). This makes sense when we want to
see the thresholds of reinforcement, and therefore observe just small change up or down
(it makes sense to combine this with `equalUpDown=TRUE`

in `plot.ordEval`

function).
The "classDist1" variant takes only neighbors with class value at most 1 different from
current case into account. This makes sense if we want to observe strictly small
changes in upward/downward reinforcement and has little effect in practical applications.

There are some additional parameters (note **... **) some of which are common with other context-sensitive evaluation methods (e.g., ReliefF).
Their list of common parameters is available in `helpCore`

(see subsection on attribute evaluation therein).
The parameters specific to `ordEval`

are:

- ordEvalNoRandomNormalizers
type: integer, default value: 0, value range: 0, Inf,

number of randomly shuffled attributes for normalization of each attribute (0=no normalization). This parameter should be set to a reasonably high value (e.g., 200) in order to produce reliable confidence intervals with`plot.ordEval`

. The parameters`ordEvalBootstrapNormalize`

and`ordEvalNormalizingPercentile`

only make sense if this parameter is larger than 0.- ordEvalBootstrapNormalize
type: logical, default value: FALSE

are features used for normalization constructed with bootstrap sampling or random permutation.- ordEvalNormalizingPercentile
type: numeric, default value: 0.025, value range: 0, 0.5

percentile defines the length of confidence interval obtained with random normalization. Percentile`t`

forms interval by taking the*nt*and*n(1-t)*random evaluation as the confidence interval boundaries, thereby forming*100(1-2t)*% confidence interval (`t`

=0.025 gives 95% confidence interval). The value*n*is set by`ordEvalNoRandomNormalizers`

parameter.- attrWeights
type: character,

a character vector representing a list of attribute weights in the ordEval distance measure.

Evaluation of attributes without specifics of ordered attributes is covered in function `attrEval`

.

The method returns a list with following components:

`reinfPosAV` |
a matrix of positive reinforcement for attributes' values, |

`reinfNegAV` |
a matrix of negative reinforcement for attributes' values, |

`anchorAV` |
a matrix of anchoring for attributes' values, |

`noAV` |
a matrix containing count for each value of each attribute, |

`reinfPosAttr` |
a vector of positive reinforcement for attributes, |

`reinfNegAttr` |
a matrix of negative reinforcement for attributes, |

`anchorAttr` |
a matrix of anchoring for attributes, |

`noAVattr` |
a vector containing count of valid values of each attribute, |

`rndReinfPosAV` |
a three dimensional array of statistics for random normalizing attributes' positive reinforcement for attributes' values, |

`rndReinfPosAV` |
a three dimensional array of statistics for random normalizing attributes' negative reinforcement for attributes' values, |

`rndAnchorAV` |
a three dimensional array of statistics for random normalizing attributes' anchoring for attributes' values, |

`rndReinfPosAttr` |
a three dimensional array of statistics for random normalizing attributes' positive reinforcement for attributes, |

`rndReinfPosAttr` |
a three dimensional array of statistics for random normalizing attributes' negative reinforcement for attributes, |

`rndAnchorAttr` |
a three dimensional array of statistics for random normalizing attributes' anchoring for attributes. |

`attrNames` |
the names of attributes |

`valueNames` |
the values of attributes |

`noAttr` |
number of attributes |

`ordVal` |
maximal number of attribute values |

`variant` |
the variant of the algorithm used |

`file` |
the file to store the results |

`rndFile` |
the file to store random normalizations |

The statistics used are median, 1st quartile, 3rd quartile, low and high percentile selected by

`ordEvalNormalizingPercentile`

, mean, standard deviation, and expected probability according to value distribution.
With these statistics we can visualize significance of reinforcements using adapted box and whiskers plot.

Marko Robnik-Sikonja

Marko Robnik-Sikonja, Koen Vanhoof: Evaluation of ordinal attributes at value level.
*Knowledge Discovery and Data Mining*, 14:225-243, 2007

Marko Robnik-Sikonja, Igor Kononenko: Theoretical and Empirical Analysis of ReliefF and RReliefF.
*Machine Learning Journal*, 53:23-69, 2003

Some of the references are available also from http://lkm.fri.uni-lj.si/rmarko/papers/

`plot.ordEval`

,
`CORElearn`

,
`CoreModel`

,
`helpCore`

,
`infoCore`

.

#prepare a data set dat <- ordDataGen(200) # evaluate ordered features with ordEval est <- ordEval(class ~ ., dat, ordEvalNoRandomNormalizers=100) # print(est) printOrdEval(est) plot(est)

[Package *CORElearn* version 1.56.0 Index]