auswahl.benchmarking.util.metrics.DengScore¶
- class auswahl.benchmarking.util.metrics.DengScore(metric_name: str = 'deng_score')[source]¶
Wraps the calculation of the selection stability score for randomized selection methods, according to Deng et al. [1]. A detailed overview is provided in the user guide.
- Parameters
- metric_name: str, default=”deng_score”
Unique Name of the metric
References
- 1
Bai-Chuan Deng, Yong-Huan Yun, Pan Ma, Chen-Chen Li, Da-Bing Ren and Yi-Zeng Liang, ‘A new method for wavelength interval selection that intelligently optimizes the locations, widths and combination of intervals’, Analyst, 6, 1876-1885, 2015.
- add_stabilities(pod: DataHandler)¶
Conducts the evaluation of the stability metric across all datasets and methods in the
DataHandlerobject, which is extended with the results of the stability evaluation.- Parameters
- pod: DataHandler
instance of
DataHandlercontaining the results of the benchmarking procedure
- evaluate_stability(meta_data: dict, selections: array, features: FeatureDescriptor)¶
Conducts the stability evaluation of a set of executions of a selector algorithm on one dataset with a specific feature configuration under different data splits and seeds
- Parameters
- meta_data: dict
information about the data set, which might be relevant for stability calculations. See
get_meta()for the contained data- selections: np.ndarray
The selected features of the different executions of the selector algorithm as integer indices of features. Shape (#executions, #features to select)
- features: FeatureDescriptor
FeatureDescriptor describing the configuration of features to be selected
- Returns
- stability: float
- pairwise_sim_func(meta_data: dict, set_1: ndarray, set_2: ndarray) float[source]¶
Function calculating the stability score for a single pair of selections of features.
- Parameters
- meta_data: dict
Dict containing meta information about the dataset for which the stability metric is evaluated. See the documentation of
get_meta()for the available data.- set_1: np.nadarray
array of integer indices of selected features of shape (n_features_to_select,)
- set_2: np.nadarray
array of integer indices of selected features of shape (n_features_to_select,)
- Returns
- stability score for the given pair of selections: float