Evaluator#
The Evaluator class is the central processor running the evaluation of a model on a dataset.
It uses a ModelInterface to score options within a set of answers.
To create an Evaluator for a given model, the Evaluator.from_model method can be used.
The appropriate ModelInterface class is then chosen automatically.
Evaluator
#
Evaluator(*, model_interface: ModelInterface, templater: Optional[Templater] = None)
Methods:
| Name | Description |
|---|---|
evaluate_dataset |
Evaluate the model on all relations in the dataset. |
evaluate_item |
Return the scores for each of the answer options. |
evaluate_dataset
#
evaluate_dataset(dataset: Dataset, template_index: Union[int, Sequence[int], None] = None, *, subsample: Optional[int] = None, save_path: Optional[PathLike] = None, fmt: InstanceTableFileFormat = None, create_instance_table: bool = True, metric: Optional[MultiMetricSpecification] = None, **kw) -> DatasetResults
Evaluate the model on all relations in the dataset.
evaluate_item
#
evaluate_item(item: Item, *, template: Literal[None] = None, answers: Literal[None] = None, subject: Literal[None] = None, print_ranking: bool = False, **kw) -> Union[ItemScores, ItemTokenScoresAndRoles]
evaluate_item(item: Union[None, Item, Iterator[Item]] = None, *, template: Optional[str] = None, answers: Optional[Sequence[str]] = None, subject: Optional[str] = None, print_ranking: bool = False, **kw) -> Union[ItemScores, ItemTokenScoresAndRoles, Iterable[ItemScores], Iterable[ItemTokenScoresAndRoles]]
Return the scores for each of the answer options.
This function needs to be implemented by each of the concrete Evaluator subclasses.