API Reference#

You can use the API to call the evaluation from a python script. For this, you need to load a dataset (see Data Files for how these should be structured) and then execute the evaluation function using your desired configuration.

Example

from lm_pub_quiz import Dataset, Evaluator

# Load dataset
dataset = Dataset.from_name("BEAR")

# Create Evaluator (and load model)
evaluator = Evaluator.from_model("distilbert-base-cased")

# Run evaluation
result = evaluator.evaluate_dataset(dataset)

# Save result object
result.save("outputs/my_results")