Dataset settings#

Rubrix datasets have certain settings that you can configure via the rb.*Settings classes, for example rb.TextClassificationSettings.

Define a labeling schema#

You can define a labeling schema for your Rubrix dataset, which fixes the allowed labels for your predictions and annotations. Once you set a labeling schema, each time you log to the corresponding dataset, Rubrix will perform validations of the added predictions and annotations to make sure they comply with the schema.

[7]:
import rubrix as rb

# Define labeling schema
settings = rb.TextClassificationSettings(label_schema=["A", "B", "C"])

# Apply settings to a new or already existing dataset
rb.configure_dataset(name="my_dataset", settings=settings)

# Logging to the newly created dataset triggers the validation checks
rb.log(rb.TextClassificationRecord(text="text", annotation="D"), "my_dataset")
#BadRequestApiError: Rubrix server returned an error with http status: 400