Dataset

Dataset view

The Dataset page is the main page of the Rubrix web app. From here you can access most of Rubrix’s features, like exploring and annotating the records of your dataset.

The page is composed of 4 major components:

Filters

Dataset filters

The filters provide you a quick and intuitive way to filter and sort your records with respect to various parameters. You can find more information about how to use the filters in our detailed filter guide.

Note

Not all filters are available for all tasks.

Predictions filter

Predictions filter

This filter allows you to filter records with respect of their predictions:

  • Predicted as: filter records by their predicted labels

  • Predicted ok: filter records whose predictions do, or do not, match the annotations

  • Score: filter records with respect to the score of their prediction

  • Predicted by: filter records by the prediction agent

Annotations filter

Annotation filters

This filter allows you to filter records with respect to their annotations:

  • Annotated as: filter records with respect to their annotated labels

  • Annotated by: filter records by the annotation agent

Status filter

Status filters

This filter allows you to filter records with respect to their status:

  • Default: records without any annotation or edition

  • Validated: records with validated annotations

  • Edited: records with annotations but still not validated

Metadata filter

Metadata filters

This filter allows you to filter records with respect to their metadata.

Hint

Nested metadata will be flattened and the keys will be joint by a dot.

Sort records

Sort filter

With this component you can sort the records by various parameters, such as the predictions, annotations or their metadata.

Record cards

The record cards are at the heart of the Dataset page and contain your data. There are three different flavors of record cards depending on the task of your dataset. All of them share the same basic structure showing the input text and a vertical ellipsis (or “kebab menu”) on the top right that lets you access the record’s metadata. Predictions and annotations are shown depending on the current mode and task of the dataset.

Check out our exploration and annotation guides to see how the record cards work in the different modes.

Text classification

Text classification view

In this task the predictions are given as tags below the input text. They contain the label as well as a percentage score. When in Explore mode annotations are shown as tags on the right together with a symbol indicating if the predictions match the annotations or not. When in Annotate mode predictions and annotations share the same labels (annotation labels are darker).

A text classification dataset can support either single-label or multi-label classification - in other words, records are either annotated with one single label or various.

Token classification

Token classification view

In this task predictions and annotation are given as highlights in the input text. Work in progress …

Text2Text

Text2Text view

In this task predictions and the annotation are given in a text field below the input text. You can switch between prediction and annotation via the “View annotation”/”View predictions” buttons. For the predictions you can find an associated score in the lower left corner. If you have multiple predictions you can toggle between them using the arrows on the button of the record card.