Annotate records#

The Rubrix web app has a dedicated mode to quickly label your data in a very intuitive way, or revise previous gold labels and correct them. Rubrix’s powerful search and filter functionalities, together with potential model predictions, can guide the annotation process and support the annotator.

You can access the Annotate mode via the sidebar of the Dataset page.

Create labels#

Create new label

For the text and token classification tasks, you can create new labels within the Annotate mode. On the right side of the bulk validation bar, you will find a “+ Create new label” button that lets you add new labels to your dataset.

Annotate#

To annotate the records, the Rubrix web app provides a simple and intuitive interface that tries to follow the same interaction pattern as in the Explore mode. As in the Explore mode, the record cards in the Annotate mode are also customized depending on the task of the dataset.

Text Classification#

Multilabel card, validated

When switching in the Annotate mode for a text classification dataset, the labels in the record cards become clickable and you can annotate the records by simply clicking on them. For multi-label classification tasks, you can also annotate a record with no labels by either validating an empty selection or deselecting all labels.

Once a record is annotated, it will be marked as Validated in the upper right corner of the record card.

Token Classification#

Annotate mode for the Token Classification task

For token classification datasets, you can highlight words (tokens) in the text and annotate them with a label. Under the hood, the highlighting takes advantage of the tokens information in the Token Classification data model. You can also remove annotations by hovering over the highlights and pressing the X button.

After modifying a record, either by adding or removing annotations, its status will change to Pending and a Save button will appear. Once a record is saved, its status will change to Validated.

Text2Text#

Text2Text View

For text2text datasets, you have a text box available, in which you can draft or edit an annotation. After editing or drafting your annotation, don’t forget to save your changes.

Bulk annotate#

Bulk annotation bar

For all tasks, you can bulk validate the predictions of the records. You can either select the records one by one with the selection box on the upper left of each card, or you can use the global selection box below the search bar, which will select all records shown on the page. Then you can either Validate or Discard the selected records.

For the text classification task, you can additionally bulk annotate the selected records with a specific label, by simply selecting the label from the “Annotate as …” list.

Validate predictions#

In Rubrix you can pre-annotate your data by including model predictions in your records. Assuming that the model works reasonably well on your dataset, you can filter for records with high prediction scores, and simply validate their predictions to quickly annotate records.

Text Classification#

For this task, model predictions are shown as percentages in the label tags. You can validate the predictions shown in a slightly darker tone by pressing the Validate button:

  • for a single label classification task, this will be the prediction with the highest percentage

  • for a multi label classification task, this will be the predictions with a percentage above 50%

Token Classification#

For this task, predictions are shown as underlines. You can also validate the predictions (or the absence of them) by pressing the Validate button.

Once the record is saved or validated, its status will change to Validated.

Text2Text#

You can validate or edit a prediction, by first clicking on the view predictions button, and then the Edit or Validate button. After editing or drafting your annotation, don’t forget to save your changes.

Search and filter#

Search and filter for annotation view

The powerful search bar allows you to do simple, quick searches, as well as complex queries that take full advantage of Rubrix’s data models. In addition, the filters provide you a quick and intuitive way to filter and sort your records with respect to various parameters, including the metadata of your records. For example, you can use the Status filter to hide already annotated records (Status: Default), or to only show annotated records when revising previous annotations (Status: Validated).

You can find more information about how to use the search bar and the filters in our detailed search guide and filter guide.

Note

Not all filters are available for all tasks.

Progress metric#

Progress metric

From the sidebar you can access the Progress metrics. There you will find the progress of your annotation session, the distribution of validated and discarded records, and the label distribution of your annotations.

You can find more information about the metrics in our dedicated metrics guide.