Annotate records

Rubrix Token Classification (NER) Annotation mode

The Rubrix web app has a dedicated mode to quickly label your data in a very intuitive way, or revise previous gold labels and correct them. Rubrix’s powerful search and filter functionalities, together with potential model predictions, can guide the annotation process and support the annotator.

You can access the Annotate mode via the sidebar of the Dataset page.

Search and filter

Search and filter for annotation view

The powerful search bar allows you to do simple, quick searches, as well as complex queries that take full advantage of Rubrix’s data models. In addition, the filters provide you a quick and intuitive way to filter and sort your records with respect to various parameters, including the metadata of your records. For example, you can use the Status filter to hide already annotated records (Status: Default), or to only show annotated records when revising previous annotations (Status: Validated).

You can find more information about how to use the search bar and the filters in our detailed search guide and filter guide.

Note

Not all filters are available for all tasks.

How to annotate

The Annotation mode enables users to add and modify annotations. This mode follows the same interaction patterns as in the Explore mode.

When choosing this mode, the display of the selected dataset is slightly different. The “Bulk Annotation” bar appears (see below), and records appear editable.

Users can annotate one record by one, or several records in a row, but the annotation will change depending on the task:

  • Text2Text Tasks: Records can be edited, validated or discarded in these tasks.

  • Token Classification Tasks: The record will show different labels on its words. Users can select words or sequences of words (tokens) in order to annotate them with labels, and then, records can be validated or discarded.

  • Text Classification Tasks: A record will be displayed with different labels below. Users have to choose one or more labels (or validate the selected one) and validate the record. Records can be discarded too.

Annotation by different users will be saved with different annotation agents. To setup various users in your Rubrix server, please refer to our user management guide.

Bulk Annotation

With this feature, from 5 to 20 records can be validated or discarded at the same time. In order to use it, users must operate with the bar placed below the search bar and the filters. One or more records can be selected by clicking on specific checkbox or by choosing the Select all checkbox.

After choosing the records to be annotated, a label must be selected on the Annotate as dropdown. After that, these records must be validated or discarded.

It is also posible to create new labels by clicking on Create new label.