Define rules

Define Rules view

The Rubrix web app has a dedicated mode to find good heuristic rules, also often referred to as labeling functions, for a weak supervision workflow. As shown in our guide and tutorial, these rules allow you to quickly annotate your data with noisy labels in a semiautomatic way.

You can access the Define rules mode via the sidebar of the Dataset page.

Note

The Define rules mode is only available for single-label text classification datasets.

Query plus label

Label searchbar

A rule in Rubrix basically applies a chosen label to a list of records that match a given query, so all you need is a query plus a label. After entering a query in the search bar and selecting a label, you will see some metrics for the rule on the right and the matches of your query in the record list below.

If you are happy with the metrics and/or the matching record list, you can save the rule by clicking on “Save rule”. In this way it will be stored as part of the current dataset and can be accessed via the manage rules button.

Note

If you want to add labels to the available list of labels, you can switch to the Annotation mode and create labels there.

Rule Metrics

Labeling metrics

After entering a query and selecting a label, Rubrix provides you with some key metrics about the rule. Some metrics are only available if your dataset has also annotated records.

  • Coverage: Number of records (percentage) of records labeled by the rule

  • Annotated coverage: Number of records (percentage) of annotated records labeled by the rule

  • Correct/incorrect: Number of records the rule labelled correctly/incorrectly (if annotations are available)

  • Precision: Percentage of correct labels given by the rule (if annotations are available)

Manage rules

Here you will see a list of your saved rules as well as their overall metrics. You can edit a rule by clicking on its name, or delete it by clicking on the trash icon.

Label Rules 5