Tasks#

This section gives you ideas about the kind of tasks you can use Rubrix for. It also describes some of the tasks on our roadmap, if there’s some task you want and don’t see here or you want to contribute a task, file an issue or use the Discussion forum at Rubrix’s GitHub page.

Supported tasks#

Text classification#

According to the amazing NLP Progress resource by Seb Ruder:

Text classification is the task of assigning a sentence or document an appropriate category. The categories depend on the chosen dataset and can range from topics.

Rubrix is flexible with input and output shapes, which means you can model many related tasks like for example:

Token classification#

The most well-known task in this category is probably Named Entity Recognition:

Named entity recognition (NER) is the task of tagging entities in text with their corresponding type. Approaches typically use BIO notation, which differentiates the beginning (B) and the inside (I) of entities. O is used for non-entity tokens.

Rubrix is flexible with input and output shapes, which means you can model related tasks like for example:

  • Named entity recognition

  • Part of speech tagging

  • Slot filling

Text2Text#

The most typical and oldest task in this category is probably Machine Translation:

Machine translation is the task of translating a sentence in a source language to a different target language.

The common frame of this category is that the modal receives and outputs a sequence of tokens. It encompasses a variety of tasks such as

  • text summarization

  • machine translation

  • natural language generation

  • paraphrase generation, etc.

Tasks on the roadmap#

Natural language processing#

Computer vision#

  • Image classification

  • Image captioning

Speech#

  • Speech2Text