Skip to content

Overview of the toolbox

The atr-ner-eval package provides an easy way to compute a wide variety of metrics to evaluate automatic workflows for Automatic Text Recognition (ATR) and Named Entity Recognition (NER).

How to use the toolbox?

1. Format your data to BIO2 format

The library expects that ground truth and prediction files are formatted in BIO2 format.

└── dataset/
    ├── labels/
       ├── file_001.bio
       ├── file_002.bio
       └── file_003.bio
    └── predictions/
        ├── file_001.bio
        ├── file_002.bio
        └── file_003.bio

To ensure that your BIO files are correctly formatted, you can run:

bio-parser validate dataset/*/*.bio

2. Compute metrics

The list of available metrics is available using the --help option:

atr-ner-eval --help

To know more about each metric, see the Usage section.