Overview of the toolbox
The atr-ner-eval
package provides an easy way to compute a wide variety of metrics to evaluate automatic workflows for Automatic Text Recognition (ATR) and Named Entity Recognition (NER).
How to use the toolbox?
1. Format your data to BIO2 format
The library expects that ground truth and prediction files are formatted in BIO2 format.
└── dataset/
├── labels/
│ ├── file_001.bio
│ ├── file_002.bio
│ └── file_003.bio
└── predictions/
├── file_001.bio
├── file_002.bio
└── file_003.bio
To ensure that your BIO files are correctly formatted, you can run:
bio-parser validate dataset/*/*.bio
2. Compute metrics
The list of available metrics is available using the --help
option:
atr-ner-eval --help
To know more about each metric, see the Usage section.