CER / WER metrics
atr_ner_eval.metrics.cer
Compute CER and WER from a label/prediction dataset.
Attributes
logger
module-attribute
logger = logging.getLogger(__name__)
Classes
TextEval
Bases: NamedTuple
Compute text errors between a label and prediction.
Attributes
label
instance-attribute
label: str
Label text.
prediction
instance-attribute
prediction: str
Predicted text.
char_errors
property
char_errors: int
Compute character errors between the label and prediction.
Returns:
Type | Description |
---|---|
int
|
Character errors. |
Examples:
>>> TextEval("I really like cats", "I love cats").char_errors
9
word_errors
property
word_errors: int
Compute word errors between the label and prediction.
Returns:
Type | Description |
---|---|
int
|
Word errors. |
Examples:
>>> TextEval("I really like cats", "I love cats").word_errors
2
char_totals
property
char_totals: int
Compute the max number of characters in the label or prediction.
Returns:
Type | Description |
---|---|
int
|
Number of characters. |
Examples:
>>> TextEval("I really like cats", "I love cats").char_totals
18
word_totals
property
word_totals: int
Compute the max number of words in the label or prediction.
Returns:
Type | Description |
---|---|
int
|
Number of words. |
Examples:
>>> TextEval("I really like cats", "I love cats").word_totals
4
TotalScore
TotalScore()
Compute total evaluation scores.
Initialize errors and counts.
Examples:
>>> score = TotalScore()
Source code in atr_ner_eval/metrics/cer.py
94 95 96 97 98 99 100 101 102 103 104 |
|
Attributes
char_errors
instance-attribute
char_errors = defaultdict(int)
word_errors
instance-attribute
word_errors = defaultdict(int)
char_totals
instance-attribute
char_totals = defaultdict(int)
word_totals
instance-attribute
word_totals = defaultdict(int)
count
instance-attribute
count = defaultdict(int)
categories
property
categories: list[str]
List of semantic categories for which scores are computed.
Returns:
Type | Description |
---|---|
list[str]
|
The list of categories. |
Examples:
>>> score.categories
['total', 'animal']
cer
property
cer: defaultdict(float)
Compute the Character Error Rate (%).
Returns:
Type | Description |
---|---|
defaultdict(float)
|
The Character Error Rate. |
Examples:
>>> score.cer
{'total': 38.9, 'animal': 0.0}
wer
property
wer: defaultdict(float)
Compute the Word Error Rate (%).
Returns:
Type | Description |
---|---|
defaultdict(float)
|
The Word Error Rate. |
Examples:
>>> score.wer
{'total': 25.0, 'animal': 0.0}
Functions
update
update(key, score: TextEval)
Update the score with the current evaluation for a given key.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
key |
str
|
Category to update. |
required |
score |
TextEval
|
Current score. |
required |
Examples:
>>> score.update("total", TextEval("I really like cats", "I like cats"))
>>> score.update("animal", TextEval("cats", "cats"))
>>> score.char_errors
defaultdict(<class 'int'>, {'total': 7, 'animal': 0})
>>> score.word_errors
defaultdict(<class 'int'>, {'total': 1, 'animal': 0})
>>> score.char_totals
defaultdict(<class 'int'>, {'total': 18, 'animal': 4})
>>> score.word_totals
defaultdict(<class 'int'>, {'total': 4, 'animal': 1})
>>> score.count
defaultdict(<class 'int'>, {'total': 1, 'animal': 1})
Source code in atr_ner_eval/metrics/cer.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
|
Functions
format_string_for_wer
format_string_for_wer(text: str) -> list[str]
Format string for WER computation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text |
str
|
The text to format. |
required |
Returns:
Type | Description |
---|---|
list[str]
|
A list of words formatted for WER computation. |
Examples:
>>> format_string_for_wer(text="this is a string to evaluate")
['this', 'is', 'a', 'string', 'to', 'evaluate']
>>> format_string_for_wer(text="this is another string to evaluate")
['this', 'is', 'another', 'string', 'to', 'evaluate']
Source code in atr_ner_eval/metrics/cer.py
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 |
|
format_string_for_cer
format_string_for_cer(text: str) -> str
Format string for CER computation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text |
str
|
The text to format. |
required |
Returns:
Type | Description |
---|---|
str
|
The formatted text for CER computation. |
Examples:
>>> format_string_for_cer(text="this is a string to evaluate")
'this is a string to evaluate'
>>> format_string_for_cer(text="this is another string to evaluate")
'this is another string to evaluate'
Source code in atr_ner_eval/metrics/cer.py
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 |
|
make_prettytable
make_prettytable(score: TotalScore) -> PrettyTable
Format and display results using PrettyTable.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
score |
TotalScore
|
Total scores. |
required |
Returns:
Type | Description |
---|---|
PrettyTable
|
The evaluation table formatted in Markdown. |
Source code in atr_ner_eval/metrics/cer.py
215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
|
merge_entities
merge_entities(
entities: list[tuple[str, str]]
) -> dict[str, str]
Iterate over entities and merge text for each entity type.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
entities |
list[tuple[str, str]]
|
A list of entities. |
required |
Returns:
Type | Description |
---|---|
dict[str, str]
|
A dictionary with entity types as keys and the corresponding text as values. |
Source code in atr_ner_eval/metrics/cer.py
245 246 247 248 249 250 251 252 253 254 255 256 257 |
|
compute_cer_wer
compute_cer_wer(
label_dir: Path,
prediction_dir: Path,
by_category: bool = False,
) -> None
Read BIO files and compute Character and Word Error Rates globally or for each NER category.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
label_dir |
Path
|
Path to the reference BIO file. |
required |
prediction_dir |
Path
|
Path to the prediction BIO file. |
required |
by_category |
bool
|
Whether to display CER/WER by category. |
False
|
Returns:
Type | Description |
---|---|
None
|
A Markdown formatted table containing evaluation results. |
Source code in atr_ner_eval/metrics/cer.py
260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 |
|