tesseract
5.0.0-alpha-619-ge9db
|
#include <lm_state.h>
Public Member Functions | |
LanguageModelNgramInfo (const char *c, int l, bool p, float nc, float ncc) | |
Public Attributes | |
STRING | context |
int | context_unichar_step_len |
bool | pruned |
float | ngram_cost |
-ln(P_ngram_model(path)) More... | |
float | ngram_and_classifier_cost |
-[ ln(P_classifier(path)) + scale_factor * ln(P_ngram_model(path)) ] More... | |
Struct for storing additional information used by Ngram language model component.
Definition at line 70 of file lm_state.h.
|
inline |
Definition at line 71 of file lm_state.h.
STRING tesseract::LanguageModelNgramInfo::context |
context string
Definition at line 74 of file lm_state.h.
int tesseract::LanguageModelNgramInfo::context_unichar_step_len |
Length of the context measured by advancing using UNICHAR::utf8_step() (should be at most the order of the character ngram model used).
Definition at line 77 of file lm_state.h.
float tesseract::LanguageModelNgramInfo::ngram_and_classifier_cost |
-[ ln(P_classifier(path)) + scale_factor * ln(P_ngram_model(path)) ]
Definition at line 86 of file lm_state.h.
float tesseract::LanguageModelNgramInfo::ngram_cost |
-ln(P_ngram_model(path))
Definition at line 84 of file lm_state.h.
bool tesseract::LanguageModelNgramInfo::pruned |
The paths with pruned set are pruned out from the perspective of the character ngram model. They are explored further because they represent a dictionary match or a top choice. Thus ngram_info is still computed for them in order to calculate the combined cost.
Definition at line 82 of file lm_state.h.