Main Page
Related Pages
Modules
Namespaces
Classes
Files
File List
File Members
All
Classes
Namespaces
Files
Functions
Variables
Typedefs
Enumerations
Enumerator
Friends
Macros
Modules
Pages
cjkpitch.h
Go to the documentation of this file.
1
// File: cjkpitch.h
3
// Description: Code to determine fixed pitchness and the pitch if fixed,
4
// for CJK text.
5
// Copyright 2011 Google Inc. All Rights Reserved.
6
// Author: takenaka@google.com (Hiroshi Takenaka)
7
// Created: Mon Jun 27 12:48:35 JST 2011
8
//
9
// Licensed under the Apache License, Version 2.0 (the "License");
10
// you may not use this file except in compliance with the License.
11
// You may obtain a copy of the License at
12
// http://www.apache.org/licenses/LICENSE-2.0
13
// Unless required by applicable law or agreed to in writing, software
14
// distributed under the License is distributed on an "AS IS" BASIS,
15
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16
// See the License for the specific language governing permissions and
17
// limitations under the License.
18
//
20
#ifndef CJKPITCH_H_
21
#define CJKPITCH_H_
22
23
#include "
blobbox.h
"
24
25
// Function to test "fixed-pitchness" of the input text and estimating
26
// character pitch parameters for it, based on CJK fixed-pitch layout
27
// model.
28
//
29
// This function assumes that a fixed-pitch CJK text has following
30
// characteristics:
31
//
32
// - Most glyphs are designed to fit within the same sized square
33
// (imaginary body). Also they are aligned to the center of their
34
// imaginary bodies.
35
// - The imaginary body is always a regular rectangle.
36
// - There may be some extra space between character bodies
37
// (tracking).
38
// - There may be some extra space after punctuations.
39
// - The text is *not* space-delimited. Thus spaces are rare.
40
// - Character may consists of multiple unconnected blobs.
41
//
42
// And the function works in two passes. On pass 1, it looks for such
43
// "good" blobs that has the pitch same pitch on the both side and
44
// looks like a complete CJK character. Then estimates the character
45
// pitch for every row, based on those good blobs. If we couldn't find
46
// enough good blobs for a row, then the pitch is estimated from other
47
// rows with similar character height instead.
48
//
49
// Pass 2 is an iterative process to fit the blobs into fixed-pitch
50
// character cells. Once we have estimated the character pitch, blobs
51
// that are almost as large as the pitch can be considered to be
52
// complete characters. And once we know that some characters are
53
// complete characters, we can estimate the region occupied by its
54
// neighbors. And so on.
55
//
56
// We repeat the process until all ambiguities are resolved. Then make
57
// the final decision about fixed-pitchness of each row and compute
58
// pitch and spacing parameters.
59
//
60
// (If a row is considered to be propotional, pitch_decision for the
61
// row is set to PITCH_CORR_PROP and the later phase
62
// (i.e. Textord::to_spacing()) should determine its spacing
63
// parameters)
64
//
65
// This function doesn't provide all information required by
66
// fixed_pitch_words() and the rows need to be processed with
67
// make_prop_words() even if they are fixed pitched.
68
void
compute_fixed_pitch_cjk
(
ICOORD
page_tr,
// top right
69
TO_BLOCK_LIST *port_blocks);
// input list
70
71
#endif // CJKPITCH_H_
ICOORD
integer coordinate
Definition:
points.h:30
compute_fixed_pitch_cjk
void compute_fixed_pitch_cjk(ICOORD page_tr, TO_BLOCK_LIST *port_blocks)
Definition:
cjkpitch.cpp:1057
blobbox.h
textord
cjkpitch.h
Generated on Mon Jul 20 2015 18:37:55 by
1.8.8