Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kur with Arabic unicharset (similar to old tessdata) #23

Open
Shreeshrii opened this issue Mar 23, 2018 · 3 comments
Open

Add kur with Arabic unicharset (similar to old tessdata) #23

Shreeshrii opened this issue Mar 23, 2018 · 3 comments

Comments

@Shreeshrii
Copy link
Contributor

See related issue in langdata.

tesseract-ocr/langdata#116

@Shreeshrii
Copy link
Contributor Author

Shreeshrii commented Mar 23, 2018

This is kur_ara.lstm-unicharset in both tessdata_best and tessdata_fast.

141
NULL 0 Common 0
Joined 7 0,255,0,255,0,0,0,0,0,0 Latin 1 0 1 Joined	# Joined [4a 6f 69 6e 65 64 ]a
|Broken|0|1 f 0,255,0,255,0,0,0,0,0,0 Common 2 10 2 |Broken|0|1	# Broken
E 5 0,255,0,255,0,0,0,0,0,0 Latin 96 0 3 E	# E [45 ]A
M 5 0,255,0,255,0,0,0,0,0,0 Latin 97 0 4 M	# M [4d ]A
K 5 0,255,0,255,0,0,0,0,0,0 Latin 100 0 5 K	# K [4b ]A
É 5 0,255,0,255,0,0,0,0,0,0 Latin 130 0 6 É	# É [c9 ]A
I 5 0,255,0,255,0,0,0,0,0,0 Latin 107 0 7 I	# I [49 ]A
H 5 0,255,0,255,0,0,0,0,0,0 Latin 106 0 8 H	# H [48 ]A
N 5 0,255,0,255,0,0,0,0,0,0 Latin 99 0 9 N	# N [4e ]A
D 5 0,255,0,255,0,0,0,0,0,0 Latin 103 0 10 D	# D [44 ]A
Z 5 0,255,0,255,0,0,0,0,0,0 Latin 112 0 11 Z	# Z [5a ]A
Y 5 0,255,0,255,0,0,0,0,0,0 Latin 95 0 12 Y	# Y [59 ]A
A 5 0,255,0,255,0,0,0,0,0,0 Latin 104 0 13 A	# A [41 ]A
V 5 0,255,0,255,0,0,0,0,0,0 Latin 123 0 14 V	# V [56 ]A
R 5 0,255,0,255,0,0,0,0,0,0 Latin 102 0 15 R	# R [52 ]A
; 10 0,255,0,255,0,0,0,0,0,0 Common 16 10 16 ;	# ; [3b ]p
S 5 0,255,0,255,0,0,0,0,0,0 Latin 93 0 17 S	# S [53 ]A
Ê 5 0,255,0,255,0,0,0,0,0,0 Latin 94 0 18 Ê	# Ê [ca ]A
Û 5 0,255,0,255,0,0,0,0,0,0 Latin 101 0 19 Û	# Û [db ]A
Ş 5 0,255,0,255,0,0,0,0,0,0 Latin 121 0 20 Ş	# Ş [15e ]A
Î 5 0,255,0,255,0,0,0,0,0,0 Latin 98 0 21 Î	# Î [ce ]A
. 10 0,255,0,255,0,0,0,0,0,0 Common 22 6 22 .	# . [2e ]p
> 0 0,255,0,255,0,0,0,0,0,0 Common 23 10 86 >	# > [3e ]
L 5 0,255,0,255,0,0,0,0,0,0 Latin 108 0 24 L	# L [4c ]A
T 5 0,255,0,255,0,0,0,0,0,0 Latin 110 0 25 T	# T [54 ]A
C 5 0,255,0,255,0,0,0,0,0,0 Latin 119 0 26 C	# C [43 ]A
X 5 0,255,0,255,0,0,0,0,0,0 Latin 117 0 27 X	# X [58 ]A
U 5 0,255,0,255,0,0,0,0,0,0 Latin 105 0 28 U	# U [55 ]A
B 5 0,255,0,255,0,0,0,0,0,0 Latin 114 0 29 B	# B [42 ]A
O 5 0,255,0,255,0,0,0,0,0,0 Latin 111 0 30 O	# O [4f ]A
Ç 5 0,255,0,255,0,0,0,0,0,0 Latin 116 0 31 Ç	# Ç [c7 ]A
! 10 0,255,0,255,0,0,0,0,0,0 Common 32 10 32 !	# ! [21 ]p
4 8 0,255,0,255,0,0,0,0,0,0 Common 33 2 33 4	# 4 [34 ]0
( 10 0,255,0,255,0,0,0,0,0,0 Common 34 10 36 (	# ( [28 ]p
G 5 0,255,0,255,0,0,0,0,0,0 Latin 113 0 35 G	# G [47 ]A
) 10 0,255,0,255,0,0,0,0,0,0 Common 36 10 34 )	# ) [29 ]p
, 10 0,255,0,255,0,0,0,0,0,0 Common 37 6 37 ,	# , [2c ]p
F 5 0,255,0,255,0,0,0,0,0,0 Latin 126 0 38 F	# F [46 ]A
P 5 0,255,0,255,0,0,0,0,0,0 Latin 125 0 39 P	# P [50 ]A
W 5 0,255,0,255,0,0,0,0,0,0 Latin 109 0 40 W	# W [57 ]A
- 10 0,255,0,255,0,0,0,0,0,0 Common 41 3 41 -	# - [2d ]p
J 5 0,255,0,255,0,0,0,0,0,0 Latin 115 0 42 J	# J [4a ]A
Q 5 0,255,0,255,0,0,0,0,0,0 Latin 120 0 43 Q	# Q [51 ]A
Ü 5 0,255,0,255,0,0,0,0,0,0 Latin 129 0 44 Ü	# Ü [dc ]A
È 5 0,255,0,255,0,0,0,0,0,0 Latin 128 0 45 È	# È [c8 ]A
~ 0 0,255,0,255,0,0,0,0,0,0 Common 46 10 46 ~	# ~ [7e ]
: 10 0,255,0,255,0,0,0,0,0,0 Common 47 6 47 :	# : [3a ]p
\ 10 0,255,0,255,0,0,0,0,0,0 Common 48 10 48 \	# \ [5c ]p
3 8 0,255,0,255,0,0,0,0,0,0 Common 49 2 49 3	# 3 [33 ]0
» 10 0,255,0,255,0,0,0,0,0,0 Common 50 10 84 »	# » [bb ]p
„ 10 0,255,0,255,0,0,0,0,0,0 Common 51 10 51 „	# „ [201e ]p
" 10 0,255,0,255,0,0,0,0,0,0 Common 52 10 52 "	# " [22 ]p
^ 0 0,255,0,255,0,0,0,0,0,0 Common 53 10 53 ^	# ^ [5e ]
Ö 5 0,255,0,255,0,0,0,0,0,0 Latin 127 0 54 Ö	# Ö [d6 ]A
' 10 0,255,0,255,0,0,0,0,0,0 Common 55 10 55 '	# ' [27 ]p
# 10 0,255,0,255,0,0,0,0,0,0 Common 56 4 56 #	# # [23 ]p
/ 10 0,255,0,255,0,0,0,0,0,0 Common 57 6 57 /	# / [2f ]p
1 8 0,255,0,255,0,0,0,0,0,0 Common 58 2 58 1	# 1 [31 ]0
Ù 5 0,255,0,255,0,0,0,0,0,0 Latin 118 0 59 Ù	# Ù [d9 ]A
” 10 0,255,0,255,0,0,0,0,0,0 Common 60 10 60 "	# ” [201d ]p
Ğ 5 0,255,0,255,0,0,0,0,0,0 Latin 133 0 61 Ğ	# Ğ [11e ]A
2 8 0,255,0,255,0,0,0,0,0,0 Common 62 2 62 2	# 2 [32 ]0
“ 10 0,255,0,255,0,0,0,0,0,0 Common 63 10 63 "	# “ [201c ]p
9 8 0,255,0,255,0,0,0,0,0,0 Common 64 2 64 9	# 9 [39 ]0
5 8 0,255,0,255,0,0,0,0,0,0 Common 65 2 65 5	# 5 [35 ]0
Š 5 0,255,0,255,0,0,0,0,0,0 Latin 131 0 66 Š	# Š [160 ]A
_ 10 0,255,0,255,0,0,0,0,0,0 Common 67 10 67 _	# _ [5f ]p
à 5 0,255,0,255,0,0,0,0,0,0 Latin 134 0 68 Ã	# à [c3 ]A
* 10 0,255,0,255,0,0,0,0,0,0 Common 69 10 69 *	# * [2a ]p
+ 0 0,255,0,255,0,0,0,0,0,0 Common 70 3 70 +	# + [2b ]
7 8 0,255,0,255,0,0,0,0,0,0 Common 71 2 71 7	# 7 [37 ]0
Ë 5 0,255,0,255,0,0,0,0,0,0 Latin 132 0 72 Ë	# Ë [cb ]A
® 0 0,255,0,255,0,0,0,0,0,0 Common 73 10 73 ®	# ® [ae ]
& 10 0,255,0,255,0,0,0,0,0,0 Common 74 10 74 &	# & [26 ]p
6 8 0,255,0,255,0,0,0,0,0,0 Common 75 2 75 6	# 6 [36 ]0
@ 10 0,255,0,255,0,0,0,0,0,0 Common 76 10 76 @	# @ [40 ]p
0 8 0,255,0,255,0,0,0,0,0,0 Common 77 2 77 0	# 0 [30 ]0
? 10 0,255,0,255,0,0,0,0,0,0 Common 78 10 78 ?	# ? [3f ]p
İ 5 0,255,0,255,0,0,0,0,0,0 Latin 107 0 79 İ	# İ [130 ]A
$ 0 0,255,0,255,0,0,0,0,0,0 Common 80 4 80 $	# $ [24 ]
§ 10 0,255,0,255,0,0,0,0,0,0 Common 81 10 81 §	# § [a7 ]p
Þ 5 0,255,0,255,0,0,0,0,0,0 Latin 122 0 82 Þ	# Þ [de ]A
% 10 0,255,0,255,0,0,0,0,0,0 Common 83 4 83 %	# % [25 ]p
« 10 0,255,0,255,0,0,0,0,0,0 Common 84 10 50 «	# « [ab ]p
8 8 0,255,0,255,0,0,0,0,0,0 Common 85 2 85 8	# 8 [38 ]0
< 0 0,255,0,255,0,0,0,0,0,0 Common 86 10 23 <	# < [3c ]
[ 10 0,255,0,255,0,0,0,0,0,0 Common 87 10 90 [	# [ [5b ]p
| 0 0,255,0,255,0,0,0,0,0,0 Common 88 10 88 |	# | [7c ]
` 0 0,255,0,255,0,0,0,0,0,0 Common 89 10 89 '	# ` [60 ]
] 10 0,255,0,255,0,0,0,0,0,0 Common 90 10 87 ]	# ] [5d ]p
€ 0 0,255,0,255,0,0,0,0,0,0 Common 91 4 91 €	# € [20ac ]
¬ 0 0,255,0,255,0,0,0,0,0,0 Common 92 10 92 ¬	# ¬ [ac ]
s 3 0,255,0,255,0,0,0,0,0,0 Latin 17 0 93 s	# s [73 ]a
ê 3 0,255,0,255,0,0,0,0,0,0 Latin 18 0 94 ê	# ê [ea ]a
y 3 0,255,0,255,0,0,0,0,0,0 Latin 12 0 95 y	# y [79 ]a
e 3 0,255,0,255,0,0,0,0,0,0 Latin 3 0 96 e	# e [65 ]a
m 3 0,255,0,255,0,0,0,0,0,0 Latin 4 0 97 m	# m [6d ]a
î 3 0,255,0,255,0,0,0,0,0,0 Latin 21 0 98 î	# î [ee ]a
n 3 0,255,0,255,0,0,0,0,0,0 Latin 9 0 99 n	# n [6e ]a
k 3 0,255,0,255,0,0,0,0,0,0 Latin 5 0 100 k	# k [6b ]a
û 3 0,255,0,255,0,0,0,0,0,0 Latin 19 0 101 û	# û [fb ]a
r 3 0,255,0,255,0,0,0,0,0,0 Latin 15 0 102 r	# r [72 ]a
d 3 0,255,0,255,0,0,0,0,0,0 Latin 10 0 103 d	# d [64 ]a
a 3 0,255,0,255,0,0,0,0,0,0 Latin 13 0 104 a	# a [61 ]a
u 3 0,255,0,255,0,0,0,0,0,0 Latin 28 0 105 u	# u [75 ]a
h 3 0,255,0,255,0,0,0,0,0,0 Latin 8 0 106 h	# h [68 ]a
i 3 0,255,0,255,0,0,0,0,0,0 Latin 7 0 107 i	# i [69 ]a
l 3 0,255,0,255,0,0,0,0,0,0 Latin 24 0 108 l	# l [6c ]a
w 3 0,255,0,255,0,0,0,0,0,0 Latin 40 0 109 w	# w [77 ]a
t 3 0,255,0,255,0,0,0,0,0,0 Latin 25 0 110 t	# t [74 ]a
o 3 0,255,0,255,0,0,0,0,0,0 Latin 30 0 111 o	# o [6f ]a
z 3 0,255,0,255,0,0,0,0,0,0 Latin 11 0 112 z	# z [7a ]a
g 3 0,255,0,255,0,0,0,0,0,0 Latin 35 0 113 g	# g [67 ]a
b 3 0,255,0,255,0,0,0,0,0,0 Latin 29 0 114 b	# b [62 ]a
j 3 0,255,0,255,0,0,0,0,0,0 Latin 42 0 115 j	# j [6a ]a
ç 3 0,255,0,255,0,0,0,0,0,0 Latin 31 0 116 ç	# ç [e7 ]a
x 3 0,255,0,255,0,0,0,0,0,0 Latin 27 0 117 x	# x [78 ]a
ù 3 0,255,0,255,0,0,0,0,0,0 Latin 59 0 118 ù	# ù [f9 ]a
c 3 0,255,0,255,0,0,0,0,0,0 Latin 26 0 119 c	# c [63 ]a
q 3 0,255,0,255,0,0,0,0,0,0 Latin 43 0 120 q	# q [71 ]a
ş 3 0,255,0,255,0,0,0,0,0,0 Latin 20 0 121 ş	# ş [15f ]a
þ 3 0,255,0,255,0,0,0,0,0,0 Latin 82 0 122 þ	# þ [fe ]a
v 3 0,255,0,255,0,0,0,0,0,0 Latin 14 0 123 v	# v [76 ]a
ı 3 0,255,0,255,0,0,0,0,0,0 Latin 7 0 124 ı	# ı [131 ]a
p 3 0,255,0,255,0,0,0,0,0,0 Latin 39 0 125 p	# p [70 ]a
f 3 0,255,0,255,0,0,0,0,0,0 Latin 38 0 126 f	# f [66 ]a
ö 3 0,255,0,255,0,0,0,0,0,0 Latin 54 0 127 ö	# ö [f6 ]a
è 3 0,255,0,255,0,0,0,0,0,0 Latin 45 0 128 è	# è [e8 ]a
ü 3 0,255,0,255,0,0,0,0,0,0 Latin 44 0 129 ü	# ü [fc ]a
é 3 0,255,0,255,0,0,0,0,0,0 Latin 6 0 130 é	# é [e9 ]a
š 3 0,255,0,255,0,0,0,0,0,0 Latin 66 0 131 š	# š [161 ]a
ë 3 0,255,0,255,0,0,0,0,0,0 Latin 72 0 132 ë	# ë [eb ]a
ğ 3 0,255,0,255,0,0,0,0,0,0 Latin 61 0 133 ğ	# ğ [11f ]a
ã 3 0,255,0,255,0,0,0,0,0,0 Latin 68 0 134 ã	# ã [e3 ]a
ُ 0 0,255,0,255,0,0,0,0,0,0 Inherited 135 17 135 ُ	# ُ [64f ]
َ 0 0,255,0,255,0,0,0,0,0,0 Inherited 136 17 136 َ	# َ [64e ]
ٍ 0 0,255,0,255,0,0,0,0,0,0 Inherited 137 17 137 ٍ	# ٍ [64d ]
ِ 0 0,255,0,255,0,0,0,0,0,0 Inherited 138 17 138 ِ	# ِ [650 ]
ً 0 0,255,0,255,0,0,0,0,0,0 Inherited 139 17 139 ً	# ً [64b ]
ٌ 0 0,255,0,255,0,0,0,0,0,0 Inherited 140 17 140 ٌ	# ٌ [64c ]

@Shreeshrii
Copy link
Contributor Author

This is the kur.unicharset from tessdata

88
NULL 0 NULL 0
Joined 7 0,69,188,255,486,1218,0,30,486,1188 Latin 1 0 1 Joined	# Joined [4a 6f 69 6e 65 64 ]a
|Broken|0|1 f 0,69,186,255,892,2138,0,80,892,2058 Common 31 10 31 |Broken|0|1	# Broken
ر 1 0,63,137,224,45,297,0,22,59,244 Arabic 3 13 3 ر	# ر [631 ]x
ه 1 55,123,147,255,35,181,6,64,48,222 Arabic 4 13 4 ه	# ه [647 ]x
س 1 0,64,140,228,123,493,0,50,132,523 Arabic 5 13 5 س	# س [633 ]x
ل 1 0,96,200,255,62,328,0,50,71,332 Arabic 6 13 6 ل	# ل [644 ]x
و 1 0,68,137,238,65,290,0,27,62,256 Arabic 7 13 7 و	# و [648 ]x
د 1 49,123,163,250,43,467,0,70,59,503 Arabic 8 13 8 د	# د [62f ]x
8 8 14,100,204,255,59,218,6,29,78,249 Common 9 2 9 8	# 8 [38 ]0
: 10 12,108,157,255,18,58,11,77,52,193 Common 10 6 10 :	# : [3a ]p
ن 1 0,88,163,255,68,321,0,52,76,354 Arabic 11 13 11 ن	# ن [646 ]x
ێ 1 0,71,188,255,95,253,0,45,103,279 Arabic 12 13 12 ێ	# ێ [6ce ]x
ش 1 0,64,196,255,123,493,0,50,132,523 Arabic 13 13 13 ش	# ش [634 ]x
ی 1 0,71,148,225,95,253,0,45,103,279 Arabic 14 13 14 ی	# ی [6cc ]x
ا 1 26,117,200,255,11,181,7,82,33,222 Arabic 15 13 15 ا	# ا [627 ]x
ب 1 0,71,140,224,113,339,0,50,123,378 Arabic 16 13 16 ب	# ب [628 ]x
م 1 0,64,134,241,51,272,0,46,56,313 Arabic 17 13 17 م	# م [645 ]x
ئ 1 0,100,185,255,95,431,0,45,103,467 Arabic 18 13 18 ئ	# ئ [626 ]x
4 8 17,108,204,255,71,225,1,22,78,249 Common 19 2 19 4	# 4 [34 ]0
ت 1 58,123,170,255,113,339,2,50,123,378 Arabic 20 13 20 ت	# ت [62a ]x
ک 1 47,121,200,255,131,288,0,45,124,305 Arabic 21 13 21 ک	# ک [6a9 ]x
« 10 17,140,166,255,68,161,4,43,78,249 Common 22 10 54 «	# « [ab ]p
گ 1 47,125,208,255,131,289,0,45,132,305 Arabic 23 13 23 گ	# گ [6af ]x
ج 1 0,64,133,255,92,262,2,37,84,290 Arabic 24 13 24 ج	# ج [62c ]x
ۆ 1 0,68,181,255,68,172,0,17,62,193 Arabic 25 13 25 ۆ	# ۆ [6c6 ]x
ڵ 1 2,96,255,255,65,190,0,50,71,193 Arabic 26 13 26 ڵ	# ڵ [6b5 ]x
ز 1 0,63,167,255,45,298,0,22,59,242 Arabic 27 13 27 ز	# ز [632 ]x
ڕ 1 0,33,137,195,65,170,0,12,59,165 Arabic 28 13 28 ڕ	# ڕ [695 ]x
پ 1 0,42,142,217,113,258,2,50,123,288 Arabic 29 13 29 پ	# پ [67e ]x
3 8 14,100,204,255,59,215,4,30,78,249 Common 30 2 30 3	# 3 [33 ]0
| 0 0,88,207,255,6,64,12,82,31,193 Common 31 10 31 |	# | [7c ]
ژ 1 0,63,192,255,71,190,0,22,59,193 Arabic 32 13 32 ژ	# ژ [698 ]x
7 8 17,108,201,255,67,215,4,37,78,249 Common 33 2 33 7	# 7 [37 ]0
- 10 76,186,109,216,42,121,4,55,51,193 Common 34 3 34 -	# - [2d ]p
ة 1 55,123,190,255,40,181,0,60,48,222 Arabic 35 13 35 ة	# ة [629 ]x
ك 1 49,123,203,255,91,451,0,50,103,483 Arabic 36 13 36 ك	# ك [643 ]x
خ 1 0,66,172,255,92,262,2,37,84,290 Arabic 37 13 37 خ	# خ [62e ]x
. 10 12,108,64,140,18,52,9,77,52,193 Common 38 6 38 .	# . [2e ]p
چ 1 0,29,133,213,92,192,4,37,84,213 Arabic 39 13 39 چ	# چ [686 ]x
! 10 12,108,193,255,19,61,15,82,49,193 Common 40 10 40 !	# ! [21 ]p
ح 1 0,64,133,255,92,262,2,37,84,290 Arabic 41 13 41 ح	# ح [62d ]x
ص 1 0,64,143,249,131,619,0,50,132,654 Arabic 42 13 42 ص	# ص [635 ]x
ڤ 1 44,121,224,255,123,339,0,47,123,352 Arabic 43 13 43 ڤ	# ڤ [6a4 ]x
/ 10 12,102,224,255,43,166,0,29,54,193 Common 44 6 44 /	# / [2f ]p
2 8 17,108,204,255,70,215,3,24,78,249 Common 45 2 45 2	# 2 [32 ]0
0 8 14,100,204,255,65,212,6,29,78,249 Common 46 2 46 0	# 0 [30 ]0
1 8 17,108,204,255,42,136,18,43,78,249 Common 47 2 47 1	# 1 [31 ]0
> 0 47,109,188,255,49,222,3,33,78,262 Common 48 10 74 >	# > [3e ]
ە 1 55,121,147,202,35,134,6,64,48,193 Arabic 49 13 49 ە	# ە [6d5 ]x
غ 1 0,64,196,255,98,239,2,37,81,276 Arabic 50 13 50 غ	# غ [63a ]x
ي 1 0,56,148,255,95,431,0,45,103,467 Arabic 51 13 51 ي	# ي [64a ]x
ط 1 40,123,200,255,110,574,0,20,113,610 Arabic 52 13 52 ط	# ط [637 ]x
اَ 1 26,252,200,255,26,497,7,82,33,415 Arabic 15 13 15 اَ	# اَ [627 64e ]x
» 10 17,140,166,255,70,158,4,43,78,249 Common 54 10 22 »	# » [bb ]p
ع 1 0,64,148,255,98,239,2,37,81,276 Arabic 55 13 55 ع	# ع [639 ]x
© 0 12,102,203,255,109,336,0,27,118,330 Common 56 10 56 ©	# © [a9 ]
ف 1 44,125,202,255,113,339,0,47,123,378 Arabic 57 13 57 ف	# ف [641 ]x
، 10 30,121,105,221,20,67,5,72,43,193 Common 58 6 58 ،	# ، [60c ]p
آ 1 26,117,230,255,36,161,0,58,33,198 Arabic 59 13 59 آ	# آ [622 ]x
5 8 14,100,201,255,62,218,6,37,78,249 Common 60 2 60 5	# 5 [35 ]0
= 0 86,150,160,244,90,218,3,33,99,262 Common 61 10 61 =	# = [3d ]
6 8 14,100,204,255,65,215,6,29,78,249 Common 62 2 62 6	# 6 [36 ]0
* 10 79,248,163,255,68,143,0,41,78,193 Common 63 10 63 *	# * [2a ]p
9 8 14,100,204,255,65,212,6,29,78,249 Common 64 2 64 9	# 9 [39 ]0
ق 1 0,79,179,255,84,310,0,52,88,345 Arabic 65 13 65 ق	# ق [642 ]x
رِ 1 0,77,0,224,59,1289,0,22,59,1267 Arabic 3 13 3 رِ	# رِ [631 650 ]x
ؤ 1 0,68,190,255,70,290,0,27,62,266 Arabic 67 13 67 ؤ	# ؤ [624 ]x
ى 1 0,100,148,255,95,431,0,45,103,467 Arabic 68 13 68 ى	# ى [649 ]x
؛ 10 60,119,140,255,20,67,2,72,43,193 Common 69 13 69 ؛	# ؛ [61b ]p
ــ 1 64,121,80,142,52,522,0,0,52,522 Common 80 13 80 ــ	# ــ [640 640 ]x
, 10 0,72,69,140,21,62,8,65,39,193 Common 71 6 71 ,	# , [2c ]p
# 10 17,102,204,255,74,242,0,23,78,249 Common 72 4 72 #	# # [23 ]p
) 10 0,85,197,255,31,108,0,55,63,193 Common 73 10 78 )	# ) [29 ]p
< 0 47,109,188,255,49,218,0,40,78,262 Common 74 10 48 <	# < [3c ]
؟ 10 0,123,188,255,35,181,5,48,69,222 Common 75 13 75 ؟	# ؟ [61f ]p
] 10 0,73,207,255,31,112,0,53,61,193 Common 76 10 77 ]	# ] [5d ]p
[ 10 0,73,209,255,31,112,13,72,61,193 Common 77 10 76 [	# [ [5b ]p
( 10 0,85,197,255,31,108,8,74,63,193 Common 78 10 73 (	# ( [28 ]p
" 10 139,254,204,255,42,128,9,55,64,193 Common 79 10 79 "	# " [22 ]p
ـ 1 64,121,80,142,45,264,0,0,26,261 Common 80 13 80 ـ	# ـ [640 ]x
ــا 1 26,121,80,255,85,744,0,0,85,744 Common 80 13 80 ــا	# ــا [640 640 627 ]x
' 10 139,254,204,255,15,45,6,82,28,193 Common 82 10 82 '	# ' [27 ]p
› 10 17,110,166,237,35,81,6,65,51,193 Common 83 10 83 ›	# › [203a ]p
لا 1 0,117,200,255,104,604,0,50,104,554 Arabic 6 13 6 لا	# لا [644 627 ]x
لاَ 1 0,252,200,255,104,797,0,50,104,747 Arabic 6 13 6 لاَ	# لاَ [644 627 64e ]x
ڵا 1 2,117,200,255,104,465,0,50,104,415 Arabic 26 13 26 ڵا	# ڵا [6b5 627 ]x
ير 1 0,63,137,255,162,756,0,45,162,711 Arabic 51 13 51 ير	# ير [64a 631 ]x

@Shreeshrii Shreeshrii changed the title kur_ara does not have Arabic unicharset. Add kur with Arabic unicharset (similar to old tessdata) Sep 17, 2018
@Shreeshrii
Copy link
Contributor Author

Kurdish in Latin script is supported as kmr.

Kurdish in Arabic script (which was kur in tessdata) is missing in tessdata_best/tessdata_fast.

MerlijnWajer added a commit to MerlijnWajer/tesseract that referenced this issue Dec 1, 2020
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is
now "kmr", which is actually Latin) now, but "kur" is not present in
tessdata_fast nor in tessdata_best. [1] [2]

"tgl" (Tagalo) is now named "fil" (Filipino) [3]

[1] tesseract-ocr/langdata#124
[2] tesseract-ocr/tessdata_best#23
[3] tesseract-ocr/langdata#84

"kur" no longer exists, might be named "kur_ara" now, but it is not
present in tessdata_fast nor in tessdata_best. "kmr" is the Latin
version (Kurmanji)

"tgl" (Tagalo) is now named "fil" (Filipino)
MerlijnWajer added a commit to MerlijnWajer/tesseract that referenced this issue Dec 1, 2020
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is
now "kmr", which is actually Latin) now, but "kur" is not present in
tessdata_fast nor in tessdata_best. [1] [2]

"tgl" (Tagalo) is now named "fil" (Filipino) [3]

[1] tesseract-ocr/langdata#124
[2] tesseract-ocr/tessdata_best#23
[3] tesseract-ocr/langdata#84
MerlijnWajer added a commit to MerlijnWajer/tesseract that referenced this issue Dec 1, 2020
"kur" no longer exists, might be named "kur_ara" (the old "kur_ara" is
now "kmr", which is actually Latin) now, but "kur" is not present in
tessdata_fast nor in tessdata_best. [1] [2]

"tgl" (Tagalo) is now named "fil" (Filipino) [3]

[1] tesseract-ocr/langdata#124
[2] tesseract-ocr/tessdata_best#23
[3] tesseract-ocr/langdata#84
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant