
Ideas for improvement of the Input Method Editor and the
GUK Input Methods themselves.


This file describes cases where the IM Editor or GUK IM
could be improved in the future, as well as a few cases
where actual bugs exist. Most problems are only related to
limited flexibility of either the editor or GUK, however, so I
hope you enjoy the editor and GUK in the current versions, too.


The editor just remembers "option" lines in GIM files and
writes them back when exporting to GIM, but does not do
anything with them. In addition, keycap values that are
equal to send values are treated like nonexisting keycap
values, and special keycap values are rejected. See below.


The editor interprets choice lists as several output values
connected to a single input value, so the choice lists will
result in several single assignments each. The GUK IM cannot
process several results for a single final state yet (should
store results in Vectors rather than single objects and show
a choice list when a non-singular Vector is reached), but they
can process sequences with early commitment (that is, you can
assign "fab" to one glyph and "fabo" to another, but cannot
assign "fab" to two different glyphs).


Yudit allows that (.kmap) keymaps SUM UP the glyph number
during composition. The editor cannot parse those files in
any useful manner and other input methods (like GUK) cannot
process the semantics either. Used for Hangul:

Hangul2.kmap: "consonant0+vowel+consonant3" (19, 21, 1+27 entries)
Hangul3.kmap: "b+m+e" (19, 29, 1+27 entries)
Unicode.kmap: "unicode+digit1+digit2+digit3+digit4"
(Unicode.kmap allows you to type u1234 to get glyph \u1234)

Mail me if you want me to provide a Perl script that "multiplies
out" the Hangul keymaps into 11k/15k single normal entries:
  eric@coli.uni-sb.de (Eric Auer).


GUK IM should be flexible with the layout of the virtual
keyboard: I have moved the "what happens to CHAR when shift
changes" by writing ShiftedKeys.java (supports UK and US,
German and Dutch planned, partially implicitly supported because
Java can upcase/downcase umlauts). This also allowed cleaner
code in KeyboardMap.java and GateIM.java - and at least UK and
US are now supported also in the looks of the keyboard map.
The UK / US flag is controlled by the ShiftedKeys.KMAP value.
German shift / unshift is already there for char, but not for
key codes, and is not used at all yet in KeyboardMap. Layout
differences of DE to UK / US are explained in ShiftedKeys.


The clipboard tools and the copy to clipboard functions of the
glyph preview and glyph palette crash in Java 1.3.1 for Unix,
and the Java global clipboard is unconnected to the global X
clipboard in Java 1.3.1 / Linux X11 4.2.0, which affects the
Input Method Editor. Other versions hopefully work better.


Deactivating the GUK IM is buggy in Java 1.3.1: It will
still read the key events, but Java will ignore the IM output
and pretend that there would be no IM. If you try to activate
the virtual keyboard before first activating the GUK IM, the
virtual keyboard will not be visible for the rest of the session
and in some other cases, it will stay iconified at inappropriate
times. Note that the menu item does not follow the virtual
keyboard, but the other way round: Checking the checkbox will
request it and unchecking the checkbox will ask it to hide.


The editor and GUK IM only use C- and A- (alias to M- or Alt-)
prefixes, other modifiers are ignored (e.g. Meta, AltGraph) or
handled implicitly (e.g. Shift). KeyEvents and KeyStrokes differ
in whether they allow AltGraph. The editor parses G- (AltGraph),
but it is often lost in further processing. XGIM format is
supposed to allow more flexibility. See AssignObject.java and
LocaleHandler.java and other GUK IM files to add better XGIM
support everywhere (none yet in the GUK IM!).

It would be an idea to have some helper for creating XGIM escapes,
but it is questionable if you want to assign Unicode characters
to things like "left button + HOME"  or "altgraph + F1" at all.
The implementation could be a form with checkboxes and comboboxes,
or something that captures all key events for a while.


The editor does not allow to edit the digits option and does
not handle the digit toggle either. Instead, it splits digit
lines into two separate assignments on import. It will not
re-unite them on export. GUK IM (current version) just takes one
half of a digit alternative line and ignores the other half.
Neither the editor nor GUK IM assign a toggle hotkey for anything.
Example: This would assign some Ctrl-d hotkey to toggle between
ASCII and national digit glyphs and set the toggle to national
on startup. Shown: Header, example digit.
  option digits "C-d" national
  bind "0"        digit 0x0030 0x06F0
The editor will import this as if it would be:
  bind "C-d0"     send 0x06F0
  bind "0"        digit 0x0030
The C-d prefix is hardcoded in FileCommands.java, but all GIM
files that I have actually use "C-d" as the toggle for the
option digits command, by coincidence.
Current GUK IM behaviour is to use ONLY the national glyphs.

Used by 2901-94.gim (Persian), arabat.gim (Arabic),
arabmlt.gim (Arabic), arabwin.gim (Arabic),
bengali.gim (Bengali, without "national" flag),
crlpers.gim (Persian), clrurdu.gim (Urdu, which uses Alt-
rather than A-, probably an error), hindiins.gim (Hindi,
without "national" flag), hindivoa.gim (Hindi, as hindiins),


Neither the editor nor (it seems) GUK IM use the flag:
  option backspace
which seems to be for use with syllable languages / scripts.
The idea seems to be that backspace would either go back into
the previous FSM state or remove a whole glyph.
Used by hangul.gim (Korean) and viet.gim (Vietnamese)
and vntelex.gim (Vietnamese).


Neither the editor nor (it seems) GUK IM use resetorsend
(the editor translates it into simple send):
  bind "\ " resetorsend 0x0020
which should act like send but only if the FSM already is in
initial state. Otherwise, it would only reset the FSM to
initial state. Usually used for the space key: Either finish
the ongoing composition, or send a space "keypress" in case
that there is no ongoing composition. GUK IM normally passes
through characters that cannot be interpreted as part of a
sequence anyway. The editor translates resetorsend into
send, but adds a comment to the file when you re-export it,
so that you can fix the affected line if you want.

GIM files that use resetorsend (all with "\ ", 0x0020 context):
hangul.gim (Korean), tcode.gim (Japanese), viet.gim (Vietnamese),
vietalt.gim (Vietnamese: only case where the line is not the
first bind line, by the way) and vntelex.gim (Vietnamese).


Neither the editor nor (it seems) GUK IM use this option:
  option hanfont japanese
Only used in tcode.gim (Japanese). We use full Unicode fonts
instead, like Cyberbit (free, available as base, CJK, and full)
or Arial Unicode MS. Cyberbit can be included as a resource
and Arial can be loaded from the current directory (arialuni.ttf)
but you can change the details in FontLoader.java of the
Input Method Editor. If you trigger FontLoader (e.g. creating
an object of that type) before the GUK IM, the GUK IM will be
able to use the font just as if it would be a system font. You
have to add FontLoader usage to the GUK IM otherwise. Or add
some setKeyboardFont to GateIM.java and use it in EditIM.java
- which is probably a much cleaner solution.


GUK IM should be able to load new GIM files dynamically (for now,
you need to mention them in im.list and restart GUK IM), and the
editor should make use of the feature then. For this, GUK IM
needs to have a dynamic locale list (Java SPI IM framework allows
this). Some user interface in the editor for im.list and the name
header (for example:
  inputmethod "Vietnamese" "VIQR Implicit"
first argument is the language, second is the variant) would be
nice.


The editor does not allow the keycap to be different from
the send value. Most of the time, there is no keycap or the
keycap value equals the send value anyway.

Hebrew is one of the problematic GIM files here:
It has lines of type send 0x????0x200C keycap 0x????.
bind "."        send 0x05E60x200C       keycap 0x05E5
bind ";"        send 0x05E40x200C       keycap 0x05E3
bind "i"        send 0x05E00x200C       keycap 0x05DF
bind "l"        send 0x05DB0x200C       keycap 0x05DA
bind "o"        send 0x05DE0x200C       keycap 0x05DD

Hindi (hindiins.gim) also has special lines:
bind "#"        send 0x094D0x0930       keycap 0x25CC0x094D0x0930
bind "="        send 0x200D0x090B       keycap 0x25CC0x200D0x0943
bind "@"        send 0x200D0x090D       keycap 0x25CC0x200D0x090D
bind "\\"       send 0x200D0x0911       keycap 0x25CC0x200D0x0911
bind "_"        send 0x0903     keycap 0x25CC0x0903
bind "e"        send 0x200D0x0906       keycap 0x25CC0x200D0x0906
bind "f"        send 0x200D0x0907       keycap 0x25CC0x200D0x0907
bind "g"        send 0x200D0x0909       keycap 0x25CC0x200D0x0909
bind "w"        send 0x200D0x0910       keycap 0x25CC0x200D0x0910
bind "z"        send 0x200D0x090E       keycap 0x25CC0x200D0x090E

Greek (greekwin.gim) only has a single special line:
bind "w"        send 0x03C30x200C       keycap 0x03C2

Special lines in Bengali (bengali.gim) (lines wrapped for readability):
bind "="        send 0x09CD0x098B       keycap 0x25CC0x200D0x09CD0x098B
bind "a"        send 0x09CD0x0993       ...
...  keycap 0x200D0x09CD0x09C70x25CC0x200D0x09CD0x09BE
bind "e"        send 0x09CD0x0986       keycap 0x25CC0x200D0x09CD0x0986
bind "f"        send 0x09CD0x0987       keycap 0x200D0x09CD0x09870x25CC
bind "g"        send 0x09CD0x0989       keycap 0x25CC0x200D0x09CD0x0989
bind "q"        send 0x09CD0x0994       ...
... keycap 0x200D0x09CD0x09C70x25CC0x200D0x09CD0x09D7
bind "r"        send 0x09CD0x0988       keycap 0x25CC0x200D0x09CD0x0988
bind "s"        send 0x09CD0x098F       keycap 0x200D0x09CD0x098F0x25CC
bind "t"        send 0x09CD0x098A       keycap 0x25CC0x200D0x09CD0x098A
bind "w"        send 0x09CD0x0990       keycap 0x200D0x09CD0x09900x25CC

Special lines in Hindi VOA (hindivoa.gim):
bind "-"        send 0x0903     keycap 0x25CC0x0903
# bind "@"       send 0x0945     keycap 0x25CC0x200D0x090D
bind "@"        send 0x200D0x090D       keycap 0x25CC0x200D0x090D
# bind "F"       send 0x0943     keycap 0x25CC0x200D0x090B
bind "F"        send 0x200D0x090B       keycap 0x25CC0x200D0x090B
# bind "I"       send 0x094B     keycap 0x25CC0x200D0x0913
bind "I"        send 0x200D0x0913       keycap 0x25CC0x200D0x0913
# bind "O"       send 0x094C     keycap 0x25CC0x200D0x0914
bind "O"        send 0x0914     keycap 0x25CC0x200D0x0914
# bind "U"       send 0x0942     keycap 0x25CC0x200D0x090A
bind "U"        send 0x200D0x090A       keycap 0x25CC0x200D0x090A
# bind "["       send 0x0947     keycap 0x25CC0x200D0x090F
bind "["        send 0x200D0x090F       keycap 0x25CC0x200D0x090F
# bind "\\"      send 0x093E     keycap 0x25CC0x200D0x0906
bind "\\"       send 0x200D0x0906       keycap 0x25CC0x200D0x0906
# bind "{"       send 0x0948     keycap 0x25CC0x200D0x0910
bind "{"        send 0x200D0x0910       keycap 0x25CC0x200D0x0910
# bind "i"       send 0x093F     keycap 0x25CC0x200D0x0907
bind "i"        send 0x200D0x0907       keycap 0x25CC0x200D0x0907
# bind "o"       send 0x0940     keycap 0x25CC0x200D0x0908
bind "o"        send 0x200D0x0908       keycap 0x25CC0x200D0x0908
# bind "u"       send 0x0941     keycap 0x25CC0x200D0x0909
bind "u"        send 0x200D0x0909       keycap 0x25CC0x200D0x0909

