Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schneider Euro PC #1748

Open
Vutshi opened this issue Oct 7, 2023 · 25 comments
Open

Schneider Euro PC #1748

Vutshi opened this issue Oct 7, 2023 · 25 comments
Labels
bug Defect in the product

Comments

@Vutshi
Copy link

Vutshi commented Oct 7, 2023

Description

  • Euro PC is an IBM XT compatible computer based on 8088 CPU. It features some nice enhancements over the original.

ELKS needs some adjustments to work better on this machine.

Configuration

  • Keyboard (German) is somehow different from the German layout in ELKS direct console
  • RTC is M3002
  • Floppy Disk Controller: WD37C65/A (seems to be compatible with NEC765)
  • I will add more hardware related information later
@Vutshi Vutshi added the bug Defect in the product label Oct 7, 2023
@Vutshi
Copy link
Author

Vutshi commented Oct 7, 2023

We begin with the ELKS direct console German layout. Although the German table of characters looks reasonable, ELKS doesn’t show peculiar characters like üöä under root and it does crazy things under toor:
Schneider_direct_console

For comparison here is the actual keyboard:
Schneider kbd

Where does ELKS take font? Could it be somehow missing?

BTW, BIOS console works okeyish

@ghaerr
Copy link
Owner

ghaerr commented Oct 7, 2023

Hello @Vutshi,

Where does ELKS take font? Could it be somehow missing?

No. When the console is in text mode, the CGA/EGA/VGA uses the video memory contents as byte indices into a character ROM contained on the video card. (Every other byte is actually a screen attribute, so the memory is displayed per memory word).

There are two things going on here, which likely need to be separated in order to fully understand the problem. There is keyboard input, which produces a byte sequence, and then there is console output, which works as described above, essentially writing the received unconverted byte directly into video RAM and the hardware displaying the character glyph.

The keyboard input works through IRQ 1, which the scancode keyboard driver then looks at with the combinations of shift/ctrl/alt to produce a keyboard input byte and shift status. Thus, the keyboard scancode is converted to an input byte code, which may or may not match the character ROM on the video card. In your use, I am assuming you have configured for German scancode keyboard (CONFIG_KEYBOARD_SCANCODE and CONFIG_KEYMAP_DE, pease check). Here are the German tables. These will need to be somewhat painfully checked in order to change the keyboard input (and yes, they're in octal, who knows why).

The issue with the two shells displaying different characters is probably complicated, but I would guess sash may have a bug where a keyboard byte is sign-extended, which IIRC displayed a white box like you're seeing. This can be fixed later. A better way to debug the output/display side of this problem would be to create a binary file (or shell script with quoted characters, possibly harder), then use cat or sh to display the output directly to the screen without using the keyboard. (I may have such a binary file, I'll have to look).

When using the BIOS console rather than direct console, it is entirely possible that the BIOS is translating characters, so we should probably leave that out of the equation for the moment.

I now ask: does ELKS configured for German work well on other PCs? It would be an interesting comparison.

Thank you!

@ghaerr
Copy link
Owner

ghaerr commented Oct 7, 2023

The mapping of byte codes for the keyboard as well as the console to the displayed glyph is referred to in total as a code page. The code pages for most US systems is CP 437, while some Western European systems use CP 850.

These references will come in helpful, even by creating a binary file with a few of the different codes in it, to give us an idea of what the Euro PC is using for its output code page. If documentation can be found regarding the code page, that would be great, otherwise the tables will be very handy to fill in many of the problems without having to test each one.

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

Hi @ghaerr

I now ask: does ELKS configured for German work well on other PCs? It would be an interesting comparison.

In QEMU the German layout works very similar to the Euro PC:
QEMU_direct_console_DE

Here I typed in every number-letter key rows separated by a space

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

I think I see what is going wrong. In the ELKS German table there are characters like 'ü' which during compilation are converted according to UTF (or Windows?), in this case into 0xfc. ELKS interprets them according to CP 437 such that 0xfc turns into as can be seen in screenshots above.
I don’t understand how the German encoding table was supposed to work in the first place. But I know how to fix it now ;)

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

@ghaerr

It seems like elks/arch/i86/drivers/char/KeyMaps/keys-de.h is in Windows (or Western (ISO Latin 1)) encoding. I am going to redo all non standard symbols as hex codes for CP 437 and save the file in UTF encoding. Is it ok? What is the standard encoding for the source files in ELKS?

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

It seems like elks/arch/i86/drivers/char/KeyMaps/keys-de.h is in Windows (or Western (ISO Latin 1)) encoding.

Good catch, I hadn't noticed but now see an encoded ö in the file. That's certainly a problem.

I am going to redo all non standard symbols as hex codes for CP 437 and save the file in UTF encoding. Is it ok?

Yes, that would be great to use hex rather than octal. As far as file encoding, straight ASCII or UTF-8. The tables themselves should only use hex, with UTF-8 encodings only used for comments within /* */.

The use of hex within the mapping tables will greatly help comparing the tables with the modern code page tables. I would guess these tables have been around for 20+ years and were created before UTF-8 was standard.

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

@ghaerr

Another question about expected keyboard behavior.
In DOS ctrl+alt modifies some keys, e.g. to get μ, but keys which have no new meaning in this regime just do not do anything.
In ELKS it seems ctrl+alt returns unmodified key value if there is no third meaning for the key.
Which behavior is more reasonable?

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

To be fair, I don't fully know what's best because I don't use DOS and have very infrequently use key combinations to come up with say, Western European character codes.

As far as ELKS OS itself, the console was coded to use ALT-F1/2/3 to switch between console but I changed that to Alt-1/2/3, to keep function keys out of the equation. kbd-scancode.c, the scancode keyboard driver, also converts arrow, keypad and function keys to ANSI sequences, and Alt-{A-Z} to ESC {A-Z}. There are no other "standard" conversions that I know of, all others are encoded into the country-specific tables.

In DOS ctrl+alt modifies some keys, e.g. to get μ, but keys which have no new meaning in this regime just do not do anything.
In ELKS it seems ctrl+alt returns unmodified key value if there is no third meaning for the key.

The scancode driver is quite complicated, so beware when making changes to things other than table entries. I would say that the DOS behavior seems more reasonable, as it becomes quite clear which special key combinations do something, versus just operate the same as without the combo, and might be easier learned. Do you happen to know what Linux does in these cases (I'm on macOS)? You're welcome to submit a change to make operation more like DOS in this respect. It also seems the tables might be easier to maintain.

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

@ghaerr

We have adjusted the table entries and it seems to work well. Almost. There are at least two issues.
First, kilo editor does not recognise umlaut:
KiloDEtest

Whereas cat shows the resulting content correctly:
CatDEtest

Second problem is the section sign § (code 0x15). It manages to erase the whole line of text preceding it under root. In toor it works as expected.
Importantly, the previously reported difference between root and toor was a mistake. In fact, the only difference we observe now is related to §.

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

kilo editor does not recognise umlaut:

Try running edit, the MINIX editor instead. kilo was actually created by its author more as an exercise in how much could be done with very little code.

section sign § (code 0x15). It manages to erase the whole line of text preceding it under root.

0x15 is also a ^U which is in ASCII. I'm pretty sure ash uses that for clear line (it uses a number of CTRL- combinations to erase words, lines, etc).

In fact, the only difference we observe now is related to §.

Good. sash doesn't process any input CTRL- combinations separately.

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

kilo editor does not recognise umlaut

This also could possibly be an internal sign-extension problem, using char instead of unsigned char when reading single characters out of a buffer. This can be traced down and fixed after you get everything else working. What will be important to know is whether it only fails on glyph codes that are > 128 (high bit set).

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

Just for fun. The characters ROM of Schneider contains two fonts, CGA and MDA versions.
All enthusiasts of raw bit graphics are welcome to enjoy it:

CGA_MDA_Schneider

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

kilo editor does not recognise umlaut

This also could possibly be an internal sign-extension problem, using char instead of unsigned char when reading single characters out of a buffer. This can be traced down and fixed after you get everything else working. What will be important to know is whether it only fails on glyph codes that are > 128 (high bit set).

This is a very likely explanation. I made a binary file with all bytes starting from 0x20 to 0xff and in macOS cat shows me ? starting with 0x80:

cat test_text_v3.bin 
 !"#$%&'()*+,-./
0123456789:;<=>?
@ABCDEFGHIJKLMNO
PQRSTUVWXYZ[\]^_
`abcdefghijklmno
pqrstuvwxyz{|}~
????????????????
????????????????
????????????????
????????????????
????????????????
????????????????
????????????????
????????????????

Here is the bin file for discovering the code page
test_text_v3.bin.txt

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

Thanks for the binary test file, I'll keep that for debugging when needed.

in macOS cat shows me ? starting with 0x80

I think the reason for that is that the macOS Terminal interprets everything as UTF-8, and those single-byte sequences are invalid UTF-8. ELKS however, including cat, do no such interpretation, passing everything sequentialy to the configured console driver.

Here's your binary test file running on the latest commit under QEMU:
Screen Shot 2023-10-08 at 2 29 37 PM

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

And here is kilo editing your binary file on QEMU:
Screen Shot 2023-10-08 at 2 31 38 PM

It would appear its not (just) a sign extension problem, but something else. Notice the right hand edge, where the CR has been converted to ?, probably not related to the first issue.

Notice that CRLF terminators are not the standard line terminators in Linux or ELKS, and your binary file is using them:
Screen Shot 2023-10-08 at 2 33 41 PM

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

Notice that CRLF terminators are not the standard line terminators in Linux or ELKS

what should I use?

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

Use LF only.

Here's MINIX edit on your file:
Screen Shot 2023-10-08 at 2 36 55 PM

@Vutshi
Copy link
Author

Vutshi commented Oct 8, 2023

I see. I will make a new Unix friendly test file tomorrow.

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

Here you go. Easily converted using :set ff=unix in vi, then :wq.
20-ff-bin.txt

Both edit and kilo now show the right edge correctly, but kilo's problems remain.

Kilo:
kilo

Edit:
edit

@ghaerr
Copy link
Owner

ghaerr commented Oct 8, 2023

I have identified the display problem in kilo.c:

        /* Handle non printable chars. */
        if (!isprint(*p)) {
            row->hl[i] = HL_NONPRINT;
            p++; i++;
            prev_sep = 0;
            continue;
        }

The code later displays '?' for non printable characters. The problem is in the C library isprint:

int isprint(int c)
{
    return (c & 0x7f) >= 32 && c < 127;
}

The high bit is ignored and all characters less than 32 are "non-printable". It matches the ASCII control-characters. I'll look into changing just kilo for now as isprint itself needs a little more thought as to the formal definition.

@Vutshi
Copy link
Author

Vutshi commented Oct 16, 2023

@ghaerr

Thank you for fixing isprint, now everything works as expected (well, almost, but we will come back to it later).

At the moment, I am implementing RTC M3002 support in clock.c. It reads the RTC well but there are some subtleties to be added. For this I need to read a byte from memory and also print it for debug purposes. I try to use peekb and printk but something is missing, I get a compile error undefined reference to 'peekb'.

Best

@ghaerr
Copy link
Owner

ghaerr commented Oct 16, 2023

Thank you for fixing isprint, now everything works as expected

Great! You're still going to submit a PR at some time for the keyboard changes, right?

I try to use peekb and printk but something is missing, I get a compile error undefined reference to 'peekb'.

That's because peekb isn't implemented for user programs, only the kernel. I recommend using a __far pointer, and setting the segment and offset using _MK_FP.

Use something like the following (modified for char rather than short), which is in elkscmd/tui/cons.c:

#define MK_FP(seg,off)  ((void __far *)((((unsigned long)(seg)) << 16) | (off)))

#define poke(s,o,w)     (*((unsigned short __far*)MK_FP((s),(o))) = (w))
#define peek(s,o)       (*((unsigned short __far*)MK_FP((s),(o))))

There's another use example in elkscmd/basic/host.c.

@ghaerr
Copy link
Owner

ghaerr commented Feb 6, 2024

Hello @Vutshi,

Would you like to post the source to your modified German keyboard file? I'd be happy to add it so that ELKS is up-to-date with your enhancement. Same thing for RTC M3002 support if you ever finished that.

Thank you!

@Vutshi
Copy link
Author

Vutshi commented Apr 28, 2024

Hi @ghaerr ,

Apologies for not responding earlier. I’ll definitely add support for the Schneider PC, but it’ll have to wait until summer. Right now, I’m busy working on the future of computing, so past has to wait;)

The progress on ELKS is impressive, and I’m eager to try out the new improvements.

Best

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Defect in the product
Projects
None yet
Development

No branches or pull requests

2 participants