Typing inside of the default WSL terminal feels amazing, why is it better than every other app? #327

nickjj · 2018-12-14T12:23:31Z

Sorry, this isn't an issue, but instead, it's more of a suggestion / request to please not break whatever you did with the default WSL terminal (Ubuntu specifically) being so responsive when it comes to rendering characters on the screen after pressing a key.

Typing in the default WSL terminal feels like you're typing on air. There's a smoothness to it that's not present in any other Windows app, not even notepad.exe. If it feels like it has 10ms of input lag instead of 75ms+ for all other Windows (and 200-250ms+ for most Electron based apps).

What makes the WSL terminal feel better than notepad.exe and will this UI enhancement make its way to all Windows apps in the future?

Feel free to close this if you don't want to discuss it. I mainly opened it to bring an awareness to how good it is to hopefully prevent some type of regression from happening in the future.

The text was updated successfully, but these errors were encountered:

miniksa · 2018-12-14T17:17:48Z

I really do not mind when someone comes by and decides to tell us that we're doing a good job at something. We hear so many complaints every day that a post like this is a breath of fresh air. Thanks for your thanks!

Also, I'm happy to discuss this with you until you're utterly sick of reading it. Please ask any follow-ons you want. I thrive on blathering about my work. :P

If I had to take an educated guess as to what is making us faster than pretty much any other application on Windows at putting your text on the screen... I would say it is because that is literally our only job! Also probably because we are using darn near the oldest and lowest level APIs that Windows has to accomplish this work.

Pretty much everything else you've listed has some sort of layer or framework involved, or many, many layers and frameworks, when you start talking about Electron and Javascript. We don't.

We have one bare, super un-special window with no additional controls attached to it. We get our keys fed into us from just barely above the kernel given that we're processing them from window messages and not from some sort of eventing framework common to pretty much any other more complicated UI framework than ours (WPF, WinForms, UWP, Electron). And we dump our text straight onto the window surface using GDI's PolyTextOut with no frills.

Even notepad.exe has multiple controls on its window at the very least and is probably (I haven't looked) using some sort of library framework in the edit control to figure out its text layout (which probably is using another library framework for internationalization support...)

Of course this also means that we have trade offs. We don't support fully international text like pretty much every other application will. RTL? No go zone right now. Surrogate pairs and emoji? We're getting there but not there yet. Indic scripts? Nope.

Why are we like this? For one, conhost.exe is old as dirt. It has to use the bare metal bottom layer of everything because it was created before most of those other frameworks were created. And also it maintains as low/bottom level as possible because it is pretty much the first thing that one needs to bring up when bringing up a new operating system edition or device before you have all the nice things like frameworks or what those frameworks require to operate. Also it's written in C/C++ which is about as low and bare metal as we can get.

Will this UI enhancement come to other apps on Windows? Almost certainly not. They have too much going on which is both a good and a bad thing. I'm jealous of their ability to just call one method and layout text in an uncomplicated manner in any language without manually calculating pixels or caring about what styles apply to their font. But my manual pixel calculations, dirty region math, scroll region madness, and more makes it so we go faster than them. I'm also jealous that when someone says "hey can you add a status bar to the bottom of your window" that they can pretty much click and drag that into place with their UI Framework and it will just work where as for us, it's been a backlog item forever and gives me heartburn to think about implementing.

Will we try to keep it from regressing? Yes! Right now it's sort of a manual process. We identify that something is getting slow and then we go haul out WPR and start taking traces. We stare down the hot paths and try to reason out what is going on and then improve them. For instance, in the last cycle or two, we focused on heap allocations as a major area where we could improve our end-to-end performance, changing a ton of our code to use stack-constructed iterator-like facades over the underlying request buffer instead of translating and allocating it into a new heap space for each level of processing.

As an aside, @bitcrazed wants us to automate performance tests in some conhost specific way, but I haven't quite figured out a controlled environment to do this in yet. The Windows Engineering System runs performance tests each night that give us a coarse grained way of knowing if we messed something up for the whole operating system, and they technically offer a fine grained way for us to insert our own performance tests... but I just haven't got around to that yet. If you have an idea for a way for us to do this in an automated fashion, I'm all ears.

If there's anything else you'd like to know, let me know. I could go on all day. I deleted like 15 tangents from this reply before posting it....

Biswa96 · 2018-12-14T17:41:01Z

Speaking of low level Windows APIs, I found that every user mode components of both Console (conhost.exe, cmd.exe ...) and WSL (wsl.exe, LxssManager.dll ...) use C++ STL enormously. For example, in string manipulation, memory allocation, virtual tables etc. Will there be any performance improvement if only C is used?

miniksa · 2018-12-14T17:52:10Z

I wouldn't consider those as using C++ STL "enormously". They definitely use some of the collections and perhaps a bit of string manipulation and an algorithm here or there. I feel like there's a lot more to STL than those few bits.

But anyway... most of the things you describe were written straight up in C a while back. We've been selectively using C++ and STL in more and more of them as a conscious tradeoff. As long as we're well aware of what the templates are doing under the hood and are making careful use of the correct ones, we're generally only trading a very small amount of performance while gaining a significant amount of security and programming ease.

Security is actually the big reason why we use STL templates over trying to craft our own structures whenever possible. Using the STL templates for collections and strings generally grants us with bounds checking that would otherwise have to be done manually in an error prone fashion (or not done at all!)

Also I'm not strictly sure that having 17 different queuing and linked list implementations inside the console code when it was straight C was overall better for performance than using std::queue and std::list today. It probably consumed more on-disk space for each individual implementation which is a different type of performance issue (storage and the page-load I/O).

nickjj · 2018-12-14T19:20:20Z

Thanks a lot for taking the time out to write that reply, and no problem on the praise.

For the past couple of months I was looking for a good terminal set up, and I think ubuntu.exe with tmux is as close to perfection as you can get with tools we have today. The ubuntu.exe terminal itself is blazing fast and tmux gives you all of the qualify of life goodies (tabs, split panes, buffer searching, etc.).

Really looking forward to future releases that make color themes more compatible and properties related enhancements like hotkeys for zooming +/- on the font size.

stakx · 2018-12-16T17:48:02Z

@miniksa: Thanks for a very interesting post!

we are using darn near the oldest and lowest level APIs that Windows has to accomplish this work.

I've been curious about this for a while. You appear to be referring to the various Win32 APIs (USER32 / GDI32). I've lately become unsure about whether they are still as low-level as one can get in recent versions of Windows (8.0 and later), or whether these APIs have been silently converted to sit on top of other stuff (such as Direct2D, DirectWrite, etc.). How do the older APIs relate to the newer ones? I'd love it if you could clarify that bit!

miniksa · 2018-12-17T17:21:35Z

@stakx, I am referring to USER32 and GDI32.

I'll give you a cursory overview of what I know off the top of my head without spending hours confirming the details. As such, some of this is subject to handwaving and could be mildly incorrect but is probably in the right direction. Consider every statement to be my personal knowledge on how the world works and subject to opinion or error.

For the graphics part of the pipeline (GDI32), the user-mode portions of GDI are pretty far down. The app calls GDI32, some work is done in that DLL on the user-mode side, then a kernel call jumps over to the kernel and drawing occurs.

The portion that you're thinking of regarding "silently converted to sit on top of other stuff" is probably that once we hit the kernel calls, a bunch of the kernel GDI stuff tends to be re-platformed on top of the same stuff as DirectX when it is actually handled by the NVIDIA/AMD/Intel/etc. graphics driver and the GPU at the bottom of the stack. I think this happened with the graphics driver re-architecture that came as a part of WDDM for Windows Vista. There's a document out there somewhere about what calls are still really fast in GDI and which are slower as a result of the re-platforming. Last time I found that document and checked, we were using the fast ones.

On top of GDI, I believe there are things like Common Controls or comctl32.dll which provided folks reusable sets of buttons and elements to make their UIs before we had nicer declarative frameworks. We don't use those in the console really (except in the property sheet off the right click menu).

As for DirectWrite and D2D and D3D and DXGI themselves, they're a separate set of commands and paths that are completely off to the side from GDI at all both in user and kernel mode. They're not really related other than that there's some interoperability provisions between the two. Most of our other UI frameworks tend to be built on top of the DirectX stack though. XAML is for sure. I think WPF is. Not sure about WinForms. And I believe the composition stack and the window manager are using DirectX as well.

As for the input/interaction part of the pipeline (USER32), I tend to find most other newer things (at least for desktop PCs) are built on top of what is already there. USER32's major concept is windows and window handles and everything is sent to a window handle. As long as you're on a desktop machine (or a laptop or whatever... I mean a classic-style Windows-powered machine), there's a window handle involved and messages floating around and that means we're talking USER32.

The window message queue is just a straight up FIFO (more or less) of whatever input has occurred relevant to that window while it's in the foreground + whatever has been sent to the window by other components in the system.

The newer technologies and the frameworks like XAML and WPF and WinForms tend to receive the messages from the window message queue one way or another and process them and turn them into event callbacks to various objects that they've provisioned within their world.

However, the newer technologies that also work on other non-desktop platforms like XAML tend to have the ability to process stuff off of a completely different non-USER32 stack as well. There's a separate parallel stack to USER32 with all of our new innovations and realizations on how input and interaction should occur that doesn't exactly deal with classic messaging queues and window handles the same way. This is the whole Core* family of things like CoreWindow and CoreMessaging. They also have a different concept of "what is a user" that isn't so centric around your butt in rolling chair in front of a screen with a keyboard and mouse on the desk.

Now, if you're on XAML or one of the other Frameworks... all this intricacy is handled for you. XAML figures out how to draw on DirectX for you and negotiates with the compositor and window manager for cool effects on your behalf. It figures out whether to get your input events from USER32 or Core* or whatever transparently depending on your platform and the input stacks can handle pen, touch, keyboard, mouse, and so on in a unified manner. It has provisions inside it embedded to do all the sorts of globalization, accessibility, input interaction, etc. stuff that make your life easy. But you could choose to go directly to the low-level and handle it yourself or skip handling what you don't care about.

The trick is that GDI32 and USER32 were designed for a limited world with a limited set of commands. Desktop PCs were the only thing that existed, single user at the keyboard and mouse, simple graphics output to a VGA monitor. So using them directly at the "low level" like conhost does is pretty easy. The new platforms could be used at the "low level" but they're orders of magnitude more complicated because they now account for everything that has happened with personal computing in 20+ years like different form factors, multiple active users, multiple graphics adapters, and on and on and on and on. So you tend to use a framework when using the new stuff so your head doesn't explode. They handle it for you, but they handle more than they ever did before so they're slower to some degree.

So are GDI32 and USER32 "lower" than the new stuff? Sort of.
Can you get that low with the newer stuff? Mostly yes, but you probably shouldn't and don't want to.
Does new live on top of old or is old replatformed on the new? Sometimes and/or partially.
Basically... it's like the answer to anything software... "it's an unmitigated disaster and if we all stepped back a moment, we should be astounded that it works at all." :P

Anyway, that's enough ramble for one morning. Hopefully that somewhat answered your questions and gave you a bit more insight.

stakx · 2018-12-18T08:30:17Z

Hopefully that somewhat answered your questions and gave you a bit more insight.

@miniksa, it did indeed! Makes perfect sense, too. Thank you very much for taking the time to write that all up! 👍

miniksa · 2019-01-18T16:32:26Z

I'm going to close this issue for now. Thanks for the inquiry and I'm glad you enjoyed the information.

GSPP · 2021-12-08T12:22:43Z

@miniksa Thank you for explaining all those details. That is super interesting to know.

You speak of a document that describes what GDI calls are fast and slow. Is there anything public about it? In fact, I couldn't find much about the internals of the Windows graphics stack. I'd be very interested to know what actually goes on under the hood. It seems tough sometimes to optimize drawing performance because you can only guess what drawing strategy is going to turn out the fastest.

Actually, have you considered opening a blog? This is Old New Thing type of information that is extremely valuable and interesting. And since you stated that you like to explain what you know that might be a great thing 😄

miniksa · 2021-12-09T18:49:00Z

https://docs.microsoft.com/windows/win32/direct2d/comparing-direct2d-and-gdi contains some diagrams that show the evolution of how GDI calls are made over various revisions of Windows that might provide insight on how it works under the hood. It looks like https://docs.microsoft.com/windows-hardware/drivers/display/specifying-gdi-hardware-accelerated-rendering-operations also alludes to which commands specifically end up hardware accelerated (AlphaBlend, BitBlt, ClearTypeBlend, ColorFill, StretchBlt, TransparentBlt)... but I swear to some degree that there are other user-level GDI calls that thunk quickly into those anyway so it's not comprehensive. I've spent a while searching for it, but I think it no longer exists anywhere I can see (many of the old blogs and whitepaper download sites at Microsoft appear to have gone offline or been archived or otherwise migrated)... or I was somehow confused on the evolution of which methods were accelerated or not through the transition from XP to Vista to Windows 7. I've reached out to the GDI team to see if they have any historical record of it and I'll let you know if I hear anything.

I have considered opening a blog. I'm no Raymond Chen though. I actually bought https://niksa.dev early in the pandemic and puttered around briefly to try to set something up with Hugo but I hate it. I'm thinking of wiping it and using https://forestry.io to run it and make it easier because I want to focus on the content more than on the running of the site. I just want it to be easy to use, fast, and not ugly. Do be warned though that if/when I launch it, it'll be all of my interests interleaved in a stream of consciousness. So you'll probably see ramblings about not just my programming work at Microsoft and the internals the best I understand them (at least as much as is publicly shareable), but also my obsessions with personal finance, infrastructure, home automation, networking stuff, video games, and other random things about my life (like a how-to on working with a flood insurance adjuster after I experienced that with my mother-in-law this past month). You're not the first person to ask me to start writing somewhere that they can follow. With any luck, I'll be able to launch it during my upcoming vacation the last 2 weeks of December. Thanks for the encouragement.

miniksa added Product-Conhost For issues in the Console codebase discussion labels Dec 14, 2018

zadjii-msft added Issue-Question For questions or discussion and removed discussion labels Dec 14, 2018

miniksa closed this as completed Jan 18, 2019

miniksa self-assigned this Jan 18, 2019

nickjj mentioned this issue Jan 27, 2019

Input latency is a lot higher than the default WSL Ubuntu terminal mintty/wsltty#145

Closed

ghost mentioned this issue Mar 23, 2019

Supplementary Multilingual Plane characters #321

Closed

fredrikaverpil mentioned this issue May 10, 2019

Feature Request: Plugin support #555

Closed

connor4312 mentioned this issue Oct 28, 2020

terminal.integrated.typeaheadThreshold default value is pretty aggressive microsoft/vscode#109586

Closed

desmap mentioned this issue Apr 15, 2021

Terminal gets extremely sluggish after a day of use #7710

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Typing inside of the default WSL terminal feels amazing, why is it better than every other app? #327

Typing inside of the default WSL terminal feels amazing, why is it better than every other app? #327

nickjj commented Dec 14, 2018 •

edited

miniksa commented Dec 14, 2018

Biswa96 commented Dec 14, 2018

miniksa commented Dec 14, 2018

nickjj commented Dec 14, 2018 •

edited

stakx commented Dec 16, 2018

miniksa commented Dec 17, 2018

stakx commented Dec 18, 2018

miniksa commented Jan 18, 2019

GSPP commented Dec 8, 2021

miniksa commented Dec 9, 2021

Typing inside of the default WSL terminal feels amazing, why is it better than every other app? #327

Typing inside of the default WSL terminal feels amazing, why is it better than every other app? #327

Comments

nickjj commented Dec 14, 2018 • edited

miniksa commented Dec 14, 2018

Biswa96 commented Dec 14, 2018

miniksa commented Dec 14, 2018

nickjj commented Dec 14, 2018 • edited

stakx commented Dec 16, 2018

miniksa commented Dec 17, 2018

stakx commented Dec 18, 2018

miniksa commented Jan 18, 2019

GSPP commented Dec 8, 2021

miniksa commented Dec 9, 2021

nickjj commented Dec 14, 2018 •

edited

nickjj commented Dec 14, 2018 •

edited