Heavy legacy of the past. Windows command line issues

Original author: Rich Turner
  • Transfer
Preface from the author, Rich Turner from Microsoft. This is an article about the command line: from its appearance and evolution to the Windows Console overhaul plans and the command line in future versions of Windows. Whether you are an experienced professional or a newcomer to IT, we hope that you will find the article interesting.

Once upon a time in a distant-distant server ...

From the first days of computer science development, people needed an efficient way to transfer commands and data to a computer and see the result of these commands / calculations.

One of the first truly effective man-machine interfaces was the Tele-Typewriter or "teletypewriter". This is an electromechanical machine with a keyboard for entering data and some kind of output device - first a printer was used, later a screen.

Operator-entered characters are locally buffered and sent from a teletype to a neighboring computer or mainframe as a series of signals over an electrical cable (for example, RS-232) at a speed of 10 characters per second (110 baud bps):

Teletype Model 33 ASR

Note: David Gösswein has an excellent PDP-8 website where you can find more information about ASR33 (and the corresponding PDP-8 technology), including photos, videos, etc.

The program on the computer receives the entered characters, decides what to do with them, and, perhaps, asynchronously sends a response to the teletype. TTY can print / show operator received characters.

Then the technology improved, the transmission speed increased to 19,200 bps, and noisy and expensive printers replaced CRT displays (a widespread type of display in the 80s and 90s), as on the popular DEC VT100 terminal:

DEC VT100 Terminal

Although the technology improved, but this model - the terminal sends symbols to the program on the computer, and it gives the text to the user - has remained today as the fundamental model for the interaction of all command lines and consoles on all platforms!

Terminal and command line architecture

The model is elegant in its own way. One of the reasons is to preserve the simplicity and integrity of each component: the keyboard generates symbols that are buffered as electrical signals. The output device simply displays on the display (paper / screen) characters received from the computer.

At each stage in the system, only a stream of characters is transmitted, so this is a relatively simple process for implementing various communication infrastructures. For example, to add modems to transmit streams of input and output symbols over long distances via telephone lines.

Text encoding

It is important to remember that terminals and computers exchange data through streams of characters. When you press a key on the terminal keyboard, a value representing the character entered is sent to the connected computer. Press the 'A' key and the value 65 (0x41) is sent. Press 'Z' and sends 90 (0x5a).

7-bit ASCII encoding

The list of characters and their meanings is defined in the American Standard Code for Information Interchange (ASCII) standard, which is also the ISO / IEC 646 / ECMA-6 standard - “7-bit coded character set”, which defines:

  • 128 values ​​representing printed Latin characters A – Z (65–90), a – z (97–122), 0–9 (48–57)
  • Many common punctuation marks
  • Several non-printing control codes (0−31 and 127):

Standard 7-bit ASCII characters

When 7 bits are not enough: code pages

However, 7 bits do not provide enough space to encode many accents, punctuation marks, and symbols used in other languages ​​and regions. So with the addition of an extra bit, you can extend the ASCII character table with additional “code page” sets for characters 128–255 (and possibly redefining several non-printable ASCII characters).

For example, IBM introduced codepage 437 with several graphic characters such as ╫ (215) and ╣ (185) and mathematical symbols, including π (227) and ± (241), and also redefined printable characters for commonly unprintable characters 1−31:

Codepage 437

The Latin-1 code page defines the set of characters used by Latin- based languages:

Latin-1 code page

In many command line environments and shells, you can change the current codepage so that the terminal displays different characters (depending on the fonts available), especially for characters with a value of 128–255. But incorrectly specified code page will lead to displaying gibbering . And yes, “krakozyabry” is the real term ! Who would have thought? ;)

When 8 bits are not enough: Unicode

Code pages temporarily solved the problem, but they have many shortcomings, for example, they do not allow displaying text from several code pages / languages ​​simultaneously. Thus, we had to introduce a new encoding that accurately displays each character and alphabet for all languages ​​known to humanity, leaving a lot of free space! Introducing Unicode .

Unicode is an international standard ( ISO / IEC 10646), which currently defines 137,439 characters from 146 modern and historical scripts, as well as many characters and glyphs, including numerous emoticons, which are widely used in almost every application, platform and device. Unicode is regularly updated with additional writing systems, new / corrected emoticons, symbols, etc.

Unicode also defines “unprintable” formatting symbols that allow, for example, to combine symbols and / or affect previous or subsequent symbols! This is especially useful in writing such as Arabic, where the ligature of a particular character is determined by others. Emoji can use a zero-width join symbol .) to combine multiple characters into one visual glyph. For example, Microsoft Emoji Ninja Microsoft is formed by connecting the cat with other Emoji:

Microsoft Emoji Ninja

When bytes are too many: UTF-8!

A unique and systematic representation of all characters requires a large amount of space, up to several bytes for each character.

Therefore, to save money, several new Unicode encodings were developed. The most popular are UTF-32 (4 bytes per character), UTF-16 / UCS-2 (2 bytes) and UTF-8 (1−4 bytes per character).

Largely due to backward compatibility with ASCII and saving space, UTF-8 has become the most popular Unicode encoding on the Internet. It has been showing explosive growth since 2008, when it overtook ASCII and other popular encodings in popularity:

UTF-8 encoding has grown in popularity (source: Wikipedia)

Thus, the terminals initially supported 7-bit and then 8-bit ANSI text, but most modern ones terminals support Unicode / UTF-8 text.

So, what is the command line and what is a shell?

“Command Line” or CLI (command line interface / interpreter) describes the most fundamental mechanism through which a person controls a computer: CLI accepts input entered by the operator and executes the required commands.

For example, it echo Hellosends the text “Hello” to an output device (for example, on a screen). dir(Cmd) or ls(PowerShell / * NIX) lists the contents of the current directory, etc.

Previously, available commands were relatively simple, but operators required more and more sophisticated commands and the ability to write scripts to automate repetitive or complex tasks. Thus, the command line processors became more complex and turned into what is now called the "shell" of the command line (shell).

On Unix / Linux, the original Unix (sh) shell spawnedmultiple shells , including the Korn shell (ksh), C shell (csh) and Bourne Shell (sh). In turn, based on them, Bourne Again Shell (bash), etc. was created.

In the Microsoft world:

  • The original MS-DOS (command.com) was a relatively simple command line shell.
  • The Windows NT command line (cmd.exe) was designed to be compatible with obsolete command.com scripts, plus several commands have been added for a new, more powerful operating system.
  • In 2006, Microsoft released Windows PowerShell.
    • PowerShell is a modern command line object shell that borrows the functions of other shells and includes the features of the .NET CLR and the .NET framework
    • With PowerShell, you can write scripts and automate almost all aspects of one or more computers under Windows, networks, storage systems, databases, etc.
    • In 2017, Microsoft opened the source code for PowerShell, allowing it to run on macOS, various Linux and BSD versions.
  • In 2016, Microsoft introduced the Windows subsystem for Linux (WSL)
    • Allows you to run regular unmodified Linux binaries directly in Windows 10
    • Users install one or more regular Linux distributions from the Windows Store
    • You can run one or more instances of the distribution in parallel with others, as well as in parallel with existing applications and Windows tools
    • WSL allows you to run side by side all Windows tools and Linux command line tools without using resource-intensive virtual machines

We will return to the Windows command line shells, but for now, remember that there are different shells, they accept commands entered by the user / operator and perform a wide range of tasks as needed.

Modern command line

Modern computers are much more powerful than the “dumb terminals” of the past and usually run under the desktop OS (for example, Windows, Linux, macOS) with a graphical user interface (GUI). This GUI environment allows several applications to work simultaneously in separate windows on the screen and / or invisible in the background.

Cmd, PowerShell, and Ubuntu Linux under WSL work on independent console instances.

Modern console / terminal applications that work in a window on a computer screen, but still perform the same functions as hardware terminals from the past, have replaced bulky, slow electromechanical teletypes. .

Similarly, the command-line applications to which the terminals are connected work as before: they get input characters, decide what to do with these characters, optionally do the work — and can produce text to display to the user. Only instead of communicating via slow TTY channels, terminal applications and command-line applications on the same machine communicate over very high-speed Pseudo Teletype (PTY) channels in memory.

Modern command line

Modern terminals mainly interact with command-line applications running locally. But of course, they can also interact with command-line applications running on other machines on the same network or even remote machines on the other side of the world via the Internet. This “remote” command line interaction is a powerful tool that is popular on every platform, especially on * NIX.

Command line evolution

Modest Start: MS-DOS

At the dawn of the computer industry, most computers were managed by entering commands at the command line. For market share computers struggled under Unix, CP / M, DR-DOS and others. As a result, the MS-DOS system became the de facto standard for the IBM PC and all compatible computers:

MS-DOS 6.0

Like most of the major operating systems of the time, the command-line interpreter or the “shell” in MS-DOS provided a simple but relatively effective set commands and command line syntax for writing batch files (.bat).

Large and small businesses very quickly adopted MS-DOS and collectively created many millions of scripts, some of which are still in use today! Batch scripts are used to automate PC settings, set / change security settings, update software, build code, etc.

You rarely or never see the real work of such a script, because many of them run in the background, for example, when authorizing on a computer. But hundreds of billions of command line scripts and commands are executed every day only on Windows!

The command line represents a powerful tool in the hands of a person who has the patience and perseverance to learn how to squeeze the most out of the available commands and tools. But most ordinary users find it difficult to effectively manage a computer from the command line. Most did not like to learn and memorize a lot of mystical abbreviations in order for the computer to produce at least some useful effect.

A more user-friendly, performance-oriented user interface is required.

GUI goes to mainstream

Welcome to the graphical user interface (GUI) invented by Xerox Alto .

Soon after the invention, many competing GUIs appeared on Apple’s Lisa and Macintosh computers, Commodore Amiga (Workbench) , Atari ST (DRI GEM) , Acorn Archimedes (Arthur / RISC OS) , Sun Workstation , X11 / X Windows and many others, including including Microsoft Windows.

Windows 1.0 was released in 1985 and was essentially an MS-DOS application that provided a simple GUI environment with a tiled window, allowing users to run several applications side by side:

Windows 1.01 on MS-DOS

Windows 2.x, 3.x, 95, and 98 worked on MS-DOS. Later versions of Windows began to replace some MS-DOS functions with alternatives to Windows (for example, file system operations), but they all relied on MS-DOS.

Note: Windows ME (Millennium Edition) has become an interesting hybrid. It finally replaced MS-DOS support and real-mode support from previous versions of Windows with several new features (especially Gaming & Media technology). Some features are borrowed from Windows 2000 (for example, the new TCP / IP stack), but are configured to run on home PCs that have difficulty running a full-featured NT.

But Microsoft understood that it could not infinitely stretch the architecture and capabilities of MS-DOS and Windows. A new operating system was required with an eye to the future.

Microsoft is the market leader in Unix! Yes seriously!

In developing MS-DOS, Microsoft also provided Xenix , the proprietary Unix port version 7 , for various processor and machine architectures, including the Z8000, 8086/80286, and 68000.

By 1984, Microsoft's Xenix became the most popular version of Unix in the world!

Meanwhile, the collapse of Bell Labs — the birthplace of Unix — led to the advent of AT & T, which began selling Unix System V to computer manufacturers and end users.

Microsoft understood that the lack of its own OS compromised its ability to develop. Therefore, it was decided to abandon Xenix: in 1987, Microsoft transferred Xenix to its partner Santa Cruz Operation (SCO), with whom she worked on several projects on porting and improving Xenix on various platforms.

Microsoft + IBM == OS / 2 ... not for long

In 1985, Microsoft began working with IBM on a new OS / 2 operating system . It was originally planned as a “more functional DOS” for some modern 32-bit CPUs and taking into account other technologies that were quickly generated by IBM and other OEMs.

But the story of OS / 2 turned out to be too turbulent . In 1990, Microsoft and IBM ceased collaboration. This was due to a number of factors , including significant cultural differences between IBM and Microsoft developers, planning problems, and the explosive success and growth of Windows 3.1 implementation. IBM continued to develop and support OS / 2 until the end of 2006.

By 1988, Microsoft was convinced that future success requires a more ambitious, bold and ambitious approach. This approach will require a new, modern operating system that will support the company's ambitious goals.

Microsoft big bet: Windows NT

In 1988, Microsoft invited Dave Cutler , creator of the popular and respected VAX / VMS operating system , to DEC. His task is to create a new, modern, platform-independent operating system that Microsoft will own, control and on which will largely build its future.

This new operating system is Windows NT : the foundation on which Windows 2000, Windows XP, Windows Vista, Windows 7, Windows 8 and Windows 10 are built, as well as all versions of Windows Server, Windows Phone 7+, Xbox and HoloLens!

Windows NT was originally designed as a cross-platform system. First, it supported the Intel i860, then MIPS R3000, Intel 80386+, DEC Alpha, and PowerPC. Since then, the Windows NT operating system family has been ported to support IA64 Itanium, x64 and ARM / ARM64 processor architectures, among others.

Windows NT provides a command line interface through the Windows Console terminal application and the Command Prompt command line (cmd.exe). Cmd is designed for maximum compatibility with MS-DOS batch scripts to help businesses migrate to the new platform.

Power PowerShell

Cmd persists in Windows to this day ( and will probably persist for many decades ). Since its main purpose is to provide maximum backward compatibility, Cmd rarely improves. Even “fixing errors” is often difficult if these “bugs” existed in MS-DOS or earlier versions of Windows!

In the early 2000s, the Cmd shell was already outdated: Microsoft and its customers urgently needed a more powerful and flexible command line. From this need came PowerShell (which arose from the Manifesto of the Monad by Jeffrey Snover).

PowerShell is an object-oriented shell, unlike file / thread based shells that are commonly used in the * NIX world: instead of text streams, PowerShell processes object streams. It provides script authors with the ability to directly access and manipulate objects and their properties instead of writing a variety of scripts for analyzing and processing text (like sed / grep / awk / lex / others).

Built on the .NET Framework and Common Language Runtime (CLR), PowerShell's language and syntax is designed to combine the richness of the .NET ecosystem with many common and useful functions from many other shell scripting languages, with an emphasis on scripts to provide maximum consistency and exceptional ... well ... power. :)

To learn more about PowerShell, I recommend reading the book “PowerShell in Action” (Manning Press), written by Bruce Payette, a developer of syntax and PowerShell language. In particular, the first few chapters contain a detailed explanation of the structure of the language.

PowerShell has been adopted for use by many technologies on the Microsoft platform, including Windows, Exchange Server, SQL Server, Azure, and many others. It provides highly consistent commands for administering and managing virtually all aspects of Windows and / or the environment.

PowerShell Core is open source PowerShell, available for Windows and various versions of Linux, BSD and macOS.

POSIX for NT, Interix and UNIX services

When designing NT, Catler’s team specifically developed the NT kernel and operating system to support multiple subsystem interfaces between the user mode code and the main core.

When the first Windows NT version 3.1 came out in 1993, it supported several subsystems: МЅ-DOS, Windows, OS / 2 and POSIX v1.2. These subsystems allowed to run applications on one machine and the base OS, aimed at several operating system platforms without virtualization or emulation - this is an impressive development even by today's standards!

The original POSIX implementation in Windows NT was acceptable, but it required significant improvements. Therefore, Microsoft acquired Softway Systems and its POSIX-compatible Interix subsystem for NT. Initially, Interix was delivered as a separate add-on, and then it was combined with several useful utilities and tools and released as Services For Unix (SFU) in Windows Server 2003 R2 and Windows Vista. However, SFU support had to be discontinued after Windows 8, mainly due to its lack of popularity.

And then a funny thing happened ...

Windows 10 - a new era for the command line of Windows!

At the beginning of the development of Windows 10, the company opened the UserVoice page asking what functions people want to implement in various areas of the OS. The developer community especially loudly demanded two things from Microsoft:

  1. Make significant improvements to the Windows console
  2. Allow users to run Linux tools on Windows

Based on these feedbacks, Microsoft has formed two new groups:

  1. The Windows Console Development Team and the command line, which was commissioned to overhaul the Windows Console and command line infrastructure
  2. Windows Subsystem for Linux Development Team (WSL)

The rest, as they say, is history!

Windows subsystem for Linux (WSL)

GNU / Linux-based “distributions” (combinations of the Linux kernel and user-mode tool collections) are becoming increasingly popular, especially on servers and in the cloud. Although Windows had a POSIX-compatible runtime, SFU could not run many Linux tools and binaries due to additional system calls and differences in behavior compared to traditional Unix / POSIX.

After analyzing feedback from developers and tech-savvy Windows users, as well as due to the growing demand within Microsoft itself, the company explored several options and ultimately decided to allow the launch of the original unmodified Linux Linux files on Windows!

In mid-2014, Microsoft formed a development team of what would becomeWindows subsystem for Linux (WSL) . WSL was first announced in Build 2016 , and soon the preliminary version was released on the Windows 10 Insider channel.

Since then, WSL has been updated in most insider builds and in every major release of the OS since Anniversary Update in the fall of 2016. Each new version increases the functionality, compatibility and stability of WSL: in the first version it was an interesting experiment that could run only a few common Linux programs. With the active help of the community (thanks to all!), The developers were quickly refining the WSL, so she soon received many new features and learned how to run increasingly complex Linux binaries.

Today (mid-2018) WSL runs most Linux binaries, programs, compilers, linkers, debuggers, etc. Many developers, IT specialists, DevOps engineers, and many others who need to run or create tools, applications, Linux services, etc., have received a dramatic increase in performance and the ability to run their favorite Linux tools along with their favorite Windows tools on one computer , without loading two operating systems.

The WSL team continues to improve WSL in terms of running Linux tasks, improving performance and integrating with Windows.

Restart and overhaul Windows Console

At the end of 2014, the project for creating a Windows subsystem for Linux (WSL) was in full swing, and amid an explosion of users' lively interest in the command line, it became obvious that the Windows console clearly needed some upgrade.

In particular, the console lacked many of the functions familiar to modern * NIX-compatible systems, such as the ability to parse and output ANSI / VT sequences widely used in the * NIX world to output rich and highlighted text and text UIs.

What then is the point of developing WSL if the user cannot use Linux tools correctly?

Below is an example of what the Windows 7 and Windows 10 console displays: notice that Windows 7 (on the left) is not able to correctly display VT generated by Linux programs tmux,htop, Midnight Commanderand cowsay, but they look correct in Windows 10 (right):

Comparison of the Windows 7 and Windows 10 console.

So, in 2014, a small “Windows Console group” was formed. It was entrusted with the task of unraveling, understanding and improving the code base of the Windows Console ... which by this time was about 28 years old - more than the programmers who work on this project.

As any developer who has ever had to take old, dirty, poorly supported code will confirm, improving such a code is a complex task. Even harder is not to disturb existing behavior. To update the most frequently launched program in WindowsWithout disrupting the work of millions of client scripts, tools, authorization scripts, build systems, production systems, analysis systems and others, a lot of “attention and patience” is required. ;)

For developers, the problem was exacerbated when they understood the strictness of the requirements for the console by the customers. For example, if the console performance was changed by 1-2% from assembly to assembly, then alarms in the Windows Build group were triggered, which resulted in ... um ... to "fast and direct feedback", that is, the requirement for immediate correction.

So, when we discuss the console improvements and new features, remember that there are several unshakable principles that every change should comply with, including:

  1. DO NOT allow new vulnerabilities
  2. Do NOT break tools, scripts, commands, etc. from existing clients (internal or external)
  3. DO NOT reduce performance or increase memory consumption / IO (without clear and well-communicated reasons)

Over the past three years, the Windows Console team has done the following work:

  • Overhaul of internal components
    • Significant simplification and reduction of the code base
    • Replacing multiple internal collections, lists, stacks, etc. STL containers
    • Breakdown into modules and isolation of logical and functional code units, which allows to improve functions (and sometimes replace them) without “breaking the world”
  • Combining several previously separate and incompatible console engines into one
  • A LOT of security and reliability improvements
  • The ability to parse and output ANSI / VT sequences, which allows the console to accurately display rich text output from * NIX and other modern command line tools and applications
  • Support 24-bit color instead of the previous 16 colors!
  • Improved barrier-free: Narrator and other barrier-free applications work in the console window
  • Added / improved mouse and touch support.

And the work continues! We are currently completing the implementation of several exciting new features.

What was this history lesson about?

I hope you understand that the command line remains a key component of the Microsoft strategy, platform and ecosystem.

Although for end users Microsoft promoted a graphical interface, the company itself and its technical clients / users / partners relied heavily on the command line to perform many technical tasks.

In fact, Microsoft literally could not create either Windows or any of its software products without a fast, efficient, stable and secure console!

Throughout the eras of MS-DOS, Unix, OS / 2, and Windows, the command line remained perhaps the most important tool in the toolkit of each technical user. Even many users who never entered commands into the window actually use the console every day! Even code assembly in Visual Studio (VS) occurs in a hidden console window. When using Exchange Server or SQL Server administration tools, many of these commands are executed using PowerShell in a hidden console.

Windows Console Mechanics

At the time of the start of the development of Windows NT in 1989, there was neither a graphical interface nor a desktop. There was only a full-screen command line that visually resembled MS-DOS. When the implementation of the Windows GUI appeared, it was necessary to create a console application for the GUI - and thus the Windows Console was born! This is one of the first Windows NT GUI applications and is certainly one of the oldest Windows applications still in use everywhere!

The code base of the Windows console is currently (July 2018) almost 30 years old ... in fact, more than the developers who are working on it now!

What does the console do?

As we learned earlier, the terminal has a relatively simple operation algorithm:

  • Process user input
    • Accept input signal from devices, including keyboard, mouse, touchscreen, etc.
    • Translate input into appropriate ANSI / VT characters and / or sequences
    • Send characters to the connected application / tool / shell
  • Process application output:
    • Accept text output from a connected application / command line tool
    • Update the screen as needed, based on the data received from the application (for example, received text, cursor movement, changing text color, etc.)
  • Process system interactions:
    • Run on request
    • Resource management
    • Resize / maximize window / minimize window, etc.
    • Completion on request or after closing the communication channel

But the Windows console works a little differently:

Windows Console Mechanics

The Windows console is a regular Win32 executable file. It was originally written in C, but most of the code is now transferred to C ++ as developers upgrade and break the code base into modules.

If you are interested in such things: many have asked, Windows is written in C or C ++. The answer is this: despite the object-oriented design of NT, like most operating systems, Windows is almost entirely written in C! Why? Because C ++ increases memory consumption and introduces overhead to code execution. Even today, the hidden costs of executing C ++ code may surprise, but back in the 1990s, when memory cost about $ 60 / MB (yes ... $ 60 per megabyte!), hidden costs for vtables, etc. were significant. In addition, the cost of indirectly accessing virtual methods and dereferencing objects at that time could lead to very significant performance losses and the scaling of C ++ code. Care should be taken nowadays, but the costs of C ++ performance on modern computers cause far less concern. Often this is an acceptable compromise, given the security, readability and better code maintenance ... that's why we are gradually rewriting the console code in modern C ++!

So what's inside the windows console?

Before Windows 7, Windows console instances were located in the critical Client Server Runtime Subsystem (CSRSS) subsystem ! But in Windows 7, for security and reliability reasons, the console was moved from the CSRSS to the following binaries:

  • conhost.exe - user mode for Windows UX console and command line mechanics
  • condrv.sys is a Windows kernel driver that provides communication between conhost and one or more command line shells / tools / applications

The high-level diagram of the current internal console architecture is as follows:

The console core consists of the following components (bottom to top):

  • ConDrv.sys - kernel mode driver
    • Provides a high-performance communication channel between the console and any connected command line applications.
    • Transfer IO Control (IOCTL) messages back and forth between command line applications and the console to which they are "attached"
    • IOCTL console messages contain:
      • Data representing API call requests for a console instance
      • Text sent from console to command line application
  • ConHost.exe - Win32 GUI application:
    • ConHost Core - insides and mechanics
      • API Server : converts IOCTL messages received from command line applications into API calls and sends text entries from the console to the command line application
      • API : implements the Win32 console API and logic for all operations that the console may ask to perform
      • Input buffer : stores keyboard and mouse event records generated by user input
      • VT Parser : if enabled, parses VT sequences, extracts them from the text, and generates equivalent API calls
      • Output buffer : stores text displayed on the console display . In essence, this is a 2D array of CHAR_INFO structures that contain the character data and attributes of each cell (more about the buffer below)
      • Other : the scheme does not include the infrastructure for storing / retrieving values ​​from the registry and / or shortcut files, etc.
    • Console UX App Services - UX and UI Layer
      • Controls the layout, size, position and other characteristics of the console window on the screen
      • Displays and processes UI parameters, etc.
      • Pumps a queue of Windows messages , processes them and converts user input into key and mouse event records, saving them in the input buffer

Windows Console API

As can be seen from the architecture diagram, unlike the NIX terminals, the console sends / receives API calls and / or data as IO Control messages (IOCTL) , not text! Even embedded ANSI / VT sequences from command line applications (mostly Linux) are extracted, parsed and converted to API calls!

This distinction reveals the key fundamental philosophical difference between * NIX and Windows: in * NIX “everything is a file” , and in Windows “everything is an object” !

Both approaches have advantages and disadvantages, which we will list, but we will not discuss in detail. Just remember that this key difference in philosophy explains many of the fundamental differences between Windows and * NIX!

In * nix everything is a file

When Unix first appeared in the late 1960s and early 1970s, one of the basic principles was that (where possible) everything should be abstracted as a file stream. One of the key goals was to simplify the code for accessing devices and peripherals: if all devices are represented in the OS as files, then the code is easier to access.

This philosophy works at the deepest level: you can even navigate and poll most of the OS and computer configuration under * NIX, navigating through pseudo / virtual file systems that show what seems like “files” and folders, but actually represents a configuration cars and equipment.

For example, in Linux, you can explore the properties of processors by examining the contents of a pseudo file /proc/cpuinfo:

But the simplicity and consistency of this model can be costly: extracting / analyzing specific information in pseudo-files often requires special tools such as sed, awk, perl, python, etc. These tools are used to write commands and scripts for parsing text content, searching for specific patterns, fields, and values. Some of the scripts can be quite complex, often difficult to maintain and fragile - if the structure, pattern and / or text format changes, many scripts will probably need to be updated.

On Windows, everything is an object.

When designed by Windows NT , “objects” were considered as the future in software development: “object-oriented” programming languages ​​appeared faster than cockroaches: Simula and Smalltalk had already established themselves, and C ++ was gaining popularity. They were followed by other object-oriented languages , including Python, Eiffel, Objective-C, ObjectPascal / Delphi, Java, C #, and many others.

The result is predictable. Created in those exciting, object-oriented days (circa 1989), Windows NT is designed with the philosophy that “everything is an object”. In fact, one of the most important parts of the NT kernel is Object Manager !

Windows NT provides a rich set of Win32 APIs.for receiving and / or managing OS objects. Developers use the Win32 API to collect and present information similar to data from pseudo files and * NIX tools only through objects and structures. And since parsers, compilers and analyzers understand the structure of objects, many coding errors often manifest themselves at an early stage, which helps to check the syntax and logical correctness of the programmer’s intentions. Over time, this can lead to fewer disruptions, volatility, and better order.

So, going back to the main discussion about the Windows console: the NT team decided to build a “console” different from the traditional * NIX terminal in several key areas:

  • Console API : instead of relying on the ability of programmers to generate correct ANSI / VT sequences that are difficult to verify, the Windows console is managed through rich console APIs
  • Common services : to avoid duplication of services in all command line shells (for example, command log, command aliases), the console itself provides some of them through the Console API

Windows console issues

Although console APIs have become very popular in command-line tools and services, the API-oriented model has certain drawbacks listed below.

Windows only

Many command line tools and applications make extensive use of the Console API .

What is the problem? They work only under Windows.

Thus, in combination with other differences (for example, in the life cycle, etc.), Windows command line applications are not always easily transferred under * NIX and vice versa.

Because of this, the Windows ecosystem has its own, often similar, but usually different command-line tools and applications. This means that when using Windows, users have to learn one set of applications and command line tools, shells, scripting languages, etc., and when using * NIX, another set.

This problem has no simple solution: the Windows console and command line cannot be simply thrown away and replaced with bash and iTerm2 - there are hundreds of millions of applications and scripts that depend on the Windows console and Cmd / PowerShell shells.

Third-party tools, such as Cygwin , do well in many of the major GNU tools and compatibility libraries on Windows, but they cannot run unported, unmodified Linux binaries. This is very important, as many packages and modules of Ruby, Python, Node depend on Linux binaries and / or depend on the behavior of * NIX.

These reasons led to the fact that Microsoft has expanded compatibility with Windows, allowing the launch of authentic binaries and Linux tools in the Windows subsystem for Linux (WSL). With WSL, users can now download and install one or more Linux distributions side by side on the same machine, as well as use apt / zypper / npm / gem / etc. to install and run the vast majority of Linux command line tools along with their favorite Windows applications and tools.

However, the native console still has functionality that is not found in third-party terminals: in particular, the Windows Console provides command-history and command-alias services so that each shell (in particular) does not have to re-implement the same functionality.

Difficulties with remote work

As mentioned at the beginning of the article, the terminals were initially separated from the computer to which they were connected. Today, this design is preserved: the majority of modern terminals and shells / applications / etc. command lines are isolated within individual processes and / or machines.

On * NIX-based platforms, the isolation paradigm of terminals and command-line applications with a simple exchange of characters made it easy to access and work from a remote computer / device. If the terminal and the command-line application exchange streams of characters over some sort of ordered infrastructure (TTY, PTY, etc.), then it is very easy to work remotely.

But in Windows, many command-line applications depend on the console API call and assume execution on the same machine as the console itself. This makes remote control difficult. How can a command line application remotely access the API on the console of the local computer? Worse, how will the remote application access the Console API if it is accessed through a terminal on a Mac or Linux ?!

Sorry to tease you, but we will return to this topic in more detail in the next article.

Running a console ... or not!

When the * NIX user wants to run the command line tool, he first starts the terminal. That one then starts the default shell or can be configured to launch a specific application / tool. The terminal and the command-line application interact by streaming characters through the Pseudo TTY (PTY) .

However, in Windows everything works differently: Windows users never launch the console (conhost.exe) - they immediately launch command line shells and applications, for example, Cmd.exe, PowerShell.exe, wsl.exe and so on. Windows connects the running application to the current console (if it is started from the command line) or to the newly created console instance.


Yes in windowsusers run a command line application, not the console itself .

If a user starts a command-line application from an existing command-line shell, Windows typically attaches the newly launched .exe to the current console. Otherwise, Windows will launch a new console instance to which the application just started will be attached.

Some tediousness : many say that "command line applications run in the console." This is not the case and leads to a lot of confusion about how the console and command line applications actually work! Command line applications and their consoles run in independent Win32 processes. Help correct this misconception. Always say that "command line tools / applications run with connection to the console." Thank!

Sounds good, right? Not really. There are some problems:

  1. Console and command line application interact with IOCTL messages through the driver, not through text streams.
  2. Windows indicates that ConHost.exe is a console application connected to command line applications.
  3. Windows creates "pipes" (pipes) through which the console and the command line application interact

These are significant limitations. What if you want to create an alternative console application for Windows? How to send keyboard / mouse events and other user actions to the command line application if you cannot access the channels connecting the console with this application?

Alas, the situation here is not very good. There are excellent third-party consoles and server applications for Windows (for example, ConEmu / Cmder, Console2 / ConsoleZ, Hyper, Visual Studio Code, OpenSSH, etc.), but they have to resort to sophisticated tricks to work like a regular console!

For example, third-party consoles have to run a command-line application off-screen, say, with coordinates (-32000, -32000). Then send keystrokes to the offscreen console, scrap the text content of the offscreen console from the screen - and re-display it in your own user interface!

Some kind of madness, right ?! The fact that these applications generally work is only proof of the ingenuity and dedication of their creators.

Obviously, we are trying to correct this situation. This will also be discussed in the following articles.

Windows Console and VT

As described above, the Windows console provides a rich API . Using the Console API, applications and command line tools write text, change its color, move the cursor, and so on. And thanks to these APIs, there was no need to support ANSI / VT sequences, which provide similar functionality on other platforms.

In fact, before Windows 10, only minimal support for ANSI / VT sequences was implemented in the Windows console:

Everything changed in 2014, when Microsoft formed a new Windows Console development team. One of its top priorities has been to implement comprehensive support for ANSI / VT sequences to visualize the output of * NIX applications running in the Windows for Linux (WSL) subsystemand on remote machines * NIX.

The group quickly introduced comprehensive support for ANSI / VT sequences to the Windows 10 console, allowing users to use and enjoy the vast array of Windows and Linux command line tools and applications .

The team continues to improve VT support with every OS release and will be grateful for any problems that you mention in our tracker on GitHub . ;)

Unicode processing

Unfortunately, the Windows console and its API appeared before the invention of Unicode!

The Windows console stores text (which is subsequently displayed on the screen) as UCS-2 encoding characters with two bytes per character. This encoding supports the encoding of the first 65536 character positions, which is known as the 0 plane or the Basic Multilingual Plane (BMP).

Command line applications display text in the console using the Console API. Text-processing interfaces are of two types: functions with a suffix Аhandle single-byte / character strings, functions with a suffix Whandle two-byte (wchar) / character strings.

For example, the WriteConsoleOutputCharacter function is compiled in WriteConsoleOutputCharacterA()for ASCII projects or inWriteConsoleOutputCharacterW()for projects on Unicode. A code can directly invoke a function with a suffix ...Aor ...W, if it needs to handle a particular type.

Note: Each W API supports UCS-2, because this is all that existed at the time of the division into A / W, and we thought it would be so good. But many W APIs have already been updated to support UTF-16 on the same channel.

Not all W APIs understand UTF-16, but they all know at least UCS-2.

In addition, the console does not support some new Unicode functions, including zero width joiner , which are used to combine individual characters in Arabic and Indian scripts, as well as to combine emoji characters into one visual glyph. So

how do you enter emoji cat-ninjas or complex multi-byte Chinese / Arabic characters into the console? Unfortunately, nothing !

Not only does the console API not support Unicode characters more than two bytes per glyph ( Emoji NinjaCat requires 8 bytes!), But the internal console UCS-2 buffer also cannot store additional data bytes. Worse, the current console GDI renderer cannot renderthe glyph, even if it fits in the buffer!

Eh ... These are the joys of legacy code.

Here again, I intend to interrupt the story - we will return to this topic in the next article. Stay with us!

So what are we staying at?

Dear reader, if you read all vyshenapisannoe, thank you, and congratulations - you now know more about Windows console than most of your friends, and, perhaps even more than you do want to know! What a luck!

We have covered a lot in this article:

  • The main building blocks of the Windows console:
    • Condrv.sys - communication driver
    • ConHost.hehe - UX console and mechanics:
      • API Server - serializes API calls and text data using IOCTL messages sent to / from the driver
      • API - console functionality
      • Buffers — An input buffer that stores user input and an output buffer that stores output / display text.
      • VT parser - converts ANSI / VT sequences from a text stream to API calls
      • UX consoles - the state of the console UI, settings, functions
      • The other is technical data, safety, and so on.
  • What does the console do
    • Sends user input to a connected command line application.
    • Receives and displays output from a connected command line application.
  • How does the console differ from the * NIX terminals
    • NIX: “Everything is a file / text stream”
    • Windows: “Everything is an object accessible via API”
  • Console problems
    • Console and command line applications interact via API call requests and text serialized in IOCTL Messages
    • Console API can be called only by Windows command line applications.
      • More difficult to port applications to / from Windows
    • Applications interact with the console through the Windows API
      • Makes remote interaction with applications and Windows command line tools
    • Dependence on IOCTL violates the terminal's “symbol exchange” scheme
      • Complicates the operation of remote command line tools from non-Windows machines
    • Running Windows command line applications is “unusual.”
      • Only ConHost.exe can be attached to command line applications.
      • Third-party terminals are forced to create an off-screen console, send characters there and scrap the screen
    • Windows does not historically understand ANSI / VT sequences.
      • Mostly fixed in Windows 10
    • Console has limited Unicode support and currently has storage and rendering issues for UTF-8 and glyphs that need zero-width connecting characters

In the next few articles in this series, we will examine the console in more detail and discuss the solution to these problems ... and not only!

As always, stay tuned.

Also popular now: