A little history about the `yes` command in Unix

Original author: Matthias Endler
  • Transfer
What do you know the simplest Unix command? There is echo, which prints a line in stdout, and there is true, which does nothing, but only ends with zero code.

Among the many simple Unix commands, a command hid yes. If you run it without arguments, you will get an endless stream of characters "y", each with a new line:

y
y
y
y
(...ну вы поняли мысль)

Although at first glance the team seems pointless, but sometimes it is useful:

yes | sh boring_installation.sh

Ever installed a program that requires you to type "y" and press Enter to install? The team yescomes to the rescue! She will neatly complete this task, so you can not be distracted from watching the Pootie Tang .

Write yes


Here is the basic version on ... hmm ... BASIC.

10 PRINT "y"
20 GOTO 10

And here is the same thing in Python:

while True:
    print("y")

Seems simple? Wait a minute!

As it turns out, such a program runs quite slowly.

python yes.py | pv -r > /dev/null
[4.17MiB/s]

Compare with the built-in version on my "poppy":

yes | pv -r > /dev/null
[34.2MiB/s]

So I tried to write a faster version in Rust. Here is my first attempt:

use std::env;
fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  loop {
    println!("{}", expletive);
  }
}

Some explanations:

  • The line we print in the loop is the first command line parameter called expletive . I learned this word from the manual yes.
  • I use unwrap_orto get expletive from parameters. If parameters are not set, the default is "y".
  • The default parameter is converted from string fragment ( &str) to owned()in heap ( String) with into().

We will test.

cargo run --release | pv -r > /dev/null
   Compiling yes v0.1.0
    Finished release [optimized] target(s) in 1.0 secs
     Running `target/release/yes`
[2.35MiB/s]

Oops, nothing really improved. It is even slower than the Python version! This interested me, so I looked for the source code for the implementation in C.

Here is the very first version of the program that was released as part of Version 7 Unix by Ken Thompson on January 10, 1979:

main(argc, argv)
char **argv;
{
  for (;;)
    printf("%s\n", argc>1? argv[1]: "y");
}

No magic.

Compare with the 128-line version of the GNU coreutils kit, which has a mirror on Github . After 25 years, the program is still in active development! The last code change happened about a year ago. She's pretty fast:

# brew install coreutils
gyes | pv -r > /dev/null 
[854MiB/s]

The important part is at the end:

/* Repeatedly output the buffer until there is a write error; then fail.  */
while (full_write (STDOUT_FILENO, buf, bufused) == bufused)
  continue;

Yeah! So it simply uses a buffer to speed up write operations. The buffer size is set constant BUFSIZ, which is selected for each system in order to optimize I / O operations (see  here ). On my system, it was set as 1024 bytes. In reality, the best performance was with 8192 bytes.

I have expanded my Rust program:

use std::io::{self, Write};
const BUFSIZE: usize = 8192;
fn main() {
  let expletive = env::args().nth(1).unwrap_or("y".into());
  let mut writer = BufWriter::with_capacity(BUFSIZE, io::stdout());
  loop {
    writeln!(writer, "{}", expletive).unwrap();
  }
}

It is important here that the buffer size is divided by four, this ensures alignment in memory .

Such a program produces 51.3 MiB / s. Faster than the version installed on my system, but much slower than the version from the author of the post I found on Reddit . He says he achieved a speed of 10.2 GiB / s.

Addition


As usual, the Rust community did not disappoint. As soon as this article got into Rust , the user nwydo pointed to a previous discussion on this topic. Here is their optimized code that breaks through 3 GB / s on my machine:

use std::env;
use std::io::{self, Write};
use std::process;
use std::borrow::Cow;
use std::ffi::OsString;
pub const BUFFER_CAPACITY: usize = 64 * 1024;
pub fn to_bytes(os_str: OsString) -> Vec {
  use std::os::unix::ffi::OsStringExt;
  os_str.into_vec()
}
fn fill_up_buffer<'a>(buffer: &'a mut [u8], output: &'a [u8]) -> &'a [u8] {
  if output.len() > buffer.len() / 2 {
    return output;
  }
  let mut buffer_size = output.len();
  buffer[..buffer_size].clone_from_slice(output);
  while buffer_size < buffer.len() / 2 {
    let (left, right) = buffer.split_at_mut(buffer_size);
    right[..buffer_size].clone_from_slice(left);
    buffer_size *= 2;
  }
  &buffer[..buffer_size]
}
fn write(output: &[u8]) {
  let stdout = io::stdout();
  let mut locked = stdout.lock();
  let mut buffer = [0u8; BUFFER_CAPACITY];
  let filled = fill_up_buffer(&mut buffer, output);
  while locked.write_all(filled).is_ok() {}
}
fn main() {
  write(&env::args_os().nth(1).map(to_bytes).map_or(
    Cow::Borrowed(
      &b"y\n"[..],
    ),
    |mut arg| {
      arg.push(b'\n');
      Cow::Owned(arg)
    },
  ));
  process::exit(1);
}

So this is a completely different matter!


The only thing I can add is this убрать необязательный mut.

Lessons learned


The trivial program yesactually turned out to be not so simple. To improve performance, it uses output buffering and memory alignment.

Recycling standard Unix tools is a fun experience and it makes you appreciate the nifty tricks that make our computers fast.

Also popular now: