Buffers and binary input and output in Perl6

Original author: Moritz
  • Transfer
In Perl 5, starting with version 5.8, Unicode support is well implemented - but people still complained about the difficulty in using it. This is mainly due to the fact that the programmer needs to keep track of which lines were decoded and which should be processed as binary data. And there is no reliable way to look at variables and understand whether these are binary strings or text.

In Perl6, this problem was solved by introducing separate types. Str stores text. String literals in Perl6 are of type Str. Binary data is stored in Buf objects. And they can no longer be confused. The conversion between them is carried out using the encode and decode methods.

    my $buf = Buf.new(0x6d, 0xc3, 0xb8, 0xc3, 0xbe, 0x0a);
    my $str = $buf.decode('UTF-8');
    print $str;

Both operations have the same effect - they output to the standard output stream “møþ” and line feed. Buf.new (...) takes a list of integers from 0 to 255 - bytes from which a new buffer is built. $ * OUT.write ($ buf) outputs a buffer from $ buf to standard output.

$ buf.decode ('UTF-8') decodes the buffer and returns a Str object (or crashes if the buffer does not contain a valid string in UTF-8). The reverse operation is $ Buf.encode ($ encoding). Str can be output simply via print.

Naturally, print should also convert the string to a binary representation in the process. For this (and other similar operations), the default encoding is set to UTF-8. The Perl6 specification states that the user can change the default settings (but for now, compilers do not support this).

For reading, you can use the methods .read ($ no-of-bytes) (and you get Buf) or .get (and you get Str). The read and write methods are present not only in files and streams, but also in sockets.

In Perl 5, you can make a very unpleasant mistake by using concatenation or in some other way (join, text interpolation) combining text and binary strings. The result is a “broken” line - but only when it contains bytes above 127. In such situations, it is extremely difficult to debug the code.

In Perl6, in this case, you just get the error “Cannot use a Buf as a string”.

Also popular now: