Rainbow CSV plugin as an alternative to Excel

Hi, Habr! This article is about the Rainbow CSV plugin, which I wrote for 5 text editors:


VS Code , Vim , Sublime Text 3 , Atom , Gedit


I think that many readers of this article periodically encounter CSV (comma-separated), TSV (tab-separated) and similar files. If you try to open them in a text editor (and how else to find out what is inside?), Then a completely nondescript picture will open up as on the left side of the image. Looking at it is difficult to say even how many columns in the table. On the right side of the picture is the same file with the included RainbowCSV, readability has increased significantly due to syntax highlighting.


image


The syntax for such a highlight, oddly enough, is set with just one (albeit long) line-regular expression:


((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?((?:"(?:[^"]*"")*[^"]*"(?:,|$))|(?:[^,]*(?:,|$)))?

The whole highlighting rule can be found, for example, here (version for VS Code), but, apart from the regular expression itself, there is absolutely nothing to look at.


For comparison, the syntax files for general purpose languages ​​such as Python, JS, C ++, etc. It usually takes several hundred lines of a very esoteric code.


In order not to load the article with details, guess what the main parts consist of and how this regular expression works is offered to the readers.


Hint: Here is a simple expression ([^,]*,)?([^,]*,)?- it will highlight the CSV file in 2 different alternating colors, but it will incorrectly work on commas inside the fields, shielded with quotes.


By the way, hereinafter the version of Rainbow CSV for Visual Studio Code is described, since This version of the plugin is currently the most technically advanced and popular (more than 500K downloads).


So, besides the fact that Rainbow CSV highlights columns, it can also:


  • Tell which column the cursor is currently pointing at: column number + title from the first line header. If there is no header line at the beginning of the file (the data immediately goes), then the user can specify his “Virtual” header.
  • Automatically check the file for a different number of records per line or incorrect use of escape characters - "CSV Lint".
  • Run a SQL-like query using the RBQL interpreter built into the plugin, which allows you to apply a very wide class of text transformations to the input table.
    RBQL supports almost all SQL statements (SELECT, UPDATE, WHERE, ORDER BY, TOP / LIMIT, JOIN, GROUP BY) as well as all standard functions and operators from JavaScript and Python.
    RBQL is a separate technology, but it fits very well with the concept of Rainbow CSV, and therefore this integration has many advantages.

One of the most important features of Rainbow CSV plug-ins is the automatic detection of CSV files by their content. This functionality is essential because Often, CSV (or TSV) files have a file extension other than .csv (.tsv). You can also find files with the .csv extension which actually use a semicolon as a separator ;. The algorithm for determining the table file by content is very simple - it is enough to check that the number of cells in each line during the split for this separator is constant> 1.


Comparing Rainbow CSV with Graphic Alignment


In general, the traditional way to view CSV data is to import them into a graphics editor, such as Excel.
Compared to this method, Rainbow CSV has both advantages and disadvantages:


Benefits:


  • What you see is what you get - you can be sure that what you see on the screen is the actual contents of the file.
  • Familiar environment favorite text editor
  • Zero-cost abstraction: Syntax highlighting is very “cheap” from a computational point of view compared to graphical alignment.
  • Higher information density: More data fits on one screen - graphic alignment “eats up” a lot of space due to leveling spaces.
  • The ability to visually link one column (highlighted in the same color) from different windows

Disadvantages:


  • The standard implementation uses 10 different colors, so when the number of columns is more than 10, the colors begin to repeat and the color coding efficiency of the columns decreases.
  • There is no support for moving lines in cells that are escaped with double quotes. Here you can read the details of this problem. However, I believe that CSV with a line break inside cells is an extremely impractical format.

Text alignment comparison


Another way to improve the readability of CSV files is alignment with spaces, but this method modifies the contents of the file, and therefore its applicability is very limited.


Also, in my opinion, the readability of the file after the syntactic Rainbow backlight is better than that of the file, which was aligned with spaces.


Little about the project


The first version of Rainbow CSV was written 5 years ago for Vim based on the rainbow_parentheses plugin. As you can see, I borrowed from this project not only part of the code, but half the name =)
Versions for VSCode, Atom, and Sublime Text 3 appeared a year ago.


Many critical features and improvements have been proposed by plugin users.


Comparing plugin development process for different editors


In conclusion, I can make a small comparison of the API of popular text editors.
The API for plug-ins for VSCode, Atom and Sublime Text 3 are quite similar to each other, the main difference is that the extensions for VS Code and Atom are written in JavaScript, and for Sublime Text 3 in Python.


All 3 editors use the same regular expression engine for syntactic highlighting, so the transfer of Rainbow CSV between these editors required only minimal adaptation of regulars.


In general, I can say that the most pleasant and convenient plugin development process is provided by VS Code. On the other hand, for some reason, it lacked some functionality necessary for the full-fledged work of the Rainbow CSV, but the VS Code team gladly accepted and improved my PR, which added the method I needed.


Writing plugins for Vim is very different from these 3 newer editors. Vim uses its own VimScript language, as well as various commands to manipulate the contents of open files. The syntax model that Vim uses for highlighting is also quite different from what VSCode, Atom, and Sublime provide.


References:



Also popular now: