Replacement for FIND and GREP

    It seems to me that the topic of comparing the capabilities of PowerShell and the shells of the UNIX world has long been ripening. Comparisons are not in the holistic sense of the word, but in the positive constructive. Linux scriptwriters (not fanatics), I think, will be interested to know how to make those or other things that they are used to doing on bash or zsh, on PowerShell. Perhaps I’ll start such a topic - and I really hope that one of my fellow Povershell members ( Guderian , ApeCoder ) will also support this topic.

    On UNIX, a fairly popular bunch of findand utilities exists to search for text in a file tree grep. For example, with the help of these utilities we can find all references to the keyword classin our source tree:

    $ find -name \*.cpp -o -name \*.hpp -exec grep -Hb class {} \;

    Let's see what PowerShell offers us for these purposes.

    First of all, let's try to replace grep. For similar purposes, PowerShell has a cmdlet Select-String. Let's try something simple: Here we look for all occurrences of the word in all files with the extension in the current directory. By the way, the documentation on gives us incorrect information about the order of the arguments: it says that the file name (parameter ) should follow first , followed by the search template (parameter ). But in fact, the opposite is true (I don’t know why). Naturally, if you use not positional, but named parameters, then this problem does not arise. We go further: what if we want to search for the specified string in files whose names can coincide with several patterns (for example, how

    $ Select-String class *.hpp

    CustomTimeEdit.hpp:5:class CustomTimeEdit : public QTimeEdit {
    DaySelecter.hpp:12: class Model : public QAbstractItemModel
    DaySelecter.hpp:43: class View : public QTreeView
    [...]


    classhppSelect-String-Path-Pattern

    hpp-, so and cpp-files): What if we want to find several search patterns in these files : In fact, both parameters (and , and ) take arrays of strings (which are specified with a comma in PowerShell). But do you remember that PowerShell does not operate on text, but on objects? Let's see what kind of objects it gives us :

    $ Select-String DaySelector *.hpp,*.cpp

    DaySelecter.hpp:1:#ifndef __DAYSELECTER_HPP__
    DaySelecter.hpp:2:#define __DAYSELECTER_HPP__
    DaySelecter.hpp:10:namespace DaySelecter {
    [...]




    $ Select-String DaySelector,MainWindow *.hpp,*.cpp

    DaySelecter.hpp:1:#ifndef __DAYSELECTER_HPP__
    DaySelecter.hpp:2:#define __DAYSELECTER_HPP__
    DaySelecter.hpp:10:namespace DaySelecter {
    MainWindow.hpp:1:#ifndef __MAINWINDOW_HPP__
    MainWindow.hpp:2:#define __MAINWINDOW_HPP__
    MainWindow.hpp:4:#include
    [...]


    -Path-Pattern

    Select-String

    $ Select-String DaySelector * .hpp, *. Cpp | gm
    TypeName: Microsoft.PowerShell.Commands.MatchInfo
    Name MemberType Definition
    ---- ---------- ----------
    Equals Method bool Equals (System.Object obj)
    GetHashCode Method int GetHashCode ()
    GetType Method type GetType ()
    RelativePath Method string RelativePath (string directory)
    ToString Method string ToString (), string ToString (string directory)
    Context Property Microsoft.PowerShell.Commands.MatchInfoContext Context {get; set;}
    Filename Property System.String Filename {get;}
    IgnoreCase Property System.Boolean IgnoreCase {get; set;}
    Line Property System.String Line {get; set;}
    LineNumber Property System.Int32 LineNumber {get; set;}
    Matches Property System.Text.RegularExpressions.Match [] Matches {get; set;}
    Path Property System.String Path {get; set;}
    Pattern Property System.String Pattern {get; set;}
    


    As you can see, we can not only see the output of the command, but also literally parse it into parts - highlight on which line and in which file the match occurred. We can build what kind of data output we need on our own (for example, in XML). In my opinion, it is very exciting and convenient. What is often lacking in zsh.

    It seems to me that the basic idea is clear; Select-Stringprovides us with the following features:
    • regular expression search (default behavior),
    • search by literal match (switch -Simple),
    • search only the first match in the file, ignoring all subsequent matches (switch -List),
    • or, conversely, search for all matches, even if there are several of them in one line (switch -AllMatches),
    • search for strings that do not match the pattern (switch -NotMatch) - an analogue of the -vutility key grep,
    • in addition to the directly matching line, displaying several previous and next lines (argument -Context) is very similar to how unified diff works.

    What is very important for our user, Select-Stringsupports the ability to specify the file encoding (parameter -Encoding). But, alas, for some reason, the list of encodings is limited to Unicode encodings, as well as ANSI (aka WINDOWS-1251 in our OS) and OEM (CP866, "dosovskaya"). Why they didn’t make a wider choice, it doesn’t reach me (it does, the breadth of possibilities, all that), although this set is quite enough in most cases.

    Now let's see how to search for files in the directory tree . In PowerShell, this is used Get-ChildItem, or ls(again, alias). I don’t know all the features of UNIX ls, but our PowerShell cmdlet is pretty powerful. We can get a list of all files matching the specified patterns:

    $ ls -r -inc *.cpp,*.hpp

    Pay special attention - under similar conditions in bash or zsh we would have to escape stars, because they are revealed by the shell itself. PowerShell has a slightly different approach: templates are passed directly to the cmdlet and it already uses the PowerShell internal environment to open them in the right places (I am not familiar with the internal PowerShell kitchen, but the principle is approximately the same). This is very convenient for me, because in such cases on Linux, I often received broken scripts, and thought for a long time why they couldn’t find the files. But this, of course, is just my carelessness.

    Back to ls. Let's try to find files that do not match the template:

    $ ls -r -ex *.hpp~

    We can do some crazy search, such as searching for files matching pattern number one, and excluding from this list those files that match pattern number two:

    $ ls -r -inc *.cpp,*.hpp -ex DaySelecter*

    If you need even more control over the search, use the cmdlet Where-Object- remember? PowerShell passes objects over pipes, not text:

    $ ls -r -inc *.cpp,*.hpp -ex DaySelecter* | ? { $_.IsReadOnly }

    Okay, we saw the possibilities of, and ls, and Select-String. How do we now combine them? The fact is that it Select-Stringcan receive a list of files both from the command line (parameter -Path) and from pipe. Thus, we can simply combine both teams with a pipe and get the desired result: In my opinion, it’s quite convenient!

    $ ls -r -inc *.cpp,*.hpp -ex *DaySelecter* | Select-String DaySelecter

    MainWindow.cpp:26: connect(viewDaySelecter->selectionModel(),
    MainWindow.cpp:38: viewDaySelecter->setDiary(diaryModel);



    Also popular now: