Unpredictable symbolic (symbolic) links in Windows.

    The reason for writing this article was Windows links, symbolic and more . It can also be read to familiarize yourself with what hard and symbolic links are in ntfs.
    I will continue and share some facts about the not obvious behavior of these links.

    I’ll make a comment right away. If with Hard links ambiguity does not seem to be observed, then with soft or symbolic links there is confusion. So hereinafter I will talk about those symbolic links that are made by the Junction program (as well as Alax.Info NTFS Links, Link Shell Extension, etc.) The
    experimental programs were: Total Commander, Far, Frigate3, Servant Salamander, WinDirStat and Explorer in Windows XP

    Recursion.



    I first encountered link problems a few years ago when I tried to write a “super-duper” utility for finding and removing duplicates. It turned out that the OS search and directory traversal mechanisms fall into recursion when processing directories that have a symbolic link to themselves or the parent directory. Of course, this can be avoided by recognizing links and processing them in a special way, separately from directories. But the graduation project interrupted the development of the utility, and then I decided not to waste time creating the next “duplicate cleaner”. Now I think that in vain.
    The fact is that most programs that work with files do not know about symbolic links. And this is what it leads to.

    Search.


    Create a directory
    X: \ 000
    In it, create a text.txt file and a symbolic link 111 to the same directory X: \ 000
    This is what our experimental file managers will give out when searching for a file except for Far
    X: \ 000 \ text.txt
    X: \ 000 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text .txt
    X: \ 000 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ 111 \ text.txt

    I ’m glad that the recursion was interrupted so early. But still, this behavior is incorrect and can lead to errors. Here we found the same file 16 times. Many duplicate finder programs will suggest deleting this file as a duplicate. Although in fact the file is completely unique.
    But Far, it turns out, doesn’t follow symbolic links when searching.

    Catalog size


    This behavior leads to interesting effects in determining the size of a directory. You can make this size be many times larger than the size of the disk. For example, there is a 300 TB folder on my laptop. The same focus with dimensions can be done using hard links.
    image
    Among the experimental programs, only WinDirStat and Explorer were correctly sized.
    Everything is clear with the explorer - he and the implementation of symbolic links in windows are the brainchild of the same company and it would be strange if they incorrectly used their own link mechanism. But WinDirStat is so familiar with links, probably because it comes from Linux.

    Copy symbolic links


    When copying links, experimental file managers behaved differently.
    * Far - copied symbolic links as links. Those. made a copy of the link and not its contents.
    * Explorer asked a question for each link - copy it as a directory or as a link. But I suspect that such clever behavior was given to him by the installed Alax.Info NTFS Links utility. I could not verify how Explorer behaves on windows xp without any extensions, and on windows 2000 explorer behaves like far when copying itself.
    * All other experimental subjects copied links as directories.
    When copying links, you need to understand what it is. If the links are copied as files, then for example I will not be able to copy my dad “INTERNET” in the near future. If you copy the links as links, it may turn out that the friend’s catalog of movies copied to another hdd in the headlight does not open, since there were symbolic links inside. And then where they referred remained on another hdd. In general, when copying, Explorer behaves most correctly in the correct extension installed.

    Then I decided to check if archivers support symbolic links. It turned out that no. All archivers from my collection (including 7Z, winrar) do not save symbolic links. Unfortunately, in my collection there were no ported archivers like tar. I hope that Linux programs will help again.

    As for hard links to a file, all programs copy, archive them as files, without recognizing the link or not. In principle, this is expected.

    Removing symbolic links.


    When deleting symbolic links, Frigate3 and Servant Salamander distinguished themselves. They not only deleted the link, but also carefully cleared the contents of the directory to which it referred. The remaining experimental subjects removed only the link.

    Paranoid.


    Removing hard links is no problem. True, I don’t understand why I can’t delete one hard link to a file if it is opened by another hard link.
    It must be remembered that hard links are essentially different names for the same file.
    When using programs that permanently erase the file (for example, Sdelete ), permanently deleting one hard link will cause the rest of the hard links to the file to refer to a bunch of garbage. In this case, this behavior is logical and correct. If to overwrite files, then for good.

    Conclusions.


    Probably because symbolic links were not popular in Windows, many programmers forget about them. Or maybe the links are not popular because programmers forget about them and their programs work with the links as it happens. In general, be careful when using symbolic and hard links and check how your file manager handles them.

    Threat. It turns out there is not enough karma to write on a thematic blog. Anyway. Maybe here someone will be useful.

    UPD: The experiments were performed in the Windows XP Home the Microsoft Edition 32bit SP3
    of Total Commander 7.04a
    of Far 1.70
    Servant Salamander 2.0 We do
    WinDirStat 1.1.2.80 (the Unicode)
    of Frigate 3.21.2.71
    Explorer 6.00.2900.5512

    Taken into account the remark Buslaand changed the translation of Symbolic link to a more common version.

    UPD2:
    In order to clarify once again what it is about and remove the differences in terminology sown with a light hand MS.
    Having looked through msdn, I understood that in the end, MS came to a consensus. Sort of. And in Windows Vista they made some symbolic link which are created by the CreateSymbolicLink function .
    And those symbolic == symbolic == soft links that were (and are) in early versions of Windows (2000, XP) are some kind of reparse point. And they create something like this:
    memset (reparseInfo, 0, sizeof (* reparseInfo));
    reparseInfo-> ReparseTag = IO_REPARSE_TAG_MOUNT_POINT;
    reparseInfo-> ReparseTargetLength =
    _tcslen (targetNativeFileName) * sizeof (WCHAR);
    reparseInfo-> ReparseTargetMaximumLength =
    reparseInfo-> ReparseTargetLength + sizeof (WCHAR);
    _tcscpy (reparseInfo-> ReparseTarget, targetNativeFileName);
    reparseInfo-> ReparseDataLength = reparseInfo-> ReparseTargetLength + 12;

    DeviceIoControl (
    hFile,
    FSCTL_SET_REPARSE_POINT,
    reparseInfo,
    reparseInfo-> ReparseDataLength + REPARSE_MOUNTPOINT_HEADER_SIZE,
    NULL,
    0,
    & returnedLength,
    NULL);

    So, since I have no whist, we are talking about reparse point. Although I suppose that symbolic link will also present a surprise with ambiguous behavior in different programs. Because the problem is mainly not in the links, but in the fact that some programmers forget about them.

    Also popular now: