The ideal document repository

    Sometimes I really want to quickly find the desired file. Given that there are hundreds of thousands of files, and you do not know its name, content, or type, nothing. But you probably know the categories. And I want to quickly calculate it and immediately edit and write.
    Today there are NO convenient cross-platform open-source file cleaners with direct access to files .
    Further, we will focus not on the media library and not on semaweb - but on a simple and convenient system for managing a huge file-washing system with direct access to files.

    1. TK


    British scientists Practice has shown that even in a small company for a couple of dozen users, there can be more than a dozen (or even hundreds) of thousands of files - of very different contents and formats. And to find a file in this mess of the farm is so difficult that it is easier to do it all over again.
    The search problem did not arise much today ( Chekhov witness) - but has not yet been resolved.
    The nuance is that in this case, "find" is not in terms of search engines or Explorer - but in human terms. A person does not know the words that he is looking for - he knows the concepts. And with the concepts (semantics) of search engines and file managers is tight. For a normal user does not look for “\\ server \ public \ Incoming \ Contracts \ Clients \ Horns and Hooves \ Horns and hoofs agreement with Chamomile for delivery.doc” - he looks for “Some kind of contract about last month with our beloved client about two lemons. ” And he will find it (if you're lucky) in “\\ server \ private \ secretary \ Outbox \ Romashka LLC \ Commercial offers \ Hoof \ How they got me already.xls."
    In the situation “they themselves do not know what they want” (Zoshchenko), one must give the person a choice (which the search engines are trying to do, but so far without success). Those. I don’t know the type of document - before my eyes there are possible types. I don’t remember the exact name of the company - I have a list of companies. Etc. Therefore, this is no longer a search - but a filter.
    So - let's say I have 100,500,000 files (from prone to AutoCAD drawings) - and I want to quickly and conveniently:
    1. “Calculate” (filter) the file according to some signs that physically are not in the file itself,
    2. open (do not download and open a copy - just open it),
    3. change - and write (do not upload back - namely write - ^ S).

    Total - we need a system that:
    • Works
    • Cross-platform (Windows, Linux, Mac OS)
    • Filters
    • Direct access (open and record)
    • Integration (integration into the desktop environment - as a consequence of paragraph 4)
    • Multiplayer
    • Internet (access from anywhere in the world)
    • Open source


    2. Who is to blame


    What do we have today:
    2.1. FS

    This is a file-wipe in the form of a branchy tree of folders and files on a local / remote file system. This option has one major advantage and one major disadvantage:
    • - The complete absence of filters.
    • + Built-in cross-platform direct file access.

    Of course, hierarchical construction can be called a “filter” with a stretch. But it’s a stretch. For if one person put the file in “\\ server \ public \ LLC Camomile \ Contract \ For delivery” - and another searches for it in “\\ server \ public \ Legal \ Outgoing \ Camomile” - then this never corresponds to associative thinking person. In fact, this file should be “Legal” And “Outgoing” And “Daisy” And “Agreement” - and not in this order, but at the same time. But at the same time, the current file managers will not provide.
    2.2. Web

    Thousands of them. Meets all requirements except two:
    • - Direct access to any files (Googledox and MS Live are good - but what should people do with AutoCAD and SmetaWizard?)
    • - Integration (as a result)

    Those. partial access
    2.3. All-in-one

    IBM Domino, MS SharePoint, MS Exchange. This is such a “thing in itself” that is trying by its own means to solve the shortcomings of existing technologies.
    • + It works
    • + Filters
    • + Direct access
    • + Multiuser
    • + Internet
    • - Not cross platform
    • - Not integrated
    • - Not open source

    2.4. Semantic fs

    Nepomuk, WinFS, ReFS etc. With all due respect, I didn’t see them working live, therefore - it is not considered along with other exotic.

    3. What to do


    In short, hang up using the web-based interface to manage files and give a direct file access link.
    With links, everything is very simple - you won’t especially turn around. If we proceed from cross-platformness, then there are as many options (of those that do not require special squats): 3: http: //, ftp: // and file: // (the wenda doesn’t understand more normally). At the same time, file metadata can be organized as you like - from simple tags to semaweb bells and whistles. But with the links you need to think about it.
    3.1. HTTP

    Only reading. In the sense - to give a link http: // you can - and you can even download, open and change. But to fill in exactly the same way - it will not work. The combination ^ S at least. From any application under any platform.
    3.2. FTP

    Probably you can somehow cross the web interface with file links in ftp. But it is quite difficult to ensure the integrity of information in the metadata database and in the ftp storage - these are two completely separate services. Interfering with the work of the ftp server is very difficult, but writing your ftp server ...
    Put it aside.
    3.3. file: //

    Frank crutch. Those. you can somehow map a remote resource to a local one using some of the Internet protocols - and it will even work. But this design looks too enchanting.
    3.4. Webdav

    And here everything is very interesting. As such, there is no “standard" WebDAV server. But all common OS / DE support WebDAV (as clients) out of the box and in many ways. In this case, you can write your own WebDAV on the server side (not a web server - but only process http request) and do literally miracles.
    Theoretically, of course ...
    Updated: after a month of sekas with Windows XP, we can say that WebDAV "spark" it supports purely nominally .

    4. What does Django have to do with it?


    Despite the fact that the demonstration of this idea (Web UI + WebDAV) uses just Django (more precisely, it is a small set of Apache, mod_dav, Django and a program made in whip specifically for this article). And so that the attentive reader reads carefully - the link to the demo is somewhere in the text :-)
    In particular, file management via the web (the “comment” field for files) and direct access to the file directly from the web page are demonstrated.
    Of course, I wanted to get a full turbo from WebDAV - but suddenly it turned out that out of thousands of implementations of the WebDAV provider in python, both (wsgidav and pywebdav) are quite difficult to integrate into your web application (if at all possible, because it is not WebDAV providers, namely servers). I had to read the lettersand start sculpting your bike. There are a lot of letters, alone things are going slowly, so I invite everyone to joint development.
    Slides .

    5. Comments


    Linux

    Konqueror with webdav: // works fine. True - only KDE applications or adapted to it (libreoffice-kde) - at least when working not in KDE. Those. the integration here is only partial (Users of LibreCAD, JuffEd and other non-KDE applications are forced to suck their paws.).
    Epiphany with dav: // - similarly (s / KDE / GNOME /). Although the dwarf works worse with WebDAV than sneakers.
    Mac OS

    Not tested
    Windows

    Because on habr application of obscene vocabulary is not welcome, then the resume will be short - everything is very bad. You can write a book about the “features” of Microsoft's view of WebDAV, HTTP, and XML.
    But with a certain position of the stars, something somehow somehow works.
    Although perhaps apache mod_dav is not too compatible with Windows ;-)

    6. Summary


    All this slender system of crutches and backups works, but not a fountain. In any OS (client). Those. at the moment, OS / DE is not quite ready for the full integration of web applications with the desktop (so that it is transparent and always).
    But there is still life on Mars!

    Also popular now: