Learn the language by watching TV shows: vlc + lua + stardict + wordnet + anki = l'amour

    I really like to watch TV shows, and I also learn languages ​​with their help. And if earlier I diligently stopped the video in an incomprehensible place, rewound it, turned on the subtitles and scored unfamiliar words in Anki, now I am doing the same. Unless laziness made this process automate, which led to the creation of the Say It Again extension for the VLC player with the following features:

    • Subtitle navigation (go to previous, next phrase) - y , u keys ;
    • Saving a word, its transcription and translation along with context (see screenshot) - i key ;
    • Function “Once again”: go to the previous phrase, show subtitle and pause - backspace key ;
    • Connection of any dictionaries in the Stardict format ( dictionaries from Lingvo x3 are on the network );
    • Export to Anki or another program that understands csv files;

    Say It Again screenshot

    But why?

    At some point, I realized that watching TV shows with subtitles turned on stopped giving tangible results (except for increasing speed reading). But, turning off the subtitles, I came across a barrage of obscure words. I had to rewind the video (of course, either not enough or too far from the right place ), turn on subtitles, stop playing, look for unfamiliar words in the dictionary, write them in a separate file, then hammer in Anki (sometimes with context, i.e. with phrase from the movie!) - In general, hemorrhoids. From here the idea of ​​expansion was born, which I use today, and which I want to share with the public. It allows you to automate the process of saving words with context into a text file of this kind:
    Sample Export File
    desecrate [ˈdesɪkreɪt] оскорблять, осквернять, позорить You've desecrated my owls. Weeds S07E12

    How to use it?

    Actually not the most user-friendly process, but there's nothing to be done. The script was tested on VLC 2.0.5 under Windows.
    1. Download the latest say_it_again.lua script from Github ( direct link )
    2. We copy it, depending on the platform, in % ProgramFiles% \ VideoLAN \ VLC \ lua \ extensions or / usr / share / vlc / lua / extensions .
    3. We are looking for and downloading a dictionary in Stardict format ( copyright? No, I haven’t heard ). Personally, I like the English-English dictionaries of the Oxford American Dictionary.
    4. Unpack it. The result should be three files per dictionary: * .idx, * .dict, * .ifo. If instead of * .dict we have * .dz, then unpack it too - this is a regular zip.
    5. Download WordNet databases and also unpack them somewhere.
    6. Edit say_it_again.lua , changing dict_dir, wordnet_dir, chosen_dict .
      Settings example

    7. We launch VLC, open the video file, next to which lies the subtitle file in srt format; turn on the View menu - Say It Again.
    8. Voilà - use the y, u, i and backspace buttons for the corresponding actions (see above)

    You can do without dictionaries ( dict_dir = nil, chosen_dict = nil ), but then the meaning of the word will have to be clogged with your hands, and there will be no transcription - the meaning of automation is lost.
    You can also not connect wordnet ( wordnet_dir = nil ), but then the normalization of words will not work - according to the word was, the verb be will not be found in the dictionary .

    Technical Aspect - VLC and Lua

    Starting with version 1.1, the VLC player allows you to extend its functionality using scripts on Lua. To do this, the extension must meet certain requirements (some are optional, but the VLC will swear):
    1. Return descriptive information in descriptor () function. The format of the table can be viewed in ready-made scripts in the vlc add-ons repository ;
    2. Have functions activate (), deactivate ();
    3. Depending on the capabilities specified in the table with the description, the functions input_changed (), meta_changed (), menu (), trigger_menu () are needed;
    4. The maximum can be one dialog box per extension. Hack with the removal and re-creation of the dialogue again leads to VLC crashes.

    In general, I must say that the extension mechanism is pretty crude, and VLC crashes without warning from literally any sneeze. Here are some points that I noted for myself:
    • To process periodic events instead of polling, it is better to subscribe to the intf-event event .
    • The callback handler should be “short”, i.e. it should not have time-consuming operations. Given the fact that I did not want to introduce parallelism into the already prone to crash system, I had to invent a small hack: at the beginning of the handler, unsubscribe from events, and at the end sign again.
    • In general, coroutines work.
    • The best way to have several different dialogs that I found is to clear the current one and fill it with controls depending on the task.

    Work with Stardict and WordNet formats, as well as export to Anki

    The description of the Stardict and WordNet formats was given by Bienne in her article , so I will not repeat it. Export to Anki works through a csv file. The sequence of fields in it is hardcoded in accordance with the following card model (or, as it is now called, Note Type):
    View screenshots

    Import process:


    The extension was created for your own needs, and its concept changed in the process of writing and using. From here follows a little chaotic code and a frightening appearance. Nevertheless, I hope that someone will find the development data interesting.


    Now works on Linux (tested on Ubuntu 12.10 64bit; vlc 2.0.5).

    Also popular now: