ConfigParser and Unicode

    Python has a very convenient module for saving and reading ini-like configuration files called ConfigParser .

    When using it, I had a problem related to saving Unicode strings to a file. In some subtle cases (for example, it showed me when the application was running under Windows XP), when reading or writing such parameters, a string conversion error pops up.

    I could not find ready-made solutions on the Internet, although there are quite a lot of questions about “how to make it always work” - they usually answer in the spirit of “ask the author of the module to fix it”.

    I want to offer my solution for those who use Python 2.X - it is quite simple and helps to solve this problem.



    First, you need to inherit the RawConfigParser class by overriding the write () method - namely, replacing all str () calls with unicode () calls :

    Copy Source | Copy HTML
    1. class UnicodeConfigParser(ConfigParser.RawConfigParser):
    2.  
    3.     def __init__(self, *args, **kwargs):
    4.         ConfigParser.RawConfigParser.__init__(self, *args, **kwargs)
    5.  
    6.     def write(self, fp):
    7.         """Fixed for Unicode output"""
    8.         if self._defaults:
    9.             fp.write("[%s]\n" % DEFAULTSECT)
    10.             for (key, value) in self._defaults.items():
    11.                 fp.write("%s = %s\n" % (key, unicode(value).replace('\n', '\n\t')))
    12.             fp.write("\n")
    13.         for section in self._sections:
    14.             fp.write("[%s]\n" % section)
    15.             for (key, value) in self._sections[section].items():
    16.                 if key != "__name__":
    17.                     fp.write("%s = %s\n" %
    18.                              (key, unicode(value).replace('\n','\n\t')))
    19.             fp.write("\n")
    20.  
    21.     # This function is needed to override default lower-case conversion
    22.     # of the parameter's names. They will be saved 'as is'.
    23.     def optionxform(self, strOut):
    24.         return strOut
    25.  


    Secondly, writing and reading the configuration file must be done with a wrapper for open () from the codecs module , which must be specified with utf-8 as the encoding. In the case of loading, this can be done if you use readfp () rather than read () for reading :

    Copy Source | Copy HTML
    1. import codecs
    2.  
    3. # Saving
    4.  
    5. confFile = codecs.open('myConfig.ini', 'w', 'utf-8')
    6. config = UnicodeConfigParser()
    7. # ...
    8. config.write(confFile)
    9. confFile.close()
    10.  
    11. # Loading
    12.  
    13. config = UnicodeConfigParser()
    14. config.readfp(codecs.open('myConfig.ini', "r", "utf-8"))


    I hope someone comes in handy. If you have a more beautiful and successful solution, I will be glad to hear it.

    Also popular now: