shulyndina July 12, 2017 at 14:23

Interview with Tinkoff Bank programmer Andrei Stepanov about Python and ML

A series of interviews with speakers of PyCon Russia continues the conversation with Andrei Stepanov, a developer and analyst at Tinkoff Bank. We talked with Andrey about the place of Python in the bank infrastructure, about machine learning and about speech recognition technology.

- Andrey, how did you come to development in Python?

- Basically, like many in my industry, through ML. But this will not be an honest answer. I was fond of Python before that, one might even say, from school. And there was a classroom course at the ShAD. At first, the syntax scared me, before that I only studied C-like languages, and after them it seemed that relying on spaces and tabs to structure the program was not entirely reasonable. But then I got used to it, I liked simplicity, and after a while I came to understand what kind of rich and multifaceted language it really is, what crazy things you can do with it. This is not something that you often have to do at work, but it’s nice to realize that the tool you work with is flexible and powerful enough.

- You have an interesting position - an analyst-developer. What do analytics developers do? What are you working on now?

- There are not very many of us analyst developers in the bank. It's a cross between a big data specialist and an industrial developer. Here you need to be able to write beautiful and maintainable code, and be able to work with data, conduct ML experiments, analyze the results and update mathematical models accordingly. Well, or look for new approaches to solving the problem, if the current ones do not work. And if they work, and everything is ok from the business side, then try to implement a prototype for production. Accordingly, the technology stack is very wide. The work is interesting and rather time-consuming, you have to read a lot of docks about new technologies and, in principle, follow the news in the field of ML and the selected technologies for the production stack.

Now I am working on a speech recognition project for a bank.

- What technologies does Tinkoff use? Do you have a lot of ML? What is the place in the infrastructure of Python?

- Of those that we use: Python, Tensorflow, Docker, Protocol Buffers, GRPC, Cython. Other teams, of course, may have a different technology stack.

There are many MLs, in production simple and fast, well-interpreted ML models are especially appreciated. We are also developing trendy Deep Learning. Speech recognition, dialogue systems - all of this just uses deep neural networks. The use of neural networks can give a qualitative leap in the creation of client products and services in the future, so we pay special attention to them.

Python in the infrastructure of our ML solutions so far takes the role of a language for experimenting and training initial models. It has everything for ML production, and we are trying to explore this topic. Ideally, I would like to have a tool that allows analysts to export trained models and provide a quick external API for their calculation. Something already partly exists for Deep Learning, I'm talking about Tensorflow Serving now. It would be cool if such a tool existed for all commonly used ML models.

At last year’s PyCon, Martin Gorner from Google spoke in detail about TensorFlow

- How scary to write code for the bank?

- In RnD, it's not scary at all, but should it be? :) Guys who are closer to personal data and money are probably worse, but they have more external control over the process.

- As far as I know, Tinkoff was the first in Russia to introduce technology for recognizing customers by voice cast. This is true? Have you been involved in the development? Tell me more about this feature?

- If I understood correctly, you are talking about a joint project with NICE Systems. We introduced voice recognition in 2014, this happened before I came to work in Tinkoff.

The introduction of this technology has reduced the number of additional questions to the client or even dispense with them. Duration of calls on average decreases by 40-60 seconds per call, which allows us to significantly save on traffic.

The voice recognition system improves the level of service, as customers do not need to answer the same questions with every call to the bank, and the efficiency of the support service, as operators immediately go to resolve the issue. And voice biometrics can increase the level of security during operations.

- Is it worth waiting for voice assistants and voice technology to become normal interlocutors?

- Over time, I think so. But how much time must pass for this, no one knows.

Now everything, even taking into account the latest developments in the field of GPGPU, rests on the computational capabilities of modern iron and the development of effective neural network architectures. The main problem is not even in training, although the development of iron in this area would help, but in providing millions of users with computing time to calculate the neural network on their data. Recently introduced by Google's Tensor Processing Unit (TPU) and specialized computing units for Deep Learning in the GPU core of the new architecture from NVIDIA can partially solve this problem, but it will take a long time for developers and scientists to master these technologies.

Yes, humanity has taken a huge step in the AI field compared to what it was, for example, 10 years ago, but I think that it is still far from full-fledged artificial intelligence.

- In your opinion, in what direction will these technologies develop in the coming years?

- I think that in the near future there will be a serious strengthening of human abilities with the help of AI in areas where traditionally people are now considered superior to computers (these are just voice assistants, chatbots, dialogue systems). The interaction interfaces will still retain a somewhat artificial, computer “appearance”, that is, naturally speaking with a voice assistant on an arbitrary topic will not work out yet, but within a fixed domain, a conversation, even a natural one, may well take place.

- You probably know what is happening in the world of fintech. What is the role of Python in this industry?

- Python has established itself as a great language for experimenting with ML. It’s not so simple in production, I’m mostly talking about speed now. But the computationally-intensive part of the code can always be implemented in Cython or as a C extension. Then there should not be problems with speed. Python has a very cool community, and you can always solve your needs with external packages.

- What tools do you use to organize work (including for planning time, organizing work space, etc.)?

- I prefer an almost empty desktop with a minimal amount of essentials. For planning, I use a piece of paper and a head, sometimes Google Keep for long-term tasks.

- Do you read any professional blog? What information resources could you recommend to colleagues for the development of skills?

- If about ML and Deep Learning, then I like the blogs of Andrej Karpathy and WildML . True, they have not been updated for a long time. There is also a cool Tensorflow blog from one Frenchman. I also really liked the David Beazley video lecture on metaprogramming in Python 3. And Python and the standard library have very cool docks, I always find something new in them.

July 17 at PyConRu Andrey will hold a large workshop “Speech Recognition in Python without PhD” where he will talk about how to write and train his simple speech recognition engine with Tensorflow and neural networks in the shortest possible time.

Thanks to our sponsors who make the conference possible: a gold sponsor - Adcombo , a partner for energy and good mood at CIAN , silver sponsors - Rambler & Co and DomKlik , a bronze sponsor - MediaScope . Thanks for supporting the Python Software Foundation .

Tags:

Interview with Tinkoff Bank programmer Andrei Stepanov about Python and ML

Also popular now: