Python for math

  • Tutorial

imageThe python language ecosystem is growing rapidly. It is no longer just a general-purpose language. With it, you can successfully develop web applications, system utilities and much more. In this note, we will still concentrate on another application, namely, scientific calculations.


We will try to find in the language functions that are usually required from mathematical packages. Consider the strengths and weaknesses of the idea of ​​using python instead of MATLAB, Maple, Mathcad, Mathematica.


Development environment


The python code can be placed in a file with the extension .py and sent to the interpreter for execution. this is a classic approach that is usually diluted using a development environment such as pyCharm. However, for python ( and not only ), there is another way of interacting with the interpreter - interactive jupyter notepads , which preserve the intermediate state of the program between the execution of various blocks of code that can be executed in random order. This method of interaction was borrowed from Mathematica notebooks; later, an analog appeared in MATLAB (Live script).



Thus, all work with python code is transferred to the browser. The resulting notebook can be opened using nbviewer.jupyter.org , github (and gist ) can independently display the contents of such files (convert).


Its disadvantages follow from the browser nature of jupyter: the lack of a debugger and problems with printing a large amount of information (browser window freezing). The last problem is solved by the extension, which limits the maximum number of characters that can be displayed as a result of executing a single cell.


Data visualization


For data visualization, the matplotlib library is usually used , whose commands are very similar to MATLAB. A library was developed at Stanford that extends the capabilities of matplotlib - seaborn (unusual graphs for statistics).



Consider an example of building a histogram for a generated data sample.


import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
# example data
mu = 100  # mean of distribution
sigma = 15  # standard deviation of distribution
x = mu + sigma * np.random.randn(10000)
num_bins = 50
# the histogram of the data
n, bins, patches = plt.hist(x, num_bins, normed=1, facecolor='green', alpha=0.5)
# add a 'best fit' line
y = mlab.normpdf(bins, mu, sigma)
plt.plot(bins, y, 'r--')
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title(r'Histogram of IQ: $\mu=100$, $\sigma=15$')
# Tweak spacing to prevent clipping of ylabel
plt.subplots_adjust(left=0.15)
plt.show()


We see that the syntax of matplotlib is very similar to the syntax of MATLAB. It is also worth noting that latex is used in the chart title.


Computational Math


For linear algebra in python, it is customary to use numpy , whose vectors and matrices are typed, in contrast to the built-in list language. For scientific calculations, the scipy library is used .


Especially for MATLAB users, a guide was written on the transition from MATLAB to numpy .


import scipy.integrate as integrate
import scipy.special as special
result = integrate.quad(lambda x: special.jv(2.5,x), 0, 4.5)

In this example, the value of a certain integral of the Bessel function on the interval [0,0.45] is numerically calculated using the QUADPACK library (Fortran).


Character Computing


You can use the sympy library to use character calculations . However, code written with sympy is inferior in beauty to code written in Mathematica, which specializes in symbolic computing.


# python
from sympy import Symbol, solve
x = Symbol("x")
solve(x**2 - 1)


In terms of functionality, Sympy is inferior to Mathematica, however, depending on your needs, it may turn out that for you their capabilities are approximately equal. A more detailed comparison can be found in the sympy repository wiki .


Speed ​​up the code


To speed up your code by converting to C ++, it can be implemented using the theano library . The syntax is the price to pay for such acceleration, now you need to write theano-oriented functions and indicate the types of all variables.


import theano
import theano.tensor as T
x = T.dmatrix('x')
s = 1 / (1 + T.exp(-x))
logistic = theano.function([x], s)
logistic([[0, 1], [-1, -2]])

Some convolutional neural network libraries, such as Lasagne and Keras , use theano for their calculations. It is also worth adding that theano supports acceleration due to GPU computing.


Machine learning


The most popular machine learning library for python is scikit-learn , which contains all the basic machine learning algorithms, as well as quality metrics, tools for validating algorithms, tools for pre-processing data.



from sklearn import svm
from sklearn import datasets
clf = svm.SVC()
iris = datasets.load_iris()
X, y = iris.data, iris.target
clf.fit(X, y)  
clf.predict(X)

For loading data from tabular data formats (excel, csv) pandas is usually used . The downloaded data is represented in memory in the form of DataFrames, to which various operations can be applied: both lowercase (line-by-line processing) and group (filters, groupings). An overview of the main functions of pandas can be found in the presentation " Pandas: an overview of the main functions " (Posted by Alexander Dyakonov, Professor, Moscow State University).


Not everything is so smooth ...


However, not everything is so smooth in python. For example, now two versions of the language 2. and 3. are getting along , both of them are developing in parallel, however, the syntax of the second version is not fully compatible with the syntax of the third version.


Another problem you may have if you are not the owner of linux, in which case you may have difficulty installing a number of libraries, some libraries will be completely incompatible, for example tensorflow .


The libraries in question

PS: all the python libraries discussed in this article are open source and distributed free of charge. To download them, you can use the pip command or just download the Anaconda assembly , which contains all the main libraries.


Also popular now: