
Overview of the most interesting materials on data analysis and machine learning No. 20 (October 27 - November 2, 2014)

I present to you the next issue of a review of the most interesting materials on the topic of data analysis and machine learning.
General
Russian AI Cup 2014: winner strategy
Moscow Big Data Hackathon November 15-16
HighLoad ++ 2014: Data processing in RTB: fast, cheap and 98% accurate (Pavel Kalaidin, RuTarget)
Real-time bidding requires real-time analytics. RuTarget handles a billion banner ad requests per day. How to determine, for example, how many unique users in these queries? Slides from the report of Pavel Kalaydin at HighLoad ++ 2014.HighLoad ++ 2014: Thorny path to the Large-Scale Graph Processing (Alexey Zinoviev, Tamtek)
Slides from the presentation of Alexei Zinoviev from the HighLoad ++ 2014 conference on working with large graphs.HighLoad ++ 2014: How we built an analytic platform for several billion events per month (Mikhail Tabunov, Coub)
Another set of slides from another report from the HighLoad ++ 2014 conference. In this case, Mikhail Tabunov from Coub talked about his experience in creating an analytical platform.New approaches in Deep Learning for pattern recognition
An interesting article from the Microsoft Research blog on the development of the use of machine learning algorithms for Deep Learning for pattern recognition.Jeff Hawkins on the limitations of neural networks
Recently, there has been a lot of noise, news and discussions around the use of neural networks for machine learning. Jeff Hawkins gives his little expert commentary on the limitations of neural networks.LinkedIn Team Data Science
News Some LinkedIn team news on the Science Data team from Venture Beat.Text analysis from the point of view of a business user (part 1)
The first part of a series of articles devoted to the view on text analysis from the point of view of a non-technical specialist.Data Analysis Content Index Page The
Analytics Vidhya Blog has a useful page that provides a link to a wide variety of content related to data analysis.25 facts about Big Data
A set of 25 interesting facts about Big Data that may seem interesting from the SmartData Collective portal.6,000 libraries on CRAN The
number of libraries for the R programming language in the CRAN repository has reached 6,000.
Theory and algorithms of machine learning, code examples
Splitting text into sentences with a linguistically independent method using the example of the AIF library
Visualization of the curse of dimension
Simple and clear visualization of the concept of the curse of dimension.Deep learning as of 2014
Slides from a recent report by renowned machine learning expert Oliver Griesel on the popular topic of Deep learning.Introduction to DeployR Open
A short article from the Revolutions blog about the interesting DeployR Open product for the R programming language.Do not start developing a machine learning algorithm by learning someone else’s code. This is an
article from the MachineLearningMastery blog, which provides useful tips that will help you increase your level of knowledge in the field of machine learning.Studying the operation of machine learning algorithms (Part 1)
Another interesting article from the author of the blog MachineLearningMastery, which will talk about how you can study the operation of machine learning algorithms and why this is useful not only in academic research.Learning about the operation of machine learning algorithms (Part 2)
And some more information from the author of the MachineLearningMastery blog on how to learn about the operation of machine learning algorithms.
Online courses, training materials and literature
Zipfian Academy: Become a Data Scientist in 12 Weeks
An expensive offer from Zipfian Academy, which promises to lead you to a brighter future in the Data Science theme in 12 weeks and $ 16,000.The book "Social Media Mining"
An electronic version of the book "Social Media Mining".A preliminary version of the book "Causal Inference"
A preliminary version of the book "Causal Inference" from the author.The electronic version of the book “Data Blending for Dummies” is available for free download.
A free electronic version of the book “Data Blending for Dummies” has appeared.The publication of the book “Data Fluency”
A curious book on the analysis of data “Data Fluency” has appeared on sale.
Videos
How recommender systems work. Lecture in Yandex
Introduction to Microsoft Azure Machine Learning
Video, which talks about Azure ML at a fairly simple level and will serve as a good base to help you get started with this solution from Microsoft.Hadley Wickham on dplyr library at useR! 2014
A small interesting report on the capabilities of the dplyr library from Hadley Wickham (Assistant Professor of Statistics, Rice University), which was presented at the useR! Conference 2014.
Data engineering
5 Indisputable Facts About Hadoop
A short article from the Big Data Analytics News portal that provides 5 interesting facts about Hadoop that will help you understand situations when using Hadoop is appropriate and when not.The role of DBA in the NoSQL world This
article will tell you about the role of DBA in the modern world of NoSQL repositories.Using SQL queries in MongoDB
An article that talks about the possibility of using SQL syntax for queries to MongoDB using SlamData.
Reviews
DataScienceCentral Weekly Digest
Regular weekly data analysis digest from DataScienceCentral.The best materials: Big Data Zone (October 24 - 31)
A collection of the best materials from the popular DZone portal on Big Data.Data Mining News
A small list of interesting resources on the topic of Data Mining on October 29.The best materials of the week
The best materials of the week on Data Science topics from the Data Science Report portal.The best materials: Big Data Zone (October
17-24 ) A collection of the best materials from the popular DZone portal on Big Data.The best materials of the week (October 19 - 25)
The best materials of the week on data analysis from the KDnuggets portal.The most interesting materials from Freakonometrics No. 179
A collection of the most interesting materials from the popular Freakonometrics portal.The most interesting materials from Freakonometrics No. 178
A collection of the most interesting materials from the popular Freakonometrics portal.The most interesting materials on High Scalability
An overview of the most interesting materials on HighScalability from the popular portal High Scalability.
Previous issue: Overview of the most interesting materials on data analysis and machine learning No. 19 (October 20 - 26, 2014)