
Overview of the most interesting materials on data analysis and machine learning No. 37 (February 23 - March 1, 2015)

I present to you the next issue of a review of the most interesting materials on the topic of data analysis and machine learning.
General
How we train future big data professionals
Visualization of Data Science templates is a visual and interesting infographic.
New features of RStudio (v0.99 Preview): Code Completion
IPython: version 3.0 released
Pulsar: eBay real-time data analysis framework
Deep learning without high costs - a small article from the HighScalability.com portal that tells you that you can start your experiments with Deep Learning now without any large financial investments.
Machine Learning Libraries - a large list of machine learning libraries, presented in the form of a periodic table and divided into several categories: Big Data, Lua / JS / Clojure, Computer Vision, NLP, C / C ++, R / Julia, Java, Scala, Python.
Theory and algorithms of machine learning, code examples
Big Data Learning: Spark MLlib
Unusual Playboy models, or about detecting outliers in data using Scikit-learn
Google AI mastered 49 old Atari games on its own
Mistakes to Avoid Using Machine Learning
User Learning Through Twitter Data Analysis and Machine Learning
Machine Learning Errors - the author of this publication describes several typical errors that those who use machine learning algorithms to solve their problems face.
Google R Code Design Standards (Google's R Style Guide)
Does balancing classes improve classifier results?
The K prediction algorithm in the k-means clustering algorithm is an interesting feature in the BigML library.
Deep Speech: Accurate Speech Recognition with Deep Learning and GPU
Visualization of clusters using R
Comparison of learning algorithms with a teacher (Supervised learning)
A series of lessons in machine learning and natural language processing. Lesson 4: Naive Bayes Classifier
Machine Learning Competitions
Avazu Kaggle Challenge Machine Learning Session Participants Diary
Machine Learning Contest: Diabetic Retinopathy Detection
Online courses, training materials and literature
Announcement of the new course: Introduction to Data Science - it is worth noting that the course is paid.
Book Review: Mastering Scientific Computing with R
Free eBook: Hadoop for Dummies
Free eBook: Software Defined Storage for Dummies
Videos, podcasts
Interview with Andrew Ng at the Deep Learning Summit in San Francisco
Machine Learning Scaling with R and the H2O Library
Talking Machines: Episode 5: Interviews with Geoffrey Hinton, Yoshua Bengio, and Yann LeCun: The Inside Machine Learning Story is the fifth episode of the Talking Machines podcast series, in this case a session with bison like Geoffrey Hinton (Google, University of Toronto ), Yoshua Bengio (University of Montreal) and Yann LeCun (Facebook, NYU).
Data engineering
Apache Spark: what's under the hood?
Real-time log analysis with Apache Kafka, Cloudera Search and Hue
Big Data Streaming: Storm, Spark, and Samza
Big Data Processing in Apache Spark
Using MongoDb with Hadoop and Spark: Part 1 - Basics and Customization
Beginning of a New Era: Apache HBase Version 1.0 Release
Now you can download beta version of Hive-on-Spark
Reviews
Interesting from the world of R (February 23 - March 1, 2015)
Best Content of the Week from KDnuggets.com (February 15-21)
DataScienceCentral Weekly Digest (March 2)
Data Science News from MyDataMine.com (February 27)
Big Data News from MyDataMine.com (February 24)
Best Resources of the Week from Data Elixir (No.24)
The weekly collection of the best materials from R1Soft (February 27)
The most interesting materials on High Scalability (February 27)
Previous issue: Overview of the most interesting materials on data analysis and machine learning No. 36 (February 16 - 22, 2015)