Overview of the most interesting materials on data analysis and machine learning No. 23 (November 17 - 23, 2014)
I present to you the next issue of a review of the most interesting materials on the topic of data analysis and machine learning.
General
- Pandemics modeling using the Wolfram Language (Mathematica 10) using Ebola
- Interesting from the world of R (November 10-16, 2014)
- DataTalks 10/25/14: first meeting
- IBM Launches Joint Master's Degrees in Big Data with Leading Russian Universities
- Why Twitter is an easy target for social analytics
- Google and Stanford Build a Neural Network Capable of Describing Photos
- 9 Skills to Become a Data Scientist
- A little more material from Highload ++ 2014 is the last batch of slides from various speeches from the conference of developers of high-loaded systems HighLoad ++ 2014. Not all of them are related to machine learning and data analysis, but many may be interesting.
- Apache Mahout vs. Weka is a small comparison of two popular products.
Theory and algorithms of machine learning, code examples
- Introduction to Unsupervised Learning with scikit-learn
- Efficiently clear text using Python
- Introduction to Deep Learning in Python
- Browse libraries for data analysis using Python
- Factor analysis versus principal component analysis
- Code example: dplyr - dynamic grouping by field
- Code example: combining multiple data.frame in R
- One-dimensional linear regression is a good article about one-dimensional linear regression.
- Using exploratory data analysis to better understand the problem and improve the result is another interesting article from the author of the blog MachineLearningMastery. In this case, we will focus on the use of Exploratory Data Analysis.
- Ask a Data Scientist: Learning without a teacher is another article from the popular insideBIGDATA portal from the Ask a Data Scientist series, in this issue we will focus on Unsupervised learning.
- Forecasting visualization is a good article on the possibility of various forecasting visualizations using the programming language R.
- Basics of data analysis using R - a good set of slides from a report on the basics of data analysis using the programming language R.
Online courses, training materials and literature
- The book “Statistical Inference for Everyone” is a link to the free version of the book “Statistical Inference for Everyone” and links to additional materials that may be useful when working with the book.
Videos
- Introduction to Revolution R Open and Deploy R Open
- Video lectures from the Summer School of Programming (Machine Learning Summer School 2014, Reykjavik)
- An introduction to the support vector machines method is a good lecture on the basics of the support vector machines method from one of the MIT courses.
- An introduction to reinforcement learning is a good introduction to reinforcement learning.
Data engineering
- Using full-text indexing and search in PostgreSQL
- How and why Yandex disables its own data centers
- Apache Hadoop - not only MapReduce - a short article from the Vidhya Analytics blog about the capabilities that Apache Hadoop has besides MapReduce.
- Apache Hive on Apache Spark - an article from the Cloudera company blog - a demonstration of the work of Apache Hive on Apache Spark, which is becoming more and more obvious the MapReduce descendant when working with Apache Hadoop.
- Big Data 101: Separation - the continuation of the discussion about the basics of distributed computing and data storage, in this case we will talk about separation (Partitioning).
Reviews
- DataScienceCentral Weekly Digest (November 24)
- Best Content of the Week from KDnuggets.com (November 9 - 15)
- Data Mining News from MyDataMine.com (November 19)
- Digest of the best resources from DataScienceCentral (November 17)
- The most interesting materials from Freakonometrics No. 186
- The most interesting materials from Freakonometrics No. 185
- The most interesting materials from Freakonometrics No. 184
- Best Resources of the Week from Data Elixir (No. 11)
- Best Content: NoSQL Zone (November 7-14)
- The weekly collection of the best materials from R1Soft (November 21)
- The most interesting materials on High Scalability (November 21)
Previous issue: Overview of the most interesting materials on data analysis and machine learning No. 22 (November 10 - 16, 2014)