Overview of the most interesting materials on data analysis and machine learning No. 14 (September 15-21, 2014)


    I present to you the next issue of a review of the most interesting materials on the topic of data analysis and machine learning. I also want to note that I have released the first digest on the topic of high performance and Data Enginering: Overview of the most interesting materials on high performance (September 15-21 , 2014) . I think that he might be interested in someone too.

    General



    Machine Learning Competitions


    • EN Description of the Higgs Boson Machine Learning Challenge winning methodology
      An interesting story from the winner of the Higgs Boson Machine Learning Challenge machine competition at Kaggle, where he describes the approach that brought him success in this competition.
    • RU Kaggle in Class decoding competition for Morse code
      In this short post, we will talk about a new competition that began at Kaggle in Class called Morse Learning Machine - v1. It is assumed that participants in the Morse Learning Machine will build a system that will decrypt messages encoded in Morse code contained in audio files.
    • EN 
      Microsoft Machine Learning Hackathon An article from the Microsoft Technet Machine Learning blog post about the Microsoft Machine Learning Hackathon.

    Online courses and training materials



    Literature



    Theory and algorithms of machine learning, code examples


    • EN R Visualization of GPS data
      A good code example for visualizing data from a GPS device using the programming language R.
    • EN R Configuring .RProfile This
      article is devoted to a useful and interesting topic of configuring R startup parameters using the .RProfile configuration file.
    • EN R Visualizing data with R Caret
      The author of the MachineLearningMastery blog talks about data visualization options in Caret's popular machine learning library for the R programming language.
    • EN R Using R Caret for Predictive Modeling
      The author of the MachineLearningMastery blog talks about using the popular Caret library for the R programming language for Predictive Modeling.
    • EN R Improving the Learning Model with R Caret
      The author of the MachineLearningMastery blog talks about the possibilities for improving the learning model with the Caret library for the R programming language.
    • EN For newbies R A series of slides on the topic of data analysis on R
      In this slide set, Yanchang Zhao covers seven interesting topics on data analysis and uses the R programming language for code examples.
    • RU Theory R Diagnostics of linear regression models. Part 1
      The first part of a series of articles on a rather interesting topic in the diagnosis of linear regression models from the blog “R: Data Analysis and Visualization”. The code examples in the article are written in the programming language R.
    • EN Theory Introduction to Probabilistic Programming
      A pretty good introduction to probabilistic programming with probabilistic code examples.
    • EN Analysis of the tonality of the text in movie reviews
      An interesting example of the analysis of textual information, namely the analysis of the tonality of the text in movie reviews, using the popular graph database Neo4j and the Java programming language.
    • EN Machine learning in a living environment
      Colin Ristig talks about a rather interesting and important question that is sometimes forgotten - the operation of the machine learning algorithm in a living environment.
    • EN Bibliography on Deep Learning
      A large list of various scientific materials on the popular Deep Learning machine learning method, categorized.

    Videos


    • EN Video lectures Andrew Ng on Deep Learning
      Andrew Ng of Stanford University made an interesting presentation on Deep Learning at the 2014 Robotics: Science and Systems Conference.
    • RU Video lectures Moscow Data Science. September 2014 Meetup
      On September 5, I visited a rather interesting meetup called Moscow Data Science - “September 2014 Meetup”, organized by Mail.ru. The link will allow you to watch the video from this meeting, for convenience, I marked the start time and duration of the performance of each participant.

    Data engineering


    • EN Who and how uses Hadoop
      An interesting article about the current state of affairs in the Hadoop ecosystem: who uses it and how, as well as development prospects.
    • RU Upcoming Data Science meetings in Moscow
      In the near future, several interesting meetings are scheduled at once, so I decided to publish a short list of upcoming interesting meetings on data analysis and high performance in Moscow.
    • EN 10 способов работы с Hadoop через SQL-запросы
      10 инструментов и способов для работы с Hadoop через SQL-запросы и небольшое описание каждого.
    • RU Habr Приглашаем на HadoopKitchen
      Объявление о встрече, посвященной Hadoop, которая состоится в офисе Mail.ru. Я тоже собираюсь посетить данное мероприятие.
    • EN Video lectures Введение в HBase
      Статья, содержащая видео и поясняющий материал по теме HBase — хранилища данных из экосистемы Hadoop, а также рассказывающая о ситуациях, когда стоит применять данное решение и когда не стоит.
    • EN Анонс Apache Spark 1.1
      Анонс новой версии Apache Spark 1.1 и описание основных нововведений.
    • EN Потоковая обработка данных в Apache Spark 1.1
      Статья о новых возможностях потоковой обработки данных в Apache Spark 1.1 и о вариантах использования данной функциональности.
    • EN R Python Статистические вычисления в Apache Spark 1.1
      Описание расширенных возможностей статистических вычислений в Apache Spark 1.1.

    Обзоры



    Previous issue:  Overview of the most interesting materials on data analysis and machine learning No. 13 (September 8-14, 2014)

    PS I think that many would like to see more material on topics in Russian, so if someone can advise them, then I I will be very grateful and add them to my list of resources that I follow.

    Also popular now: