Overview of the Data Scientist profession

    Data Scientist is an analytical data expert who has the technical skills to solve complex problems, as well as the curiosity that helps to set these tasks. They are partly mathematicians, partly computer scientists and partly trend spotters.

    Data Scientist requires real and practical knowledge of methods of statistical data analysis, skills to build mathematical models (from neural networks to clustering, from factor to correlation analysis), work with large data arrays and a unique ability to find patterns. But this is all the lyrics. Let's get down to business now.

    The average salary in the USA is Data Scientist - $ 91 thousand per year. And here is a graph of earnings versus work experience.


    PayScale Data

    In Russia, the figure is from 60-70 thousand rubles per month for completely "green" beginners and reaches 220 for experienced specialists.

    According to DJ Patil, a former senior fellow at the United States Science and Technology Policy Department, “A data scientist is a specialist with a unique blend of skills that makes amazing finds and embodies fantastic stories - all thanks to the data.”

    What are Big Data specialists really doing ? They constantly face limitations - technical, methodological and any other - and find ways for new solutions. Make discoveries by analyzing and predicting. There is a place for creativity in Data Science: specialists invent elegant solutions to complex problems, as well as provide high-quality visualizationinformation, make patterns understandable and compelling.

    An example from the life of Data Scientist: “Jonathant Goldman, a physicist from Stanford, got a job on the LinkedIn social network, and began to do something that could not be measured in KPI or look at the final result: a site, fixing a bug, introducing features. While the development team was puzzled over how to modernize the site and cope with the influx of visitors, Goldman built a predictive model that told the owner of the LinkedIn account which other users of the site might be familiar with. Having convinced the company’s management to test its new model, Goldman brings millions of new views to social networks and significantly accelerates its growth. ”

    There is no definitive description of this profession - it all depends on the scope of data skills. However, there are things that any Data Scientist does :

    • Gathering large amounts of unmanaged data and converting them into a more convenient format.
    • Solving business problems using data.
    • Work with various programming languages, including SAS, R and Python .
    • Work with statistics, including statistical tests and distributions.
    • Use of analytical methods such as machine learning, deep learning and text analytics.
    • Collaboration with IT and business equally.
    • Search for order and data patterns, as well as identifying trends that can help achieve the final business result.

    And here are the terms and technologies that the future Data Scientist needs to know:

    • Data visualization : Presentation of data in graphical format so that it can be easily analyzed.
    • Machine Learning: A branch of artificial intelligence based on mathematical algorithms and automation.
    • Deep Learning: An area of ​​study in machine learning that uses data to model complex abstractions.
    • Pattern Recognition: A technology that recognizes patterns in data (often used interchangeably with machine learning).
    • Data Preparation: The process of converting raw data to a different format to make it easier to consume.
    • Text analytics: The process of analyzing unstructured data to generate key business ideas.

    Among other things, you need to know and understand:

    • Statistics and machine learning.
    • SAS, R, or Python programming languages .
    • MySQL and Postgres Databases.
    • Data visualization and reporting technologies.
    • Hadoop and MapReduce.

    Here there can be read as Beeline conducts an interview on the Data Scientist in the company: "The process begins with a phone interview with questions on certain branches of mathematics. After the candidate, a test task awaits - a specific machine learning task similar to the tasks on kaggle.com. Having built a good algorithm and received a high value of the quality metric on the test sample, the candidate is allowed to the next stage - a direct interview, which tests knowledge of machine learning methods and data analysis, as well as sets non-trivial questions from practice and logic problems. ”

    And yes, Data Scientist can be accessed not from scratch, but with a good base. Here is what he writesa physicist who graduated from the university and exchanged science for Big Data: “An office called Bidgely offered me a Data Scientist position with a salary of $ 130k per year dirty (approximately $ 7400 per month clean): work in an office in Sunnyvale, which in Silicon Valley, a couple of kilometers from the headquarters of Google, Linkedin, Apple. " In January, he thought he should go to Data Science, and in October he worked in the United States, graduating from university in June.

    So, you already realized that Data Scientist is a person who can not only extract and analyze, but also process large amounts of data, doing truly magic with the help of many tools. If you want to do Data Science for real, then get more than just Excelbut also knowledge of Python, a math analysis tutorial, and get ready to learn.

    Well, in the end we just wanted to please you. Here are some useful links. The first is with 51 free books related to Data Science. And here is the largest Data Science community . There is also an excellent textbook by Peter Flach, Machine Learning. Science and the art of constructing algorithms that extract knowledge from data ”, translated into Russian.

    From the editors


    If you want to become a Data Scientist, we recommend that you sign up for our full-time course , which lasts 5 months. After training, you will receive a diploma of professional retraining in the specialty “Data Analyst / Machine Learning Specialist”. The teachers are real experts from Yandex Data Factory, OWOX, Rambler, Sberbank-Technology, Microsoft, MTS and others. All training is based not only on theory, but also on mandatory practical development. Therefore, after a full-time course, you will come out as a trained specialist who can go to any field of interest to him: retail, banks, startups, IT, telecom. All the details are here .

    Also popular now: