What Data Science and Big Data Experts Discuss Today



    Today we decided to go through the rating of experts on the topic Data Science on Quora and see what the most active members of the community are discussing.

    William Chan ( William Chen ), who works as an analyst Quora [Eng. data scientist] shares experiences regarding his toolkit. HE says his team uses Python and SQL. Many others also use the statistical package R, but due to the fact that the main Quora code is written in Python, William's approach significantly speeds up the work of his colleagues.

    Chen's team regularly usesiPython Notebook and Jupyter to capture the results of their calculations. For data analysis, they most often use the packages Pandas, Seaborn, Numpy and SciPy. William’s colleagues in the field mostly use Sublime Text for development and Unison for file synchronization. Like the Quora developers, his team uses Phabricator for version control and code analysis.

    It is widely believed that every data scientist must know R, Matlab and Hadoop. Ricardo Vladimir ( of Ricardo Vladimiro), a specialist in this field and an employee of Miniclip, believes that this is not so. In his opinion, in order to truly immerse yourself in the study of data, you should be well versed in statistics and probability theory, be able to conduct experiments and test your hypotheses, and also know at least one programming language that allows you to process "big data".

    Ricardo adds that it is necessary to understand the field of knowledge itself, where the data for analysis come from. In addition, the very name "data science" suggests that the specialist will not be prevented by knowledge about the methods of storage, management, processing and transmission of data. Among personal qualities, he distinguishes the desire for new knowledge: be it algorithms, programming languages, or business communication skills.

    Data scientist can not do without programming skills:

    “You will constantly limit yourself, instead of achieving the desired result. You can grow only if you leave your comfort zone. Deal with it. Programming isn’t such a complicated thing. ”



    If you work in the field of Big Data and, say, want to solve the problems of dynamic pricing, then you must be an expert in at least one of these areas: economics, econometrics, finance, statistics or industrial engineering. So says Laszlo Korsos, senior analyst at Uber. William Chen adds that having programming skills will be a huge advantage.

    If you think your programming skills are poor, remember that you still have a chance. Paul DeVos of IBM Watson Health recommendspay attention to vacancies with a focus on analytics. Among the requirements for such positions are usually indicated skills in SQL, Excel, SAS and SPSS. Well, if you can work with the R package or analytical tools for Python (Numpy, Pandas, Scipy, Scikit Learn, Seaborn, Plotly, Matplotlib). They are a little more complicated than SPSS or Excel, but they can be mastered quickly enough without significant programming experience.

    In studying the topic Data Science, writes Joe Blitsshtayn ( by Joe Blitzstein ), practice - it's the most important thing. Of course, you will learn something if you take video courses, but this activity is too passive. Practical skills are acquired only during laboratory work and homework. It makes no sense to watch the video all day long: most barely stand the hour lecture.

    For additional literature, Pandora Research Director Michael Hochster recommends reading Scott McCloud's Comic Mechanics . A rather large part of the data analysis work, in his opinion, is communication using words and pictures, as in comics. The book is filled with deep reasoning and many examples and, according to Hochster, it will be more useful and interesting than the standard literature on data visualization.



    Today, experts in the field of Bid Data can be divided, for example, into those who analyze data in Excel, and those who write models in R or Python. Dmitry Korolev ( Dima Korolev ), who worked at Google, Facebook and Microsoft, believesthat workers with universal expertise will soon be in demand, such as the concept of a “universal Full-Stack developer”.

    Apple developer and founder of several IT-startups Shane Ryau ( by Shane Ryoo ) tells the story of how he hires experts in Data Science. First of all, a person should be able to program well in Python, C / C ++ and / or Java: knowledge of R, Matlab and other languages ​​is not of interest to Shane. The candidate should be able to make algorithms and, preferably, understand machine learning. In addition, he should be able to tell in detail about the maximum likelihood method, Bayes theorem, Viterbi algorithm and regularization, ideally write an article on these topics.

    Many are interested in the level of earnings in this area. Paul DeVos claimsthat last year, for example, in Dallas, the average salary was about 130 thousand dollars. He is familiar with three specialists who receive such a salary. “Each one has a different experience, and each one has a master’s degree,” says Paul.

    Elias Abou Haydar , data scientist at iGraal, believes that the most successful colleagues are distinguished by effective communication skills, in particular, working with the media. He notes that this does not mean at all that other, less visible specialists are worse off.

    “We have a hard time, especially when there are a lot of people around who brag more than they do business,” writes Elias. Of course, experience and skills in solving complex analytical problems play an important role.

    Working with data forces you to communicate with people from different departments of your company. As a result, you find yourself in the center of events, so you need to understand in what areas the business works, what employees do and how you can interact with them. Therefore, work in the Big Data sphere gives you a clear advantage over specialists from nearby fields of activity and more career opportunities.

    Also popular now: