Data scientist salary range midpoint
$125,250. Source: Robert Half Technology 2020 Salary Guide
In a nutshell: What is a data scientist
In short, data scientists combine technical prowess with scientific and social knowledge to solve business challenges with data, including building Artificial Intelligence and Machine Learning models and creating complex models to address issues large and small.
The role is newer, yet of the utmost importance to businesses as AI/ML becomes mainstream, and "data-driven business" becomes an imperative rather than a buzzword.
What skills are needed?
Data scientists must have a deep knowledge of statistics and at least one area of machine learning/artificial intelligence. They have to be able to build highly specialized mathematical models and have a thorough understanding of ML algorithms. Preferably they have basic programming skills in R and/or Python and a good understanding of distributed data/computing tools like Map/Reduce, Hadoop, Hive, Spark, Gurobi, MySQL, among others.
How to stand out in an interview
Data scientists are twice as likely as the average technical professional to have a secondary degree and often come from surprising backgrounds, so prospective candidates need to find non-traditional avenues to stand out from the crowd.
As with any technical role, creating public projects that can be viewed by hiring managers is a great way to show off your skills. Projects don't necessarily need to be work-related – this is a great way to show off your passion. This also helps display curiosity, a key trait data scientists must possess – the more you can show off a drive to keep asking questions, the better.
Bonus: Sample interview question
Question: Can you describe the techniques of data wrangling?
Answer: Data wrangling involves cleaning data by finding and replacing missing values, removing duplicate values, and detecting outliers and anomalies. Transforming categorical data to numerical data using data science libraries is also critical, as is merging multiple sets of data into a single dataset.