While this may appear to not be an issue at first glance, in reality the situation described is why the first pillar (business domain expertise) is so critical. The business that plans to utilize the prediction engine, on the other hand, may have the goal of increasing revenue, which can be achieved by using this prediction engine. The real question is which goal, and whose goal is it?įor example, a data scientist may think that her goal is to create a high performing prediction engine. Segmentation (e.g., demographic-based marketing)Įach of these is intended to address a specific goal and/or solve a specific problem. Recognition (image, text, audio, video, facial, …)Īctionable insights (via dashboards, reports, visualizations, …)Īutomated processes and decision-making (e.g., credit card approval) Pattern detection and grouping (e.g., classification without known classes)Īnomaly detection (e.g., fraud detection) Recommendations (e.g., Amazon and Netflix recommendations) Prediction (predict a value based on inputs) Here is a short list of common data science deliverables: Let’s first discuss some common data science goals and deliverables. In order to understand the importance of these pillars, one must first understand the typical goals and deliverables associated with data science initiatives, and also the data science process itself. This diagram, and others like it, attempt to assign labels and/or characterize the person or field that lies at the intersection of each of the primary competencies shown, which I’m calling pillars here.Īs this diagram shows, Stephan Kolassa labels ‘The Perfect Data Scientist’ as the individual who is equally strong in business, programming, statistics, and communication.
You’ll notice that the primary ellipses in the diagram are very similar to the pillars given above. Here is one of my favorite data scientist Venn diagrams created by Stephan Kolassa. David Taylor wrote an excellent article on these Venn diagrams entitled, Battle of the Data Science Venn Diagrams. One can find many different versions of the data scientist Venn diagram to help visualize these pillars (or variations) and their relationships with one another. The insights that data scientists uncover should be used to drive business decisions and take actions intended to achieve business goals.
ONLINE ROSE DIAGRAM ORIENTATION DATA SOFTWARE
A data scientist does this through business domain expertise, effective communication and results interpretation, and utilization of any and all relevant statistical techniques, programming languages, software packages and libraries, and data infrastructure. If you do happen to meet a data scientist that is truly an expert in all, then you’ve essentially found yourself a unicorn.īased on these pillars, my data scientist definition is a person who should be able to leverage existing data sources, and create new ones as needed in order to extract meaningful information and actionable insights. In reality, people are often strong in one or two of these pillars, but usually not equally strong in all four. These will be referred to as the data scientist pillars for the rest of this article. There are other skills and expertise that are highly desirable as well, but these are the primary four in my opinion. A data scientist’s level of experience and knowledge in each, often varies along a scale ranging from beginner, to proficient, and to expert, in the ideal case. Here is a diagram showing some of the common disciplines that a data scientist may draw upon. Some of these include data analyst, data engineer, and so on. This definition can be further confused by the fact that there are other roles sometimes thought of as the same, but are often quite different. This definition is somewhat loose since there really isn’t a standardized definition of the data scientist role, and given that the ideal experience and skill set is relatively rare to find in one individual. This article provides a data science definition and discussion meant to help define the data scientist role and its purpose, as well as typical skills, qualifications, education, experience, and responsibilities.
So what exactly is the data scientist’s secret sauce, and what does this “sexy” person actually do at work every day? What profession did Harvard call the Sexiest Job of the 21st Century? That’s right… the data scientist.Īh yes, the ever mysterious data scientist.