The meteoric rise of — and incredible hunger for — big data analytics across industries has generated a massive demand for trained professionals who can extract knowledge from large, complex datasets. Colleges and universities across the country have responded to this demand by offering degree programs in data science at all levels.
A data science degree can position prospective data scientists for career success by teaching them how to use predictive statistical analysis to solve everyday business problems. The following guide provides an overview of various academic paths in the field of data science, examines related career opportunities, explores curriculum and learning outcomes of undergraduate and graduate programs, and answers frequently asked questions about this new, yet highly in demand discipline.
Degree programs in data science teach students how to bridge the gap between technology and business interests by using analytics to make informed business decisions. According to research from executive recruiting company, Burtch Works, a majority of professional data scientists (92 percent) hold a graduate degree and nearly half (48 percent) have a PhD. However, an associate or bachelor’s degree can also open the doors to career opportunities in business intelligence and analytics.
Below is a breakdown of the different degree levels available as well as potential career paths for each.
A bachelor’s degree in data science provides students with foundational training in the principles of statistical and mathematical analysis, including knowledge of computer science components, data structures, algorithms and information visualization.
Identifies and suggests techniques to improve an organization’s efficiency. This review process includes analyzing company data, such as revenue and employee performance reports, and interviewing personnel to craft recommendations for new systems and procedures.
Works with an organization’s data to design, construct and manage information delivery systems. Identifies business intelligence needs and uses a suite of tools to access data warehouses and build data visualization solutions (e.g., tables and reports).
Gathers information about sales, consumers, and industry conditions to identify market trends and opportunities for revenue. Uses statistical software to transform complex business data into business and marketing strategies.
A master’s degree in data science helps students develop strong statistical, mathematical, computational, and programming skills. This prepares graduates to move into a PhD program or into industry as a data science professional.
Collects, analyzes, and interprets data using a variety of mathematical techniques and statistical software. Designs statistical models, experiments and surveys to solve real-world or business-related problems.
Analyzes raw data by applying machine learning principles to large data environments. Gathers information through data mining, uses predictive modeling to develop business intelligence reports, and documents statistical methods and results.
Works in statistical data analysis and with data mining packages (e.g., SAS) to extract, import, store and analyze large and complex sets of data. Designs and develops data warehouses, manages data quality procedures, and builds data applications.
The PhD in data science emphasizes advanced research in computational science, focusing on a cross section of advanced topics in data mining and high-performance computing. Students learn quantitative analysis techniques and how to develop mathematical models to analyze and solve problems using data.
Uses statistical analysis and predictive modeling to devise solutions to business problems. Employs computer science skills and statistical tools to acquire, clean, produce and analyze complex, unstructured data sets.
Applies data analytics knowledge to identify and propose new business concepts, solutions, and services. Uses machine learning, probability theory, predictive modeling and statistical analysis to transform data sets into actionable business models.
Addresses business problems by using advanced statistical analysis and machine learning to develop data-driven solutions. Works with business stakeholders to identify needs and craft appropriate project plans. Manages the process of collecting, preparing, and analyzing data.
Academic programs in data science do not exist at the associate degree level. However, there are a number of degree programs, as well as individual courses, that can prepare students for future studies in data science. At the degree level, students may want to consider programs in information systems, computer science, mathematics or economics. Students can use these two-year programs to build a strong academic foundation in areas central to data science, such as statistics and computer programming, and as a starting point for a bachelor’s program in data science.
The list below is examples of courses that can help students develop basic competencies related to data science and pave the way for success in a four-year program.
This class introduces students to the foundation of proportional logic, which is used in model reasoning and as a design tool in modern computer programming.
A course in computer programming helps students understand modern programming practices and how algorithms are designed and function. Students learn about the fundamentals of using object-oriented computer programming to devise business information problems.
Students can use a course on SQL servers to learn about working with databases and manipulating data by, for example, creating data tables, managing database functions and building data forms.
Through this course, students gain experience with statistical analysis and elements of data science, including organizing data, working with variables and using statistical inference (drawing upon data to make conclusions).
The BS in data science is an interdisciplinary program of study. Combining coursework from business, information technology, computer science and statistics, this degree is designed to provide students with a broad understanding of data collection, management, and analysis. Students learn how to think about data in a general business context, organizing and examining it to develop real world solutions and business models.
A BS typically requires four years of full-time study (120 to 128 credit hours) and includes a curriculum that covers topics such as data visualization, probability, object-oriented programming, and algorithms. Upon graduation, students will have developed skills in data mining, computer programming, and data analysis and visualization. This diverse skill set provides a foundation for either continuing studies in a graduate program or transitioning into a career in the emerging field of data analytics.
The curriculum for data science majors is divided into general education, core coursework, and electives. Example concepts students will explore, along with the courses in which they will learn these concepts, are listed below:
|Basic statistical modeling||
Intro to Statistical Modeling:
This class teaches students about data analysis within big data sets. The course focuses on using linear regression to develop appropriate data models.
This course covers creating, implementing and analyzing statistical models (e.g., discrete linear and nonlinear).
|Knowledge of software design||
This course provides students with foundational instruction in the design principles of computer software.
This class explores technical knowledge in software design, including language processing, linked data structures and component interface design.
|Familiarity with programming languages||
Introduction to C++:
This course is an overview of C++ and the study of computing data.
Science of Java Programming:
This course introduces students to computer programming in Java and how computer programming can aid in solving problems.
By the end of a bachelor’s program in data science, students should be able to:
Think critically in a mathematical framework
Solve problems using abstract contexts
Have basic competencies with computer programming concepts
Apply specific mathematical techniques to solve problems
Understand the computational techniques used to manipulate large data sets
Visualize and communicate analytical findings
These programs are aimed at individuals with an interest in quantitative study — mathematics, statistics and computational analysis. These programs are also a good fit for students preparing to continue their studies in graduate school or pursue entry-level employment in big data as well as related fields, such as information technology or market research.
For the most part, no. Although there are some exceptions, a master’s degree is typically the minimum educational requirement for most big data employers. Today, companies typically prefer candidates with a PhD and specific proficiencies in key technical areas, such as coding.
Data science curriculum at the undergraduate level covers a breadth, rather than depth, of topics. The curriculum starts with a topical core of statistics, mathematics and computer science. Students then build upon that foundation by studying the central concepts of data science: statistical modeling, data mining, data visualization, and business analytics.
Most programs require students to complete a capstone project. The purpose of a capstone project is for students to apply the conceptual knowledge they gain in the program to a real life problem or situation. Students work either individually or in teams to collect and manipulate data, applying statistical models to analyze it, and devising an appropriate solution to the problem.
Some programs allow students to specialize their education through additional coursework in specific areas such as engineering or finance. Other programs allow students to focus on industry- or domain-specific areas, such as healthcare, manufacturing, energy, or technology. Students may also find that departments have industry partnerships that can be leveraged for experiential training opportunities or internships.
The MS in data science focuses on the study of the latest developments in the field. Students develop an advanced understanding of mathematics and statistical analysis, while also developing strong skills in computational statistics and programming. Students also gain specialized knowledge in quantitative analysis and are able to employ it in a variety of ways, from predicting consumer behavior to studying stock market fluctuations.
Master of Science candidates are typically required to complete 30 credit hours of study, which is divided between core courses and electives. Core classwork is anchored in mathematics (e.g. linear algebra, discrete mathematics), computer programming (e.g. software design, programming languages), statistics (e.g. data mining, statistical inference), and data management (e.g. database systems, data warehousing).
Students may be able to complete their degree in 15 to 24 months. Some programs require a capstone project, which allows students to conduct original research in data science. Upon completion of their master’s degree, students are prepared to enroll in a PhD program in data science, statistics or computer science. They may also choose to pursue career opportunities as a data science professional.
|Research design and question formulation||
Data Analysis Applications:
This course covers the decision-making concepts and related role of big data. Students learn about gathering data, interpreting results and presenting relevant findings.
Data Analysis Design:
This class provides students with an introduction to quantitative research methods and associated statistical techniques used in data analysis.
|Storing and retrieving data||
Database Design and Management:
Students explore the theoretical foundations and practical applications of database systems, including design, use, creation and management. Students also learn about database languages (e.g., MySQL and PHP).
Database Systems Engineering:
In this course, students study the core concepts of database systems, including relational data models, query languages and distributed database systems.
Students learn fundamentals of machine learning, including algorithm development, clustering and development of machine learning programs.
In this course, students study statistical processing algorithms and programming structures. The class also focuses on using data analysis software packages to manipulate large data sets.
Learning objectives vary by program, but graduates of master’s programs in data science should be able to:
Understand statistical and data mining techniques
Apply analytical techniques and create algorithms to extract information from a data set
Apply quantitative modeling and data analysis techniques to develop solutions to business problems
Understand central concepts of data science analytics, including visualization, predictive modeling, machine learning, and data mining
Use programming languages
Understand statistical data analysis techniques and their relationship to decision-making in business
Individuals who are interested in quantitative work and enjoy solving problems can pursue master’s degrees in data science. Students from most analytical backgrounds (e.g., economics or engineering) should succeed in a master’s program and don’t necessarily need an undergraduate degree in data science to apply, although a related degree is usually preferred.
In additional typical admissions requirements, students without a background in computer science or mathematics may be required to complete a series of prerequisite courses in either field prior to being admitted to a MS program.
Data scientists should have a versatile coding skill set and be comfortable working in multiple languages. Students should be prepared to take classes involving any number of programming languages, including Python, R, Java, and C.
Most MS in data science programs require students to complete a 6-credit summer practicum held between their first and second years in the program. The practicum is designed to provide students with real world experience working on a data science team and conducting data analytics research.
Concentration areas vary by program and school, and are generally optional. Students complete a series of 12 to 15 additional credit hours of study in their chosen concentration. Some common specializations include computational methods, bioinformatics, business analytics, and marketing.
Doctoral degree programs in data science are designed to train students to manage large, unstructured and complex data sets into information that can be used to make decisions. Curriculum includes advanced concepts and techniques in computer programming, statistical modeling and data mining. Students develop expert knowledge in mathematical foundations, which allows them to conduct independent research in an area of interest. PhD programs emphasize expert, comprehensive skill building in data science, including statistical testing, analytical modeling, programming (SAS, Hadoop, C++, Java), and databases and warehousing.
Although the number of credits to graduate varies, the curriculum is typically spread across core instruction, electives, and a dissertation. Programs can generally be completed within four to six years of full-time study. After finishing their core instruction, students start specialized research and write a dissertation in an area of interest alongside a faculty mentor. The PhD in data science prepares students for employment in academia, government agencies, private industry, and scientific research.
|Statistical computing and simulation||
This course refines students’ knowledge and skills in SAS programming through simulated data analysis of real-world data sets.
This class includes practical study and use of data and statistical mining models that are utilized to analyze massive data sets.
|Advanced data mining techniques||
Data Mining I:
Students study data extraction techniques, including how to select and clean data and how to apply machine learning and data visualization techniques.
Data Mining II:
This course covers advanced concepts in working with larger data sets, using multivariate regression, and graphing data.
This class covers the relationship between data warehousing and business intelligence applications, including major data warehousing and mining techniques, analytical processing, and cluster classifications.
Relational Database Systems:
Students in this course study how to integrate, store and manipulate large data sets, major database systems in data science, and database languages, including SQL.
Expert knowledge of statistical methods, machine learning algorithms, and data regression techniques
Advanced competency in programming languages and tools, such as Hadoop, Python and Java
Expert knowledge of machine learning methods and tools, such as R and SAS
Advanced understanding of statistical methods and predictive modeling
Broad skill set in data visualization, pattern recognition and signal processing
Expert knowledge of data mining, data warehousing and working with relational and non-relational databases
Individuals with a strong background in mathematics who have previous degrees in a quantitative field of study, such as data science, statistics, mathematics or economics, tend to find success.
PhD students are required to successfully pass a qualifying examination after completing their core course requirements. This examination demonstrates they have gained the skills required to complete their data science PhD. In addition, students should also complete papers for peer-reviewed journals and attend conferences. Finally, students must complete and defend a doctoral research paper.
Students must submit an application and transcripts from their undergraduate and graduate education. Depending on the program, students may also be asked to take and submit scores from the Graduate Management Admission Test and/or Graduate Record Examination. Other requirements include letters of recommendations, a personal statement, resume and — if applicable — TOEFL scores.
Some dual degree programs do allow students to be admitted with only a bachelor’s degree. In those programs, students complete additional coursework at the graduate level before going on to PhD coursework. In doing so, they earn a master’s degree and write a thesis during the process.
The demand for professionals with data-driven decision-making skills continues to increase, and the market is becoming richer with opportunities. That places a greater emphasis on data science degree programs. Data science is a relatively new field, which means some people are uncertain about what type of program to select. Here are four things to consider when choosing a program.
Choosing a program that aligns with your career goals is important. Before selecting a program, decide on the type of career you want to have. When choosing where to apply to a data science program, it is important to understand the type of specialization the program offers. One program may offer expertise in economics, while another may focus on business analytics.
Soft skills are just as valuable as technical skills. Students should consider programs that integrate business knowledge and communication skills into the curriculum. Data professionals need to understand the industry and business problems they are trying to solve. They also need individuals who can present technical and data reports clearly and efficiently.
Technical knowledge is the bedrock of a data professional’s skill set. Programs at each level — bachelor’s, master’s and doctorate — should provide a diverse curriculum that allows students to develop skills in multiple areas. These areas include statistical analysis tools (e.g., SAS and R), coding (e.g., Python, Java, C/C++), data applications (e.g., Hive, Pig, cloud computing), databases (e.g., SQL) and unstructured data (e.g., social media, audio feeds).
Many data science programs are interdisciplinary. For example, a program may include faculty members from the department of computer science, the department of business and the department of mathematics. Examine the mix of faculty members and their academic records and professional experience. Are they interested in social networking analysis? Crowdsourcing? Information visualization? Attending a program with the right faculty mix and research interests means students can better pursue their educational and professional goals.
Those interested in pursuing a career in big data aren’t limited to a degree in data science to learn the skills and knowledge necessary for success. There are also opportunities for individuals in a range of other disciplines such as business administration, economics, and computer science. These related fields may have a different approach, but graduates should still learn key data analysis skills.
Below are a few alternative degree options for students interested in data science:
The Bachelor of Science in business administration combines instruction in management, organizational and communication strategies with coursework in quantitative data analysis. In these programs, students learn how to analyze data to make business decisions in various professional environments. The curriculum introduces students to working with databases, business analytics, simulation modeling and basic programming (e.g., C++). Some departments allow for specialization in a business area, such as accounting, economics or marketing. With the degree, graduates can pursue entry-level employment opportunities in business or further studies in graduate school.
The Bachelor of Science in economics provides students with a background in quantitative techniques through the study of modern economics, statistics and math. In these programs, students develop an understanding of economic activity, market structures and economic models. They gain applied knowledge in forecasting economic activity and using software to analyze economic data. With the degree, students are prepared for employment opportunities in both technological and scientific professions, as well as future graduate studies in related areas, including data science, mathematics, statistics or finance.
The Master of Science in computer science provides students with instruction in the foundational concepts and principles in the design and programming of computer applications, software and technologies. Some programs offer data science as an academic concentration, allowing students to learn about the field while developing their knowledge of computer science. Students may use their degree to pursue further graduate study or to pursue employment in data science and other fields, such as computer security, software engineering, computer security and game development.
The Master of Science in business analytics is closely related to a data science program. In the former, students develop skills in transforming complex data into manageable information. The curriculum blends data-specific knowledge — data mining, predictive modeling and programming — with business-centered coursework in business intelligence, marketing and supply chain analytics. Armed with foundational knowledge in business practices and quantitative thinking, graduates can pursue employment opportunities across industries, from healthcare to technology to consulting to financial services.
The PhD in statistics is an advanced program focused on the theories and methodologies used in working with large quantities of statistical information. Statistical analysis has become a central research component in nearly every industry and domain, ranging from the biological sciences to machine learning. These programs prepare students to conduct independent research and develop real-world applications needed to extract, manipulate and analyze data from industry-specific sources. Most students enter into academic positions after graduation, but opportunities exist in government organizations and private industry, such as banking and pharmaceuticals.
The PhD in machine learning, also offered as a specialization in PhD in computer science programs, is designed for students interested in conducting research in computational statistics. In these programs, students develop expert knowledge of automated learning, data analysis and statistical principles. Research areas include machine learning algorithms, data warehousing methods, data visualization security and business applications. With a PhD, a graduate may join an academic department as a researcher/faculty member or pursue diverse employment opportunities, such as data scientist or big data engineer.