DATS200 Functional Methods and Coding (3 semester hours)

This course provides the student with the basic knowledge and skills to handle and analyze data using a variety of methods, as well as a variety of programming languages and tools. Students are introduced to current industry standard data analysis packages and tools such as those in R/RStudio, Matlab/Octave, SAS or SPSS. Depending on current industry standards, the student will be provided with the opportunity to develop knowledge and skills in programming environments such as R, Octave, and Python®. In addition, students are introduced to data analysis packages and tools such as those in various scripting languages, SQL, Java®/NetBeans®, JavaScript® or Julia. (Prerequisites: MATH302 and MATH220)Python® is a registered trademark of the Python Software Foundation.Java® and JavaScript® are registered trademarks of Oracle America, Inc.NetBeans® is a registered trademark of The Apache Software Foundation.

DATS201 Analytical Methods I (3 semester hours)

This course provides students with the basic toolkit of statistical methods and models that practitioners use for regression, analysis of variance, and linear models. This toolkit could be based on Python® or on R. Topics include descriptive statistics/data summaries, inference in simple and multiple linear regression, residual analysis, estimation and testing of hypothesis, transformations, polynomial regression, model building with real data, nonlinear regression and linear models. The course is not mathematically advanced but covers a large volume of material. (Prerequisites: DATS200, MATH302 and MATH220)Python® is a registered trademark of the Python Software Foundation.

DATS211 Introduction to Data Science (3 semester hours)

This course provides an overview of data science including a foundation in research methodology. Data science is a data-driven process that provides descriptive, predictive, and prescriptive insight. Whether reporting on historical information or making predictions about future events, the goal of data science is to add value through analysis that informs. To meet this goal this course introduces a range of tools and methods including supervised and unsupervised techniques. These include techniques such as classification, rule-based association techniques, support vector machines, K-nearest neighbor, regression, and clustering techniques such as K-Means. (Prerequisite: DATS201)

DATS221 Exploratory Data Analysis (3 semester hours)

Exploratory data analysis (EDA) plays a crucial role in the first stages of analysis. That is, the data determine what the appropriate analysis technique is rather than an analysis technique being applied to the data. During this course students develop skills in all aspects of exploratory data analysis including a wide variety of tools and techniques for pre-processing and cleaning data, including big data. This course also introduces students to evaluating and plotting/graphing data to evaluate the content and integrity of a data set. (Prerequisite: DATS200)

DATS225 Data Visualization (3 semester hours)

One of the most important functions in data science is the communication of the meaning in data. Data visualization is a core competency that enables that communication. This course introduces students to best practices in the visual and graphical representation of data and meaning. Design principles are emphasized as skills in visual communication develop. The specific tools and methods used in this course will vary depending on current industry standards and preferences. (Prerequisite: DATS200)

DATS298 Associate Seminar (3 semester hours)

This course is intended to be the last course taken prior to completing the Associate of Science in Data Science program of study. Students will work as a team to acquire and analyze data to address a specified challenge or problem. This course stresses critical thinking in analyzing data with a focus on implementation of potential solutions within a given context. Challenges or problems assigned in this course involve real-world, complex situations such as large natural disasters, broad financial incidents or pandemics, planning for policy discussions at the state, regional or national level, etc. (Intended to be the last course in the program)

DATS301 Analytical Methods II (3 semester hours)

Whereas Analytical Methods I primarily deals with continuous data, this course deals with methods and tools used to analyze categorical (discrete) data. For example, researchers analyze categorical data, e.g. using logistic regression, to determine the results of tests such as learning if a patient’s tumor cancerous or not, or whether a consumer will purchase a particular product or not. Specific attention will be paid to surveys and survey data. In addition, this course introduces generalized linear modeling. (Prerequisite: DATS211)

DATS311 Intermediate Data Science (3 semester hours)

This course continues to expand the knowledge, skills and abilities of students by two paths; first through the design of experiments required to acquire specified data, and second using carefully designed experiments to establish causal effects. Students will take a deep dive into the differences between correlation and causality. They will learn the critical thinking skills required to assert the reliability of data acquired. (Prerequisite: DATS301)

DATS331 Machine Learning I (3 semester hours)

This course introduces students to machine learning. If provides students with a broad overview of machine learning topics for both supervised and unsupervised methods. The topics typically include classification, decision trees, association rule-based classification, support vector machines, regression (linear, logistic and Bayesian), clustering, k-Nearest Neighbor, principal component analysis (PCA), Feature Selection, Linear Discriminant Analysis (LDA) and Factor Analysis. Additional topics can include ensemble methods such as stacking, bagging and boosting. (Prerequisite: DATS311)

DATS332 Machine Learning II (3 semester hours)

This course follows the Machine Learning I course and builds on the topics originally covered. In this course, students will reconsider the various types of machine learning but now in the context of increasing their efficacy. That is, the focus will be on increasing the efficiency and effectiveness as well as the accuracy of the machine learning methods by evaluating their output and conducting error analyses on their results. Furthermore, the results from the various machine learning methods will be examined, e.g. by ROC, to determine their effectiveness in solving a given problem. Errors will be reported using a variety of methods including misclassification, the Gini Index, and entropy. Students will learn how to adjust a variety of parameters and/or hyperparameters to increase the efficiency of running machine learning methods as well as the accuracy of their output. In addition, underfitting and overfitting will be considered. Regularization and optimization will be introduced. (Prerequisite: DATS331)

DATS344 Probabilistic Graphical Models (3 semester hours)

This course focuses on the use of probabilistic graphical models to represent complex domains using probability distributions. Using probabilistic graphical models to model large collections of random variables with complex interactions. Students will learn the key formalisms and main techniques in building probabilistic graphical models. And, how to use them to make predictions and support decision-making under uncertainty. Bayesian networks, directed and undirected graphical models, as well as their temporal extensions will be covered. Students will be introduced to causation and how it can be modeled. (Prerequisites: MATH302, MATH328, DATS301)

DATS351 Sentiment Analysis (3 semester hours)

Sentiment analysis is a specialized form of natural language process intended to determine opinions expressed in written text. This is a lab-based course designed to implement topics covered in labs. The topics covered include the concepts and theories behind Sentiment Analysis. Discussion of the research approaches taken in Sentiment Analysis, knowledge-based techniques, statistical methods, supervised and unsupervised learning, and hybrid approaches. Tasks in Sentiment Analysis will be discussed and implemented through labs, e.g. classifying polarity, and determining an emotional scale. Students will learn how to generate a Sentiment Lexicon. (Prerequisites: MATH302, DATS411)

DATS371 Fundamentals of Simulation (3 semester hours)

This course provides students with an introduction to modeling and simulation at the undergraduate or at the graduate level. It includes an introduction to discrete event simulation (DES) as well as simulation methodology, input data modeling, output data analysis, and a broad overview of DES tools. It also includes an introduction to continuous simulation (CS) as well as simulation methodology, differential equation models, numerical solution techniques, and a broad overview of CS tools. (Prerequisites: MATH240, MATH302, MATH328)

DATS373 Simulation Techniques (3 semester hours)

This course provides the theoretical foundations, tools and methods used to implement solutions, and examples of various problems and their solutions using discrete event simulations, continuous simulation, and agent-based simulation. (Prerequisite: DATS371)

DATS381 Behind the Data, Our values and beliefs (3 semester hours)

This course discusses the legal, policy and ethical implications of data including privacy, surveillance, security, classification, discrimination, decisional-autonomy, and duties to warn or act. Examines these issues over the full data-science life cycle; data collection, storage, processing, analysis, and use. Includes current topics such as: legal and policy constraints; data collection methods and institutions; as well as technical, legal, and market approaches to mitigating and managing concerns.

DATS401 Analytical Methods III (3 semester hours)

This course covers topics in advanced topics and methods such as unstructured data/information and big data. The theoretical background required for the integration of data mining and text analytics or text mining are explored. Students are exposed to additional topics such as the implementation and use of data lakes and ontology evaluation. (Prerequisites: DATS200, DATS201, DATS301)

DATS411 Advanced Data Science (3 semester hours)

This course completes the three-course sequence in Data Science. This advanced course takes students through the application of more advanced methods in regression and time series models. It includes discussions about causal inference, and a wide-range of time series models. This course emphasizes tools and methods used to capture key patterns and generate insight from data. (Prerequisite: DATS311)

DATS431 Machine Learning III (3 semester hours)

This course focuses artificial neural networks as a machine learning tool. A variety of neural networks from perceptrons to convolutional and recurrent networks will be studied. Autoencoders and reinforcement learning will also be covered. Regularization and optimization are included. (Prerequisites: DATS301, DATS344)

DATS432 Deep Learning (3 semester hours)

Deep learning is a technique used by machine learning in the attempt to approach artificial intelligence. In this course, students will learn how to build deep networks. Students will begin learning how to use regularization and optimization for deep learning. Convolutional and recurrent networks will be reviewed in the context of deep learning. (Prerequisite: DATS431)

DATS433 Artificial Neural Networks using TensorFlow (3 semester hours)

The Google Brain Team originally developed TensorFlow as a proprietary code. Today, TensorFlow is open source and freely available. This course teaches students how to use TensorFlow to build and run artificial neural networks. Convolutional and recurrent networks will be included as well regression. (Prerequisite: DATS431. DATS432 strongly recommended)

DATS435 Optimization and Machine Learning (3 semester hours)

This course provides an introduction to optimization and machine learning, i.e. how machines learn. It takes an in-depth look at objective, or loss, functions and how they are used to reduce error through feedback. It also takes a look at how that feedback enables machines to learn. Students gain an appreciation of the similarities in optimization and machine learning, as well as the differences. It also takes a look at all the challenges in training machine models including the local minima or maxima that affect machine learning. In addition, it discusses different methods that are applied to optimization and machine learning, e.g. gradient methods. (Prerequisites: MATH240, MATH328, DATS301)

DATS442 Bayesian Methods (Bayesian Inference, Naïve Bays) (3 semester hours)

This course focuses on the Bayesian approach to probability and statistics and applies it to tools and methods used in data science. This course starts with a brief review of the tenets of probability that form the foundation for Bayes Theorem. Next, it discusses Bayes Theorem in-depth. Then, it considers the Bayesian approach to inference, examines the Naïve Bayes model, and develops Bayesian regression. (Prerequisite: MATH328)

DATS443 Generalized Linear Equations Using R (3 semester hours)

This course introduces students to generalized linear models that extend the linear modeling framework to allow response variables that are not normally distributed. Estimation and inference are included. Continuous data is considered. However, the primary focus is on binary or count data. Poisson and quasi-Poisson models are discussed. (Prerequisites: MATH227, MATH328, DATS301)

DATS465 Risk Modeling and Assessment (3 semester hours)

This course provides students with knowledge about the complete risk assessment process. It begins with in-depth discussions about the mathematical and computational models supporting risk assessment. Then, covers a variety of situations where risk can have a great impact, e.g. extreme events. Hierarchical holographic modeling and related models are used to analyze subsystem risks and how those risks compound to affect an entire system. (Prerequisites: MATH328, DATS301)

DATS481 Introduction to Python (3 semester hours)

This course introduces Python to students with very little programming experience. It covers the basics of programming including core elements of programs, data types, simple programs and functions, debugging and computational complexity. (Prerequisite: DATS200)

DATS482 Python and Data Science (3 semester hours)

In this course students will learn how Python, perhaps the most popular programming language today, is used in databases, image processing, machine learning and parallelism. This course is a lab-based course that requires considerable time programming. Students will learn how to handle various data structures, e.g. Python lists, NumPy arrays and Pandas DataFrames. Functions and flow control will be emphasized. This course also uses Python to build on the data visualization topics started in DATS225 Data Visualization. (Prerequisite: DATS481)

DATS499 Senior Capstone Project (3 semester hours)

This Capstone course is intended to be the last course taken by seniors prior to completing their bachelor’s degree. It is intended to give seniors the opportunity to demonstrate the knowledge and skills they have gained throughout their program of study. At the start of this course, students will be presented with a complex, real-world problem. As they consider this problem, they will restate it in a format consistent with conducting a data analysis. They will acquire the data needed and pre-process that data as required. They will determine the best method of analysis depending on the problem and the data. They will conduct the analysis obtaining preliminary results, and evaluate the analysis and the results to determine if any adjustments to parameters or hyperparameters need to be made. They will obtain the final results, i.e. the solution to the problem. Last, they will generate a written report and an oral presentation that discusses the problem, the data, the analysis including any parameters or hyperparameters used, the results, and their evaluation of the analysis and results. It is intended that the report and presentation be of sufficiently high quality to serve as part of the student’s portfolio for job interviews or applications to graduate school. (Intended to be the last course in the program)