# AIPI 520: Modeling Process & Algorithms (Spring)

**Course Description**

This course is an introduction to the modeling process and best practices in model creation, interpretation, validation, and selection of models for different uses. The primary machine learning algorithms, both supervised and unsupervised, are introduced and explained with the necessary level of mathematical theory to establish students’ intuition for how each algorithm works. The primary focus will be on “traditional” machine learning approaches but it will also introduce deep learning and its applications. At the end of this course, students should have a solid understanding of the end-to-end modeling process and the different types of model algorithms along with the strengths, weaknesses, assumptions, and use cases for each type. Programming in this course will be limited to basic implementations of the model algorithms in Python with the primary goal to reinforce the conceptual understanding using simplified, canned datasets. The course is designed to run in parallel to AAI XXX: Analytics Programming, in which students will gain practice in the programming implementation of the data pre-processing and modeling process and algorithms via case-based learning using real-world datasets.

**Pre-Requisites**

Students are expected to understand the main concepts of calculus, linear algebra and probability & statistics, as well as possess a foundational level of proficiency in Python programming.

**Learning Objectives**

Through this course, students will be expected to:

• Be able to explain the primary applications of machine learning, including supervised and unsupervised techniques

• Understand the steps of the machine learning modeling process and gain practice in them through their programmatic implementation

• Be able to explain the strengths, weaknesses, assumptions, general mathematical theory, and use cases for the major types of machine learning algorithms, including clustering, decision trees and ensemble methods, linear and logistic regression, support vector machines, and neural networks

• Develop skills in feature selection and evaluating feature importance for modeling

• Understand the bias-variance tradeoff and how to perform model selection and hyperparameter tuning

• Be able to validate models including cross-validation and communicate their performance including relevant error metrics

**Course Materials**

• “An Introduction to Statistical Learning” by Gareth James, et al. The pdf can be downloaded for free at http://faculty.marshall.usc.edu/gareth-james/ISL/

**Required Free Software:**

• Python 3.7.x (suggest installing Python via the Anaconda distribution (https://www.anaconda.com/distribution/)

• The following libraries must also be installed (can be installed using pip or conda):

o Numpy

o Pandas

o Jupyter Notebook

o Matplotlib

o Scikit-learn

**Course Grading**

• 20% in-class quizzes

• 30% Homework assignments

• 20% Midterm exam

• 30% Final Exam