Skip to main content

DP-100 Part 2 – Modeling

Course

Intro Video

Photo of Brian  Roehm

Brian Roehm

Azure Training Architect I in Content

Brian currently resides in Kansas City, Kansas where he loves getting to view the changing leaves, cooling temperatures, and of course, football season. Brian considers himself an adventure seeker; in his spare time you can find him rebuilding classic cars, scuba diving, skiing, and exploring the world around him. He is also an avid reader and movie goer.

Length

08:00:00

Difficulty

Intermediate

Videos

47

Course Details

In this course we will be

In this course we will focus on how to run experiments and train models in Azure Machine Learning. This course is part 2 of 3 for full preparation for the DP-100 exam.

We will examine:

How to create models by using Azure Machine Learning designer Run training scripts in an Azure Machine Learning workspace Generate metrics from an experiment run Examine key algorithms, features, and machine learning models to build a foundation Get an overview of important tools such as PyTorch, Scikit learn, Keras, and Chainer

Syllabus

Course Introduction

Course Introduction

00:03:08

Lesson Description:

In this video, we discuss this course and Microsoft's DP-100 certification exam.

About the Training Architect

00:01:05

Lesson Description:

Meet the instructor! In this video, I'll introduce myself and provide a little background on my experience.

Using the DP-100 Essentials Guide

00:01:59

Lesson Description:

In this video, we introduce the DP-100 Essentials Guide, how to navigate it, and how to leverage it to help with exam preparation.

About the Exam

00:02:50

Lesson Description:

Gain practical tips on what to expect on exam day, details about the exam itself, and how to prepare for the exam both today and on exam day.

A Note on Data Science and Mathematics

00:02:28

Lesson Description:

This course is designed to help pass the DP-100 exam. In this video, we cover other critical skills required for being an outstanding data scientist.

Azure Machine Learning Pipelines

A Refresh on Azure Machine Learning Pipelines

00:05:35

Lesson Description:

In this video, we do a refresh on Azure Machine Learning pipelines and discuss the reasons pipelines are advantageous to machine learning experiments.

Designer Modules to Define Pipeline Data Flow

00:13:30

Lesson Description:

In this video, we do a deep dive into all of the modules available in Designer. This video provides a critical background necessary for the DP-100 exam and provides a baseline for future videos where we deep dive into individual modules.

Using Custom Code Modules in Designer

00:04:43

Lesson Description:

In this video, we take a look at custom code modules in Designer. We provide details on the inputs and outputs, a very high-level review of a Python script, and set the stage for the custom code module lab later on in this section.

Exam Essentials and References

00:06:45

Lesson Description:

In this video, we review key concepts from the section and additional resources to help prepare for the DP-100. References: Azure Machine Learning pipelines Azure Machine Learning designer

Machine Learning Algorithm

An Introduction to Terminology

00:15:09

Lesson Description:

In this video, we begin discussing key terms needed to understand when looking at machine learning models. This lesson creates a foundation for future videos where we dive deeper into important concepts.

How to Select Algorithms in Azure Machine Learning

00:08:02

Lesson Description:

In this video, we review key criteria we need to take into account when choosing an algorithm. Algorithm cheat sheet: https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet

Text Analytics

00:18:09

Lesson Description:

In this video, we provide an introduction to Text Analytics and take a look at the Text Analytics modules found in Azure Machine Learning designer.

Regression

00:21:42

Lesson Description:

In this video, we introduce the concept of regression. We discuss key terminology, key steps to prepare data, and look at common regression algorithms including boolean decision tree, decision forest, linear regression, and neural networks.

Multiclass Classification

00:23:54

Lesson Description:

In this video, we review what classification is, the differences between binary and multiclass classification systems, decision trees, and key terminology. In addition, we discuss the classification algorithms found in Azure Machine Learning Studio.

Image Classification

00:11:29

Lesson Description:

What is image classification? In this video, we explore key concepts and take a look at ways to implement it in Azure Machine Learning Studio.

Anomaly Detection

00:11:40

Lesson Description:

This video is a review of anomaly detection and how it works in Azure Machine Learning Designer.

Clustering

00:09:52

Lesson Description:

In this video, we take an introductory look at clustering, how it works, and how it is used in Azure Machine Learning designer.

Recommenders

00:08:31

Lesson Description:

In this video I will give an introduction to recommender systems. We will discuss the basic types and then take a look at recommender systems in Azure Machine Learning Designer.

Exam Essentials and References

00:05:33

Lesson Description:

In this video, we review key concepts from the section and additional resources to help prepare for the DP-100. Documentation: Machine Learning Cheat Sheet

Feature Selection

Upcoming Lesson: An Introduction to Feature Selection

Lesson Description:

In this video, we introduce the concept of feature selection and its role in the machine learning process.

Intro to Feature Extraction

00:09:31

Lesson Description:

What is feature extraction, how does it differ from feature selection, and when should it be employed in a machine learning experiment? In this lesson, we explore these questions and learn how feature extraction is used in Azure.

Pearson's Correlation

00:05:41

Lesson Description:

In this video, we examine Pearson's Correlation, the general steps to implement it, and common terminology such as correlation values.

Mutual Information Score

00:09:13

Lesson Description:

In this lesson, we continue our examination of feature selection algorithms with a look at Mutual Information score.

Kendall's Correlation Coefficient

00:07:02

Lesson Description:

In this lesson, we teach about Kendall's correlation coefficient, when it should be used, and learn basic terminology such as concordance, parametric, and monotonic relationships.

Spearman's Correlation Coefficient

00:07:54

Lesson Description:

In this lesson, we dive into Spearman's correlation coefficient, what it is, and when to use it in feature selection.

Chi-Squared Statistic

00:04:48

Lesson Description:

In this video, we introduce the concept of the chi-squared statistic algorithm.

Fisher Score

00:05:13

Lesson Description:

What is Fisher score and how is it utilized? In this lesson, we dive into these topics and explore how Fisher score is used in feature selection.

Count-Based Feature Selection

00:03:40

Lesson Description:

What is count-based feature selection? In this lesson, we examine count-based feature selection and when it should be utilized.

Upcoming Lesson: Exam Essentials and References

Lesson Description:

In this video, we review key concepts from the section and additional resources to help prepare for the DP-100.

Classic Machine Learning Models

Introduction to Neural Networks

00:07:32

Lesson Description:

In this video, we reintroduce the concept of neural networks with a deeper dive into key features and terminology. This lesson also introduces the section and the additional neural network types we will be discussing.

RNN

00:04:47

Lesson Description:

In this lesson, we review recurrent neural networks. We discuss how RNNs are used, key terminology, and the basics of function.

DNN

00:04:31

Lesson Description:

In this lesson, we review deep neural networks. We discuss how DNNs are used, key terminology, and the basics of function.

CNN

00:04:46

Lesson Description:

In this lesson, we review convolutional neural networks. We discuss how CNNs are used, key terminology, and the basics of function.

Upcoming Lesson: SMOTE

Lesson Description:

In this lesson, we review SMOTE. We discuss how SMOTE is used, key terminology, and the basics of function.

Exam Essentials and References

00:03:42

Lesson Description:

In this video, we review key concepts from the section and provide additional resources to help prepare for the DP-100.

Run Training Scripts in an Azure Machine Learning Workspace

Azure Machine Learning SDK Introduction

00:06:53

Lesson Description:

One of the requirements for the DP-100 is an understanding of the Azure SDK. In this lesson, we introduce the SDK, prepare our environment to create experiments, and work with data stores and data sets.

Create an Experiment with SDK

00:04:42

Lesson Description:

In this lesson, we take the setup from our last video and learn how to create an experiment using Azure SDK. Workspace config files are not covered in greater detail as they are not a necessary part of the DP-100 exam. However, for those interested in learning more, this is a fantastic resource: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-configure-environment#workspace

Consume Data from a Datastore with SDK

00:06:55

Lesson Description:

In this lesson, we learn how to create datastores in the SDK and consume data from those newly-created datastores. Create a Datastore blob_datastore_name='azblobsdk' # Name of the datastore to workspace container_name=os.getenv("BLOB_CONTAINER", "") # Name of Azure blob container account_name=os.getenv("BLOB_ACCOUNTNAME", "") # Storage account name account_key=os.getenv("BLOB_ACCOUNT_KEY", "") # Storage account access key blob_datastore = Datastore.register_azure_blob_container(workspace=ws, datastore_name=blob_datastore_name, container_name=container_name, account_name=account_name, account_key=account_key) Get a Datastore from your current Workspace datastore = Datastore.get(ws, datastore_name='your datastore name') Upload Data datastore.upload(src_dir='your source directory', target_path='your target path', overwrite=True, show_progress=True) Download Data datastore.download(target_path='your target path', prefix='your prefix', show_progress=True)

Consume Data from a Data Set with SDK

00:08:54

Lesson Description:

In this lesson, we learn how to create and register data sets using the Azure SDK. We review a few different methods to complete this task. Datastore Data Set Creation from azureml.core import Workspace, Datastore, Dataset datastore_name = 'your datastore name' workspace = Workspace.from_config() datastore = Datastore.get(workspace, datastore_name) datastore_paths = [(Titanic.csv') titanic_ds2 = Dataset.Tabular.from_delimited_files(path=datastore_paths) Web Data Set from azureml.core import Dataset from azureml.data.dataset_factory import DataType web_path ='https://dprepdata.blob.core.windows.net/demo/Titanic.csv' titanic_ds = Dataset.Tabular.from_delimited_files(path=web_path, set_column_types={'Survived': DataType.to_bool()}) titanic_ds.take(3).to_pandas_dataframe() Register Data Sets titanic_ds = titanic_ds.register(workspace=workspace, name='titanic_ds', description='titanic training data') This Microsoft doc contains some of the source code and additional information on the topic for those wanting to go further: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-register-datasets

Choosing an Estimator in Azure Machine Learning

00:05:55

Lesson Description:

In this video, we review what an Estimator is as well as how we find and utilize code snippets to interact with various framework and compute targets. Here is a handy document for Estimator class code snippets: https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.estimator?view=azure-ml-py

Exam Essentials and References

00:02:36

Lesson Description:

In this video, we review key concepts from the section and additional resources to help prepare for the DP-100.

Generate Metrics from an Experiment Run

Upcoming Lesson: Logging Metrics from an Experiment Run

Lesson Description:

In this lesson, we teach two important ways (focusing on Jupyter Notebook and Designer) to "turn on" metrics from an Experiment run.

Upcoming Lesson: Retrieving and Viewing Experiment Outputs

Lesson Description:

In this lesson, we take a look at the metrics we logged in the last video, see what is available, and cover how we can access our metrics. We also examine the differences between no metrics, single-run, and multi-run metrics in an Experiment.

Upcoming Lesson: Using Logs to Troubleshoot Experiment Run Errors

Lesson Description:

In this lesson, we examine ways we can access log information to troubleshoot individual steps or entire run errors. We also examine some tips to make sure that we are gathering the information we need to be successful. Additional troubleshooting tips and Python instructions: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-debug-pipelines

Exam Essentials and References

00:02:48

Lesson Description:

In this video, we review key concepts from the section and additional resources to help prepare for the DP-100.

Conclusion

Review and Final Notes

00:05:02

Lesson Description:

In this video, we recap this course with some quick tips for exam preparation.

What's Next

00:01:56

Lesson Description:

In our final video, we take a forward look at DP-100 Part 3.

Take this course and learn a new skill today.

Transform your learning with our all access plan.

Start 7-Day Free Trial