기계학습의 모델결과 해석방법 (5)

How to interpret model results in Azure Machine Learning
Azure기계학습의 모델결과 해석방법

By garyericsonLast updated: 04/21/2015

In this article:

Classification 분류
Regression 회귀
Clustering 클러스터링
Recommender System 추천자시스템

Understanding & Visualizing 'Score Model' Output '모델채점’ 산출물의 이해와 시각화

In this article:

This topic explains how to visualize and interpret prediction results in the Azure Machine Learning Studio.
이 항목에서는 Azure ML 스튜디오에서 예측결과를 시각화하고 해석하는 방법을 설명합니다.

After you have trained a model and done predictions on top of it ("scored the model"), you need to understand and interpret the prediction result you have obtained.
모델을 훈련시키고 그 모델에서 예측하고 (“모델채점") 나면, 확보한 예측결과를 이해하고 해석 해야 합니다.

There are four major kinds of machine learning models in Azure Machine Learning:
Azure기계학습에는 다음과 같이 4가지 주된 유형의 ML 모델이 있습니다.

· Classification: Score Model module for classification and regression
분류: 분류와 회귀를 위한 모델채점 모듈,

· clustering: Assign to Clusters module for clustering
클러스터링: 클러스터링을 위한 클러스터에 대한 할당 모듈

· regression: Score Model module for classification and regression
회귀: 분류와 회귀를 위한 모델채점 모듈

· recommender systems: Score Matchbox Recommender for recommendation systems: 추천자시스템: 추천자시스템을 위한 매치박스 추천자 채점

The modules used do prediction on top of these modules, called "scoring" them, given some test data, are:
모델들을 “채점한다" 라고 하는, 몇몇 테스트 데이터가 주어진, 이들 모듈에서 예측수행에 사용되는 모듈들은 아래와 같습니다.

This document explains how to interpret prediction results for each of these modules.
이 문서에서는 이러한 각 모듈의 예측결과 해석방법에 대해 설명합니다.

For an overview of these kinds of models, see How to choose parameters to optimize your algorithms in Azure Machine Learning.
이러한 유형의 모델에 대한 개요는 Azure ML에서 알고리즘을 최적화하는 매개변수 선택방법을 참조하세요.

This topic addresses prediction interpretation but not model evaluation.
이 항목에서는 예측해석만 다루고 모델 평가는 다루지 않습니다.

For more information on how to evaluate your model, please refer to How to evaluate model performance in Azure Machine Learning.
모델평가 방법에 대한 자세한 내용은 Azure ML의 모델성능 평가방법을 참조하세요.

If you are new to Azure Machine Learning, and help on how to create a simple experiment to get started, see Create a simple experiment in Azure Machine Learning Studio in the Azure Machine Learning Studio.
Azure ML이 처음이면, 그리고 시작 전에 간단한 실험 만들기 방법에 대한 도움이 필요한 경우, Azure ML 스튜디오의 Azure ML 스튜디오에서 간단한 실험 만들기를 참조하세요.

Classification 분류

There are two sub-categories of classification problems: 분류 문제의 하위 범주로 다음 2가지가 있습니다.

problems with only two classes (two-class or binary classification)
2클래스 만의 문제(2-클래스 또는 이진분류)
problems with more than two classes (multi-class classification)
3개 이상 클래스의 문제 (다중클래스 분류)

Azure Machine Learning has different modules to deal with each of these types of classification.
Azure ML에는 이러한 유형의 분류를 다루는 여러 모듈이 있습니다.

But the ways to interpret their prediction results are very similar. 하지만 예측결과를 해석하는 방법은 매우 비슷합니다.

We will talk about two-class classification problems first, and then address multi-class classification problems.
먼저 2-클래스 분류 문제를 설명한 다음, 다중클래스 분류 문제에 대해 설명합니다.

Two-class classification 2-클래스 분류

Example experiment 예제 실험

An example of two-class classification problem is the classification of Iris flowers: the task is to classify Iris flowers based on their features.
2-클래스 분류문제의 예제는 붓꽃의 분류입니다: 특성에 근거해서 붓꽃을 분류하는 것이 과제 입니다.

Note that the Iris dataset provided in Azure Machine Learning is a subset of the popular Iris dataset containing instances of only 2 flower species (classes 0 and 1).
Azure ML에서 제공하는 붓꽃 데이터집합은 널리 사용되는 붓꽃 데이터집합의 하위 집합이며, 이 집합에서 꽃의 종류는 2가지(클래스 0과 1)뿐입니다.

There are four features for each flower (sepal length, sepal width, petal length and petal width).
각 꽃에는 4가지 특성이 있습니다(꽃받침 길이, 꽃받침 너비, 꽃잎 길이 및 꽃잎 너비).

An experiment has been performed to solve this problem, as shown in Figure 1.
그림1에 표시된 대로 이 문제를 해결하기 위해 실험을 수행했습니다.

screenshot_of_experiment

Figure 1 Experiment of Iris Two-Class Classification Problem
그림1 붓꽃 2-클래스 분류문제 실험

A two-class boosted decision tree model has been trained and scored.
2-클래스 향상된 의사결정트리 모델을 훈련하고 채점하였습니다.

Now we can visualize the prediction results from Score Model module by clicking on the output port of Score Model module and then clicking on Visualize in the appeared menu.
이제 모델채점 모듈의 출력부분을 클릭한 다음 나타난 메뉴에서 시각화를 클릭하여 모델채점 모듈의 예측결과를 시각화 할 수 있습니다.

This will bring up the scoring results as shown in Figure 2.
그림2에서와 같이 채점 결과가 표시됩니다.

screenshot_of_experiment

Figure 2 Visualize Score Model Result in Two-Class Classification
2-클래스 분류에서의 모델채점결과 시각화

Result interpretation 결과 해석

There are six columns in the results table. 결과 테이블에 6개의 열(column)이 있습니다.

The left four columns are the four features. 왼쪽에 있는 4개의 열(column)이 4개의 특성 입니다.

The right two columns, Scored Labels and Scored Probabilities are the prediction results.
오른쪽의 2개 열(column), 채점된 레이블과 채점된 확률은 예측결과 입니다.

The Scored Probabilities column shows the probability that a flower belongs to the positive class (class 1).
채점된 확률 열은 꽃이 긍정 클래스(클래스1)에 속할 확률을 보여줍니다.

For example, the first number 0.028571 in the column means there is 0.028571 probability that the first flower belongs to class 1.

예를 들어, 열 안의 첫 번째 숫자 0.028571는 첫 번째 꽃이 클래스1에 속할 확률이 0.028571 임을 의미합니다.

The Scored Labels column shows the predicted class for each flower.
채점된 레이블 열에는 꽃의 예측클래스가 표시됩니다.

This is based on the Scored Probabilities column.
채점된 확률 열을 기반으로 채점된 레이블의 값이 정해집니다.

If the scored probability of a flower is larger than 0.5, it is predicted as class 1, otherwise, it is predicted as class 0.
꽃의 채점된 확률이 0.5보다 크면 클래스1로, 그렇지 않으면 클래스0으로 예측됩니다.

screenshot_of_experiment

그림2 2-클래스 분류의 모델채점 결과 시각화

Web service publication 웹 서비스 게시

Once the prediction results have been understood and judged sound, the experiment can be published as a web service so that we can deploy it in various applications and be called to obtain class predictions on any new iris flower.
예측 결과를 철저히 이해하고 판단한 후에 웹서비스에 실험을 게시하면, 다양한 응용 프로그램 에 이 실험을 설치할 수 있고, 모든 새 붓꽃에 대한 클래스 예측 값을 얻을 때 이 실험을 호출할 수 있습니다.

For the procedure on how to change a training experiment into a scoring experiment and publish it as a web service, see Publish the Azure Machine Learning web service.
훈련 실험을 채점 실험으로 변경하여 웹서비스로 게시하는 방법에 대한 절차는 Azure ML 웹서비스 게시를 참조하세요.

Following this procedure provides you with a scoring experiment as shown in Figure 3.
이 절차에 따르면 그림3에 표시된 대로 채점 실험이 제공됩니다.

screenshot_of_experiment

Figure 3 Scoring Experiment of Iris Two-Class Classification Problem
그림3 붓꽃 2-클래스 분류문제의 채점 실험

Now we need to set the input and output for the web service.
이제 웹서비스의 입력 및 출력을 설정해야 합니다.

Obviously, the input is the right input port of Score Model, which is the Iris flower features input.
붓꽃 특성 입력인 모델채점의 오른쪽 입력포트가 입력입니다.

The choice of the output depends on whether we are interested in the predicted class (scored label), the scored probability, or both.
예측된 클래스(채점된 레이블), 채점된 확률, 둘 다 중 어느 것이 관심사 인지에 따라 산출이 선택 됩니다.

Here, it is assumed that we are interested in both.
여기에서는 둘 다에 관심이 있다고 가정합니다.

To select the desired output columns, we need to use a Project Columns module.
원하는 출력 열을 선택하기 위해 Project Columns(프로젝트 열) 모듈을 사용해야 합니다.

We click on Project Columns module, click on Launch column selector in the right panel, and select Scored Labels and Scored Probabilities.
프로젝트 열 모듈을 클릭하고, 오른쪽 패널에서 열 선택기 시작을 클릭한 다음 채점된 레이블 및 채점된 확률을 선택 합니다.

After setting the output port of Project Columns module and running it again, we should be ready to publish the scoring experiment as a web service by clicking on PUBLISH WEB SERVICE button at the bottom.
프로젝트 열 모듈의 출력 포트를 설정하고 다시 실행하고 나면, 맨 아래에 있는 웹서비스 게시 단추를 클릭하여 채점 실험을 웹서비스로 게시할 준비가 되어야 합니다.

The final experiment looks like Figure 4. 마지막 실험은 그림4와 같이 표시됩니다.

screenshot_of_experiment

Figure 4 Final Scoring Experiment of Iris Two-Class Classification Problem
그림4 붓꽃 2-클래스 분류문제 마지막 채점 실험

After running the web service, and entering some feature values of a test instance, the returned result returns two numbers.
웹서비스를 실행하고 테스트 인스턴스의 특성 값을 입력하고 나면 반환된 결과가 숫자 2개를 반환합니다.

The first number is the scored label, the second is the scored probability.
첫째 숫자는 채점된 레이블이고 둘째는 채점된 확률입니다.

This flower is predicted as class 1 with 0.9655 probability.
이 꽃은 확률이 0.9655인 클래스1로 예측됩니다.

screenshot_of_experiment

Figure 5 Web Service Result of Iris Two-Class Classification
그림5 붓꽃 2-클래스 분류의 웹서비스 결과

저작자표시

'AZ ml' 카테고리의 다른 글

다중-클래스 분류 해석방법 (7) (0)	2016.01.13
음식점 추천 기계학습 모델 결과 해석하기 (6) (0)	2016.01.13
신용위험 예측 기계학습 모델 만들기 (4) (0)	2016.01.13
기계학습 교본: Azure ML스튜디오에서 첫 번째 실험 만들기 (3) (0)	2016.01.13
What is Azure Machine Learning Studio? Azure기계학습 스튜디오란 무엇인가요? (2) (0)	2016.01.06

머신러닝 GPT

기계학습의 모델결과 해석방법 (5)

'AZ ml' 카테고리의 다른 글

댓글

티스토리툴바

기계학습의 모델결과 해석방법 (5)

'AZ ml' 카테고리의 다른 글

관련글

댓글

티스토리툴바