모델 성능 평가하기 (9)

How to evaluate model performance in Azure Machine Learning
Azure기계학습의 모델성능 평가방법

By garyericson Last updated: 04/21/2015

In this article

Evaluation vs. Cross Validation 평가 대 교차검증

Evaluating a Binary Classification Model 이진분류모델 평가

Evaluating a Multiclass Classification Model 다중클래스분류모델 평가

This topic demonstrates how to evaluate the performance of a model in Azure Machine Learning Studio and provides a brief explanation of the metrics available for this task.
이 항목에서는 Azure ML 스튜디오에서 모델의 성능평가 방법을 보여주며, 이 작업에 사용할 수 있는 메트릭(지표,척도,측정)을 간략하게 설명합니다.

Azure Machine Learning에서는 일련의 메트릭을 생성하여 모델의 성능을 평가할 수 있습니다.
따를 수 있는 단계는 다음과 같습니다:
In Azure Machine Learning, you can evaluate the performance of a model by generating a set of metrics. Here are the steps you can follow 1:

점수모델 구성요소를 사용하여 점수집합을 생성한 후, 점수가 매겨진 데이터 합을 모델평가 구성요소에 연결합니다. After you have generated a set of scores using the Score Model component, connect the scored dataset to the Evaluate Model component.
모델평가 구성요소를 마우스 오른쪽 버튼으로 클릭하고 시각화를 선택합니다. 이를 통해 모델의 정확성(성능)을 평가하는 데 사용되는 다양한 측정항목이 포함된 결과 샘플을 볼 수 있습니다.
Right-click the Evaluate Model component and select Visualize. This will allow you to see a sample of the results, which include various metrics used for evaluating the model’s accuracy (performance)1.

Model validation is also an important part of this process. It helps ensure that the model performs well on new data and aids in selecting the best model, parameters, and accuracy metrics 2.
모델 검증도 이 프로세스의 중요한 부분입니다.
이는 모델이 새로운 데이터에서 제대로 작동하는지 확인하고 최상의 모델, 매개변수 및 정확도 지표를 선택하는 데 도움이 됩니다.

출처: Bing과의 대화, 2023. 11. 7.
(1) 점수 모델: 구성 요소 참조 - Azure Machine Learning. https://learn.microsoft.com/en-us/azure/machine-learning/comComponent-reference/score-model?view=azureml-api-2.
(2) Azure ML Studio의 모델 유효성 검사 | Pluralsight. https://www.pluralsight.com/guides/model-validation-in-azure-ml-studio.
(3) 모델 평가: 구성 요소 참조 - Azure Machine Learning. https://learn.microsoft.com/en-us/azure/machine-learning/comComponent-reference/evaluate-model?view=azureml-api-2.
(4) Azure Machine Learning에서 모델 성능을 평가하는 방법. https://github.com/uglide/azure-content/blob/master/articles/machine-learning/machine-learning-evaluate-model-performance.md.
(5) 정의되지 않음. https://avatars.githubusercontent.com/u/1655867?v=4.
(6) 정의되지 않음. https://github.com/uglide/azure-content/blob/master/articles/machine-learning/machine-learning-evaluate-model-performance.md?raw=true.
(7) 정의되지 않음. https://desktop.github.com.
(8) 정의되지 않음. https://docs.github.com/articles/about-issue-and-pull-request-templates.
(9) 정의되지 않음. https://github.com/uglide/azure-content/raw/master/articles/machine-learning/machine-learning-evaluate-model-performance.md.

Three common supervised learning scenarios are presented:
다음 3가지 일반적인 지도학습 시나리오가 제공됩니다.

regression 회귀
binary classification 이진분류
multiclass classification 다중클래스 분류

Evaluating the performance of a model is one of the core stages in the data science process.
모델성능평가는 데이터과학 프로세스의 핵심 단계 중 하나입니다.

It indicates how successful the scoring (predictions) of a dataset has been by a trained model.
모델성능평가는 훈련된 모델이 데이터집합을 얼마나 정확하게 채점(예측) 했는지를 보여줍니다.

Azure Machine Learning supports model evaluation through two of its main machine learning modules: Evaluate Model and Cross-Validate Model.
Azure ML은 주요 ML모듈 중 모델평가 및 모델 교차-검증 이라는 2가지 모듈을 통해 모델평가를 지원합니다.

These modules allow you to see how your model performs in terms of a number of metrics that are commonly used in machine learning and statistics.
이러한 모듈을 사용하여 기계학습 및 통계에서 일반적으로 사용되는 여러 메트릭으로 모델의 성능을 확인할 수 있습니다.

Evaluation vs. Cross Validation 평가 및 교차검증

Evaluation and cross validation are standard ways to measure the performance of your model.
평가 및 교차검증은 모델성능을 측정하는 표준 방법입니다.

They both generate evaluation metrics that you can inspect or compare against those of other models.
두 방법 모두 검사하거나 또는 다른 모델과 비교할 수 있는 평가 메트릭을 만들어냅니다.

Evaluate Model expects a scored dataset as input (or 2 in case you would like to compare the performance of 2 different models).
모델평가에서는 채점된 데이터집합을 입력으로 사용해야 합니다(또는 2가지 모델의 성능을 비교 하려는 경우 2개의 데이터집합이 필요).

This means that you need to train your model using the Train Model module and make predictions on some dataset using the Score Model module, before you can evaluate the results.
따라서 결과를 평가하려면 먼저 모델훈련 모듈을 사용하여 모델을 훈련시키고, 모델채점 모듈 을 사용하여 일부 데이터집합을 예측해야 합니다.

The evaluation is the based on the scored labels/probabilities along with the true labels, all of which are output by the Score Model module.
모델평가는 true 레이블과 함께 채점된 레이블/확률을 기반으로 하는데, 이들 모두 모델채점 모듈의 산출물입니다.

Alternatively, you can use cross validation to perform a number of train-score-evaluate operations (10 folds) automatically on different subsets of the input data.
또는 교차검증을 사용하여, 입력데이터의 여러 하위집합을 갖고 많은 훈련-채점-평가작업(접기 수 10)을 자동으로 수행할 수 있습니다.

The input data is split into 10 parts, where one is reserved for testing, and the other 9 for training.
입력데이터를 10개 부분으로 나누어서, 1개는 테스트용(testing)으로, 나머지 9개는 훈련용 (training)으로 예약합니다.

This process is repeated 10 times and the evaluation metrics are averaged.
이 프로세스가 10번 반복되어 평가 메트릭의 평균이 계산됩니다.

This helps in determining how well a model would generalize to new datasets.
이는 모델이 새 데이터집합에 얼마나 잘 일반화되는지를 결정하는 데 도움이 됩니다.

The Cross-Validate Model module takes in an untrained model and some labeled dataset and outputs the evaluation results of each of the 10 folds, in addition to the averaged results.
모델 교차검증 모듈은 훈련안된 모델과 레이블이 지정된 일부 데이터집합을 사용하여, 10번의 접기 각각에 대한 평가 결과를 평균 결과와 함께 출력합니다.

In the following sections, we will build simple regression and classification models and evaluate their performance, using both the Evaluate Model and the Cross-Validate Model modules.
다음 섹션에서는 간단한 회귀 및 분류 모델을 구축하고, 모델평가 및 모델 교차검증 모듈을 모두 사용하여 해당 성능을 평가합니다.

Evaluating a Regression Model 회귀모델 평가

Assume we want to predict a car’s price using some features such as dimensions, horsepower, engine specs, and so on.
크기, 마력, 엔진 사양 등 몇 가지 특성을 사용하여 자동차 가격을 예측하려고 합니다.

This is a typical regression problem, where the target variable (price) is a continuous numeric value.
이는 목적 변수(target variable) (가격)가 연속 숫자 값인 전형적인 회귀문제입니다.

We can fit a simple linear regression model that, given the feature values of a certain car, can predict the price of that car.
특정 자동차의 특성 값이 주어진 경우, 해당 자동차 가격을 예측할 수 있는 간단한 선형 회귀 모델을 만들 수 있습니다.

This regression model can be used to score the same dataset we trained on.
이 회귀모델을 사용하여, 훈련된 동일 데이터집합의 점수를 매길 수 있습니다.

Once we have the predicted prices for all of the cars, we can evaluate the performance of the model by looking at how much the predictions deviate from the actual prices on average.
모든 자동차 가격을 예측한 후, 예측이 실제 가격에서 평균적으로 어느 정도 벗어났는지 확인 하여 모델성능을 평가할 수 있습니다.

To illustrate this, we use the Automobile price data (Raw) dataset available in the Saved Datasets section in Azure Machine Learning Studio.
이 예제에서는 Azure ML 스튜디오의 저장된 데이터집합 섹션에서 제공된 Automobile price data (Raw) dataset을 사용합니다.

Creating the Experiment 실험 만들기

Add the following modules to your workspace in Azure Machine Learning Studio:
다음 모듈을 Azure ML 스튜디오의 작업 영역에 추가합니다.

Automobile price data (Raw) 자동차 가격 데이터(원시)
Linear Regression 선형 회귀
Train Model 모델훈련
Score Model 모델채점
Evaluate Model 모델평가

Connect the ports as shown below in Figure 1 and set the Label column of the Train Model module to price.
아래 그림1에 표시된 대로 포트를 연결하고 모델훈련 모듈의 레이블 열(Label column)을 price로 설정합니다.

Evaluating a Regression Model

Figure 1. Evaluating a Regression Model. 그림1. 회귀모델 평가

Inspecting the Evaluation Results 평가결과 검사

After running the experiment, you can click on the output port of the Evaluate Model module and select Visualize to see the evaluation results.
실험을 실행한 후 모델평가 모듈의 출력포트를 클릭하고 Visualize를 선택하여 평가결과를 확인 할 수 있습니다.

The evaluation metrics available for regression models are: Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error, Relative Squared Error, and the Coefficient of Determination.
회귀모델에서 사용 가능한 평가 메트릭은 Mean Absolute Error, Root Mean Absolute Error, Relative Absolute Error, Relative Squared Error 및 Coefficient of Determination입니다.

The term "error" here represents the difference between the predicted value and the true value.
여기서 "오차(error)"는 예측값(predicted value)과 실제값(true value) 간의 차이를 나타냅니다.

The absolute value or the square of this difference are usually computed to capture the total magnitude of error across all instances, as the difference between the predicted and true value could be negative in some cases.
예측값과 실제값간 차이는 경우에 따라 음수일 수 있으므로 모든 인스턴스에서 오차의 총 크기 를 확보하기 위해 일반적으로 이 차이의 절대값 또는 제곱이 계산됩니다.

The error metrics measure the predictive performance of a regression model in terms of the mean deviation of its predictions from the true values.
오차 메트릭은, 실제값과 예측값의 평균편차로, 회귀모델의 예측성능을 측정합니다.

Lower error values mean the model is more accurate in making predictions.
오차값이 작을수록 모델예측이 더 정확함을 의미합니다.

An overall error metric of 0 means that the model fits the data perfectly.
전체 오차 메트릭 0은 모델이 데이터에 완벽하게 적합함을 의미합니다.

The coefficient of determination, which is also known as R squared, is also a standard way of measuring how well the model fits the data.
R제곱으로도 알려진 결정계수(coefficient of determination)도 모델이 데이터에 적합한 정도를 측정하는 표준방법 입니다.

It can be interpreted as the proportion of variation explained by the model.
결정계수는 모델이 설명하는 변형의 비율(proportion of variation)로 해석될 수 있습니다.

A higher proportion is better in this case, where 1 indicates a perfect fit.
이 경우 비율이 높을수록 좋으며, 1은 완벽한 적합을 나타냅니다.

Linear Regression Evaluation Metrics

Figure 2. Linear Regression Evaluation Metrics. 그림2. 선형 회귀평가 메트릭

Using Cross Validation 교차검증 사용

As mentioned earlier, you can perform repeated training, scoring and evaluations automatically using the Cross-Validate Model module.
앞서 설명한 바와 같이 모델 교차검증 모듈을 사용하여 반복적인 훈련, 채점 및 평가를 자동으로 수행할 수 있습니다.

All you need in this case is a dataset, an untrained model, and a Cross-Validate Model module (see figure below).
이 경우, 데이터집합, 훈련 안된 모델 및 모델 교차검증 모듈만 있으면 됩니다(아래 그림 참조).

Note that you need to set the label column to price in the Cross-Validate Model module’s properties.
모델 교차검증 모듈의 속성에서 레이블 열(label column)을 price로 설정해야 합니다.

Cross-Validating a Regression Model

Figure 3. Cross-Validating a Regression Model. 그림3. 회귀모델 교차검증

After running the experiment, you can inspect the evaluation results by clicking on the right output port of the Cross-Validate Model module.
실험을 실행한 후 모델 교차검증 모듈의 오른쪽 출력포트를 클릭하여 평가 결과를 검사할 수 있습니다.

This will provide a detailed view of the metrics for each iteration (fold), and the averaged results of each of the metrics (Figure 4).
각 반복(접기)에 대한 메트릭과 각 메트릭의 평균 결과에 대한 상세 보기가 제공됩니다(그림4).

Cross-Validation Results of a Regression Model

Figure 4. Cross-Validation Results of a Regression Model.
그림4. 회귀모델의 교차검증 결과

Evaluating a Binary Classification Model 이진분류모델 평가

In a binary classification scenario, the target variable has only two possible outcomes, for example: {0, 1} or {false, true}, {negative, positive}.
이진분류 시나리오에서 대상변수의 가능한 결과는 2가지뿐으로, 예를 들면 {0, 1} 또는 {false, true}, {negative, positive}입니다.

Assume you are given a dataset of adult employees with some demographic and employment variables, and that you are asked to predict the income level, a binary variable with the values {“<=50K”, “>50K”}.
일부 인구통계 및 고용변수가 포함된 성인 직원의 데이터집합에서, 소득수준(값이 {"<=50K", ">50K"}인 이진변수)을 예측해 보겠습니다.

In other words, the negative class represents the employees who make less than or equal to 50K per year, and the positive class represents all other employees.
다른 말로, 부정 클래스는 연 소득이 50K 이하인 직원을 나타내고, 긍정 클래스는 나머지 모든 직원을 나타냅니다.

As in the regression scenario, we would train a model, score some data, and evaluate the results.
회귀 시나리오와 마찬가지로 모델을 훈련하고, 일부 데이터의 점수를 매긴 다음, 결과를 평가 합니다.

The main difference here is the choice of metrics Azure Machine Learning computes and outputs.
가장 큰 차이점은 Azure ML이 계산하고 출력하는 메트릭의 선택입니다.

To illustrate the income level prediction scenario, we will use the Adult dataset to create an Azure Machine Learning experiment and evaluate the performance of a two-class logistic regression model, a commonly used binary classifier.
소득수준 예측 시나리오를 보여주기 위해 Adult 데이터집합을 사용하여 Azure ML 실험을 만들고, 일반적으로 사용되는 이진 분류기인 2-클래스 로지스틱 회귀모델의 성능을 평가 합니다.

Creating the Experiment 실험 만들기

Add the following modules to your workspace in Azure Machine Learning Studio:
다음 모듈을 Azure기계학습 스튜디오의 작업 영역에 추가합니다.

Adult Census Income Binary Classification dataset 성인 인구조사 소득 이진분류 데이터집합
Two-Class Logistic Regression 2-클래스 로지스틱 회귀
Train Model 모델훈련
Score Model 모델채점
Evaluate Model 모델평가

Connect the ports as shown below in Figure 5 and set the Label column of the Train Model module to income.
아래 그림5에 표시된 대로 포트를 연결하고 모델훈련 모듈의 레이블 열(Label column)을 income으로 설정합니다.

Evaluating a Binary Classification Model

Figure 5. Evaluating a Binary Classification Model. 그림 5. 이진분류모델 평가

Inspecting the Evaluation Results 평가결과 검사

After running the experiment, you can click on the output port of the Evaluate Model module and select Visualize to see the evaluation results (Figure 7).
실험 실행 후 모델평가 모듈의 출력포트를 클릭하고 Visualize를 선택하여 평가결과를 확인할 수 있습니다(그림 7).

The evaluation metrics available for binary classification models are: Accuracy, Precision, Recall, F1 Score, and AUC.
이진분류모델에 사용할 수 있는 평가 메트릭은 Accuracy, Precision, Recall, F1 Score 및 AUC입니다.

In addition, the module outputs a confusion matrix showing the number of true positives, false negatives, false positives, and true negatives, as well as ROC, Precision/Recall, and Lift curves.
또한 이 모듈은 ROC, Precision/Recall 및 Lift 곡선뿐만 아니라 참 긍정, 거짓 부정, 거짓 긍정 및 참 부정의 개수를 보여 주는 혼동행렬(confusion matrix)을 출력합니다.

Accuracy is simply the proportion of correctly classified instances.
정확도(Accuracy)는 정확히 분류된 인스턴스의 비율일 뿐입니다.

It is usually the first metric you look at when evaluating a classifier.
정확도는 일반적으로 분류기(classifier)를 평가할 때 보게되는 첫 번째 메트릭입니다.

However, when the test data is unbalanced (where most of the instances belong to one of the classes), or you are more interested in the performance on either one of the classes, accuracy doesn’t really capture the effectiveness of a classifier.
그러나 테스트 데이터가 불균형하거나(대부분의 인스턴스가 1개 클래스에 속한 경우) 클래스 중 1개 성능에 더 많은 관심이 있는 경우 정확도는 실제로 분류자의 효과를 캡처하지 못합니다.

In the income level classification scenario, assume you are testing on some data where 99% of the instances represent people who earn less than or equal to 50K per year.
소득수준 분류 시나리오에서는 인스턴스의 99%가 연 소득이 50K 이하인 사람을 나타내는 일부 데이터를 테스트하는 것으로 가정합니다.

It is possible to achieve a 0.99 accuracy by predicting the class “<=50K” for all instances.
0.99의 정확도는 모든 인스턴스에 대해 "<=50K" 클래스를 예측하여 달성할 수 있습니다.

The classifier in this case appears to be doing a good job overall, but in reality, it fails to classify any of the high-income individuals (the 1%) correctly.
이 경우 분류기는 전반적으로 양호한 것처럼 보이지만 실제로는 고소득자(1%)를 올바르게 분류하지 못합니다.

For that reason, it is helpful to compute additional metrics that capture more specific aspects of the evaluation.
따라서 평가의 보다 특정한 측면을 캡처하는 추가 메트릭을 계산하는 것이 좋습니다.

Before going into the details of such metrics, it is important to understand the confusion matrix of a binary classification evaluation.
이러한 메트릭에 대해 자세히 알아보기 전에 이진분류평가의 혼동행렬(confusion matrix)을 이해해야 합니다.

The class labels in the training set can take on only 2 possible values, which we usually refer to as positive or negative.
훈련집합의 클래스 레이블은 일반적으로 긍정 또는 부정이라는 2가지 가능한 값만 취할 수 있습니다.

The positive and negative instances that a classifier predicts correctly are called true positives (TP) and true negatives (TN), respectively.
분류기가 올바르게 예측한 긍정 및 부정 인스턴스를 각각 TP(참 긍정) 및 TN(참 부정)이라고 합니다.

Similarly, the incorrectly classified instances are called false positives (FP) and false negatives (FN).
마찬가지로 잘못 분류된 인스턴스는 FP(거짓 긍정) 및 FN(거짓 부정)이라고 합니다.

The confusion matrix is simply a table showing the number of instances that fall under each of these 4 categories.
혼동행렬은 이 4가지 범주 각각에 속하는 인스턴스의 개수를 보여 주는 테이블입니다.

Azure Machine Learning automatically decides which of the two classes in the dataset is the positive class.
Azure ML은 데이터집합의 2가지 클래스 중 어느 것이 긍정 클래스인지 자동으로 결정합니다.

If the class labels are Boolean or integers, then the ‘true’ or ‘1’ labeled instances are assigned the positive class.
클래스 레이블 이 부울 또는 정수인 경우에는 'true' 또는 '1' 로 레이블된 인스턴스에 긍정 클래스가 할당됩니다.

If the labels are strings, as in the case of the income dataset, the labels are sorted alphabetically and the first level is chosen to be the negative class while the second level is the positive class.
레이블이 문자열인 경우에는 소득 데이터집합의 경우처럼, 레이블이 알파벳순으로 정렬되고 첫 번째 수준은 부정 클래스로, 두 번째 수준은 긍정 클래스로 선택됩니다.

Binary Classification Confusion Matrix

Figure 6. Binary Classification Confusion Matrix. 그림6. 이진분류 혼동행렬

Going back to the income classification problem, we would want to ask several evaluation questions that help us understand the performance of the classifier used.
소득분류 문제로 돌아가서, 사용된 분류자의 성능을 이해하는 데 도움이 되는 몇 가지 평가 질문을 해보겠습니다.

A very natural question is: ‘Out of the individuals whom the model predicted to be earning >50K (TP+FP), how many were classified correctly (TP)?’
매우 자연스러운 질문은 '모델에서 50K(TP+FP)를 초과할 것으로 예측한 사람 중 정확히 분류된 (TP) 사람은 몇 명입니까?' 입니다.

This question can be answered by looking at the Precision of the model, which is the proportion of positives that are classified correctly: TP/(TP+FP).
이 질문에 대한 답은 정확히 분류된 긍정의 비율인 모델의 Precision(정밀도), 즉 TP/(TP+FP)를 확인하여 얻을 수 있습니다.

Another common question is “Out of all the high earning employees with income >50k (TP+FN), how many did the classifier classify correctly (TP)”.
또 다른 일반적 질문은 "소득이 50k(TP+FN)를 초과하는 모든 고소득 직원 중 분류자가 올바르게 분류한(TP) 직원은 몇 명입니까?"입니다.

This is actually the Recall, or the true positive rate: TP/(TP+FN) of the classifier.
이는 실제로 재현률(Recall) 또는 참 긍정 비율, 즉 분류자의 TP/(TP+FN)입니다.

You might notice that there is an obvious trade-off between precision and recall.
정확도(precision)와 재현률(recall) 사이에는 명확한 반비례 관계가 있는 것을 볼 수 있습니다.

For example, given a relatively balanced dataset, a classifier that predicts mostly positive instances, would have a high recall, but a rather low precision as many of the negative instances would be misclassified resulting in a large number of false positives.
예를 들어 비교적 균형 잡힌 데이터집합에서 주로 긍정 인스턴스를 예측하는 분류기는 재현률 이 높지만 대부분의 거짓 인스턴스가 잘못 분류되어 많은 거짓 긍정이 발생하므로 정확도가 낮습니다.

To see a plot of how these two metrics vary, you can click on the ‘PRECISION/RECALL’ curve in the evaluation result output page (top left part of Figure 7).
이 두 메트릭이 어떻게 달라지는지에 대한 그림을 보려면 평가결과 출력 페이지에서 ‘PRECISION/RECALL(정확도/재현률)' 곡선(그림7의 왼쪽 위)을 클릭하면 됩니다.

Binary Classification Evaluation Results

Figure 7. Binary Classification Evaluation Results. 그림 7. 이진분류 평가 결과

Another related metric that is often used is the F1 Score, which takes both precision and recall into consideration.
자주 사용되는 또 다른 관련 메트릭은 정확도와 재현률을 둘 다 고려하는 F1점수입니다.

It is the harmonic mean of these 2 metrics and is computed as such: F1 = 2 (precision x recall) / (precision + recall).
F1점수는 이 두 메트릭의 조화 평균이며, 다음과 같이 계산됩니다: F1 = 2 (정확도 x 재현률) / (정확도 + 재현률).

The F1 score is a good way to summarize the evaluation in a single number, but it’s always a good practice to look at both precision and recall together to better understand how a classifier behaves.
F1점수는 평가를 단일 숫자로 요약하는데 적합한 방법이지만 분류자의 동작 방식을 보다 잘 이해하려면 항상 정확도와 재현률을 함께 확인하는 것이 좋습니다.

In addition, one can inspect the true positive rate vs. the false positive rate in the Receiver Operating Characteristic (ROC) curve and the corresponding Area Under the Curve (AUC) value.
또한 ROC(Receiver Operating Characteristic) 곡선에서 참 긍정 비율과 거짓 긍정 비율을 검사 하고 해당 AUC(Area Under the Curve) 값을 확인할 수 있습니다.

The closer this curve is to the upper left corner, the better the classifier’s performance is (that is maximizing the true positive rate while minimizing the false positive rate).
이 곡선이 왼쪽 위 모서리에 가까울수록 분류자의 성능이 좋습니다(참 긍정 비율이 최대화되고 거짓 긍정 비율이 최소화됨).

Curves that are close to the diagonal of the plot, result from classifiers that tend to make predictions that are close to random guessing.
그림의 대각선에 가까운 곡선은 예측 경향이 임의 추측에 가까운 분류자의 결과입니다.

Using Cross Validation 교차검증 사용

As in the regression example, we can perform cross validation to repeatedly train, score and evaluate different subsets of the data automatically.
회귀 예제처럼 교차검증을 수행하여 데이터의 여러 하위 집합에 대해 반복적인 훈련, 채점 및 평가를 자동으로 수행할 수 있습니다.

Similarly, we can use the Cross-Validate Model module, an untrained logistic regression model, and a dataset. 마찬가지로 모델 교차검증 모듈, 학습되지 않은 로지스틱 회귀모델 및 데이터집합을 사용할 수 있습니다.

The label column must be set to income in the Cross-Validate Model module’s properties.
레이블 열은 모델 교차검증 모듈의 속성에서 *income*으로 설정되어야 합니다.

After running the experiment and clicking on the right output port of the Cross-Validate Model module, we can see the binary classification metric values for each fold, in addition to the mean and standard deviation of each. 실험을 실행한 후 모델 교차검증 모듈의 오른쪽 출력 포트를 클릭하면 각 접기에 대한 이진분류 메트릭 값과 각각의 평균 및 표준 편차를 볼 수 있습니다.

Cross-Validating a Binary Classification Model

Figure 8. Cross-Validating a Binary Classification Model. 그림 8. 이진분류모델 교차검증

Cross-Validation Results of a Binary Classifier

Figure 9. Cross-Validation Results of a Binary Classifier.
그림 9. 이진 분류자의 교차검증 결과

Evaluating a Multiclass Classification Model 다중클래스 분류모델 평가

In this experiment we will use the popular Iris dataset which contains instances of 3 different types (classes) of the iris plant.
이 실험에서는 3가지 유형(클래스)의 붓꽃 인스턴스가 포함된 일반적인 Iris 데이터집합을 사용합니다.

There are 4 feature values (sepal length/width and petal length/width) for each instance.
각 인스턴스에 대해 4개의 특성 값 (꽃받침 길이/너비 및 꽃잎 길이/너비)이 있습니다.

In the previous experiments we trained and tested the models using the same datasets.
앞선 실험에서 동일한 데이터집합을 사용하여 모델을 훈련시키고 테스트했습니다.

Here, we will use the Split module to create 2 subsets of the data, train on the first, and score and evaluate on the second.
여기에서는 분할모듈을 사용하여 데이터의 하위 집합 2개를 만들고 첫 번째 하위 집합을 학습 한 후 두 번째 하위 집합의 점수를 매기고 평가합니다.

The Iris dataset is publicly available on the UCI Machine Learning Repository, and can be downloaded using a Reader module. Iris
데이터집합은 UCI ML 리포지토리에서 공개적으로 사용할 수 있으며, 판독기 모듈을 사용하여 다운로드할 수 있습니다.

Creating the Experiment 실험 만들기

Add the following modules to your workspace in Azure Machine Learning Studio: 다음 모듈을 Azure ML 스튜디오의 작업 영역에 추가합니다.

Reader 판독기
Multiclass Decision Forest 다중클래스 의사결정 포리스트
Split 분할
Train Model 모델훈련
Score Model 모델채점
Evaluate Model 모델평가

Connect the ports as shown below in Figure 10. 아래의 그림10과 같이 포트를 연결합니다.

Set the Label column index of the Train Model module to 5. 모델훈련 모듈의 레이블 열 인덱스를 5로 설정합니다.

The dataset has no header row but we know that the class labels are in the fifth column. 이 데이터집합에는 헤더 행이 없지만 클래스 레이블이 다섯 번째 열에 있다는 것을 알고 있습니다.

Click on the Reader module and set the Data source property to Web URL via HTTP, and the URL to http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data. 판독기 모듈을 클릭하고 Data source 속성을 *Web URL via HTTP*로, *URL*을 http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data로 설정합니다.

Set the fraction of instances to be used for training in the Split module (0.7 for example). 분할모듈에서 훈련에 사용할 인스턴스 부분을 설정합니다(이 예제의 경우 0.7).

Evaluating a Multiclass Classifier

Figure 10. Evaluating a Multiclass Classifier 그림10. 다중클래스 분류기 평가

Inspecting the Evaluation Results 평가 결과 검사

Run the experiment and click on the output port of Evaluate Model.
실험을 실행하고 모델 평가의 출력 포트를 클릭합니다.

The evaluation results are presented in the form of a confusion matrix, in this case.
이 경우 평가 결과가 혼동 행렬 형식으로 제공됩니다.

The matrix shows the actual vs. predicted instances for all 3 classes.
행렬에는 세 클래스 모두에 대해 실제 인스턴스와 예측 인스턴스가 표시됩니다.

Multiclass Classification Evaluation Results

Figure 11. Multiclass Classification Evaluation Results.
그림11. 다중클래스 분류 평가 결과

Using Cross Validation 교차검증 사용

As mentioned earlier, you can perform repeated training, scoring and evaluations automatically using the Cross-Validate Model module.
앞서 설명한 바와 같이 모델 교차검증 모듈을 사용하여 반복적인 학습, 채점 및 평가를 자동 으로 수행할 수 있습니다.

You would need a dataset, an untrained model, and a Cross-Validate Model module (see figure below).
데이터집합, 훈련안된 모델 및 모델 교차검증 모듈이 필요합니다(아래 그림 참조).

Again you need to set the label column of the Cross-Validate Modelmodule (column index 5 in this case).
모델 교차검증 모듈의 레이블 열을 설정해야 합니다(이 예제의 경우 열 인덱스 5).

After running the experiment and clicking the right output port of the Cross-Validate Model, you can inspect the metric values for each fold as well as the mean and standard deviation.
실험을 실행한 후 모델 교차검증 모듈의 오른쪽 출력포트를 클릭하면 각 접기에 대한 메트릭 값과 평균 및 표준 편차를 검사할 수 있습니다.

The metrics displayed here are the similar to the ones discussed in the binary classification case. 여기에 표시된 메트릭은 이진 분류 예제에서 설명한 메트릭과 유사합니다.

However, note that in multiclass classification, computing the true positives/negatives and false positives/negatives is done by counting on a per-class basis, as there is no overall positive or negative class.
그러나 다중클래스 분류에서는 전체 긍정 또는 부정 클래스가 없기 때문에 참 긍정/부정 및 거짓 긍정/부정이 클래스 단위로 계산됩니다.

For example, when computing the precision or recall of the ‘Iris-setosa’ class, it is assumed that this is the positive class and all others as negative.
예를 들어 'Iris-setosa' 클래스의 정확도 또는 재현율을 계산할 때 이것은 긍정 클래스이고 나머지는 모두 부정 클래스인 것으로 가정합니다.

Cross-Validating a Multiclass Classification Model

Figure 12. Cross-Validating a Multiclass Classification Model.
그림12. 다중클래스 분류모델 교차검증

Cross-Validation Results of a Multiclass Classification Model

Figure 13. Cross-Validation Results of a Multiclass Classification Model.
그림13. 다중클래스 분류모델의 교차검증 결과

저작자표시 (새창열림)

'AZ ml' 카테고리의 다른 글

자동차 가격 예측 기계학습 모델 만들기 (8) (0)	2016.01.13
다중-클래스 분류 해석방법 (7) (0)	2016.01.13
음식점 추천 기계학습 모델 결과 해석하기 (6) (0)	2016.01.13
기계학습의 모델결과 해석방법 (5) (0)	2016.01.13
신용위험 예측 기계학습 모델 만들기 (4) (0)	2016.01.13

머신러닝 GPT

모델 성능 평가하기 (9)

How to evaluate model performance in Azure Machine Learning
Azure기계학습의 모델성능 평가방법

Evaluation vs. Cross Validation 평가 및 교차검증

Evaluating a Regression Model 회귀모델 평가

Creating the Experiment 실험 만들기

Inspecting the Evaluation Results 평가결과 검사

Using Cross Validation 교차검증 사용

Evaluating a Binary Classification Model 이진분류모델 평가

Creating the Experiment 실험 만들기

Inspecting the Evaluation Results 평가결과 검사

Using Cross Validation 교차검증 사용

Evaluating a Multiclass Classification Model 다중클래스 분류모델 평가

Creating the Experiment 실험 만들기

Inspecting the Evaluation Results 평가 결과 검사

Using Cross Validation 교차검증 사용

'AZ ml' 카테고리의 다른 글

댓글

티스토리툴바

모델 성능 평가하기 (9)

How to evaluate model performance in Azure Machine LearningAzure기계학습의 모델성능 평가방법

Evaluation vs. Cross Validation 평가 및 교차검증

Evaluating a Regression Model 회귀모델 평가

Creating the Experiment 실험 만들기

Inspecting the Evaluation Results 평가결과 검사

Using Cross Validation 교차검증 사용

Evaluating a Binary Classification Model 이진분류모델 평가

Creating the Experiment 실험 만들기

Inspecting the Evaluation Results 평가결과 검사

Using Cross Validation 교차검증 사용

Evaluating a Multiclass Classification Model 다중클래스 분류모델 평가

Creating the Experiment 실험 만들기

Inspecting the Evaluation Results 평가 결과 검사

Using Cross Validation 교차검증 사용

'AZ ml' 카테고리의 다른 글

관련글

댓글

티스토리툴바

How to evaluate model performance in Azure Machine Learning
Azure기계학습의 모델성능 평가방법