代写Tree Algorithm、代做data留学生、Python，Java，c/c++程序语言代写-代写Java编程

联系方式

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

您当前位置：首页 >> Java编程Java编程

代写Tree Algorithm、代做data留学生、Python，Java，c/c++程序语言代写

日期：2019-10-12 11:42

Assignment # 2 – Total: 100 Points

Classification Tree Algorithm

Classification trees are one of the most widely used data mining algorithms; they are simple yet effective, and are the basis for many complicated data mining algorithms. Classification tree learning is a form of supervised learning. A set of training examples with their correct classifications is used to generate a decision tree that, hopefully, classifies each example in the validation set correctly. To get started, consider the problem of learning whether or not to go jogging on a particular day. To keep things simple enough to work through this problem by hand, we use a very small number of examples from which we want to learn the concept.

Assume you are using the following attributes to describe the examples:

AttributePossible Values

WEATHERWarm,Cold, Raining

JOGGED_YESTERDAYYes, No

(Since each attribute's value starts with a different letter, for shorthand we'll just use that initial letter, e.g., 'W' for Warm.)

Our output decision is binary-valued, so we'll use '+' and '-' as our class labels, indicating a "jog" recommendation or not, respectively. Here is our TRAINING set, which contains data:

WEATHERJOGGED_YESTERDAYCLASSIFICATION

2.1. Constructing the Initial Decision Tree

Apply the decision tree steps described in the class using information gain as the decision criteria for every split in the tree (to choose the best attribute). Show all your works, including the final decision tree. To draw your decision trees and show your calculation, please see the example at Page 30 of the slide, “Week5_ClassificationTree”. Again, the process for selecting the best split can be simplified into one simple rule:

Select the best split (by attribute) using information gain

You may use Excel to calculate the information gain from partitioning the training example on a given attribute. Use the function =log(base, number), where base is 2 for entropy calculation.

2.2. Estimating Future Accuracy

Here is our Validation set:

WEATHERJOGGED_YESTERDAYCLASSIFICATION

Use the decision tree produced in part (a) to classify each example in the Validation set. Show all your works, and report the accuracy (i.e., percent correct classification) on these validation examples. Briefly discuss your results.

【返回顶部】【打印本稿】【关闭本页】

【上一篇】：代写glFrustum、代做Python程序语言、代写Java，c/c++设计

【下一篇】：代写glFrustum、代做Python程序语言、代写Java，c/c++设计

联系方式

最新辅导

热门辅导

您当前位置：首页 >> Java编程Java编程

代写Tree Algorithm、代做data留学生、Python，Java，c/c++程序语言代写

日期：2019-10-12 11:42

相关文章