
@relation weather.symbolic @attribute outlook {sunny, overcast, rainy} @attribute temperature {hot, mild, cool} @attributehumidity {high, normal} @attribute windy {TRUE, FALSE} @attribute play {yes, no} @data sunny,hot,high,FALSE,nosunny,hot,high,TRUE,no overcast,hot,high,FALSE,yes rainy,mild,high,FALSE,yes rainy,cool,normal,FALSE,yesrainy,cool,normal,TRUE,no overcast,cool,normal,TRUE,yes sunny,mild,high,FALSE,no sunny,cool,normal,FALSE,yesrainy,mild,normal,FALSE,yes sunny,mild,normal,TRUE,yes overcast,mild,high,TRUE,yes overcast,hot,normal,FALSE,yesrainy,mild,high,TRUE,no
因为最终的结果只有yes和no两种,判断是否打高尔夫球所需的信息量(熵、不确定性)是1 bit。构建决策树的过程就是通过各种天气特征,来消除不确定性(使熵减少)。sunny,hot,high,FALSE,no sunny,hot,high,TRUE,no sunny,mild,high,FALSE,no sunny,cool,normal,FALSE,yessunny,mild,normal,TRUE,yes overcast,hot,high,FALSE,yes overcast,cool,normal,TRUE,yes overcast,mild,high,TRUE,yesovercast,hot,normal,FALSE,yes rainy,mild,high,FALSE,yes rainy,cool,normal,FALSE,yes rainy,cool,normal,TRUE,norainy,mild,normal,FALSE,yes rainy,mild,high,TRUE,no
某些子集在分割后变得更加纯净了,如当 outlook = overcast 的时候,全部为yes,该子集的熵为0,使得总体的熵(各个子集熵的平均值)减少。| 欢迎光临 数学建模社区-数学中国 (http://www.madio.net/) | Powered by Discuz! X2.5 |