Projects:2017s1-121 Learning Procedural Knowledge using Random Forests

From Projects
Revision as of 21:59, 2 November 2017 by A1607586 (talk | contribs) (Project Method)
Jump to: navigation, search

Project Team

Students: Woong-Ji Choi and Yiming Shitao

Supervisor: Dr Braden Phillips and Xin Yuan

Abstract

This project aims to produce artificially intelligent(AI) programs, or agents, that are automatically generated using a learning process. The agents are written in a production rule language and are created by inferring decision trees. A decision tree describes a series of conditions that link observed variables to a conclusion. Decision trees can be generated automatically (or inferred) from training data and random forests of inferred trees have often proven to be very effective classifiers. Although it is less common, decision trees can also be used for AI control of agents. In effect, the agent's behaviour is learned by the system rather than programmed in to it. As outcome of the project, it was possible to produce AI agents by using random forest for simple games such as Tic-Tac-toe and Connect 4. However, the behaviour of the AI agents depend strongly on the quality of training data and gathering these data is a time-consuming process. Also, the AI agent was only able to make decisions based on current environment state and can not foresee the future outcome. Hence, it was concluded that using random forest alone is not an optimal way of developing more complex AI agents. As a solution, it was suggested that a further research needs to be carried out to overcome the limitations of random forest by combining it with reinforcement learning.

Introduction & Motivation

In current technology, a computer requires a set of specific rules or an algorithm that pre- defines the process to execute any given task [1]. Therefore, successfully completing a task relies on the presence of a precise and extensive set of instructions. Existing computer systems lack the ability to: execute tasks through examples and previously executed tasks; observe and imitate experts to acquire new abilities; or improve on errors previously made [1] [2]. As computing tasks continue to become more complex, manually defining and implementing these rules is challenging and time consuming, hence demanding extra time and expertise from programmers. Machine learning methods have the potential to overcome these limitations [1] [3]. Machine learning involves the learning process of algorithms which can automatically be generated from studying data, learning from examples (models) and a process of developing computer systems or programs which automatically improve with experience [4]. In recent years there has been great progress in the use of machine learning for classification tasks [5] [3]. Using machine learning to generate agents is also possible and is beginning to emerge as an important research topic. Significant examples from recent years include developing new reinforcement learning algorithm for robot soccer [6] and applying machine learning strategy for controlling mobile robots [7] . Similar to computer programs and algorithms, developing an artificial intelligent (AI) agent is also usually rule-based, meaning it requires programmers to define a set of rules in order to perform a task. Additionally, like defining algorithms, developing AI agents are a complex and time-consuming process as its reasoning and decision-making abilities must mimic or resemble those of a human [8]. Utilising machine learning to develop AI agents can enhance fast development, reduce the workload of the developer but most importantly, results in more versatile AI agents through their ability to learn [1].

Objectives

The purpose of this project is to produce AI agents using random forests, a well-known machine learning method. The first objective is to produce a training dataset which is used as an input for random forests. The next objective is to Convert the production rules into a functional agent in Java language and develop a program to demonstrate the games using the AI agent. Here, another goal is to investigate the effects of different training data size has on the production rules, hence, the performance of AI agent. Finally, the last objective is to achieve expert behaviour of the agents.

Background

Machine Learning

Machine learning is referred to as the autonomous learning ability of a computer through series of models [8]. There are two types of machine learning: supervised and unsupervised. Unsupervised learning uses a set of data without any class information and reveals patterns or features hidden within the data [9]. On the other hand, supervised learning uses classified training data, which contains an input and a desired output to learn a function and predict a class for any valid data [9]. Supervised learning usually involves classification tasks. Decision trees and random forests are well-known examples of supervised learning methods [9].

Decision Trees

A decision tree is supervised logic based machine learning method ultimately used to classify data, through categorizing information depending on values or features [3]. Decision tree contains two types of decision nodes: root nodes (also known as decision nodes); and leaf nodes [10]. Figure 1 represents an example of a decision tree. The node at the upmost part of the tree is referred to as the Root Node where the classification initiates. From the decision node are multiple branches, which then lead to leaf nodes. Leaf nodes signify a result or classification. The set of rules that results a in classification is called the production rules. The production rules can easily be defined by translating each path from a leaf node to the root node of a decision tree into the form of “if-then” statement (Figure 1). Decisiontree.png

Random Forest

Random forest is a machine learning method used for classification that consists of multiple decision trees. A random selection of data (or variables) is used to generate the decision trees. This results in an ensemble of classifiers [11]. When presented with data to classify, the results from the decision trees may differ. The final classification for the input data is determined through casting a majority vote. The most popular classification among the various decision trees is output as the final result. The process is illustrated in Figure 2 [11] [12]. To achieve correct classification, it is crucial that the set of data chosen to generate the random forests [CHECK THIS] contains important features [13]. However, this is not always the case as the dataset is randomly selected and may contain impurities. Random forest provides a process called “feature selection” which involves developing a model that only includes the most important feature from a random dataset [11]. Therefore, random forest has advantages on minimising the generalization error and does not overfit (too many features effecting the overall accuracy in prediction) [13]. Other advantages of random forests include the effectiveness in computing and short amount of time required to train which makes random forest a good classifier [13]. Randomforest.png

C5.0

C5.0 is a software platform used as a data mining tool which involves a process of evaluating patterns from a set of data [14]. The evaluated patterns are categorised and constructed into classifiers [14]. C5.0 uses the classification techniques discussed in 2.2 and 2.3. A random set of input data produces classifiers in the form of decision trees and by majority voting the final output classification is generated. As a result, the set of rules (production rules) can be found automatically which describe the classifier [14].

Reinforcement Learning

RL is a type of machine learning which its agents learn the desired behaviour through interacting with the environment [6]. The objective of RL is to obtain the best action based on the current state [7]. The RL agents learns the behaviour through the process of rewarding/penalising the actions it has taken [7] and the RL algorithm aims to select actions that maximizes the expected cumulative rewards [6].

Project Method

Developing Training Dataset

Tic-Tac-Toe Training Dataset

Project Results

Project Conclustion

Future Project Recommendations