Difference between revisions of "Projects:2021s2-63311 Robots playing capture the flag"
|  (→Robot’s average speed) |  (→Robot’s average speed) | ||
| Line 90: | Line 90: | ||
| We can substitute ∆t = 1. Hence: | We can substitute ∆t = 1. Hence: | ||
| − | ∆d consists of 2 components 𝑑_𝑥 and 𝑑_𝑦. Therefore: | + | ∆d consists of 2 components 𝑑_𝑥 and 𝑑_𝑦. Therefore, we have new displacement knowing velocity and angle: | 
| − | [[File:CTF displacement formula.png|thumb| | + | [[File:CTF displacement formula.png|thumb|left|Displacement formula]] | 
| ==== Robot’s acceleration ==== | ==== Robot’s acceleration ==== | ||
Revision as of 01:56, 8 June 2022
Contents
Abstract
Robots have been proven useful in many aspects of our lives. From easy tasks such as the auto-cleaning bot in homes, to more precise and complex tasks such as surgical robotic arm in hospitals, or systems of multi drone swarms used in defence. Most robotic system are known for their excellence in handling tasks that require a high level of responsiveness, repetition, and accurate execution. However, the more complicated task, the more inclination for robots to suffer functionally, especially with tasks that involves critical planning and making strategies in advance. To research such tactical tasks, the game capturing the flag is chosen. Capturing the flag is a team-based game that requires the player to capture their opponent’s flag while protecting their own. The team with better strategy and formation will be more likely to achieve victory. With that in mind, this project aims to simulate the game CTF in a software environment to study the ways in which complex and strategic behaviours can be implemented into robots using Python programming. Hopefully, it may contribute to future development of complex tasks integration for other strategical, tactical robots.
Acknowledgements
The author wishes to acknowledge the sponsorship of the Defence Science and Technology Group of Australia (DST Group). The credit for great support, helpful feedback and on-going consultation goes to David Hubczenko, Jijoong Kim, Anna Dostovalova and Bradley Fraser. Moreover, the author would also like to express his deepest gratitude towards invaluable guidance and support from project advisors Prof. Cheng-Chew Lim and Dr. Duong D. Nguyen. This project would not have reached this far without their expert recommendations.
Introduction
Motivation
The game Capturing the Flag (CTF) is a very popular game. It consists of two teams; each normally consists of five players. Each team will be required to capture their opponent’s player flag, and run back to the base without getting caught, while defending their own flag. The level of complexity of the game is mild, which suits the purpose of the project. The CTF game requires the players to have winning characteristics such as good teamwork, communication, planning strategy, tactical execution, and situational response. These characteristics are also traits that is needed to study for later implementation into robots. There has been work done in the past with similar objectives that was carried out before, that can be an example for our project to advance. The other inspiration comes from the successful implementation of AlphaGo [1][2]. It achieved many victories (4/5 games won) against the 18th time world champion Lee Sedol in the game of Go in 2016. This project’s ambition is to write a self-learning, or at least self-guiding algorithm so that the robot players can play with themselves in the scope of the game CTF. Moreover, if time constraints are reasonable, it would be stimulating to implement real world robots that can play the CTF game, using the simulation built before. However, since to the level of high ambition implementation exceeds the ability of our group members, it would be reasonable to remain the project’s scope of making simulation of robots playing the game on computers only.
Previous studies
This game challenges the need to build a smart algorithm. In this case, it’s the tactical planning and execution for robots to play capture the flag. The goal is to integration these algorithms into physical robots that play capture the flag, but also where they can be used in a variety of other/different applications. The essence of this project is the strategic algorithm, which will determine how the robots act according to real-time events.
Aquaticus
One of the closest research projects that have done similar research on the same objective is Aquaticus[3]. The project tested the game of CTF with real human and robots on the water surface. The coordinates and movement of the game is recorded and made available on Aquaticus website. However, data is limited, as there had not been many game set to analyse. Meanwhile, a set of large number of simulations is needed to study the behaviour of the robotic players in the game.
Hamilton Jacobi reachability
Another group of researchers have been trying to implement different approaches to game planning on the game of CTF. The research had been successful at that time for the implementation of Hamilton-Jacobi reachability [4]into one of the scenarios in the game CTF. The advantage of this research is to apply Hamilton Jacobi reachability to calculate the possibilities to successfully offend in the game of CTF, depending on different scenarios and achieve victory. However, the idea was abstract, on the paper research without actually simulate a full course game of CTF.
Project Aims
The deliverable of this project is a simulation program, which visualize robots playing the game together, with or without control from human observer. The robots were programmed so that they must obey fundamental physics rules, such as velocity, acceleration. Next, it is essential to define attacking, and defending algorithms so that they can be implemented into the program. The execution of algorithm will need to depend on the game situation. Additionally, the virtual environment in which the game takes place must also be considered. In the world of robotic simulation, basic physics constraints apply to the robot’s movements. For instance, the parameters with maximum velocity, acceleration and stopping acceleration will be considered. Since this report focus more on the art of how the game played, other physics elements, such as gravity, friction, robot’s inertia, mass, are neglected. Nevertheless, these elements can always be implemented in the future when it’s required. The focus of this project is algorithms building, which will be covered in Chapter 3 of this report. The algorithms will take in the input of other robot’s position, and situational information (such as user’s input instructions, or the state whether the flag is captured, and so on). After processing provided input, the robots will act accordingly. If an algorithm is implemented successfully, it is still required for implemented algorithms to be tested for their functionality, stability, and reliability. The result to each algorithm performance will be further discussed in Chapter 4. The following step would be algorithm analysis, and directions for further implementations in the future, if the project is to be continued.
Background
This section will cover main theoretical grounds in which is used to construct the project. The first part will cover some of the physical quantities that will be used in the software simulation. Next, the report will go through mathematical models used to describe coordinates, which is the standing ground for the software simulation. Next, this section will cover more about the software simulation environment. Finally, this chapter will cover how the theoretical grounds are setup into software simulation.
Simulation environemnt
Python
There are many programming languages to be considered for this project. Most popular high level programming languages available are C/C++, Java, or Python. The project team decided to build this project based on Python for the following reasons:
1. Python is one of the most popular programming languages at the time.
2. Python can be learned quickly, easy to learn.
3. Python is a multi-platform and compatible with different systems.
4. Python has great support from the programming community.
5. Python have available packages, which can be selected optionally.
6. Python have Pygame package, which contains useful functions for the purpose of the project.
Using a built package is a faster and more efficient way, than writing everything from scratch. They are also free, readily available, and have great support from the community who also use Python and the packages.
Pygame
Pygame is a package of pre-built functions that is regularly used for simple game designers. Pygame has useful functions of drawing objects into the program screen, which can be useful for the project.
PyCharm
There are a range of Python editor software available. Python can even be written in Notepad, Sublime, or most of basic text editor software. However, Pycharm by JetBrains appears to be one of the best Python editor software out there. It supports selecting installed Python versions, to keep the compiler up to date. It also support handling Python packages such as Pygame, numpy and other useful resources, which will be later added into the project. Pycharm is also infused with intelligent coding assistance, such as real time error detection, intelligent code refractor (variable renaming) and many other features. In real world coding practice, Pycharm helps coding smoothly and interactively. On top of that, Pycharm has an open source free version, which is very useful for the learning. In short, if there is a project to be built with Python, then Pycharm is one of the brightest candidates.
Technical Methods
Development Process
Developing a program, like everything else, need to start from simplicity, and then builds up from that. It could be said that writing a program is like building a house. It is required to start building a foundation strong, or the house would fall. All large projects begin with the smaller first steps. And the first milestone in the early stages of the project were to set up 2 player robots, with 1 versus 1 (1v1) – human vs robot to test the robotic behaviour, and how the simulation is displayed in Python simulation environments. The “human” robot will be controlled by user input, and the other “computer” robot will be controlled automatically by the computer itself, using programmed algorithms. The “human” robot will be used in the future to test the functionality of “computer” robots’ algorithms in the future. The first stage sounds very basic and seems to be the easiest one. Despite of how this first step is perceived, it turns out to be one of the most challenging stages of all. This stage was considered the foundation of the whole project later. The fundamental components required for successful implementation of this very first step relies on:
1. Pygame package, which is responsible for:
a. Generating an environment, a “playground”, where the game CTF takes place.
b. Receiving user input from the keyboards and execute movements.
c. Displaying robots, flags and other in-game objects and visualization
2. A well-defined class for unicycle robots, which describes all robot’s movement and behaviour, that mimic the movement of the robot influenced by the laws of physics.
3. A test “main” file, which will be compiled, tested, and debugged.
4. All other files that contain constant variables, such as screen size, RGB colour triplets, other useful functions, and so on.
After each algorithm implemented, it is critical to test and debug recent implementation, to see how it works with previously implemented algorithms, and how they behave in a wide picture. It must be guaranteed for the simulation to be bug-free before more implementations is considered. It would not make sense to develop a big chunk of code, that does not work properly. As the process of the making contains a great deal of failure and unsatisfactory attempts, in which the simulation does not work the way it is intended to, this report will only mention successful implementation and some other algorithms under consideration. The code analysis from this point will be derived from the source code of the program in June 2022. The next part will cover the program hierarchy in a wider perspective, which will be useful information for those who would like to take over and continue with this project
Unicycle Robot Model
Positional representation
As robot moves in a 2D area, (x, y) Cartesian coordinates is used.
Robot’s average speed
Robot’s average speed: 𝑣 = ∆d/∆t (d denotes displacement) In simulations, ∆t is considered 1 frame, or 1 program cycle.
We can substitute ∆t = 1. Hence:
∆d consists of 2 components 𝑑_𝑥 and 𝑑_𝑦. Therefore, we have new displacement knowing velocity and angle:
Robot’s acceleration
Similarly, each program cycle (frame) is considered ∆t. We can substitute ∆t = 1.
Using these derived formulas, it is possible to simulate unicycle robot movements with real world physics constrains.
Project Outcomes
This project produced a 5 v 5 CTF game with 3 options of play including PC v PC, P v PC and P V P. The game contained 6 attacking algorithms, 3 defending algorithms, 1 avoiding algorithm, 1 chasing algorithm and 1 algorithm for PC control.
Attacking Algorithms
Attacking Algorithm 1
Robots head directly towards flag.
Attacking Algorithm 2
Robots follow the y-coordinate of starting position, and at random position the robots converge on the flag.
Attacking Algorithm 3
All attacking robots head towards the flag in wave motion, and at random position the robots converge on the flag.
Attacking Algorithm 4 - 6
5 robots must be attacking. The robots get into 1 of 3 formations and attack the flag.
Defending Algorithms
Defending Algorithm 1
All defending robots move in a rectangular motion in front of the flag.
Defending Algorithm 2
All defending robots spread out horizontally in their own half and they move in an up and down motion.
Defending Algorithm 3
All defending robots spread out horizontally in their own half and they move in an up and down motion.
Avoiding/Chasing Algorithms
Avoiding
Robot uses algorithm to swerve around the the opponent’s defender when within a distance of 50 units.
Chasing
Robot uses algorithm to head towards the attacker when within a distance of 100 units.
PC Control
Algorithm picks random algorithm and attacker-defender ratio when all robots are defending. The probability of a certain algorithm increases with every successful capture. The program runs until a best strategy is achieved.
Conclusions
Results
Each attacking algorithm (5 attackers) was tested against each defending algorithm (5 defenders) and the following table shows how the first defending algorithm was the significantly more effective than the other algorithms as it caught the most attackers and returned the most flags.
Next, the PC v PC game was tested by running 5 games to a 100 flags each. The amount of each algorithm used was measured when 5 robots were attacking and we can see from the "Number of attackers used for 5 attackers" graph that attacking algorithm 4 was used the most. The "Frequency of Number of Attackers used" shows that using 3 attackers and 2 defenders is the most effective combination to capture flags.
Achievements
So far, the project has reached its final stage, which is implementation of a full game of CTF using unicycle robot’s model. The following tasks have been accomplished:
- Completed deliverable of programming the game of CTF.
- Setup a testbed for the simulation of the game CTF.
- Implemented unicycle robot model into object uc_robot, which follows the physics rules as if they were robots in the real world.
- Setup successfully most of the attacking and defending algorithms that the team has come up with for this project.
- The game simulation is playable
Future implementations
Attacking algorithm
Although attacking algorithms are working as they should be, the movement is identical if same algorithm is used. They should have randomized and differential attacking approach to bring the game to the next level. As mentioned in the algorithm evaluation section, the movements of the players in the offending are identical no matter how many iterations. The area for enhancement is the randomness of movement, as well as adding more attacking formations into the arsenal. There is also a solution of making decentralised learning to achieve synchronised goal, in which the attacking algorithm can make good use of in the future[5]. Also, the potential field for path finding is a very interesting and promising approach to plan out the shortest and safest route to the enemy’s flag.
Defending algorithm
Defending algorithms should have been infused with more formations and should tackle attackers more effectively. It can be said that the defending algorithm is not up to the same level as offending. Also, since the avoiding algorithm is working well with single defenders, there is a need to come up with better team-based strategy, to intercept the opponent, especially the one who is holding the flag.
Machine learning (AI)
The AI mixed algorithms are meant to be one of the most interesting implementations, which will automatically select suitable strategy, make choices of defending and offending by its own. Obviously, it will require a great deal of time and attention to develop as well. The motivation of the project, after all, is to implement such smart algorithms that can play against the best human and excel in the game of CTF. However, time is running out for the project team, and it will take a great deal of time for it to be completed. Therefore, the team decided to make it simple, using random selection. This part will be one of the most invested if the project is continued. The current program still have little space for randomness, which leads to limited variations between different iterations. The ultimate goal for this project is that the machine can learn how to play, and constantly improve itself.
References
- ↑ A. H. B. Scott R. Granter, David J. Papke, "AlphaGo, Deep Learning, and the Future of the Human Microscopist," 2017
- ↑ X. Chen, "The Evolution of Computing: AlphaGo," Computing in Science & Engineering, vol. 18, no. July-Aug. 2016, 4, pp. 4-7, 2016, doi: 10.1109/MCSE.2016.74.
- ↑ Oceanai.mit.edu. 2021. Aquaticus : Main - Home Page browse. [online] Available at: https://oceanai.mit.edu/aquaticus/pmwiki/pmwiki.php?n=Main.HomePage> [Accessed 29 October 2021].
- ↑ H. Huang, J. Ding, W. Zhang, and C. J. Tomlin, "A differential game approach to planning in adversarial scenarios: A case study on capture-the-flag," in 2011 IEEE International Conference on Robotics and Automation, 2011: IEEE, pp. 1451-1456.
- ↑ D. D. Nguyen, A. Rajagopalan, J. Kim, C.-C. Lim, and D. Hubczenko, "Dynamic Multi-Target Assignment with Decentralised Online Learning to Achieve Multiple Synchronised Goals," in 2020 International Conference on Machine Learning and Cybernetics (ICMLC), 2020: IEEE, pp. 111-117.
Project Team
Student researchers
Kevin Perera 
Vu Quang Truong 
Supervisors
Dr. Cheng-Chew Lim 
Dr. Duong Duc Nguyen 
David Hubczenko (DTSG) 


