Victoria’s favourite ping pong tournament for charity is back!Learn More
This is a follow-up to my earlier post “Slithering to Success at BattleSnake 2017”.
I wanted to write a bit about the design of our bounty snake, “Robo”, and my experience building it. This was my first time participating in any kind of programming challenge, and I honestly didn’t expect the bounty snake to perform as well as it did.
Our win conditions to claim the bounty were the following:
- Game is played on a 17 x 17 board
- 10 food are present on the board, at any given time
- API timeout of 1 second
- One-versus-one, last snake slithering wins the bounty.
Under these conditions, we won forty-two games and lost three, for a total win record of 42/45 or 93%.
Before I could start building a snake, I needed two things — an arena to run the games in, and one or more enemy snakes to test against.
My first goal was the arena. I searched GitHub in early January but wasn’t able to find the source code to the previous year’s arena (let alone the current year’s). So I decided to build my own, using only screenshots and YouTube videos of previous year’s gameplay, and a copy of the 2016 API documentation as reference. I managed to get something “good enough” cobbled together over the course of a couple weekends, which eventually became known as Mojave.
Getting some of last year’s snakes to test against was much easier. I searched GitHub for “battlesnake”, which turned up a lot of results. I downloaded most of them, ran them locally, and played them against each other using Mojave – the winners became the snakes that I would use for testing the bounty snake against:
Some of the bounty snake’s win conditions stem from the fact that the snake was initially built against the 2016 API – such as the 17×17 game board (which was the size used in the YouTube videos of the previous year’s gameplay) and the 1 second API timeout (which was mentioned in the previous year’s documentation).
Assuming that I had a full second to work with, I wanted to optimize the bounty snake for performance, and ensure that as much of that second was used as possible to compute the “best” square to move to. For this reason, I selected Lua (under OpenResty) as my language of choice. Lua was a good fit as I had already used it to build Mojave (using the excellent Love2D framework), as well as the tracking infrastructure for Deskmetrics, which I have written about previously. One of the big advantages to using OpenResty is that there is no additional overhead of an application server – application code is executed directly on the web server, providing a substantial performance boost.
I hosted the bounty snake server on a c4.xlarge instance from Amazon Web Services. The instance ran in us-west-2 (Oregon), in order to provide the lowest latency to Victoria where Mojave would be running on my laptop (for the day of the competition, this was moved to us-east-1 which is where the arena was running). Last year, Redbrick’s bounty snake team had used Heroku, but they experienced severe latency issues – we didn’t want to use them again, for that reason.
I searched for “snake algorithm” on Google, and eventually found my way to the “Google AI Challenge post-mortem” blog. I started reading through it and realized that, as a game of Snake progresses, and the length of the snakes in the game grow, eventually what you get is much closer to a Tron game than a Snake game – and that looking into Tron algorithms rather than Snake algorithms may provide me with a late-game advantage.
This led me to a family of algorithms known as Minimax. A minimax algorithm works by looking at all possible moves that the player can make, and for each of those moves, then looking at all possible moves that the opponent can make. This process, which can recurse until you run out of computation time, forms a decision tree. At each leaf node of the tree, a heuristic is run to determine the current state of the game and score it. As the algorithm returns through each branch of the tree, if it is looking at the player’s moves, the best possible scoring move will be selected (maximized). If the algorithm is looking at the opponent’s moves, then the worst possible scoring move will be selected (minimized). The idea is that the best move for you results in the worst move for your opponent, and vice versa.
By implementing minimax, the bounty snake would be able to “see” several moves ahead, before making a decision about what to do. However, this means that the snake would only be effective in one-versus-one battles, which is why that became one of our win conditions.
Rather than selecting “pure” minimax, I selected a variant known as alpha-beta pruning. It is an optimized form of the algorithm that is able to prune branches of the decision tree that would result in a better (or worse) score, depending on whose turn we are looking at.
In order to “play” several moves ahead, the bounty snake would have to be taught the rules of the game. For example, if a future move resulted in passing over a food square, it would need to know that its’ length would increase on the next turn. If moving to a non-food square, it would need to know that its’ health would decrease by one point.
No matter what square was being moved to, the bounty snake would have to maintain a representation of the current game board in memory, that would be updated each time it recursed through the minimax algorithm with the new coordinates of itself and its opponent.
We also make sure that when selecting a move, to never target another snake’s head, and to only move into a square that another snake can move to if we are the larger snake.
Most of the logic up to this point was easy (though time-consuming) to implement. Writing a heuristic function, on the other hand, was quite challenging, and I believed it was what would ultimately make or break the bounty snake.
The heuristic function is the code that runs once the minimax algorithm has reached a leaf node. This function’s job is to take the new state of the game board, having looked several moves ahead, evaluate it in some way, and spit out a number that can be returned to minimax.
Tron games typically use a voronoi diagram here, but that wouldn’t work very well in a snake game (at least not in an early game) because snakes only grow when eating food. So I came up with my own heuristic, which uses a number of factors to score the board.
First, it checks for immediate win/loss conditions, which give the board an infinite (or negative infinite) score:
- Are there any moves available to me/my opponent from this position?
- Is my/my opponent’s health at 0?
The next thing that it does is run a floodfill in order to see how many squares on the board can be reached from my position and the opponent’s position. If the number of squares that are available is less than my own length, I know that moving to this position will probably kill me, so it will result in a low board score.
However, moving to such a position is not guaranteed to kill me – I could be trapped by my opponent’s tail, but in a couple more moves that tail will get out of the way and open up the board again. So this score, while low, won’t be as low as a score that results in an immediate death, such as moving into a wall.
The same rules apply in inverse when looking at the opponent – if the opponent appears to be trapped, we will score the board highly, but not as high as a guaranteed kill.
Next is food. For each piece of food on the board, we measure the distance between the bounty snake’s head and the food. That distance is subtracted from the board’s score – so that as the snake approaches the food, the board score gets higher, and collecting the food will make the board score highest of all (Minimax does not add new food to the board when looking ahead – since we don’t know where it is going to spawn).
The food score is weighted depending on the bounty snake’s health. If the bounty snake is full, food is ignored. As the bounty snake gets hungrier, the food score affects the board score more and more until it overwhelms everything else that affects the board score (save for instant-death considerations).
Last but not least, I added some code to score the board higher if the bounty snake is close to the center of the board. There are two reasons for this, the first being that it is much easier to reach food from the center than another area of the board if we don’t know when and where it is going to spawn – and secondly, this behavior forces the enemy snake to the edges of the board, making it easier to go for a “kill shot” by trapping the enemy against the wall of the arena.
After the board score is calculated, it will be weighted by the result of the first floodfill, as a percentage of the total squares on the board. So if there are two moves that would otherwise result in the same score, we will always select the one that leads to the most open area on the board.
Halfway through building the bounty snake, the 2017 API was released. This required me to update both Mojave and the bounty snake, in order to support the changes – for example, “game_id” was now used to store the game’s ID instead of “game”.
One particular challenge came from the fact that no snakes existed yet using the 2017 API, meaning that I was unable to test my implementation (since all the other snakes on GitHub used the 2016 API). To work around this I had to make another modification to Mojave – allowing toggling the 2017 rule changes separately from the 2017 API changes. This would allow me to run games using the 2016 API, but with the 2017 rules, so that I could still use test snakes.
Eventually, however, some 2017 API snakes started to show up on GitHub that I could practice against directly. A couple of them were pretty tough:
https://github.com/adamdubicki/BattleSnake (they showed up to challenge us at the event!)
The 2017 rule changes dropped the default timeout value from 1 second to 200 milliseconds. This was done in order to prevent teams from human-controlling their snake, but it impaired the effectiveness of my minimax implementation. Where under a 1-second timeout I could look 7-8 moves ahead, with the lower timeout I was reduced to looking about 4-5 moves ahead.
The timeout also had another adverse impact – when Lua’s garbage collector kicked in (which happened frequently due to all the objects that get created during minimax recursion), response times would increase just enough that the bounty snake would timeout. I was partially (though not completely) able to mitigate this by executing the garbage collector manually when I knew it would be safe to do so (i.e. immediately after returning a move to the arena)
Ultimately, however, the timeout change turned out to be a non-issue. I was informed that bounty snakes were exempt from the timeout, and were allowed to take however long they wanted to execute their move (other snakes in a bounty snake game would be allowed the same time). So I left the timeout at 1 second.
Eventually, the official arena was made public. I got it set up and running, only to find that the bounty snake would crash when I tried to play with it. As it turns out, Mojave was making some assumptions about the arena that didn’t hold true, and the bounty snake was relying on those assumptions. So I had to fix them both, again.
The experience of building the bounty snake was a lot of fun, and I found it quite refreshing to work on a programming project outside of the realm of web development for a change.
While it functioned well as a bounty snake, I don’t believe that Robo would have performed very well had it been able to play in the tournament proper, as the general implementation of a minimax algorithm only allows for predicting the move of a single opponent. On an arena containing up to twelve snakes at a time, it would simply not be effective.
Next year, I think it would be fun to create a bounty snake that’s a bit more “creative”. Maybe something that is designed for a non-square arena, or two snakes that have to be defeated instead of one (one company did that this year). I also wouldn’t mind using a bounty snake as an opportunity to try out a new language, such as Rust or Go.
While our bounty snake server is no longer running, if you would like to try your hand at playing it, I have released the code under an open-source (MIT) license here: https://github.com/rdbrck/bountysnake2017 .
We look forward to seeing some tough competition next year!