Programming a good puzzle game (Puyo Puyo) AI

Question

I'm trying to build a strong AI for a game called Puyo Puyo, in Matlab (I know... but it's the only language I know well).

Basically it's like Tetris but you get falling pairs of 2 puyos (blobs) of different (or same) color, among 4 possible colors. You can move them around and place them however you like. Fill the screen (more precisely the upmost case on the 3rd column) and you're dead. Put 4 puyos of the same color that connect vertically or horizontally (no diagonals, 4-connectivity) and they will pop, disappearing of the screen. The puyos above the popping ones fall downward until it reaches another puyo or the ground. You get to work with your current falling pair as well as the two next pairs as info.

The aim of the game is to produce a large chain by stacking puyos so that breaking one group will also trigger other groups afterwards. Good human players reach 19-chain in endless training which is the maximum possible on the 13x6 board, but the max chain is typically around 12-13 in human matches. I'm limiting my AI to a 12x6 board (the 13th row has special properties I don't want to deal with now), and aim at my AI building 13-chains or more which is suitable for battle against strong human players.

So I've been trying different things like heuristics but I feel like depth-3 search (we got 3 pairs we know of) should give good results, as it uses all info available to the player. However I'm coding in Matlab and there are 5 for loops imbricated into each other (one for the number of games I make the AI play, one for each move I allow the AI, and three for the depth-3 search). So the code runs sloooow... and in the future I'd like to combine depth-3 search with other heuristic parameters and optimize the whole thing with a genetic algorithm.

Anybody got advice on things I could speed up, possibly vectorize some operations? Would coding in another language speed up the for loops? by a factor of how much?

Matlab isn't a very fast language in my experience, but I'll leave it for someone with more knowledge if that's actually critical. Vectorization doesn't seem to be applicable, since you create a new state on every iteration. 2 things that do spring to mind: Pruning and Caching. Pruning is the go-to technique in decision trees. Essentially you can sometimes calculate the bounds for the state, without going through every leaf underneath it. Caching - in most cases, most of your board will remain the same after an action. You can cache your calculations of the board desirability and reuse them. — Ordous, Aug 15 '16 at 19:14
Thanks! Pruning doesn't seem applicable to me as a depth-1&2 very low result can end up being the best possible move at depth-3. I'll look into caching and how to applicate this to my problem, but I'm not too optimistic, it seems like a lot of work for a little gained computational time in my neophyte opinion. Thanks again. — BrouH, Aug 15 '16 at 19:20
It's certainly true that the more depth you have, the more the potential gain from pruning. But also the more difficult the computation algorithm. It may be worth doing it on the topmost level. Whether caching is useful depends on if the function you use to rate a game state is local or not. The more local it is, the more the effect. — Ordous, Aug 15 '16 at 19:31
From what I understand of pruning is that I can evade exploring a whole depth path if the score at depth-1 is not satisfactory. But as I explained, very bad results early on can lead to very good results in the 3rd step of the depth-search, so I don't get why pruning would be useful, or I'm understanding it wrong. — BrouH, Aug 15 '16 at 20:00
If a very bad result can lead to a very good result later on and vice-versa, that is an indication that your value function is sub-optimal. If you want long collapsing chains done by your AI, that means it should necessarily go through a long patch of "bad" states that are ever closer to losing, right? In fact, if you want the AI to do it in multiple turns, you can't get there with a 3-deep search, *unless you include the potential for building chains into your value function*. And if you do, then a very bad result (bad position + no chains potential) *cannot* lead to a good result. — Ordous, Aug 16 '16 at 16:30
Ok I think I grasp what you're saying about the value function. The AI needs to "think" beyond depth-3 like a good human player would. It's a pity though, I was interested in making the code faster, not adding another layer. I'll try it out anyways, thanks. — BrouH, Aug 17 '16 at 15:55

Programming a good puzzle game (Puyo Puyo) AI

0 Answers0