Bot can beat humans in multiplayer hidden-role games

Many gaming bots have been built to keep up with human players. Earlier this year, a team from Carnegie Mellon University developed the world’s first bot that can beat professionals in multiplayer poker. DeepMind’s AlphaGo made headlines in 2016 for besting a professional Go player. Several bots have also been built to beat professional chess players or join forces in cooperative games such as online capture the flag. In these games, however, the bot knows its opponents and teammates from the start.

At the Conference on Neural Information Processing Systems next month, the researchers will present DeepRole, the first gaming bot that can win online multiplayer games in which the participants’ team allegiances are initially unclear. The bot is designed with novel “deductive reasoning” added into an AI algorithm commonly used for playing poker. This helps it reason about partially observable actions, to determine the probability that a given player is a teammate or opponent. In doing so, it quickly learns whom to ally with and which actions to take to ensure its team’s victory.

The researchers pitted DeepRole against human players in more than 4,000 rounds of the online game “The Resistance: Avalon.” In this game, players try to deduce their peers’ secret roles as the game progresses, while simultaneously hiding their own roles. As both a teammate and an opponent, DeepRole consistently outperformed human players.

“If you replace a human teammate with a bot, you can expect a higher win rate for your team. Bots are better partners,” says first author Jack Serrino ’18, who majored in electrical engineering and computer science at MIT and is an avid online “Avalon” player.

The work is part of a broader project to better model how humans make socially informed decisions. Doing so could help build robots that better understand, learn from, and work with humans.

“Humans learn from and cooperate with others, and that enables us to achieve together things that none of us can achieve alone,” says co-author Max Kleiman-Weiner, a postdoc in the Center for Brains, Minds and Machines and the Department of Brain and Cognitive Sciences at MIT, and at Harvard University. “Games like ‘Avalon’ better mimic the dynamic social settings humans experience in everyday life. You have to figure out who’s on your team and will work with you, whether it’s your first day of kindergarten or another day in your office.”

Joining Serrino and Kleiman-Weiner on the paper are David C. Parkes of Harvard and Joshua B. Tenenbaum, a professor of computational cognitive science and a member of MIT’s Computer Science and Artificial Intelligence Laboratory and the Center for Brains, Minds and Machines.

Deductive bot

In “Avalon,” three players are randomly and secretly assigned to a “resistance” team and two players to a “spy” team. Both spy players know all players’ roles.

During each round, one player proposes a subset of two or three players to execute a mission. All players simultaneously and publicly vote to approve or disapprove the subset. If a majority approve, the subset secretly determines whether the mission will succeed or fail. If two “succeeds” are chosen, the mission succeeds; if one “fail” is selected, the mission fails. Resistance players must always choose to succeed, but spy players may choose either outcome.

The resistance team wins after three successful missions; the spy team wins after three failed missions.

Winning the game basically comes down to deducing who is resistance or spy, and voting for your collaborators. But that’s actually more computationally complex than playing chess and poker. “It’s a game of imperfect information,” Kleiman-Weiner says. “You’re not even sure who you’re against when you start, so there’s an additional discovery phase of finding whom to cooperate with.”

DeepRole uses a game-planning algorithm called “counterfactual regret minimization” (CFR) — which learns to play a game by repeatedly playing against itself — augmented with deductive reasoning. At each point in a game, CFR looks ahead to create a decision “game tree” of lines and nodes describing the potential future actions of each player. Game trees represent all possible actions (lines) each player can take at each future decision point. In playing out potentially billions of game simulations, CFR notes which actions had increased or decreased its chances of winning, and iteratively revises its strategy to include more good decisions. Eventually, it plans an optimal strategy that, at worst, ties against any opponent.

CFR works well for games like poker, with public actions — such as betting money and folding a hand — but it struggles when actions are secret. The researchers’ CFR combines public actions and consequences of private actions to determine if players are resistance or spy.

The bot is trained by playing against itself as both resistance and spy. When playing an online game, it uses its game tree to estimate what each player is going to do. The game tree represents a strategy that gives each player the highest likelihood to win as an assigned role. The tree’s nodes contain “counterfactual values,” which are basically estimates for a payoff that player receives if they play that given strategy.

At each mission, the bot looks at how each person played in comparison to the game tree. If, throughout the game, a player makes enough decisions that are inconsistent with the bot’s expectations, then the player is probably playing as the other role. Eventually, the bot assigns a high probability for each player’s role. These probabilities are used to update the bot’s strategy to increase its chances of victory.

Simultaneously, it uses this same technique to estimate how a third-person observer might interpret its own actions. This helps it estimate how other players may react, helping it make more intelligent decisions. “If it’s on a two-player mission that fails, the other players know one player is a spy. The bot probably won’t propose the same team on future missions, since it knows the other players think it’s bad,” Serrino says.

Language: The next frontier

Interestingly, the bot did not need to communicate with other players, which is usually a key component of the game. “Avalon” enables players to chat on a text module during the game. “But it turns out our bot was able to work well with a team of other humans while only observing player actions,” Kleiman-Weiner says. “This is interesting, because one might think games like this require complicated communication strategies.”

Next, the researchers may enable the bot to communicate during games with simple text, such as saying a player is good or bad. That would involve assigning text to the correlated probability that a player is resistance or spy, which the bot already uses to make its decisions. Beyond that, a future bot might be equipped with more complex communication capabilities, enabling it to play language-heavy social-deduction games — such as a popular game “Werewolf” — which involve several minutes of arguing and persuading other players about who’s on the good and bad teams.

“Language is definitely the next frontier,” Serrino says. “But there are many challenges to attack in those games, where communication is so key.”

https://www.sciencedaily.com/rss/all.xml

Metro Vancouver SkyTrain workers vote in favour of job action

Credit to Author: Jennifer Saltman| Date: Thu, 21 Nov 2019 21:57:58 +0000

Workers who operate and maintain Metro Vancouver’s Expo and Millennium SkyTrain lines have voted 96.8 per cent in favour of striking.

No job action has been planned, however, and the union representing 900 SkyTrain attendants, control operators, administration, maintenance and technical staff have mediated discussions planned with the employer, B.C. Rapid Transit Company (BCRTC), next week. SkyTrain is running as usual.

“This vote demonstrates that our members are deeply concerned that the Company has not addressed our key issues at the table,” CUPE 7000 president Tony Rebelo said in a news release. “It also reflects the frustration that many SkyTrain workers feel about how long the process has taken, after more than 40 sessions at the table.”

Canada Line and West Coast Express are not affected by this decision, as their workers are represented by different unions and have their own collective agreements.

Rebelo said the main concerns are wages, staffing levels, forced overtime and sick leave, and that the union is willing to sit down and bargain any time before mediation.

The company, however, plans to wait for the mediated sessions.

B.C. Rapid Transit Company president Michel Ladrak, in a statement, described mediation as a “very important and productive” way to come to resolve differences.

“British Columbia Rapid Transit Company (BCRTC) and CUPE 7000 have agreed to mediation beginning next week, and we are looking forward to those discussions helping us come to a fair and reasonable collective agreement,” Ladrak said.

The vote comes as bus and SeaBus workers and maintenance staff prepare to walk off the job for three days next week.

The strike is scheduled for Wednesday, Thursday and Friday, with regular service resuming on Saturday, Nov. 30. To date, bus drivers have refused to wear uniforms every day since Nov. 1 and have refused to work overtime for three days. Another day of overtime refusal is set for Friday. Maintenance workers have been refusing overtime since Nov. 1.

Hundreds of bus and SeaBus trips have been cancelled as a result of the job action, causing delays and crowding for transit users.

West Vancouver’s Blue Bus and HandyDART are still running regular service. Their workers are represented by different unions and contracts.

Responding to questions about the job action from reporters in Victoria on Thursday, Premier John Horgan urged Coast Mountain Bus Company and Unifor to resume bargaining and avoid a full-scale strike next week.

“They’ve got the whole weekend to hammer out a deal, and I think that’s the best course, not just for the transit community but also British Columbia,” Horgan said.

Horgan dodged questions about whether transit would or should be designated an essential service because of the number of people who rely on it each day, instead reiterating that he believes bargaining is the way forward.

Last year, the bus system in Metro Vancouver saw an average of 931,000 boardings each weekday, and two-thirds of all transit journeys are made by bus.

MORE TO COME.

jensaltman@postmedia.com

twitter.com/jensaltman

https://vancouversun.com/feed/