{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "96dfe427-942a-47e9-8f1f-91854989b8c8",
   "metadata": {},
   "source": [
    "# Agent and standard notions of extensive form games\n",
    "\n",
    "The purpose of this tutorial is to explain the notions of `MixedBehaviorProfile.agent_max_regret` and `MixedBehaviorProfile.agent_liap_value`, and the corresponding solvers `Gambit.nash.enumpure_agent_solve` and `Gambit.nash.liap_agent_solve`. These notions are only relevant for *extensive-form games*, and so `agent_max_regret` and \n",
    "`agent_liap_value` are only available for `MixedBehaviorProfile`s and not for `MixedStrategyProfile`s."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b87ebb4e-7080-4aa1-9920-67fba5a36114",
   "metadata": {},
   "source": [
    "# Nash equilibria are profiles with maximum regret 0\n",
    "\n",
    "For either a `MixedBehaviorProfile` and a `MixedStrategyProfile`, the profile is a Nash equilibrium if and only if its maximum regret is zero.\n",
    "The profiles maximum regret is the maximum over the players of the individual player regrets.\n",
    "A player's regret is 0 if they are playing a mixed (including pure) best response; otherwise it is positive and \n",
    "is the different between the best response payoff (achievable via a pure strategy) against the other players' and what the player actually gets as payoff in this profile.\n",
    "\n",
    "Let's see an example taken from [Myerson (1991)](#references)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5142d6ba-da13-4500-bca6-e68b608bfae9",
   "metadata": {},
   "outputs": [],
   "source": [
    "from draw_tree import draw_tree\n",
    "\n",
    "import pygambit as gbt\n",
    "\n",
    "g = gbt.catalog.load(\"books/myerson1991/fig4_2\")\n",
    "draw_tree(g)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dabe6e40-509e-4454-b3ef-f7f0737cc9d8",
   "metadata": {},
   "source": [
    "Let's use `enumpure_solve` to find all pure Nash equilibria of this game."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7882d327-ce04-43d3-bb5a-36cff6da6e96",
   "metadata": {},
   "outputs": [],
   "source": [
    "pure_Nash_equilibria = gbt.nash.enumpure_solve(g).equilibria\n",
    "print(\"Number of pure equilibria:\", len(pure_Nash_equilibria))\n",
    "pure_eq = pure_Nash_equilibria[0]\n",
    "print(\"Max regret:\", pure_eq.max_regret())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b95e67d-a44e-4622-acb5-37bab18a30f4",
   "metadata": {},
   "source": [
    "We see that the game has only one pure Nash equilibrium and, its maximum regret is 0, which is what defines a Nash equilibrium."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6e3e9303-453a-4bac-a449-fa8fda2ba5ec",
   "metadata": {},
   "outputs": [],
   "source": [
    "eq = pure_Nash_equilibria[0]\n",
    "for infoset, probs in eq.as_behavior().mixed_actions():\n",
    "    print(infoset.player.label, \"infoset:\", infoset.number, \"behavior probabilities:\", probs)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "98eb65d8",
   "metadata": {},
   "source": [
    "The `liap_value` which stands for \"Liapunov value\" is a related notion that sums the squared regrets of each pure strategy in the game. As with the maximum regret, the `liap_value` of a profile is 0 if and only if the profile is a Nash equilibrium, which we confirm now in our example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "804345b9-d32b-4f60-b4a0-f9d69dca10a8",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Liap value:\", pure_eq.liap_value())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c88d08e2-33bf-48ad-b71f-4a0c19929fdc",
   "metadata": {},
   "source": [
    "As we have seen, both the maximum regret and Liapunov value of a profile are non-negative and zero if and only if the profile is a Nash equilibrium. When positive, one can think of both notions as describing how close one is to an equilibrium.\n",
    "\n",
    "Based on this idea, the method `Gambit.nash.liap_solve` looks for *local* minima of the function from profiles to the Liapunov value. The set of Nash equilibria are exactly the *global* minima of this function, where the value is 0, but `liap_solve` may terminate at a non-global, local minimum, which is not a Nash equilibrium."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4afdea13-0cbb-4430-9689-ecf9b6a4b18d",
   "metadata": {},
   "source": [
    "Let's use the method which requires us to specify a starting profile. The method works only with floating point profiles. We will create two profiles, one in floating point and one in rationals, using the former as the starting point for the method and the latter to check the maximum regret and Liapunov value of the profile exactly."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9d18768b-db9b-41ef-aee7-5fe5f524a59e",
   "metadata": {},
   "outputs": [],
   "source": [
    "starting_profile_double = g.mixed_strategy_profile(data=[[0,1,0],[1,0]], rational=False)\n",
    "starting_profile_rational = g.mixed_strategy_profile(data=[[0,1,0],[1,0]], rational=True)\n",
    "print(\"Max regret of starting profile:\", starting_profile_rational.max_regret())\n",
    "print(\"Liapunov value of starting profile:\", starting_profile_rational.liap_value())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e67d9926-d19d-4745-a406-a3c1198a8484",
   "metadata": {},
   "source": [
    "It could be a useful exercise to make sure that you can compute these values of the maximum regret and Liapunov value. For that, the starting point would be computing the reduced strategic form. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e799eded-c6e1-4a3e-80cb-953c52627762",
   "metadata": {},
   "source": [
    "Returning to `liap_solve`, since the maximum regret and therefore Liapunov value are both positive, the starting profile is not a Nash equilibrium and we expect `liap_solve` to return a different profile, which will hopefully, but not necessarily by a Nash equilibrium, depending on whether the solver finding a global minimum, or non-global local minimum, or nothing at all."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b885271f-7279-4d87-a0b9-bc28449b00ba",
   "metadata": {},
   "outputs": [],
   "source": [
    "candidate_eq = gbt.nash.liap_solve(start=starting_profile_double).equilibria[0]\n",
    "print(candidate_eq)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f8a90a9c-393e-4812-9418-76e705880f6f",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Liap value:\", candidate_eq.liap_value())\n",
    "print(\"Max regret:\", candidate_eq.max_regret())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "567e6a6a-fc8d-4142-806c-6510b2a4c624",
   "metadata": {},
   "outputs": [],
   "source": [
    "candidate_eq_rat = g.mixed_strategy_profile(data=[[0,\"1/2\",\"1/2\"],[\"1/3\",\"2/3\"]], rational=True)\n",
    "print(\"Liap value:\", candidate_eq_rat.liap_value())\n",
    "print(\"Max regret:\", candidate_eq_rat.max_regret())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3b61364c-3e7f-4094-8ddf-a557863632e5",
   "metadata": {},
   "source": [
    "Finally, before looking beyond Nash equilibria to \"agent Nash equilibria\", we will use Gambit's `enummixed_solve` to find all extreme mixed (including pure) Nash equilibria of this game."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "87a62c9e-b109-4f88-ac25-d0e0db3f27ea",
   "metadata": {},
   "outputs": [],
   "source": [
    "all_extreme_Nash_equilibria = gbt.nash.enummixed_solve(g).equilibria\n",
    "for eq in all_extreme_Nash_equilibria:\n",
    "    print(eq)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e2e2e129-e7c7-41fe-8bf9-26e3ab889839",
   "metadata": {},
   "source": [
    "The first of these is the pure equilibrium we found above with `enumpure_solve`. The last of these if the mixed equilibrium we just found with `liap_solve`. The middle of these is a new mixed equilibrium we haven't seen yet. Let's just confirm that it too, like the first and last, also have Liapunov value and maximum regret zero, as required for a Nash equilibrium:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2c8ed3df-958e-4ee9-aed6-a106547fbd37",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(all_extreme_Nash_equilibria[2])\n",
    "print(\"Liap value:\", all_extreme_Nash_equilibria[2].liap_value())\n",
    "print(\"Max regret:\", all_extreme_Nash_equilibria[2].max_regret())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "141a6c1f-7f3c-450b-8b2f-1d47671595de",
   "metadata": {},
   "source": [
    "# Agent maximum regret versus standard maximum regret\n",
    "\n",
    "Now we can introduce the \"agent\" versions of both of the notions, maximum regret and the Liapunov value. The \"agent\" versions relate to what [Myerson (1991)](#references) called the \"multi-agent representation\" of an extensive form game, in which each information set is treated as an individual \"agent\". The \"agent maximum regret\" is then either 0 (if every information set has regret 0, i.e. `infoset_regret` 0), or it is largest of the information set regrets, which is then necessarily positive.\n",
    "\n",
    "The maximum regret of a profile is at least at large as the agent maximum regret. \n",
    "In short, the reason it cannot be smaller is that all possible deviations of a given player -- even those that require changing behavior at multiple information sets -- are considered.\n",
    "In particular, that includes deviations at a single information set, or at more than one.\n",
    "On the other hand, the agent maximum regret only considers deviations at a single information set at a time, by considering each such information set as an \"agent\".\n",
    "\n",
    "Thus, **if the maximum regret is 0, then we have a Nash equilibrium, and the agent maximum regret will be 0 too**.\n",
    "However, **there are examples where a profile has agent maximum regret of 0 but positive maximum regret**, so the profile is \n",
    "not a Nash equilibrium.\n",
    "\n",
    "There is also an analagous distinction between `agent_liap_value` and `liap_value`, where the `liap_value` is at least as large as the `agent_liap_value` and there are examples where the former is positive (so we do not have a Nash equilibrium) but the latter is 0 (so we have an \"agent Nash equilibrium\".\n",
    "\n",
    "The game given above is such an example. It is taken from [Myerson (1991)](#references) figure 4.2. \n",
    "\n",
    "Gambit implements version of `enumpure_solve` and `liap_solve` called `enumpure_agent_solve` and `agent_liap_solve` that work only for the extensive form and use `agent_max_regret` and `agent_liap_value` respectively. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f46ce825-d2b7-492f-b0cf-6f213607e121",
   "metadata": {},
   "outputs": [],
   "source": [
    "pure_agent_equilibria = gbt.nash.enumpure_agent_solve(g).equilibria\n",
    "print(len(pure_agent_equilibria))\n",
    "for agent_eq in pure_agent_equilibria:\n",
    "    print(agent_eq)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "912b9af6-a2e4-4bae-9594-41c8861a4d9d",
   "metadata": {},
   "source": [
    "The first of the pure agent equilibria is the Nash equilibrium we found above, which we can check if we convert the agent equilibrium from a `MixedBehaviorProfile` to a `MixedStrategyProfile`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dbfa7035",
   "metadata": {},
   "outputs": [],
   "source": [
    "pure_Nash_equilibria[0] == pure_agent_equilibria[0].as_strategy()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec2a8564-5102-4847-8110-a26ee1f4f400",
   "metadata": {},
   "source": [
    "The second agent equilibrium is not a Nash equilibrium, which we can confirm by showing that it's `max_regret` and `liap_value` are both positive, while the agent versions of these are 0 (which is why this profile was returned by `enumpure_agent_solve`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "85760cec-5760-4f9d-8ca2-99fba79c7c3c",
   "metadata": {},
   "outputs": [],
   "source": [
    "aeq = pure_agent_equilibria[1]\n",
    "print(\"Max regret:\", aeq.max_regret())\n",
    "print(\"Liapunov value:\", aeq.liap_value())\n",
    "print(\"Agent max regret\", aeq.agent_max_regret())\n",
    "print(\"Agent Liapunov value:\", aeq.agent_liap_value())"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a42f18d7-5fb4-4a45-9afd-76a63477ef1d",
   "metadata": {},
   "source": [
    "It is a useful exercise to make sure you can confirm that the pure profile `pure_agent_equilibria[1]` indeed has these values of agent and standard maximum regret and Liapunov value."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c4eeb65f",
   "metadata": {},
   "source": [
    "To conclude, we note that, for most use cases, the standard non-agent versions are probably what a user wants. The agent versions have applications in the area of \"equilibrium refinements\", in particular for \"sequential equilibria\"; for more details see Chapter 4, \"Sequential Equilibria of Extensive-Form Games\", in [Myerson (1991)](#references)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "65def67e",
   "metadata": {},
   "source": [
    "#### References\n",
    "\n",
    "Roger Myerson (1991) \"Game Theory: Analysis of Conflict.\" Harvard University Press. "
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}