平均場ゲーム理論のソースを表示

'''平均場ゲーム理論'''（へいきんばゲームりろん、''Mean-field game theory''）は、非常に大規模な集団における小さな相互作用[[経済主体|エージェント]]による戦略的意思決定の研究である。

== 解説 ==
[[ゲーム理論]]と確率分析および制御理論の交差点にある。「平均場」という用語の使用は、個々の粒子がシステムに与える影響がごくわずかである多数の粒子のシステムの挙動を考慮する物理学の[[平均場近似|平均場理論]]に触発されている。言い換えると、各エージェントは、他のエージェントの決定を考慮して、最小化または最大化の問題に従って行動し、その母集団が多いため、エージェントの数は無限大へ向かうと仮定でき、代表的なエージェントが存在するとも仮定できる。<ref>{{Cite arXiv|arxiv=1907.01411|class=math.OC|last=Vasiliadis|first=Athanasios|title=An Introduction to Mean Field Games using probabilistic methods}}</ref>

伝統的なゲーム理論では、研究対象は通常、2人のプレイヤーと離散的な時間空間を持つゲームであり、帰納法によって結果をより複雑な状況に拡張する。ただし、連続状態を持つ連続時間のゲーム(差分ゲームまたは確率的差分ゲーム)の場合、動的相互作用が生成する複雑さのために、この戦略は使用できない。一方、MFGでは、平均代表エージェントを介して多数のプレーヤーを処理できると同時に、複雑な状態のダイナミクスを記述できる。

このクラスの問題は、ボヤン・ヨバノビッチとロバート・W・ローゼンタールによる経済学文献<ref>{{Cite journal|last=Jovanovic|first=Boyan|last2=Rosenthal|first2=Robert W.|year=1988|title=Anonymous Sequential Games|journal=[[Journal of Mathematical Economics]]|volume=17|issue=1|pages=77–87|doi=10.1016/0304-4068(88)90029-8}}</ref>、ミンイ・ファン、ローランド・マルハメ、ピーター・E・ケインズによる工学文献<ref>{{Cite journal|last=Huang|first=M. Y.|last2=Malhame|first2=R. P.|last3=Caines|first3=P. E.|year=2006|title=Large Population Stochastic Dynamic Games: Closed-Loop McKean–Vlasov Systems and the Nash Certainty Equivalence Principle|journal=Communications in Information and Systems|volume=6|issue=3|pages=221–252|doi=10.4310/CIS.2006.v6.n3.a5|zbl=1136.91349}}</ref><ref>{{Cite journal|last=Nourian|first=M.|last2=Caines|first2=P. E.|year=2013|title=ε–Nash mean field game theory for nonlinear stochastic dynamical systems with major and minor agents|journal=SIAM Journal on Control and Optimization|volume=51|issue=4|pages=3302–3331|arxiv=1209.5684|doi=10.1137/120889496}}</ref><ref>{{Cite journal|last=Djehiche|first=Boualem|last2=Tcheukam|first2=Alain|last3=Tembine|first3=Hamidou|date=2017|title=Mean-Field-Type Games in Engineering|journal=AIMS Electronics and Electrical Engineering|volume=1|issue=1|pages=18–73|arxiv=1605.03281|doi=10.3934/ElectrEng.2017.1.18}}</ref> 、そして数学者ジャン・ミッシェル・ラスリーと [[ピエール＝ルイ・リオン]]によって独立してほぼ同時に検討された<ref>{{Cite journal|last=Lions|first=Pierre-Louis|last2=Lasry|first2=Jean-Michel|date=March 2007|title=Large investor trading impacts on volatility|url=http://www.numdam.org/item/AIHPC_2007__24_2_311_0/|journal=Annales de l'Institut Henri Poincaré C|volume=24|issue=2|pages=311–323|bibcode=2007AIHPC..24..311L|doi=10.1016/j.anihpc.2005.12.006}}</ref><ref>{{Cite journal|last=Lasry|first=Jean-Michel|last2=Lions|first2=Pierre-Louis|date=28 March 2007|title=Mean field games|url=https://basepub.dauphine.fr/handle/123456789/2263|journal=Japanese Journal of Mathematics|volume=2|issue=1|pages=229–260|doi=10.1007/s11537-007-0657-8}}</ref>。


連続時間では、平均場ゲームは通常、個人の[[最適制御]]を記述する[[ハミルトン-ヤコビ-ベルマン方程式|ハミルトン–ヤコビ–ベルマン方程式]]と、エージェントの集合分布のダイナミクスを記述する[[フォッカー・プランク方程式|フォッカー–プランク方程式]]で構成される。かなり一般的な仮定の下では、平均場ゲームのクラスが次のようにNプレイヤーの[[ナッシュ均衡]]の<math>N \to \infty</math>の極限であることを証明できる<ref>{{Cite web |url=https://www.ceremade.dauphine.fr/~cardaliaguet/MFG20130420.pdf |title=Notes on Mean Field Games |author=Cardaliaguet |first=Pierre |date=September 27, 2013}}</ref>。


平均場ゲームに関連する概念は、「平均場型制御」である。この場合、ソーシャルプランナーは状態の分布を制御し、制御戦略を選択する。平均場型制御問題の解は、通常、[[フォッカー・プランク方程式|コルモゴロフ方程式]]と結合した二重随伴ハミルトン-ヤコビ-ベルマン方程式として表すことができる。平均場型ゲーム理論は、単一エージェント平均場型制御のマルチエージェント一般化である<ref>{{Cite book |url=https://www.springer.com/gp/book/9781461485070 |title=Mean Field Games and Mean Field Type Control Theory |last=Bensoussan |first=Alain |last2=Frehse |first2=Jens |last3=Yam |first3=Phillip |date=2013 |publisher=Springer-Verlag |isbn=9781461485070 |series=Springer Briefs in Mathematics |location=New York |language=en}}{{要ページ番号|date=May 2019}}</ref>。

== 平均場ゲームの一般形式 ==
次の連立方程式を使用して<ref>{{Cite book |last=Achdou |first=Yves |url=https://www.worldcat.org/oclc/1238206187 |title=Mean field games : Cetraro, Italy 2019 |date=2020 |others=Pierre Cardaliaguet, F. Delarue, Alessio Porretta, Filippo Santambrogio |isbn=978-3-030-59837-2 |location=Cham |oclc=1238206187}}</ref> 、典型的な平均場ゲームをモデル化できる。

<math>\begin{cases} \partial_t u-\nu \Delta u+H(x,m,Du)=0 &(1)\\ \partial_t m-\nu \Delta m-div(D_p H(x,m,Du) m)=0 &(2)\\ m(0)=m_0 &(3)\\ u(x,T)=G(x,m(T)) &(4) \end{cases}</math>

この一連の方程式の基本的なダイナミクスは、平均的なエージェントの最適制御問題によって説明できる。平均場ゲームでは、平均的なエージェントは、次の方法で移動αを制御して、母集団の全体的な位置に影響を与えることができる。


<math>d X_t=\alpha_t d_t+\sqrt{2\nu}B_t</math>

<math>\nu</math> はパラメータであり、 <math>B_t</math> は標準ブラウン運動。 エージェントの動きを制御することにより、エージェントは、期間<math>[0,T]</math>を通じて全体的な予想コスト<math>C</math> を最小限に抑えることを目指している。

<math>C=\mathbb{E}[\int_{0}^TL(X_s,\alpha_s,m(s))ds+G(X_T,m(T))]</math>

<math>L(X_s,\alpha_s,m(s))</math> は時間<math>s</math>におけるランニングコストで <math>G(X_T,m(T))
</math>は時間<math>T</math>におけるターミナルコスト。定義により、時間<math>t</math>と位置<math>x</math>について、 価値関数<math>u(t,x)</math>は以下のように決定できる。

<math>u(t,x)=\inf_{\alpha}\mathbb{E}[\int_{t}^TL(X_s,\alpha_s,m(s))ds+G(X_T,m(T))]</math>

価値関数 <math>u(t,x)</math>の定義が与えられると、ハミルトン-ヤコビ方程式 (1) で追跡できる。平均的なプレーヤーの最適なアクション<math>\alpha^*(x,t)</math> は として求めることができる。すべてのエージェントは比較的小さく、集団のダイナミクスを単独で変更することはできないので、それらは個別に最適な制御を適応させ、人口はそのように移動する。これは、すべてのエージェントが他の特定の戦略のセットに応じて行動するナッシュ均衡に似ている。最適制御解は、コルモゴロフ-フォッカー-プランク方程式(2)につながる。

== 有限状態ゲーム ==
平均場の顕著なカテゴリは、有限数の状態と有限数のプレイヤーあたりのアクションを持つゲームである。これらのゲームでは、ハミルトン-ヤコビ-ベルマン方程式の類似物はベルマン方程式であり、フォッカー-プランク方程式の離散バージョンはコルモゴロフ方程式である。具体的には、離散時間モデルの場合、プレイヤーの戦略はコルモゴロフ方程式の確率行列である。連続時間モデルでは、プレイヤーは遷移率行列を制御することができる。

離散平均場ゲームはタプル <math>\mathcal{G}=(\mathcal{E}, \mathcal{A}, \{ Q_a \}, {\bf m }_0, \{ c_a \}, \beta)</math>,で定義でき、<math>\mathcal{E}</math> は状態空間、 <math>\mathcal{A}</math> は作用集合、<math> Q_{a} </math> は遷移速度行列、<math>{\bf m }_0</math>は初期状態、<math>\{c_a\}</math>はコスト関数、 <math>\beta</math> <math>\in \mathbb{R}</math> は割引係数である。さらに、混合戦略は測定可能な関数<math>\pi: \mathbb{E} \times \mathbb{R}^+ \xrightarrow[]{} \mathcal{P(A)}</math>, これは各状態  <math>i \in \mathcal{E}</math>  と <math>t \geq 0</math>  ごとに可能なアクションのセットに対する確率測度 <math>\pi_i(t) \in \mathcal{P(A)}</math> に関連付ける。したがって、<math>\pi_{i,a}(t)</math>は、時間<math>t</math>において、状態<math>i</math> のプレイヤーが戦略の下で行動<math>a</math>をとる確率である。さらに、レート行列 <math> \{ Q_a ({\bf m }^{\pi}(t)) \}_{a \in \mathcal{A}} </math> は母集団分布の経時的な進化を定義し、ここで <math>{\bf m }^{\pi}(t) \in \mathcal{P(\mathcal{E})}</math>は時刻 <math>t</math>における母集団分布である<ref>{{Cite journal|last=Doncel|first=Josu|last2=Gast|first2=Nicolas|last3=Gaujal|first3=Bruno|year=2019|title=Discrete mean field games: Existence of equilibria and convergence|journal=Journal of Dynamics & Games|pages=1–19|arxiv=1909.01209|doi=10.3934/jdg.2019016}}</ref>。

== 線形二次ガウスゲーム問題 ==
Caines(2009)から、大規模ゲームの比較的単純なモデルは線形二次ガウスモデルである。個々のエージェントのダイナミクスは、[[確率微分方程式]]としてモデル化される。<math display="block">dX_i = (a_i X_i + b_i u_i) \,dt + \sigma_i \,dW_i, \quad i = 1, \dots, N,</math><math>X_i</math>は<math>i</math>番目のエージェントの状態で, <math>u_i</math>は<math>i</math>番目のエージェントの制御, <math>W_i</math>は 独立の<math>i = 1, \dots, N</math>に対する[[ウィーナー過程]]である。 個々のエージェントのコストは、<math display="block">J_i(u_i, \nu) = \mathbb{E}\left\{ \int_0^\infty e^{-\rho t} \left[(X_i - \nu)^2 + ru_i^2\right] \,dt\right\}, \quad \nu = \Phi\left(\frac{1}{N} \sum_{k \neq i}^N X_k + \eta\right).</math>エージェント間の結合はコスト関数で発生する。

== 一般および応用用途 ==
平均場ゲームのパラダイムは、分散意思決定と確率的モデリングの間の主要なつながりとなっている。確率的制御の文献から始まり、次のようなさまざまなアプリケーションで急速に採用されている。

'''金融市場'''。Carmonaは、MFGパラダイムの枠組みの中でキャストして取り組むことができる金融工学と経済学のアプリケーションをレビューしている<ref>{{Cite arXiv|arxiv=2012.05237|class=q-fin.GN|last=Carmona|first=Rene|title=Applications of mean field games in financial engineering and economic theory|date=2020}}</ref> 。カルモナは、マクロ経済学、契約理論、金融などのモデルは、より伝統的な離散時間モデルから連続時間への切り替えから大きな恩恵を受けると主張している。彼はレビューの章で、システミックリスク、価格への影響、最適な執行、銀行経営のモデル、高頻度取引、暗号通貨など、連続時間モデルのみを検討している。


'''群衆の動き'''。MFGは、個人が特定のコストに関して戦略とパスを最適化しようとする賢いプレーヤーであることを前提としている(合理的期待アプローチとの均衡)。MFGモデルは、予測現象を記述するのに役立つ:前方部分は群衆の進化を記述し、後方部分は予測がどのように構築されるかのプロセスを提供する。さらに、マルチエージェントの微視的モデル計算と比較して、MFGは巨視的シミュレーションの計算コストが低くて済む。一部の研究者は、人口間の相互作用をモデル化し、2つの歩行者グループ間の嫌悪感と渋滞行動<ref>{{Cite journal|last=Lachapelle|first=Aimé|last2=Wolfram|first2=Marie-Therese|date=2011|title=On a mean field game approach modeling congestion and aversion in pedestrian crowds|url=https://basepub.dauphine.fr/handle/123456789/5946|journal=Transportation Research Part B: Methodological|volume=45|issue=10|pages=1572–1589|doi=10.1016/j.trb.2011.07.011}}</ref>、朝の通勤者の出発時間の選択<ref>{{Cite arXiv|arxiv=1912.08695|class=q-fin.MF|last=Feinstein|first=Zachary|last2=Sojmark|first2=Andreas|title=A dynamic default contagion model: From Eisenberg-Noe to the mean field|date=2019}}</ref>、自動運転車の意思決定プロセスなど<ref>{{Cite journal|last=Huang|first=Kuang|last2=Chen|first2=Xu|last3=Di|first3=Xuan|last4=Du|first4=Qiang|date=2021|title=Dynamic driving and routing games for autonomous vehicles on networks: A mean field game approach|journal=Transportation Research Part C: Emerging Technologies|volume=128|page=103189|arxiv=2012.08388|doi=10.1016/j.trc.2021.103189}}</ref>、インテリジェントエージェントの意思決定プロセスを研究するためにMFGに目を向けた。


'''エピデミックの制御と緩和'''。流行は社会と個人に大きな影響を与えているため、MFGと平均場制御(MFC)は、特にCovid-19パンデミック対応のコンテキストで、根底にある人口動態を研究および理解するための視点を提供する。MFGは、空間効果でSIRタイプのダイナミクスを拡張したり、個人が自分の行動を選択し、病気の蔓延への寄与を制御できるようにするために使用されている。MFCは、空間領域内でのウイルスの拡散を制御し、社会的相互作用を制限する個人の決定を制御し、政府の非医薬品介入をサポートするための最適な戦略を設計するために適用される。<ref>{{Cite journal|last=Lee|first=Wonjun|last2=Liu|first2=Siting|last3=Tembine|first3=Hamidou|last4=Li|first4=Wuchen|last5=Osher|first5=Stanley|date=2021|title=Controlling propagation of epidemics via mean-field control|journal=SIAM Journal on Applied Mathematics|volume=81|issue=1|pages=190–207|arxiv=2006.01249|doi=10.1137/20M1342690}}</ref><ref>{{Cite journal|last=Aurell|first=Alexander|last2=Carmona|first2=Rene|last3=Dayanikli|first3=Gokce|last4=Lauriere|first4=Mathieu|date=2022|title=Optimal incentives to mitigate epidemics: a Stackelberg mean field game approach|journal=SIAM Journal on Control and Optimization|volume=60|issue=2|page=S294–S322|arxiv=2011.03105|doi=10.1137/20M1377862}}</ref> <ref>{{Cite journal|last=Elie|first=Romuald|last2=Hubert|first2=Emma|last3=Turinici|first3=Gabriel|date=2020|title=Contact rate epidemic control of COVID-19: an equilibrium view|journal=Mathematical Modelling of Natural Phenomena|volume=15|page=35|doi=10.1051/mmnp/2020022}}</ref>

== 出典 ==
{{Reflist|30em}}

== 外部リンク ==
* [http://www.ieeecss-oll.org/lectures/2009/mean-field-stochastic-control Mean Field Stochastic Control] ([http://www.ieeecss-oll.org/sites/default/files/Caines.pdf Slides]), 2009 IEEE Control Systems Society Bode Prize Lecture by Peter E. Caines
* {{Cite book |doi=10.1007/978-1-4471-5102-9_30-1 |chapter=Mean Field Games |title=Encyclopedia of Systems and Control |pages=1–6 |year=2013 |last=Caines |first=Peter E. |isbn=978-1-4471-5102-9}}
* [https://www.ceremade.dauphine.fr/~cardaliaguet/MFG20130420.pdf Notes on Mean Field Games], from [[ピエール＝ルイ・リオン|Pierre-Louis Lions]]' lectures at [[コレージュ・ド・フランス|Collège de France]]
* {{In lang|fr}} [http://www.college-de-france.fr/site/pierre-louis-lions/index.htm#%7Cm=course%7Cq=/site/pierre-louis-lions/course-2011-2012.htm Video lectures] by Pierre-Louis Lions
* [https://www.oliviergueant.com/uploads/4/3/0/9/4309511/paris-princeton.pdf Mean field games and applications] by Olivier Guéant, Jean-Michel Lasry, and Pierre-Louis Lions

{{ゲーム理論}}
{{DEFAULTSORT:へいきんはけえむりろん}}
[[Category:数理経済学]]
[[Category:ゲーム理論]]
[[Category:金融工学]]
[[Category:数学に関する記事]]