逐次最小問題最適化法のソースを表示

{{Infobox algorithm
|image=
|class=サポートベクターマシンの訓練のための{{仮リンク|最適化アルゴリズム|en|Optimization algorithm}}
|data=
|time=O(''n''³)
|space=
}}
'''逐次最小問題最適化法'''（{{lang-en-short|Sequential Minimal Optimization}}, '''SMO'''）は[[サポートベクターマシン]] (SVM) の訓練で生じる[[2次計画問題]] (QP) を解くための[[アルゴリズム]]である。1998年に[[マイクロソフトリサーチ]]の{{仮リンク|John Platt|en|John Platt (computer scientist)}}によって発明された<ref>{{Citation
 | last = Platt | first = John
 | year = 1998
 | title = Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines
 | id = {{Citeseerx|10.1.1.43.4376}}
}}</ref>。SMOはサポートベクターマシンの訓練のために広く使われ、人気の[[LIBSVM]]ツールによって実装される<ref>{{Cite journal
|last1=Chang |first1=Chih-Chung
|last2=Lin |first2=Chih-Jen
|title=LIBSVM: A library for support vector machines
|journal=ACM Transactions on Intelligent Systems and Technology
|volume=2 |issue=3 |year=2011
}}</ref><ref>Luca Zanni (2006). ''[http://jmlr.csail.mit.edu/papers/volume7/zanni06a/zanni06a.pdf Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems]''.</ref>。以前から利用できたSVM訓練法はより一層複雑で、高価な[[サードパーティー]]のQPソルバーを必要としたので、1998年のSMOアルゴリズムの公表はSVMコミュニティでたくさんの興奮を引き起こした<ref>{{Citation
 | last = Rifkin | first = Ryan
 | year = 2002
 | url = https://hdl.handle.net/1721.1/17549
 | title = Everything Old is New Again: a Fresh Look at Historical Approaches in Machine Learning
 | journal = Ph.D. thesis
 | pages = 18
}}</ref>。

== 最適化問題 ==
{{Main|サポートベクターマシン}}
データセット (''x''<sub>1</sub>, ''y''<sub>1</sub>), ..., (''x''<sub>''n''</sub>, ''y''<sub>''n''</sub>) に関する[[二項分類]]問題を考える。ここで ''x''<sub>''i''</sub> は入力ベクトル、{{nobr|''y''<sub>''i''</sub> ∈ {-1, +1} }}はそれに対応する2値ラベルである。ソフトマージン[[サポートベクターマシン]]は以下の[[双対問題]]で表される2次計画問題を解くことによって訓練される:
{{Indent|<math>\max_{\alpha} \sum_{i=1}^n \alpha_i - \frac12 \sum_{i=1}^n \sum_{j=1}^n y_i y_j K(x_i, x_j) \alpha_i \alpha_j,</math>}}
{{Indent|ただし}}
{{Indent|<math>0 \leq \alpha_i \leq C, \quad \mbox{ for } i=1, 2, \ldots, n,</math>}}
{{Indent|<math>\sum_{i=1}^n y_i \alpha_i = 0</math>}}
ここで ''C'' は SVM hyperparameter、''K''(''x''<sub>''i''</sub>, ''x''<sub>''j''</sub>) は{{仮リンク|カーネル関数|en|Positive-definite kernel|redirect=1}}で、どちらもユーザが与える。変数 <math>\alpha_i</math> は[[ラグランジュの未定乗数法|ラグランジュ乗数]]である。

== アルゴリズム ==
SMOは上記の最適化問題を解くための反復型アルゴリズムである。SMOはこの問題をその時解析的に解かれる一連の最小の可能な部分問題に分割する。ラグランジュ乗数 <math>\alpha_i</math> を伴う線形等式制約のため、最小の可能な問題はそのような2つの乗数を含む。そして、任意の2つの乗数 <math>\alpha_1</math>、<math>\alpha_2</math> について、次の制約に分解される:
{{Indent|<math>0 \leq \alpha_1, \alpha_2 \leq C,</math>}}
{{Indent|<math>y_1 \alpha_1 + y_2 \alpha_2 = k,</math>}}
<math>k</math> は前述の和の等式より導かれる定数である。そしてこの問題は解析的に解くことができる。

アルゴリズムは次のように進行する:
# 最適化問題の[[KKT条件]]を破るラグランジュ乗数 <math>\alpha_1</math> を見つける。
# 第2の乗数 <math>\alpha_2</math> を選び、組 <math>(\alpha_1,\alpha_2)</math> を最適化する。
# 収束するまでステップ1、2を繰り返す。
すべてのラグランジュ乗数がKKT条件を十分に満たすとき、全体の最適化が終了する。このアルゴリズムは収束することが保証されている。しかし、データセットが大きくなると、組 <math>(\alpha_1,\alpha_2)</math>の選び方が<math>O(n^2)</math>で大きくなるので、より速く収束させるために、部分問題を構成する変数を選び出すためのヒューリスティックを使うことが重要となる。

== 関連項目 ==
* [[サポートベクターマシン]]

== 参考文献 ==
{{Reflist|30em}}

{{デフォルトソート:ちくしさいしようもんたいさいてきかほう}}
[[Category:最適化アルゴリズム]]
[[Category:アルゴリズム]]
[[Category:機械学習]]
[[Category:数学に関する記事]]