エラスティックネットのソースを表示

'''エラスティックネット'''（{{lang-en|Elastic net}}）は、[[ラッソ回帰]]と[[リッジ回帰]]の L<sub>1</sub>正則化と L<sub>2</sub>正則化をパラメータを用いてバランスよく[[線形結合]]で組み合わせた[[正則化]][[回帰分析|回帰]]手法である。[[統計学]]での[[線形回帰]]や[[ロジスティック回帰]]モデルの最適化に用いられる。

== 仕様 ==
エラスティックネットは、[[ラッソ回帰]]のペナルティ関数: <math>\|\beta\|_1 = \textstyle \sum_{j=1}^p |\beta_j|</math> の特性によって生じる欠点を解消した正則化手法である。

ラッソ回帰のペナルティ関数によって生じる欠点<ref name=ZH>{{cite journal|last1=Zou|first1=Hui|first2=Trevor|last2=Hastie|date=2005|title=Regularization and Variable Selection via the Elastic Net|journal=Journal of the Royal Statistical Society, Series B|volume=67|issue=2|pages=301–320|doi=10.1111/j.1467-9868.2005.00503.x|citeseerx=10.1.1.124.4696}}</ref>は具体例として、共変量 ''p'' と標本数 ''n'' のとき、共変量が高次元で標本数の少ないデータの場合、ラッソ回帰では多くとも標本数までしか共変量を選択することができない。また高い相関を持つ共変量の組み合わせのとき、ラッソ回帰内のペナルティ関数が共変量の1つの変数だけに影響され、他の変数が影響しなくなることがある。この欠点を解消するため、エラスティックネットでは、ラッソ回帰の正則化項にリッジ回帰の正則化項のペナルティ関数 (<math>\|\beta\|^2</math>) を新たに加えた形式となる。エラスティックネットにおける推定値は次のように定義する:

: <math> \hat{\beta} \equiv \underset{\beta}{\operatorname{argmin}} (\| y-X \beta \|^2 + \lambda_2 \|\beta\|^2 + \lambda_1 \|\beta\|_1) </math>。

2次の正則化項の導入により損失関数は強凸性となり、損失関数の最小値は一意に決まる。エラスティックネットでは <math>\lambda_1 = \lambda, \lambda_2 = 0</math> または、<math>\lambda_1 = 0, \lambda_2 = \lambda</math> のとき、それぞれラッソ回帰とリッジ回帰として正則化することができる。一方、パラメータを適切に設定したエラスティックネットでの正規化は <math>\lambda_2</math> を固定してリッジ回帰の正則化項の係数を決定してから、ラッソ回帰の正則化項の係数を決定する2段階の手順で推定量を求める。この推定方法では、推定量が約2倍の速さで収縮するため、バイアスが大きくなり、予測精度が悪くなる。予測精度を向上させるために、論文著者は推定係数を <math>(1 + \lambda_2)</math> 倍することで、エラスティックネットの係数を再スケーリングしている<ref name=ZH/>。

エラスティックネットによる正則化が行われている例:
* サポートベクターマシン<ref>{{cite journal|last1=Wang|first1=Li|last2=Zhu|first2=Ji|last3=Zou|first3=Hui|date=2006|title=The doubly regularized support vector machine|journal=Statistica Sinica|volume=16|pages=589–615|url=http://www.stat.lsa.umich.edu/~jizhu/pubs/Wang-Sinica06.pdf}}</ref>
* 距離学習<ref>{{cite journal|last1=Liu|first1=Meizhu|last2=Vemuri|first2=Baba|title=A robust and efficient doubly regularized metric learning approach|journal=Proceedings of the 12th European Conference on Computer Vision|series=Lecture Notes in Computer Science|year=2012|volume=Part IV|pages=646–659 |doi=10.1007/978-3-642-33765-9_46|pmid=24013160|pmc=3761969|isbn=978-3-642-33764-2|url=http://dl.acm.org/citation.cfm?id=2404791}}</ref>
* ポートフォリオ最適化<ref>{{cite journal|last1=Shen|first1=Weiwei|last2=Wang|first2=Jun|last3=Ma|first3=Shiqian|s2cid=11017740|title=Doubly Regularized Portfolio with Risk Minimization|journal=Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence|year=2014|pages=1286–1292 }}</ref>
*がん予測<ref>{{Cite journal|last1=Milanez-Almeida|first1=Pedro|last2=Martins|first2=Andrew J.|last3=Germain|first3=Ronald N.|last4=Tsang|first4=John S.|date=2020-02-10|title=Cancer prognosis with shallow tumor RNA sequencing|url=https://www.nature.com/articles/s41591-019-0729-3|journal=Nature Medicine|volume=26|issue=2|language=en|pages=188–192|doi=10.1038/s41591-019-0729-3|pmid=32042193|s2cid=211074147|issn=1546-170X}}</ref>

== サポートベクターマシンでの正則化 ==
[[2014年]]後半、エラスティックネットによる正則化で線形[[サポートベクターマシン]]の説明変数の削減が可能なことが証明された<ref name=SV>
{{cite conference |last1=Zhou |first1=Quan |last2=Chen |first2=Wenlin |last3=Song |first3=Shiji |last4=Gardner |first4=Jacob |last5=Weinberger |first5=Kilian |last6=Chen |first6=Yixin |title=A Reduction of the Elastic Net to Support Vector Machines with an Application to GPU Computing |url=https://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9856 |conference=[[アメリカ人工知能学会]]}}</ref>。
2014年内に、ラッソ回帰で同様の削減方法が証明された<ref name=MJ>{{Cite book |title=An Equivalence between the Lasso and Support Vector Machines |last=Jaggi |first=Martin |editor-last1=Suykens |editor-first1=Johan |editor-last2=Signoretto |editor-first2=Marco |editor-last3=Argyriou |editor-first3=Andreas |year=2014 |publisher=Chapman and Hall/CRC |arxiv=1303.1152 }}</ref>。
論文の著者達はエラスティックネットの各インスタンスについて線形サポートベクターマシン (SVM) の超平面解が（再スケーリング後の）解 <math>\beta</math> と等しくなるような二項分類問題を任意に構築できることを示した。この削減法で、エラスティックネットは高度に最適化された SVM ソルバーを使用することができるようになった。また、大規模な SVM ソルバーでは高速処理を実現する [[Graphics Processing Unit|GPU]] アクセラレーションを利用することも可能である<ref name="GT">{{cite web|url=http://ttic.uchicago.edu/~cotter/projects/gtsvm/|title=GTSVM|work=uchicago.edu|access-date=13 June 2022}}</ref>。この削減法は、元のデータと正則化定数の単純な変換: 
: <math> X\in{\mathbb R}^{n\times p},y\in {\mathbb R}^n,\lambda_1\geq 0,\lambda_2\geq 0</math>
によって、二項分類問題と SVM 正則化定数を特定する新しいデータインスタンスと正則化定数に変換する: 
: <math> X_2\in{\mathbb R}^{2p\times n},y_2\in\{-1,1\}^{2p}, C\geq 0 </math>。
ここで、<math>y_2</math> は2値ラベル <math>{-1,1}</math> からなる。<math>2p>n</math> のとき、一般的に線形SVM では主問題で解くと速く、それ以外の場合は双対問題を解く方が速い。論文著者はこの変換をサポートベクトルエラスティックネット (SVEN) と命名し、以下の MATLAB での疑似コードを提供した:
<syntaxhighlight lang="matlab">
function β=SVEN(X,y,t,λ2);
 [n,p]=size(X); 
 X2 = [bsxfun(@minus, X, y./t); bsxfun(@plus, X, y./t)]’;
 Y2=[ones(p,1);-ones(p,1)];
if 2p>n then 
 w = SVMPrimal(X2, Y2, C = 1/(2*λ2));
 α = C * max(1-Y2.*(X2*w),0); 
else
 α = SVMDual(X2, Y2, C = 1/(2*λ2)); 
end if
β = t * (α(1:p) - α(p+1:2p)) / sum(α);
</syntaxhighlight>

== ソフトウェア ==
* "Glmnet: Lasso and elastic-net regularized generalized linear models" は [[R (プログラミング言語)|R]]ソースパッケージや [[MATLAB]] のツールボックスとして実装されたソフトウェアである<ref>{{cite journal|last=Friedman|first=Jerome |author2=Trevor Hastie |author3=Rob Tibshirani|date=2010|title=Regularization Paths for Generalized Linear Models via Coordinate Descent|journal=Journal of Statistical Software|volume=33 |issue=1 |pages=1–22|doi=10.18637/jss.v033.i01 |pmid=20808728 |pmc=2929880 }}</ref><ref>{{cite web|url=https://cran.r-project.org/web/packages/glmnet/index.html|title=CRAN - Package glmnet|work=r-project.org|access-date=13 June 2022}}</ref>。これは周期的に正則化パスに沿って計算される[[座標降下法]]を用いて、ℓ<sub>1</sub>（ラッソ回帰）、ℓ<sub>2</sub>（リッジ回帰）を混合した正則化項（エラスティックネット）による[[一般化線形モデル]]の推定を行う高速アルゴリズムが実装されている。
*{{仮リンク|JMP (ソフトウェア)|en|JMP (statistical software)}}は、最適化モデルによる一般化回帰パーソナリティを使用したエラスティックネットを搭載している。
* "pensim: Simulation of high-dimensional data and parallelized repeated penalized regression" では、ℓ パラメータの並列化 "2D" チューニングを実装し、予測精度の向上させることができる手法としてエラスティックネットが用いられている<ref>{{Cite journal |last1=Waldron |first1=L. |last2=Pintilie |first2=M. |last3=Tsao |first3=M. -S. |last4=Shepherd |first4=F. A. |last5=Huttenhower |first5=C. |last6=Jurisica |first6=I. |doi=10.1093/bioinformatics/btr591 |title=Optimized application of penalized regression methods to diverse genomic data |journal=Bioinformatics |volume=27 |issue=24 |pages=3399–3406 |year=2011 |pmid=22156367 |pmc=3232376 }}</ref><ref>{{cite web |url=https://cran.r-project.org/web/packages/pensim/index.html |title=CRAN - Package pensim |work=r-project.org |access-date=2022-06-13 }}</ref>。
* [[scikit-learn]] ではエラスティックネットによる線形回帰、[[ロジスティック回帰]]、線形[[サポートベクターマシン]]の正則化に対応している。
* SVEN はサポートベクトルエラスティックネットによる正則化を [[MATLAB]] 上で実装したソフトウェアである。このソルバーは SVM による[[二項分類]]でエラスティックネットの正則化でのインスタンスを削減し、MATLAB の SVM ソルバーを使用して正則化後の解を求める。SVM は容易に並列化できるため、最新のハードウェア上では Glmnet より高速なコードが実現できる<ref>{{cite web|url=https://bitbucket.org/mlcircus/sven|title=mlcircus / SVEN — Bitbucket|work=bitbucket.org|access-date=13 June 2022}}</ref>。
* SpaSM は[[MATLAB|Matlab]]上でエラスティックネット正則化回帰を含むスパース線形回帰、分類、[[主成分分析]]を実装している<ref>{{Cite journal|url = http://www.imm.dtu.dk/projects/spasm/references/spasm.pdf|title = SpaSM: A Matlab Toolbox for Sparse Statistical Modeling|last1 = Sjöstrand|first1 = Karl|date = 2 February 2016|journal = Journal of Statistical Software|last2 = Clemmensen|first2 = Line|last3 = Einarsson|first3 = Gudmundur|last4 = Larsen|first4 = Rasmus|last5 = Ersbøll|first5 = Bjarne}}</ref>。
* [[Apache Spark]] は機械学習ライブラリ[http://spark.apache.org/mllib/ MLlib]でエラスティックネット回帰をサポートしている。この方法は一般化線形回帰クラスのパラメータとして利用することができる<ref>{{Cite web|url=http://spark.apache.org/docs/1.6.1/api/python/pyspark.ml.html#pyspark.ml.regression.LinearRegression|title=pyspark.ml package — PySpark 1.6.1 documentation|website=spark.apache.org|access-date=2019-04-17}}</ref>。
*{{仮リンク|SAS (ソフトウェア)|en|SAS (software)}} SAS プロシージャーの Glmselect<ref>{{Cite web|url=http://support.sas.com/documentation/cdl/en/statug/66859/HTML/default/viewer.htm#statug_glmselect_examples06.htm|title=Proc Glmselect|access-date=2019-05-09}}</ref>では、モデル選択における正則化でエラスティックネットをサポートしている。

== 脚注 ==
{{脚注ヘルプ}}
{{Reflist}}

== 参考文献 ==
* {{cite book |first1=トレバー |last1=ヘイスティ |first2=ロバート |last2=ティブシラニ |author-link2=Robert Tibshirani |first3=ジェローム |last3=フリードマン |author-link3=Jerome H. Friedman |title=統計学習入門: データマイニング、推論と予測 |location=ニューヨーク |publisher=シュプリンガー |edition=2nd |year=2017 |isbn=978-0-387-84857-0 |chapter=Shrinkage Methods |pages=61–79 |chapter-url=https://web.stanford.edu/~hastie/Papers/ESLII.pdf#page=80 }}

== 外部リンク ==
* [https://web.stanford.edu/~hastie/TALKS/enet_talk.pdf Regularization and Variable Selection via the Elastic Net] (プレゼンテーション)

{{統計学}}

{{DEFAULTSORT:えらすていつくねつと}}
[[Category:回帰分析]]
[[Category:統計モデル]]
[[Category:数学に関する記事]]