疎行列のソースを表示

{| class="wikitable" width="240px" align="right" style="margin: 3px 0 5px 14px;"
|<center>'''疎行列の例'''</center><center><math>\left(\begin{smallmatrix}
11 & 22 & 0 & 0 & 0 & 0 & 0 \\
0 & 33 & 44 & 0 & 0 & 0 & 0 \\
0 & 0 & 55 & 66 & 77 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 88 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 99 \\
\end{smallmatrix}\right)</math>

上の疎行列には非ゼロ要素が9個しかなく、ゼロ要素は26個ある。スパース性は74%であり、密度は26%である。
</center>
|}
[[File:Finite_element_sparse_matrix.png|リンク=https://en.wikipedia.org/wiki/File:Finite_element_sparse_matrix.png|サムネイル|2次元の[[有限要素法|有限要素問題]]を説いた時に得られる疎行列。非ゼロ要素を黒で表している。]]
[[数値解析]]と[[計算科学]]の分野において、'''疎行列'''（そぎょうれつ、{{Lang-en|sparse matrix}}）または'''疎配列'''（{{Lang-en|sparse array|links=no}}）とは、成分のほとんどが零である[[行列]]のことをいう<ref name="Yan Wu Liu Gao 2017 p.">{{cite conference|last=Yan|first=Di|last2=Wu|first2=Tao|last3=Liu|first3=Ying|last4=Gao|first4=Yang|title=An efficient sparse-dense matrix multiplication on a multicore system|publisher=IEEE|year=2017|isbn=978-1-5090-3944-9|doi=10.1109/icct.2017.8359956|page=|quote=The computation kernel of DNN is large sparse-dense matrix multiplication. In the field of numerical analysis, a sparse matrix is a matrix populated primarily with zeros as elements of the table. By contrast, if the number of non-zero elements in a matrix is relatively large, then it is commonly considered a dense matrix. The fraction of zero elements (non-zero elements) in a matrix is called the sparsity (density). Operations using standard dense-matrix structures and algorithms are relatively slow and consume large amounts of memory when applied to large sparse matrices.}}</ref>。スパース行列とも言う。行列が'''疎である'''と判定するためのゼロの値を持つ要素の割合について厳密な定義はないが、一般的な条件としては、非ゼロ要素の数が行数または列数におおよそ近いものである。逆に、ほとんどの要素が非ゼロ要素である行列は、'''密な'''（dense）行列であると見なされる<ref name="Yan Wu Liu Gao 2017 p." />。行列のゼロ要素の数を要素数の合計で割った値を、行列の'''スパース性'''（sparsity）と呼ぶことがある。

概念的には、スパース性はペアワイズ相互作用をほとんど持たないシステムに対応する。たとえば、隣同士がバネで接続されたボールの線について考えると、各ボールは隣接するボールのみと組になっているため、これはスパースなシステムである。対称的に、同じボールの線でも、1つのボールが他のすべてのボールとバネでつながっている場合、このシステムは密行列と対応する。スパース性の概念は、[[組み合わせ論|組み合せ論]]や、通常、重要なデータや接続の密度が低くなる[[ネットワーク理論]]・[[数値解析]]などの応用領域で役に立つ。巨大な疎行列は、[[偏微分方程式]]を解くときに[[科学]]や[[工学]]のアプリケーションによく現れる。

[[コンピュータ|コンピューター]]上で疎行列の保存や操作を行うときには、行列のスパースな構造を利用した特別な[[アルゴリズム]]と[[データ構造]]を使用することが有益であり、多くの場合には必要になる。機械学習の分野では疎行列がよく用いられるため<ref>{{Cite press|title=Argonne National Laboratory Deploys Cerebras CS-1, the World’s Fastest Artificial Intelligence Computer {{!}} Argonne National Laboratory|url=https://www.anl.gov/article/argonne-national-laboratory-deploys-cerebras-cs1-the-worlds-fastest-artificial-intelligence-computer|language=en|quote=The WSE is the largest chip ever made at 46,225 square millimeters in area, it is 56.7 times larger than the largest graphics processing unit. It contains 78 times more AI optimized compute cores, 3,000 times more high speed, on-chip memory, 10,000 times more memory bandwidth, and 33,000 times more communication bandwidth.|website=www.anl.gov|access-date=2019-12-02}}</ref>、疎行列に特化したコンピューターも作られている<ref>{{Cite web|url=https://www.businesswire.com/news/home/20190819005148/en/Cerebras-Systems-Unveils-Industry%E2%80%99s-Trillion-Transistor-Chip|title=Cerebras Systems Unveils the Industry's First Trillion Transistor Chip|quote=The WSE contains 400,000 AI-optimized compute cores. Called SLAC™ for Sparse Linear Algebra Cores, the compute cores are flexible, programmable, and optimized for the sparse linear algebra that underpins all neural network computation|date=2019-08-19|website=www.businesswire.com|language=en|access-date=2019-12-02}}</ref>。標準的な密行列の構造とアルゴリズムを対象とする操作は、巨大な疎行列に適用する場合には処理と[[メモリ]]がゼロ値で無駄になり、遅くて非効率である。スパースなデータは本質的により簡単に[[データ圧縮|圧縮]]されるため、必要な[[ストレージ]]が非常に小さくなる。非常に巨大な疎行列に対しては、標準的な密行列で使用する操作を適用することができる場合もある。

[[有限差分法]]、ある[[有限体積法]]、[[有限要素法]]などで離散化された[[偏微分方程式]]は、一般に疎行列を係数行列とした[[連立一次方程式]]となる。

[[数値解析]]の分野では、疎行列を前提とした解法が多い。疎行列の非零要素だけを工夫してうまく格納することにより、大次元の問題を扱うことが容易になる。また、たとえば比較的少ない手間でベクトルと行列の積を計算できるなどの利点がある。[[ランダムアクセス|ランダムメモリアクセス]]を多用する疎行列を用いた計算処理は[[ベクトル計算機|ベクトルプロセッサ]]が得意としており、一般的な[[スカラー計算機|スカラ型CPU]]や[[GPGPU]]では未だに苦手とする処理である<ref>{{Cite web|和書|title=プロセッサ開発のセンス ～第4回 ベクトル・プロセッサ～ {{!}} 株式会社エヌエスアイテクス （NSITEXE,Inc.） |url=https://www.nsitexe.com/blog/sense_of_processor_development_004/ |date=2023-02-22 |access-date=2023-06-18 |language=ja}}</ref><ref group="注釈">疎行列にアクセスする際に行われる、巨大な配列データを大域的にインデックス参照で引いてくるランダムメモリアクセスを多用する操作は、一般的なスカラ型の[[CPU]]やGPGPUにとっては苦手な処理である。</ref>。

==格納形式==
行列は、典型的には2次元の配列に格納される。配列の各要素は、行列の要素{{math|''a''<sub>''i'',''j''</sub>}}を表し、2つの[[配列|インデックス]]{{math|''i''}}と{{math|''j''}}を用いてアクセスされる。慣習として、{{math|''i''}}は上から下に数えた行のインデックスを指し、{{math|''j''}}は左から右に数えた列のインデックスを指す。{{math|''m'' × ''n''}}行列の場合、このフォーマットで行列を格納するのに必要なメモリ量は、{{math|''m'' × ''n''}}に比例する（忘れられることが多いが、行列の次元も格納する必要がある）。

疎行列の場合、非ゼロ要素のみを保存することで、必要メモリ容量の大幅な削減が実現できる。非ゼロ要素の数と分散によって、異なるデータ構造を利用することで、基本的なアプローチに比べてメモリ量の大幅な節約が可能になる。トレードオフは、各要素へのアクセスがより複雑になり、オリジナルの行列を曖昧さなく復元できるようにするために追加の構造が必要になることである。

このため疎行列を格納するための様々な形式（[[ファイルフォーマット|フォーマット]]）が存在する。

フォーマットは2つのグループに分けられる。

* 効率的な編集をサポートするフォーマット
** DOK（Dictionary of keys）
** LIL（List of lists）
** [[疎行列#%E5%BA%A7%E6%A8%99|COO]]
* 効率的なアクセスと行列操作をサポートするフォーマット
** [[疎行列#%E5%9C%A7%E7%B8%AE%E8%A1%8C%E6%A0%BC%E7%B4%8D|CSR]]
** [[疎行列#%E5%9C%A7%E7%B8%AE%E5%88%97%E6%A0%BC%E7%B4%8D|CSC]]
** [[疎行列#%E3%83%96%E3%83%AD%E3%83%83%E3%82%AF%E5%9C%A7%E7%B8%AE%E8%A1%8C%E6%A0%BC%E7%B4%8D|BSR]]: ブロック疎行列（Block Sparse matrix）向け

以下の名称は、[[Netlib]]で使われているもの<ref>[http://netlib.org/linalg/html_templates/node90.html Survey of Sparse Matrix Storage Formats]</ref>や[[Intel Math Kernel Library]]<ref>[https://software.intel.com/en-us/articles/intel-mkl-sparse-blas-overview Intel® MKL Sparse BLAS Overview | Intel® Developer Zone]</ref>、[[SciPy]]で使われているものに基づく。例として次の疎行列Aを考える。

<math>
\begin{bmatrix}
 1 & 2 & 3 & 0\\
 0 & 0 & 0 & 1\\
 2 & 0 & 0 & 2\\
 0 & 0 & 0 & 1\\
\end{bmatrix}
</math>

=== Dictionary of Key ===
Dictionary of Key（'''DOK'''）は、(行, 列) をキーにして[[連想配列]]に入れる方式である。

=== リストのリスト ===
リストのリスト（{{lang-en-short|List of lists}}, '''LIL'''）は、行ごとにリストを作り、そのリストの中に (列, 値) のタプルを入れる方式である。

=== 座標 ===
座標（{{lang-en-short|Coordinate}}, '''COO'''）形式は [値, 行インデックス, 列インデックス] タプルの集合で行列を表現する方式である<ref>"scipy.sparse.coo_matrix ... A sparse matrix in COOrdinate format." ''[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html scipy.sparse.coo_matrix]''. scipy. 2022-03-05閲覧.</ref>。

行列Aの要素を座標（インデックス）とともに並べると次のようになる。

 A  = [1 2 3 0 0 0 0 1 2 0 0 2 0 0 0 1] # 値
 IA = [1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4] # 行インデックス
 JA = [1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4] # 列インデックス

ここで「存在しない値をゼロ要素とする」と定めるとゼロ要素をすべて削除できる。これにより得られる、

 A  = [1 2 3 1 2 2 1] # 値
 IA = [1 1 1 2 3 3 4] # 行インデックス
 JA = [1 2 3 4 1 4 4] # 列インデックス

が疎行列AのCOO形式による表現である。

COO行列のゼロ要素を非ゼロに編集したい場合、後ろに非ゼロタプルを追加するだけでよいため編集効率が良い。

=== 圧縮行格納 ===
圧縮行格納（{{lang-en-short|Compressed Row Storage}}, CRS）形式は行インデックス配列を圧縮する方式である。別名は Compressed Sparse Row（'''CSR'''）形式<ref>"scipy.sparse.csr_matrix ... Compressed Sparse Row matrix" ''[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html scipy.sparse.csr_matrix]''. scipy. 2022-03-05閲覧.</ref>。

CSR方式ではまず2次元の行列を行方向に並べる。次に「存在しない値をゼロ要素とする」と定め、ゼロ要素をすべて削除する。この段階で行・列インデックスとともに並べると次のようになる。

 data = [1 2 3 1 2 2 1] # 値
 IA   = [1 1 1 2 3 3 4] # 行インデックス
 JA   = [1 2 3 4 1 4 4] # 列インデックス

ここで行インデックス配列（<code>IA</code>）に着目する。現在は各要素が明示的に行インデックスを持っているが、行の切れ目さえわかっていれば、これは自動的に導ける。例えば <code>IA[1] = IA[2] = IA[3] = 1</code>であるが、「1行目は1要素目から、2行目は4要素目から」とわかっていれば、<code>IA[1:4]=[1 1 1]</code>を即座に導ける。これはCSR方式が行ごとに並べたうえでゼロ要素を削除する規則に由来している。

この行インデックス表現は行headポインタの配列と見なせる。すなわち<code>indptr = [ptr_row_1 ptr_row2 ...]</code>である。インデックスを直接示す配列は列インデックス配列<code>JA</code>のみになったので、これを<code>indices</code>と改名する<ref name=":0">"csr_matrix((data, indices, indptr) ... is the standard CSR representation where the column indices for row i are stored in <code>indices[indptr[i]:indptr[i+1]]</code> and their corresponding values are stored in <code>data[indptr[i]:indptr[i+1]]</code>." ''[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html scipy.sparse.csr_matrix]''. scipy. 2022-03-05閲覧.</ref>。これにより得られる、

 data    = [1 2 3 1 2 2 1] # 値
 indices = [1 2 3 4 1 4 4] # 列インデックス
 indptr  = [1 4 5 7]       # 行Headポインタ

が疎行列AのCSR形式による表現である。

CSR形式は行へのアクセスに優れている<ref>"Advantages of the CSR format ... efficient row slicing" ''[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html scipy.sparse.csr_matrix]''. scipy. 2022-03-05閲覧.</ref>。1行目にアクセスする場合、データを <code>data[indptr[1]:indptr[2]]</code> で取得し列インデックスを <code>indices[indptr[1]:indptr[2]]</code> で取得できる（行インデックスは当然<code>1</code>）<ref name=":0" />。対照的にCOO形式であればまず行インデックス配列<code>IA</code>を全長走査し<code>IA[k] == 1</code>に該当する要素番号<code>k</code>をリストアップし、そのうえで<code>data[k], indices[k]</code>をによるアクセスを全<code>k</code>に関しておこなう必要がある。

対照的に、CSR形式は列へのアクセスに劣る<ref>"Disadvantages of the CSR format    slow column slicing operations" ''[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html scipy.sparse.csr_matrix]''. scipy. 2022-03-05閲覧.</ref>。1列目にアクセスする場合、<code>indices</code>を全長走査し<code>indices[k] == 1</code>に該当する要素番号<code>k</code>をリストアップしたのち、行インデックスを得るために<code>indptr</code>を走査して各<code>k</code>に大して<code>indptr[n] <= k <indptr[n+1]</code>を満たす<code>n</code>を見つける必要がある。

=== 圧縮列格納 ===
圧縮列格納（{{lang-en-short|Compressed Column Storage}}, CCS）形式はCRSを列単位にしたものである。別名は Compressed Sparse Column ('''CSC''') 形式<ref>"scipy.sparse.csc_matrix ... Compressed Sparse Column matrix" ''[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.html scipy.sparse.csc_matrix]''. scipy. 2022-03-05閲覧. </ref>。

=== 圧縮対角格納 ===
圧縮対角格納（Compressed Diagonal Storage、CDS）形式やDiagonal（DIA）は、CRS・CSR を対角行列単位にしたものである。

=== スカイライン格納方式（SKS、SKY） ===
三角行列のために用いられる。

=== ブロック圧縮行格納 ===
ブロック圧縮行格納（{{lang-en-short|Block Compressed Row Storage}}, BCRS）形式は[[疎行列#%E5%9C%A7%E7%B8%AE%E8%A1%8C%E6%A0%BC%E7%B4%8D|CRS]]をブロック単位にしたものである。別名は Block Sparse Row ('''BSR''') 形式<ref>"scipy.sparse.bsr_matrix ... Block Sparse Row matrix" ''[https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.bsr_matrix.html scipy.sparse.bsr_matrix]''. 2022-03-05閲覧.</ref>。

== 関連項目 ==
* [[四分木]]

== 外部リンク ==
* [https://www.jsces.org/activity/journal/web_vol25no2_tutorial.pdf 緒方隆盛:「疎行列直接法ソルバ入門」]
* [https://link.springer.com/book/10.1007/978-3-031-25820-6  Jennifer Scott and Miroslav Tuma:　"Algorithms for Sparse Linear Systems", Birkhauser, (2023)], {{doi|10.1007/978-3-031-25820-6}}　(Open Access Book)

== 脚注 ==
{{脚注ヘルプ}}
=== 注釈 ===
{{Notelist}}
=== 出典 ===
{{reflist}}

{{Linear algebra}}

{{DEFAULTSORT:そきようれつ}}
[[Category:行列]]
[[Category:数値線形代数]]
[[Category:数学に関する記事]]