cnn-without-any-downsampling - プロジェクトは、cnnのダウンサンプリング(アップサミング)が必須ではないことを示しています

(A project demonstrate that downsampling(upsaming) in cnn are not nesscessary)

Created at: 2019-11-30 22:19:52
Language: Python
License: MIT

このコードの使用方法

このリポジトリには、tensorflow-gpu-1.5.0または他の互換性のあるバージョンのtensorflowが必要です。

タイプ別に簡単に実行できます

python cifar10.py

CNNは本当にダウンサンプリング(アップサンプリング)が必要ですか?

一般的な畳み込みニューラルネットワークでは、サンプリングはほぼユビキタスであり、以前はmax_poolingでしたが、現在は畳み込みにまたがっています。例としてvggネットワークを取り上げます。これは、非常に多くのmax_poolingを使用します。

vgg

入力側は以下のとおりです。ネットワークで多くの2x2プーリングが使用されていることがわかります。また、セマンティックセグメンテーションまたはオブジェクト検出を行う場合、かなり多くのアップサンプリングまたは転置畳み込みを使用します。

fcn

典型的なfcn構造、赤で区別されるデコボリューションに注意してください以前は、分類ネットワークの最後の数層でfcを使用していました。その後、fcのパラメーターが多すぎて、一般化のパフォーマンスが低いことが判明しました。これはグローバル平均プーリングに置き換えられ、ネットワーク内のネットワークに最初に登場しました。

ギャップ

GAPは、空間的特徴を直接スカラーに集約します。それ以来、分類ネットワークのパラダイムは次のように動作します(Reluは、ショートカットを考慮せずにconvとdeconvに統合されています)。
Input-->Conv-->DownSample_x_2-->Conv-->DownSample_x_2-->Conv-->DownSample_x_2-->GAP-->Conv1x1-->Softmax-->Output

そして、セマンティックセグメンテーションネットワークのパラダイムは次のように動作します。

Input-->Conv-->DownSample_x_2-->Conv-->DownSample_x_2-->Conv-->DownSample_x_2-->Deconv_x_2-->Deconv_x_2-->Deconv_x_2-->Softmax-->Output

しかし、私たちはそれについて考えなければなりません。ダウンサンプリングとアップサンプリングは本当に必要ですか?取り外すことはできませんか?

cifar10の分類タスクで、ダウンサンプリングを削除し、畳み込みを拡張畳み込みに変更しようとしました。ダイヤルレートはそれぞれ増加しました。モデル構造を以下に示します。

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
conv2d (Conv2D)              (None, 32, 32, 16)        448
_________________________________________________________________
batch_normalization (BatchNo (None, 32, 32, 16)        64
_________________________________________________________________
activation (Activation)      (None, 32, 32, 16)        0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 24)        3480
_________________________________________________________________
batch_normalization_1 (Batch (None, 32, 32, 24)        96
_________________________________________________________________
activation_1 (Activation)    (None, 32, 32, 24)        0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 32, 32, 32)        6944
_________________________________________________________________
batch_normalization_2 (Batch (None, 32, 32, 32)        128
_________________________________________________________________
activation_2 (Activation)    (None, 32, 32, 32)        0
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 32, 32, 48)        13872
_________________________________________________________________
batch_normalization_3 (Batch (None, 32, 32, 48)        192
_________________________________________________________________
activation_3 (Activation)    (None, 32, 32, 48)        0
_________________________________________________________________
global_average_pooling2d (Gl (None, 48)                0
_________________________________________________________________
dense (Dense)                (None, 10)                490
_________________________________________________________________
activation_4 (Activation)    (None, 10)                0
=================================================================

80回のトレーニングの後、最終的に次の分類結果が得られました。

時代 損失 val_accuracy
10 0.9200 0.6346
20 0.7925 0.6769
30 0.7293 0.7193
40 0.6737 0.7479
50 0.6516 0.7470
60 0.6311 0.7678
70 0.6085 0.7478
80 0.5865 0.7665

検証データセットの精度曲線を以下に示します。

acc

最終的な正解率は76%に達しました。同じパラメータを持つvgg構造を持つ畳み込みネットワークの正解率は基本的にこれに近いです。

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            [(None, 32, 32, 3)]  0
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 32, 32, 16)   448         input_1[0][0]
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 32, 32, 16)   64          conv2d[0][0]
__________________________________________________________________________________________________
activation (Activation)         (None, 32, 32, 16)   0           batch_normalization[0][0]
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 32, 32, 16)   2320        activation[0][0]
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 32, 32, 16)   64          conv2d_1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 32, 32, 16)   0           batch_normalization_1[0][0]
__________________________________________________________________________________________________
conv2d_2 (Conv2D)               (None, 32, 32, 16)   2320        activation_1[0][0]
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 32, 32, 16)   64          conv2d_2[0][0]
__________________________________________________________________________________________________
add (Add)                       (None, 32, 32, 16)   0           activation[0][0]
                                                                 batch_normalization_2[0][0]
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 32, 32, 16)   0           add[0][0]
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 32, 32, 16)   2320        activation_2[0][0]
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 32, 32, 16)   64          conv2d_3[0][0]
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 32, 32, 16)   0           batch_normalization_3[0][0]
__________________________________________________________________________________________________
conv2d_4 (Conv2D)               (None, 32, 32, 16)   2320        activation_3[0][0]
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 32, 32, 16)   64          conv2d_4[0][0]
__________________________________________________________________________________________________
add_1 (Add)                     (None, 32, 32, 16)   0           activation_2[0][0]
                                                                 batch_normalization_4[0][0]
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 32, 32, 16)   0           add_1[0][0]
__________________________________________________________________________________________________
conv2d_5 (Conv2D)               (None, 32, 32, 16)   2320        activation_4[0][0]
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 32, 32, 16)   64          conv2d_5[0][0]
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 32, 32, 16)   0           batch_normalization_5[0][0]
__________________________________________________________________________________________________
conv2d_6 (Conv2D)               (None, 32, 32, 16)   2320        activation_5[0][0]
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 32, 32, 16)   64          conv2d_6[0][0]
__________________________________________________________________________________________________
add_2 (Add)                     (None, 32, 32, 16)   0           activation_4[0][0]
                                                                 batch_normalization_6[0][0]
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 32, 32, 16)   0           add_2[0][0]
__________________________________________________________________________________________________
conv2d_7 (Conv2D)               (None, 32, 32, 16)   2320        activation_6[0][0]
__________________________________________________________________________________________________
batch_normalization_7 (BatchNor (None, 32, 32, 16)   64          conv2d_7[0][0]
__________________________________________________________________________________________________
activation_7 (Activation)       (None, 32, 32, 16)   0           batch_normalization_7[0][0]
__________________________________________________________________________________________________
conv2d_8 (Conv2D)               (None, 32, 32, 16)   2320        activation_7[0][0]
__________________________________________________________________________________________________
batch_normalization_8 (BatchNor (None, 32, 32, 16)   64          conv2d_8[0][0]
__________________________________________________________________________________________________
add_3 (Add)                     (None, 32, 32, 16)   0           activation_6[0][0]
                                                                 batch_normalization_8[0][0]
__________________________________________________________________________________________________
activation_8 (Activation)       (None, 32, 32, 16)   0           add_3[0][0]
__________________________________________________________________________________________________
conv2d_9 (Conv2D)               (None, 32, 32, 16)   2320        activation_8[0][0]
__________________________________________________________________________________________________
batch_normalization_9 (BatchNor (None, 32, 32, 16)   64          conv2d_9[0][0]
__________________________________________________________________________________________________
activation_9 (Activation)       (None, 32, 32, 16)   0           batch_normalization_9[0][0]
__________________________________________________________________________________________________
conv2d_10 (Conv2D)              (None, 32, 32, 16)   2320        activation_9[0][0]
__________________________________________________________________________________________________
batch_normalization_10 (BatchNo (None, 32, 32, 16)   64          conv2d_10[0][0]
__________________________________________________________________________________________________
add_4 (Add)                     (None, 32, 32, 16)   0           activation_8[0][0]
                                                                 batch_normalization_10[0][0]
__________________________________________________________________________________________________
activation_10 (Activation)      (None, 32, 32, 16)   0           add_4[0][0]
__________________________________________________________________________________________________
conv2d_11 (Conv2D)              (None, 32, 32, 16)   2320        activation_10[0][0]
__________________________________________________________________________________________________
batch_normalization_11 (BatchNo (None, 32, 32, 16)   64          conv2d_11[0][0]
__________________________________________________________________________________________________
activation_11 (Activation)      (None, 32, 32, 16)   0           batch_normalization_11[0][0]
__________________________________________________________________________________________________
conv2d_12 (Conv2D)              (None, 32, 32, 16)   2320        activation_11[0][0]
__________________________________________________________________________________________________
batch_normalization_12 (BatchNo (None, 32, 32, 16)   64          conv2d_12[0][0]
__________________________________________________________________________________________________
add_5 (Add)                     (None, 32, 32, 16)   0           activation_10[0][0]
                                                                 batch_normalization_12[0][0]
__________________________________________________________________________________________________
activation_12 (Activation)      (None, 32, 32, 16)   0           add_5[0][0]
__________________________________________________________________________________________________
conv2d_13 (Conv2D)              (None, 32, 32, 32)   4640        activation_12[0][0]
__________________________________________________________________________________________________
batch_normalization_13 (BatchNo (None, 32, 32, 32)   128         conv2d_13[0][0]
__________________________________________________________________________________________________
activation_13 (Activation)      (None, 32, 32, 32)   0           batch_normalization_13[0][0]
__________________________________________________________________________________________________
conv2d_14 (Conv2D)              (None, 32, 32, 32)   9248        activation_13[0][0]
__________________________________________________________________________________________________
conv2d_15 (Conv2D)              (None, 32, 32, 32)   4640        activation_12[0][0]
__________________________________________________________________________________________________
batch_normalization_14 (BatchNo (None, 32, 32, 32)   128         conv2d_14[0][0]
__________________________________________________________________________________________________
add_6 (Add)                     (None, 32, 32, 32)   0           conv2d_15[0][0]
                                                                 batch_normalization_14[0][0]
__________________________________________________________________________________________________
activation_14 (Activation)      (None, 32, 32, 32)   0           add_6[0][0]
__________________________________________________________________________________________________
conv2d_16 (Conv2D)              (None, 32, 32, 32)   9248        activation_14[0][0]
__________________________________________________________________________________________________
batch_normalization_15 (BatchNo (None, 32, 32, 32)   128         conv2d_16[0][0]
__________________________________________________________________________________________________
activation_15 (Activation)      (None, 32, 32, 32)   0           batch_normalization_15[0][0]
__________________________________________________________________________________________________
conv2d_17 (Conv2D)              (None, 32, 32, 32)   9248        activation_15[0][0]
__________________________________________________________________________________________________
batch_normalization_16 (BatchNo (None, 32, 32, 32)   128         conv2d_17[0][0]
__________________________________________________________________________________________________
add_7 (Add)                     (None, 32, 32, 32)   0           activation_14[0][0]
                                                                 batch_normalization_16[0][0]
__________________________________________________________________________________________________
activation_16 (Activation)      (None, 32, 32, 32)   0           add_7[0][0]
__________________________________________________________________________________________________
conv2d_18 (Conv2D)              (None, 32, 32, 32)   9248        activation_16[0][0]
__________________________________________________________________________________________________
batch_normalization_17 (BatchNo (None, 32, 32, 32)   128         conv2d_18[0][0]
__________________________________________________________________________________________________
activation_17 (Activation)      (None, 32, 32, 32)   0           batch_normalization_17[0][0]
__________________________________________________________________________________________________
conv2d_19 (Conv2D)              (None, 32, 32, 32)   9248        activation_17[0][0]
__________________________________________________________________________________________________
batch_normalization_18 (BatchNo (None, 32, 32, 32)   128         conv2d_19[0][0]
__________________________________________________________________________________________________
add_8 (Add)                     (None, 32, 32, 32)   0           activation_16[0][0]
                                                                 batch_normalization_18[0][0]
__________________________________________________________________________________________________
activation_18 (Activation)      (None, 32, 32, 32)   0           add_8[0][0]
__________________________________________________________________________________________________
conv2d_20 (Conv2D)              (None, 32, 32, 32)   9248        activation_18[0][0]
__________________________________________________________________________________________________
batch_normalization_19 (BatchNo (None, 32, 32, 32)   128         conv2d_20[0][0]
__________________________________________________________________________________________________
activation_19 (Activation)      (None, 32, 32, 32)   0           batch_normalization_19[0][0]
__________________________________________________________________________________________________
conv2d_21 (Conv2D)              (None, 32, 32, 32)   9248        activation_19[0][0]
__________________________________________________________________________________________________
batch_normalization_20 (BatchNo (None, 32, 32, 32)   128         conv2d_21[0][0]
__________________________________________________________________________________________________
add_9 (Add)                     (None, 32, 32, 32)   0           activation_18[0][0]
                                                                 batch_normalization_20[0][0]
__________________________________________________________________________________________________
activation_20 (Activation)      (None, 32, 32, 32)   0           add_9[0][0]
__________________________________________________________________________________________________
conv2d_22 (Conv2D)              (None, 32, 32, 32)   9248        activation_20[0][0]
__________________________________________________________________________________________________
batch_normalization_21 (BatchNo (None, 32, 32, 32)   128         conv2d_22[0][0]
__________________________________________________________________________________________________
activation_21 (Activation)      (None, 32, 32, 32)   0           batch_normalization_21[0][0]
__________________________________________________________________________________________________
conv2d_23 (Conv2D)              (None, 32, 32, 32)   9248        activation_21[0][0]
__________________________________________________________________________________________________
batch_normalization_22 (BatchNo (None, 32, 32, 32)   128         conv2d_23[0][0]
__________________________________________________________________________________________________
add_10 (Add)                    (None, 32, 32, 32)   0           activation_20[0][0]
                                                                 batch_normalization_22[0][0]
__________________________________________________________________________________________________
activation_22 (Activation)      (None, 32, 32, 32)   0           add_10[0][0]
__________________________________________________________________________________________________
conv2d_24 (Conv2D)              (None, 32, 32, 32)   9248        activation_22[0][0]
__________________________________________________________________________________________________
batch_normalization_23 (BatchNo (None, 32, 32, 32)   128         conv2d_24[0][0]
__________________________________________________________________________________________________
activation_23 (Activation)      (None, 32, 32, 32)   0           batch_normalization_23[0][0]
__________________________________________________________________________________________________
conv2d_25 (Conv2D)              (None, 32, 32, 32)   9248        activation_23[0][0]
__________________________________________________________________________________________________
batch_normalization_24 (BatchNo (None, 32, 32, 32)   128         conv2d_25[0][0]
__________________________________________________________________________________________________
add_11 (Add)                    (None, 32, 32, 32)   0           activation_22[0][0]
                                                                 batch_normalization_24[0][0]
__________________________________________________________________________________________________
activation_24 (Activation)      (None, 32, 32, 32)   0           add_11[0][0]
__________________________________________________________________________________________________
conv2d_26 (Conv2D)              (None, 32, 32, 64)   18496       activation_24[0][0]
__________________________________________________________________________________________________
batch_normalization_25 (BatchNo (None, 32, 32, 64)   256         conv2d_26[0][0]
__________________________________________________________________________________________________
activation_25 (Activation)      (None, 32, 32, 64)   0           batch_normalization_25[0][0]
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, 32, 32, 64)   36928       activation_25[0][0]
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, 32, 32, 64)   18496       activation_24[0][0]
__________________________________________________________________________________________________
batch_normalization_26 (BatchNo (None, 32, 32, 64)   256         conv2d_27[0][0]
__________________________________________________________________________________________________
add_12 (Add)                    (None, 32, 32, 64)   0           conv2d_28[0][0]
                                                                 batch_normalization_26[0][0]
__________________________________________________________________________________________________
activation_26 (Activation)      (None, 32, 32, 64)   0           add_12[0][0]
__________________________________________________________________________________________________
conv2d_29 (Conv2D)              (None, 32, 32, 64)   36928       activation_26[0][0]
__________________________________________________________________________________________________
batch_normalization_27 (BatchNo (None, 32, 32, 64)   256         conv2d_29[0][0]
__________________________________________________________________________________________________
activation_27 (Activation)      (None, 32, 32, 64)   0           batch_normalization_27[0][0]
__________________________________________________________________________________________________
conv2d_30 (Conv2D)              (None, 32, 32, 64)   36928       activation_27[0][0]
__________________________________________________________________________________________________
batch_normalization_28 (BatchNo (None, 32, 32, 64)   256         conv2d_30[0][0]
__________________________________________________________________________________________________
add_13 (Add)                    (None, 32, 32, 64)   0           activation_26[0][0]
                                                                 batch_normalization_28[0][0]
__________________________________________________________________________________________________
activation_28 (Activation)      (None, 32, 32, 64)   0           add_13[0][0]
__________________________________________________________________________________________________
conv2d_31 (Conv2D)              (None, 32, 32, 64)   36928       activation_28[0][0]
__________________________________________________________________________________________________
batch_normalization_29 (BatchNo (None, 32, 32, 64)   256         conv2d_31[0][0]
__________________________________________________________________________________________________
activation_29 (Activation)      (None, 32, 32, 64)   0           batch_normalization_29[0][0]
__________________________________________________________________________________________________
conv2d_32 (Conv2D)              (None, 32, 32, 64)   36928       activation_29[0][0]
__________________________________________________________________________________________________
batch_normalization_30 (BatchNo (None, 32, 32, 64)   256         conv2d_32[0][0]
__________________________________________________________________________________________________
add_14 (Add)                    (None, 32, 32, 64)   0           activation_28[0][0]
                                                                 batch_normalization_30[0][0]
__________________________________________________________________________________________________
activation_30 (Activation)      (None, 32, 32, 64)   0           add_14[0][0]
__________________________________________________________________________________________________
conv2d_33 (Conv2D)              (None, 32, 32, 64)   36928       activation_30[0][0]
__________________________________________________________________________________________________
batch_normalization_31 (BatchNo (None, 32, 32, 64)   256         conv2d_33[0][0]
__________________________________________________________________________________________________
activation_31 (Activation)      (None, 32, 32, 64)   0           batch_normalization_31[0][0]
__________________________________________________________________________________________________
conv2d_34 (Conv2D)              (None, 32, 32, 64)   36928       activation_31[0][0]
__________________________________________________________________________________________________
batch_normalization_32 (BatchNo (None, 32, 32, 64)   256         conv2d_34[0][0]
__________________________________________________________________________________________________
add_15 (Add)                    (None, 32, 32, 64)   0           activation_30[0][0]
                                                                 batch_normalization_32[0][0]
__________________________________________________________________________________________________
activation_32 (Activation)      (None, 32, 32, 64)   0           add_15[0][0]
__________________________________________________________________________________________________
conv2d_35 (Conv2D)              (None, 32, 32, 64)   36928       activation_32[0][0]
__________________________________________________________________________________________________
batch_normalization_33 (BatchNo (None, 32, 32, 64)   256         conv2d_35[0][0]
__________________________________________________________________________________________________
activation_33 (Activation)      (None, 32, 32, 64)   0           batch_normalization_33[0][0]
__________________________________________________________________________________________________
conv2d_36 (Conv2D)              (None, 32, 32, 64)   36928       activation_33[0][0]
__________________________________________________________________________________________________
batch_normalization_34 (BatchNo (None, 32, 32, 64)   256         conv2d_36[0][0]
__________________________________________________________________________________________________
add_16 (Add)                    (None, 32, 32, 64)   0           activation_32[0][0]
                                                                 batch_normalization_34[0][0]
__________________________________________________________________________________________________
activation_34 (Activation)      (None, 32, 32, 64)   0           add_16[0][0]
__________________________________________________________________________________________________
conv2d_37 (Conv2D)              (None, 32, 32, 64)   36928       activation_34[0][0]
__________________________________________________________________________________________________
batch_normalization_35 (BatchNo (None, 32, 32, 64)   256         conv2d_37[0][0]
__________________________________________________________________________________________________
activation_35 (Activation)      (None, 32, 32, 64)   0           batch_normalization_35[0][0]
__________________________________________________________________________________________________
conv2d_38 (Conv2D)              (None, 32, 32, 64)   36928       activation_35[0][0]
__________________________________________________________________________________________________
batch_normalization_36 (BatchNo (None, 32, 32, 64)   256         conv2d_38[0][0]
__________________________________________________________________________________________________
add_17 (Add)                    (None, 32, 32, 64)   0           activation_34[0][0]
                                                                 batch_normalization_36[0][0]
__________________________________________________________________________________________________
activation_36 (Activation)      (None, 32, 32, 64)   0           add_17[0][0]
__________________________________________________________________________________________________
global_average_pooling2d (Globa (None, 64)           0           activation_36[0][0]
__________________________________________________________________________________________________
flatten (Flatten)               (None, 64)           0           global_average_pooling2d[0][0]
__________________________________________________________________________________________________
dense (Dense)                   (None, 10)           650         flatten[0][0]
==================================================================================================

これは私たちに考えさせました、サンプリングは本当に必要ですか?もちろん、エンジニアリングの観点からは、サンプリングによってフィーチャマップのサイズを大幅に削減できるため、計算量を大幅に削減できます。ただし、この実験面では、サンプリングは畳み込みニューラルネットワークのパフォーマンスの向上には役立ちません。最大プーリングにはノイズを抑制する効果があるので便利ですが、最大プーリングは、従来のメディアンファイラーと同じように、ダウンサンプリングなしで実装することもできます。 medianfilter

典型的な中央値フィルタリング。ここで、畳み込みカーネルのサイズは20です。出力イメージのサイズは変更されていないことに注意してください。最大プーイングは、中央値フィルタリングに似ています。どちらもノイズを抑えるために使用できます。

これは、各畳み込み層が空間相関のエンコードに使用され、浅い特徴が短距離相関をエンコードし、より深い畳み込み層が長距離空間相関をエンコードすることも示しています。特定のレベルでは、統計的な意味での空間相関はなくなります(これは、画像内の意味のあるオブジェクトのサイズによって異なります)。このレイヤーでは、GAPを使用して空間フィーチャを集約できます。

サンプリング層がないと、分類ネットワークのパラダイムは次のようになります。

Input-->Conv(dilate_rate=1)-->Conv(dilate_rate=2)-->Conv(dilate_rate=4)-->Conv(dilate_rate=8)-->GAP-->Conv1x1-->Softmax-->Output

セマンティックセグメンテーションネットワークのパラダイムは次のようになります。

Input-->Conv(dilate_rate=1)-->Conv(dilate_rate=2)-->Conv(dilate_rate=4)-->Conv(dilate_rate=8)-->Conv(dilate_rate=4)-->Conv(dilate_rate=2)-->Conv(dilate_rate=1)-->Softmax-->Output

私の知る限り、画像の分類とセグメンテーションにグローバル平均プールと組み合わせた拡張畳み込みを使用した最初の人でした。パフォーマンスの向上がない場合でも(ただし、基本的に問題はありません)。拡張畳み込みは必要ないことに注意してください。より大きなカーネルサイズの畳み込みはそれを置き換えることができますが、これは必然的により多くのパラメーターを導入し、過剰適合につながる可能性があります。同様のアイデアがdeeplabの論文[セマンティック画像セグメンテーションのためのAtrousConvolutionの再考]に最初に登場しました:https://arxiv.org/abs/1706.05587

この記事では、拡張畳み込みは主に、ネットワークの最後の数層のダウンサンプリング操作と対応するフィルターカーネルのアップサンプリング操作を削除することで、新しい学習パラメーターを追加せずに、よりコンパクトな特徴を抽出するために使用されます。

深くなる

シリアル方式で設計された大胆な畳み込みモジュールは、block4などのResNetの最後のブロックをコピーし、コピーされたブロックをシリアル方式でカスケードします。