[GoogleML] Batch Normalization

2023. 9. 21. 16:12ใ†ArtificialIntelligence/2023GoogleMLBootcamp

 

 

 

Normalizing Activations in a Network

normalize๋ฅผ ํ†ตํ•ด ์ˆ˜๋ ด ์†๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. 

์ด๋•Œ normalize์˜ ๋Œ€์ƒ์€ a๊ฐ€ ์•„๋‹Œ, z์ธ ๊ฒฝ์šฐ๊ฐ€ ๋” ๋งŽ๋‹ค.

(ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ํ†ต๊ณผํ•œ ์ด์ „์˜ ๊ฐ’์„ normalize)

 

 

 

์„ ํ˜• ๋ณ€ํ™˜์„ ์œ„ํ•œ ๊ฐ๋งˆ์™€ ๋ฒ ํƒ€๋Š” Learnable params์ด๋‹ค!

 

 

 

๊ฐ๋งˆ์™€ ๋ฒ ํƒ€

 

 

 

Fitting Batch Norm into a Neural Network

z์™€ a๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ์‚ฌ์ด์— ๋“ค์–ด๊ฐ„๋‹ค

 

 

 

๋ชจ๋ฉ˜ํ…€์ด๋‚˜ ADAM์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๋ฒ ํƒ€์™€๋Š” ๋‹ค๋ฅธ ๋ฒ ํƒ€์ด๋‹ค! at BN

tf.nn.batch_normalization ํ•œ ์ค„์˜ ์ฝ”๋“œ๋กœ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ๋‹ค

 

 

 

BN์„ ์ ์šฉํ•œ z ํ‹ธ๋‹ค์— activation์„ ํ†ต๊ณผ์‹œ์ผœ a๋ฅผ ์–ป๊ณ  -> ๋‹ค์‹œ z(2)

 

 

 

bias term์€ normalize ํ•˜์ง€ ์•Š๋Š”๋‹ค!

 

 

 

 

 

 

Why does Batch Norm work?

covariate shift ๋ฌธ์ œ (ํ•™๊ต์—์„œ ๋ฐฐ์šฐ๋˜ ๊ฑฐ๋ž‘ ๋‹ค๋ฅธ ๊ฐœ๋… ๊ฐ™๊ธฐ๋„ . . ? )

 

 

 

W์™€ b๊ฐ€ ๋ฐ”๋€Œ๋ฉด, ๊ทธ ์ดํ›„์˜ ๊ฐ’๋“ค (a)๋„ ์˜ํ–ฅ์„ ๋ฐ›๊ฒŒ ๋œ๋‹ค

 

 

 

batch norm์€ input์˜ distribution์ด ๋ณ€ํ•˜๋Š” ๊ฒƒ์„ ๋ง‰์•„์ค€๋‹ค

speed up learning 

์ดˆ๊ธฐ ์ธต๋“ค์˜ params update 

 

 

 

์ „์ฒด full batch๋Š” noisyํ•  ์ˆ˜ ์žˆ๋‹ค. 

๋”ฐ๋ผ์„œ mini-batch์˜ mean๊ณผ variance๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค 

side effect๋กœ regularization ํšจ๊ณผ๊ฐ€ ์žˆ๋‹ค

 

 

 

Batch Norm at Test Time

mini-batch ๋‹จ์œ„ ๋ณ„ mean๊ณผ variance 

 

 

 

rescaling -> ๋žŒ๋‹ค์™€ ๋ฒ ํƒ€๋ฅผ ํ†ตํ•œ ์„ ํ˜• ๋ณ€ํ™˜ 

 

 

 

 

 

'ArtificialIntelligence > 2023GoogleMLBootcamp' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[GoogleML] Multi-class Classification  (0) 2023.09.23
[GoogleML] Hyperparameter Tuning, Regularization and Optimization ์ˆ˜๋ฃŒ  (0) 2023.09.22
[GoogleML] Hyperparameter Tuning  (1) 2023.09.20
[GoogleML] Adam Optimizer  (0) 2023.09.20
[GoogleML] Optimization Algorithms  (0) 2023.09.20