[GoogleML] ML workflow by implementing strategy, avoidable bias

2023. 9. 29. 21:57ใ†ArtificialIntelligence/2023GoogleMLBootcamp

 

 

 

Orthogonalization

feature ๊ฐ„ ์—ฐ๊ด€์„ฑ (๋ฒกํ„ฐ ์‚ฌ์ด์˜ ๊ฐ์„ ์ˆ˜์ง์œผ๋กœ!)

์ด๊ฒŒ ์–ด๋–ป๊ฒŒ ML๊ณผ ์—ฐ๊ด€์ด ์žˆ๋Š”๊ฐ€? 

 

 

 

์ฃผ์–ด์ง„ ๋ฌธ์ œ ์ƒํ™ฉ ์†์—์„œ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ณ€์ˆ˜๋ฅผ ์กฐ์ ˆํ•ด๋ณด๋Š” ๊ด€์ 

 

 

 

๊ฐ๊ฐ์˜ ๋ฌธ์ œ์— ๋Œ€ํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•๋“ค

 

 

 

Single Number Evaluation Metric

precision (์ •๋ฐ€๋„)

: cat์œผ๋กœ ํŒ๋ณ„๋œ ๊ฒƒ ์ค‘ ์‹ค์ œ cat์˜ ๋น„์œจ

recall (์žฌํ˜„์œจ)

: ์‹ค์ œ cat ์ค‘ ์ œ๋Œ€๋กœ ๋ถ„๋ฅ˜๋œ ๊ฒƒ์˜ ๋น„์œจ

 

 

 

A์™€ B์ค‘ ์–ด๋–ค ๋ถ„๋ฅ˜๊ธฐ๊ฐ€ ๋” ์ข‹์€๊ฐ€?

ํ•˜๋‚˜์˜ ์š”์†Œ (F1 score)๋ฅผ ๋” ์ถ”๊ฐ€ํ•˜์—ฌ ๊ณ ๋ คํ•ด๋ณด์ž

 

 

 

 

 

 

ํ•œ๋ฒˆ์— ์—ฌ๋Ÿฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋น„๊ตํ•˜๋ ค๋ฉด ์–ด๋ ต๋‹ค -> average ์ƒˆ๋กœ์šด ๋ฉ”ํŠธ๋ฆญ์œผ๋กœ ํŒ๋‹จ์„ ๋” ์‰ฝ๊ฒŒ ํ•˜์ž

 

 

 

Satisficing and Optimizing Metric

cost func์— ๋‹ค์–‘ํ•œ ์˜๋„๋ฅผ ๋‹ด์„ ์ˆ˜ ์žˆ๋‹ค 

accuracy - ๋†’์„์ˆ˜๋ก ์ข‹์Œ (optimize)

running time - 100ms ์ดํ•˜์ด๊ธฐ๋งŒ ํ•˜๋ฉด ๋จ (satisfied)

+ ์Œ์„ฑ ์‹ ํ˜ธ์˜ 100ms๋ฅผ ์˜๋ฏธ!

 

 

 

 

 

 

Train/Dev/Test Distributions

dev์™€ test set์˜ ๋ถ„ํฌ๊ฐ€ ๋‹ฌ๋ผ์ง€๋Š” ๊ฒƒ์€ ์˜ฌ๋ฐ”๋ฅด์ง€ X

๋‘ set์ด ๊ฐ™์€ ๋ถ„ํฌ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋„๋ก 

randomly sampling

 

 

 

๋™์ผํ•œ ํƒ€๊ฒŸ์„ ๋Œ€์ƒ์œผ๋กœ ํ•ด์•ผํ•œ๋‹ค

 

 

 

 

 

 

Size of the Dev and Test Sets

๊ณผ๊ฑฐ์—๋Š” ์ด๋Ÿฌํ•œ ๋น„์œจ์ด ํ•ฉ๋ฆฌ์ ์ด์—ˆ๋‹ค

ํ•˜์ง€๋งŒ ํ˜„๋Œ€๋Š” X

 

 

 

๋ฐฑ๋งŒ๊ฐœ์˜ ๋ฐ์ดํ„ฐ -> 1%์ธ 10000๋„ ์ถฉ๋ถ„ํ•จ

 

 

 

test set ์—†์• ๋Š” ๊ฒƒ์„ ์ถ”์ฒœํ•˜์ง€ X

 

 

 

When to Change Dev/Test Sets and Metrics?

 

 

 

์ƒˆ๋กœ์šด ํ‰๊ฐ€ ์ง€ํ‘œ๋ฅผ ์ฐพ์•„๋ณด์•„์•ผ ํ•œ๋‹ค. evaluation metric

์›๋ž˜ error metric์— ๋งŒ์กฑํ•˜์ง€ ๋ชปํ•œ๋‹ค๋ฉด, ๋‹ค๋ฅธ ๋ฉ”ํŠธ๋ฆญ์„ ์ƒ๊ฐํ•ด๋ณด์•„๋ผ

 

 

 

1. ํƒ€๊ฒŸ์„ ์„ค์ •ํ•˜๊ณ  (๋ชฉํ‘œ๋ฅผ ์ •ํ•˜๊ณ )

2. ๊ทธ ํƒ€๊ฒŸ์„ ์ž˜ ๋งž์ถ”๋„๋ก, ๋ฉ”ํŠธ๋ฆญ์— ๋งž์ถ˜๋‹ค

-> error metric์€ ์ƒํ™ฉ์— ๋”ฐ๋ผ ๋ฐ”๋€” ์ˆ˜๋„ ์žˆ๋Š” ๊ฒƒ!

 

 

 

๊ณ ํ™”์งˆ์˜ ์‚ฌ์ง„์œผ๋กœ ํ•™์Šต์‹œ์ผฐ์„ ๋•Œ, A๊ฐ€ ๋” ์ข‹์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ 

๊ทธ๋Ÿฌ๋‚˜ ์œ ์ €์˜ ์‚ฌ์ง„์€ ํ๋ฆฐ ํ™”์งˆ -> B๊ฐ€ ๋” ์ข‹์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๋  ์ˆ˜ ์žˆ๋‹ค

์ด๋Ÿฐ ์ƒํ™ฉ์—์„œ๋Š” metric์„ ๋ฐ”๊พธ์ž

 

 

 

Why Human-level Performance?

๋ฒ ์ด์ฆˆ ์—๋Ÿฌ

x -> y mapping์—์„œ ์ตœ์ ์˜ ์—๋Ÿฌ 

(๋” ์ด์ƒ ์ข‹์€ ๊ฐ’์€ X)

 

 

 

์‚ฌ๋žŒ ์ธ์‹ ๋Šฅ๋ ฅ๊ณผ ๋น„๊ตํ•˜๋Š” ์ด์œ 

 

 

 

Avoidable Bias

- ์‚ฌ๋žŒ๊ณผ ์ฐจ์ด๊ฐ€ ๋งŽ์ด ๋‚  ๋•Œ -> ๋ชจ๋ธ ํ•™์Šต ์„ฑ๋Šฅ์„ ๋” ์˜ฌ๋ฆฌ๊ธฐ ์œ„ํ•ด ๋…ธ๋ ฅ (bias ์ค„์ด๊ธฐ)

- ์‚ฌ๋žŒ๊ณผ ์ฐจ์ด๊ฐ€ ๊ฑฐ์˜ ์—†์„ ๋•Œ -> ๋ชจ๋ธ ํ•™์Šต ์„ฑ๋Šฅ์„ ๊ฒ€์ฆ ๋ฐ์ดํ„ฐ (valid, dev set)๊ณผ ๋งž์ถ”๊ธฐ ์œ„ํ•ด ๋…ธ๋ ฅ (์˜ค๋ฒ„ํ”ผํŒ… ๊ด€์ , variance ์ค„์ด๊ธฐ)

 

 

 

human error๋ฅผ ๋ฒ ์ด์ฆˆ ์—๋Ÿฌ์™€ ์œ ์‚ฌํ•˜๊ฒŒ ๋ณผ ์ˆ˜ ์žˆ๋‹ค

 

 

 

avoidable bias ์™€ variance์˜ ๊ฐœ๋…!

 

 

 

Understanding Human-level Performance

๋ฒ ์ด์ฆˆ ์—๋Ÿฌ๋ฅผ ์ƒ๊ฐํ•ด๋ณด๋ฉด ๋‹ต์ด ๋‚˜์˜จ๋‹ค 

bayes error == human level error (0.5%) 

 

 

 

 

 

 

์ด์ „๊นŒ์ง€ 0๊ณผ ๋น„๊ตํ•˜๋Š” bias ๋Œ€์‹  

bayes error์™€ ๋น„๊ตํ•˜๋Š” avoidable bias (๋ณด๋‹ค ํ˜„์‹ค์ ์ธ ๋น„๊ต)

์ด๊ฒƒ์ด human level error์˜ ์˜์˜ (bayes error์˜ approximation)

 

 

 

Surpassing Human-level Performance

์—ฌ๊ธฐ์„œ avoidable bias๋Š”? (๋ฒ ์ด์ฆˆ๋ฅผ ์ •์˜ํ•˜๋ผ)

 

 

 

๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด human์„ ๋„˜์–ด์„  ๊ฒฝ์šฐ -> avoidable bias๋ฅผ ์ •์˜ํ•˜๊ธฐ ์–ด๋ ต๋‹ค

 

 

 

์‚ฌ๋žŒ๋ณด๋‹ค ML ์„ฑ๋Šฅ์ด ๋†’์€ ๊ฒฝ์šฐ 

- structural data

- not natural perception (์‚ฌ๋žŒ์—๊ฒŒ ์šฉ์ดํ•œ ์ž์—ฐ์Šค๋Ÿฌ์šด ์ธ์ง€๊ฐ€ X)

- lots of data (์‚ฌ๋žŒ์ด ํ•œ๋ฒˆ์— ํŒŒ์•…ํ•˜๊ธฐ ํž˜๋“  ๋งŽ์€ ๋ฐ์ดํ„ฐ)

 

 

 

Improving your Model Performance

 

 

 

์ด์ „๊ณผ ์œ ์‚ฌํ•˜๊ธด ํ•˜๋‹ค 

- model ์„ฑ๋Šฅ์„ ๋” ๋Œ์–ด์˜ฌ๋ฆฐ๋‹ค

vs ์˜ค๋ฒ„ํ”ผํŒ… ๋ฐฉ์ง€