[GoogleML] Word Embeddings _ Word2vec & GloVe

2023. 10. 29. 18:52ใ†ArtificialIntelligence/2023GoogleMLBootcamp

 

 

 

Learning Word Embeddings

blank๋ฅผ ์ฑ„์šฐ๋Š” ๋ฐฉ์‹ (์˜ˆ์ธกํ•˜๋Š”)

 

 

 

๋งŒ์•ฝ, ๋นˆ์นธ ์•ž์˜ 4 ๋‹จ์–ด๋งŒ ๋ณด๊ณ  ์˜ˆ์ธกํ•œ๋‹ค๋ฉด? (์•ž์— i want๋Š” ์‚ญ์ œ, 1800 -> 1200)

 

 

 

context์— ๋งž์ถ”์–ด (๊ธฐ์ค€์— ๋”ฐ๋ผ) 

ํƒ€๊ฒŸ์„ ์˜ˆ์ธกํ•˜๋Š”๋ฐ ํ™œ์šฉํ•  ๋‹จ์–ด๊ฐ€ ๋‹ฌ๋ผ์ง„๋‹ค. 

 

 

 

Word2Vec

๋žœ๋คํ•˜๊ฒŒ target ๋ฒกํ„ฐ๋ฅผ ๊ณ ๋ฅธ๋‹ค. (ํŠน์ • ๋ฒ”์œ„์˜ ์œˆ๋„์šฐ ์‚ฌ์ด์ฆˆ ๋‚ด๋ถ€์—์„œ)

 

 

 

๊ณ„์‚ฐ ๋น„์šฉ -> slow ๋ฌธ์ œ

 

 

 

๊ฐ€์†ํ™”๋ฅผ ์œ„ํ•ด ํŠธ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉ (์ €๋นˆ๋„ -> ๋” deep) 

Hierarchical softmax

 

 

 

ํœด๋ฆฌ์Šคํ‹ฑํ•œ ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•๋“ค์ด ์ ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. 

 

 

 

Negative Sampling

์ธํ’‹์œผ๋กœ ๋‘ ๋‹จ์–ด -> target์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฌธ์ œ (0 or 1)

๋น„๊ต์  ํฐ ๋ฐ์ดํ„ฐ ์…‹ -> K๋ฅผ ์ž‘๊ฒŒ ์žก๋Š”๋‹ค. 

 

 

 

์ฒซ๋ฒˆ์งธ ์ฃผ์Šค๋งŒ 1 -> positive

์ดํ›„๋กœ K๊ฐœ๋Š” ๋ชจ๋‘ 0 -> negative 

๋”ฐ๋ผ์„œ negative sampling ์ด๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” ๊ฒƒ

๊ธฐ์กด์˜ ๋ฌด๊ฑฐ์šด ์‹œ๊ทธ๋งˆ ์ˆ˜์‹๋ณด๋‹ค ํ›จ์”ฌ ๊ณ„์‚ฐ ๋ณต์žก๋„๊ฐ€ ๊ฐœ์„  !

softmax => binary classification problem์œผ๋กœ ๋ฐ”๊พธ์–ด ํ•ด๊ฒฐ

 

 

 

์–ด๋–ป๊ฒŒ 0 ์ƒ˜ํ”Œ๋“ค์„ ๊ณ ๋ฅด๋ƒ? frequency์— ๋”ฐ๋ผ ๊ณ„์‚ฐ

 

 

 

GloVe Word Vectors

 

 

 

 

 

 

 

 

 

Sentiment Classification

labeling์ด ๋œ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•˜๋‹ค๋Š” ๋ฌธ์ œ์ ์ด ์žˆ๋‹ค. 

common ๋Œ€์ค‘์˜ ๋ฐ˜์‘์ด ๊ถ๊ธˆํ•  ๋•Œ 

 

 

 

 

 

 

RNN ๊ตฌ์กฐ ์ค‘ many to one๊ณผ ์—ฐ๊ด€ O

์—ฌ๋Ÿฌ ๋‹จ์–ด (๋ฌธ์žฅ) input -> output (๊ฐ์ •, timeseries ๊ฐ’ X) 

 

 

 

Debiasing Word Embeddings

ML -> ์ค‘์š”ํ•œ ํŒ๋‹จ์˜ ๋„๊ตฌ๊ฐ€ ๋จ -> ์ด๋Ÿฌํ•œ ๋ฐ”๋žŒ์งํ•˜์ง€ ์•Š์€ bias๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ๊ฒƒ์€ ์ค‘์š”ํ•˜๋‹ค

์ธ๊ฐ„์˜ ๋จธ๋ฆฌ๋ณด๋‹ค, AI์˜ ํŽธ๊ฒฌ์„ ์ œ๊ฑฐํ•˜๋Š” ๊ฒƒ์ด ๋” ์‰ฝ๋‹ค.

 

 

 

gender ๋ฒกํ„ฐ ๋ฐฉํ–ฅ์„ ์ฐพ๊ณ , neutralize (์ถ•์— ํˆฌ์˜)

์ดํ›„ ๊ฑฐ๋ฆฌ๊ฐ€ ๊ฐ™๋„๋ก equalize (ex, ๋ฒ ์ด๋น„์‹œํ„ฐ ํฌ์ธํŠธ๋กœ๋ถ€ํ„ฐ ํ• ๋จธ๋‹ˆ, ํ• ์•„๋ฒ„์ง€ ํฌ์ธํŠธ์˜ ๊ฑฐ๋ฆฌ๊ฐ€ ๊ฐ™๋„๋ก ์กฐ์ •)

PCA ๊ฐ™๊ณ  ์‹ ๊ธฐํ•˜๋‹ค