'ArtificialIntelligence' 카테고리의 글 목록 (5 Page)

Normalizing Activations in a Network normalize를 통해 수렴 속도를 향상시킬 수 있다. 이때 normalize의 대상은 a가 아닌, z인 경우가 더 많다. (활성화 함수 통과한 이전의 값을 normalize) 선형 변환을 위한 감마와 베타는 Learnable params이다! 감마와 베타 Fitting Batch Norm into a Neural Network z와 a를 계산하는 사이에 들어간다 tf.nn.batch_normalization 한 줄의 코드로 구현할 수 있다 Why does Batch Norm work? batch norm은 input의 distribution이 변하는 것을 막아준다 speed up learning 초기 층들의 params update 전..

2023.09.21

[GoogleML] Hyperparameter Tuning

Tuning Process 왜냐하면 params 별 (축 별) 중요도가 다르기 때문 섬세한 정도가 달라야 하는데, grid는 모두 동일하게 다루기 때문 random하게 보는 것이 더 좋다 Using an Appropriate Scale to pick Hyperparameters 다음과 같이 베타가 분모에 들어갈 경우, 단순한 델타값 이상의 중요도가 있다 (sensitivity) Hyperparameters Tuning in Practice: Pandas vs. Caviar 작은 setting / computational 으로 하나의 model을 평가 vs 다양한 모델, 다양한 setting을 병렬적으로 처리 판다식 vs 캐비어

2023.09.20

[GoogleML] Adam Optimizer

Gradient Descent with Momentum RMSprop 이것이 어떻게 동작할 수 있는가? 세로 방향으로는 적게, 가로 방향으로는 많이 update 되어야 한다! (많이 가야 한다) Adam Optimization Algorithm Learning Rate Decay 초반에는 크게 확 확 나중에는 learning rate 크기를 줄인다 The Problem of Local Optima 오랫동안 기울기가 0에 가까운, plateau도 학습을 방해하는 요인이 된다

2023.09.20

[GoogleML] Optimization Algorithms

Mini-batch Gradient Descent Understanding Mini-batch Gradient Descent batch도 시간이 많이 걸린다. 이 둘의 하이브리드 너무 크거나 작지 않은 미니 배치 사이즈 1. vectorization 2. 전체를 full로 다 기다릴 필요 X 1. 2000개 이하의 데이터 -> full batch 2. 큰 데이터 셋 -> 64 / 128 / 512 중 하나를 택해서 사용 3. GPU / CPU 메모리에 맞게 사용 주의 Exponentially Weighted Averages Understanding Exponentially Weighted Averages Bias Correction in Exponentially Weighted Averages t 가 커..

2023.09.20

[Paper reading] Denoising Diffusion Probabilistic Models

Abstract We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. (비평형 열역학) Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin..

2023.09.19

[OpenAI] ChatGPT Prompt 개발

https://platform.openai.com/examples OpenAI Platform Explore developer resources, tutorials, API docs, and dynamic examples to get the most out of OpenAI's platform. platform.openai.com 👩‍💻 UI도 깔끔하고 짱 이쁘다. structure도 화면에 짜인 구조가 API JSON request에 그대로 반영되어서, 개발하기 너무 좋았다. :) 🥺 해당 화면에서 호출하는 API 코드도 그대로 보여준다. 최고다. 👍 message 구조만 조금 수정해서 colab에서 개발했다. 👩‍💻 다른 재미있는 기능들도 많은 것 같다. 프로젝트에 활용하면 정말 편하고, 빠르게 구현..

2023.09.18

[Paper reading] Implicit Neural Representations

Implicit Neural Representations with Periodic Activation Functions Abstract Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit neural representations are incapable of modeling signals w..

2023.09.18

[GoogleML] Optimization Problem

Normalizing Inputs Vanishing / Exploding Gradients 겹겹이 쌓인 W -> weights 1.5 -> 지수적으로 증가 (gradient 폭발) 0.5 -> 지수적으로 감소 (gradient vanishing) layer가 깊게 쌓일수록, 학습이 어려워지는 문제 이를 해결하기 위한 웨이트 초기화 Weight Initialization for Deep Networks weight init 중요하다 gradient가 폭발하거나 사라지게 하지 않기 위해서 Numerical Approximation of Gradients 단방향 / 양방향 grad 계산 Gradient Checking 이 수식은 어떤 값을 확인하라는거지 . . ? 잘 모르겠다. cos 유사도도 아닌 것 같고,..

2023.09.13

[GoogleML] Regularizing Neural Network

Regularization 높은 차원의 W에 비해, b는 매우 낮은 차원(실수) bias는 regularization 하지 않는다. @Logistic regression @Neural Network Why Regularization Reduces Overfitting? 람다를 키울 경우, W가 죽게 된다 더 간단한 모델 구조가 된다 W를 0으로 만들기 위해 더 simple한 network -> parmas size를 줄여서 오버피팅 방지 가능 (과적합 해결) layer를 linear하게 만든다 -> 오버피팅 해결 second term의 추가로, 기존 cost func에서 변화가 발생하게 된다 Dropout Regularization 더 작은 규모의 network로 만드는 역할 1보다 작은 값으로 나눈다 ..

2023.09.12

KimAnt 🥦

KimAnt 🥦

태그

최근글

댓글

공지사항

아카이브

ArtificialIntelligence(72)

티스토리툴바