Notes

Deconvolution Layer
Batch Normalization
Q-learning v SARSA
Policy Iteration v Value Iteration
Q Learning
Policy Gradients
Actor Critic methods
Trust Region Methods
Monte Carlo Tree Search
Inverse Reinforcement Learning
One shot learning
Meta learning
A3C
Distributed DL
MAC vs Digital Signatures
MLE and KL Divergence
Lipschitz Continuity
Exposure bias problem
Gini coefficient
Pareto distribution


Deconvolution Layer

References


Batch Normalization

References


Q-learning v SARSA

References


Policy Iteration v Value Iteration

References


Q Learning

References


Policy Gradients

References


Actor Critic Methods

References


Trust Region Methods

References


References


Inverse Reinforcement Learning

References


One Shot Imitation Learning

References


Meta Learning

References


Asynchronous Actor-Critic Agents (A3C)

References


Distributed DL

References


MAC vs Digital Signatures

References


MLE and KL Divergence

\(\hat{\theta} = arg\;max_{\theta} \; \mathcal{L}(\theta, D)\) or \(\hat{\theta} = arg\;max_{\theta} \; log \; P_{\theta}(\mathcal{D}|\theta)\)

\(NLL(\theta) = - \sum^N_{i=1} log \; p(y_i|x_i, \theta)\)

\(\mathcal{KL} (p||q) = \sum^K_{k=1} \; p_k \; log \; \frac{p_k}{q_k}\)

References


Lipschitz Continuity

References


Exposure bias problem

References


Gini Coefficient

\[G = \frac{\sum_{i=1}^{n}\sum_{j=1}^{n}|x_i-x_j|}{2n^2\hat{x}}\]

Pareto distribution