VAE
假定所有数据\(\mathbf{x}^{(1)},\ldots,\mathbf{x}^{(N)}\)都是i.i.d, \[ \log p_{\theta}(\mathbf{x}^{(1)},\ldots,\mathbf{x}^{(N)}) = \sum^N_{i=1}\log p_{\theta}(\mathbf{x}^{(i)}) \] 模型后验与真实后验的KL距离(打粗体真累,下面不打粗体了) \[ \begin{align} \mathrm{KL}(q_{\phi}(z|x^{(i)})||p_{\theta}(z|x^{(i)})) = & \mathbb{E}_{q_{\phi}(z|x^{(i)})}\log\frac{q_{\phi}(z|x^{(i)})}{p_{\theta}(z|x^{(i)})} \\ = & \mathbb{E}_{q_{\phi}(z|x^{(i)})}\log\frac{q_{\phi}(z|x^{(i)})}{p_{\theta}(z,x^{(i)})}+\log p_{\theta}(x^{(i)}) \end{align} \]
因此 \[ \log p_{\theta}(x^{(i)}) = \mathrm{KL}(q_{\phi}(z|x^{(i)})||p_{\theta}(z|x^{(i)})) + \mathcal{L}(\theta,\phi;x^{(i)}) \] 我们要优化变分下界\(\mathcal{L}(\theta,\phi;x^{(i)})\)。
To be continue…