4. When doing variational inference, due to intractability we typically maximize the evidence lower bound (ELBO) instead of minimizing Kullback-Leibler divergence (KLD) between our approximate and exact posterior. Assuming that we can compute gradients of KLD, my question is the following: are gradients of ELBO and KLD evaluated at the same ... More @Wikipedia
Hover over any link to get a description of the article. Please note that search keywords are sometimes hidden within the full article and don't appear in the description or title.