Cross-modal Embeddings for Video and Audio Retrieval

Book Chapter 2019