Enhancing Video Understanding Through Deep Generative Models and Task Comprehension