May 16, 2024, 3:13 p.m. | Subarna Tripathi

Towards Data Science - Medium

We explore novel video representations methods that are equipped with long-form reasoning capability. This is part 1 focusing on video representation as graphs and how to learn light-weights graph neural networks for several downstream applications. Part II focuses on sparse video-text transformers. And Part III provides a sneak peek into our latest and greatest explorations.

Existing video architectures tend to hit computation or memory bottlenecks after processing only a few seconds of the video content. So, how do we enable …

