all AI news
ReFormer: The Relational Transformer for Image Captioning. (arXiv:2107.14178v2 [cs.CV] UPDATED)
July 18, 2022, 1:12 a.m. | Xuewen Yang, Yingru Liu, Xin Wang
cs.CV updates on arXiv.org arxiv.org
Image captioning is shown to be able to achieve a better performance by using
scene graphs to represent the relations of objects in the image. The current
captioning encoders generally use a Graph Convolutional Net (GCN) to represent
the relation information and merge it with the object region features via
concatenation or convolution to get the final input for sentence decoding.
However, the GCN-based encoders in the existing methods are less effective for
captioning due to two reasons. First, using …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Senior AI Engineer, EdTech (Remote)
@ Lightci | Toronto, Ontario
Data Scientist for Salesforce Applications
@ ManTech | 781G - Customer Site,San Antonio,TX
AI Research Scientist
@ Gridmatic | Cupertino, CA
Data Engineer
@ Global Atlantic Financial Group | Boston, Massachusetts, United States
Machine Learning Engineer - Conversation AI
@ DoorDash | Sunnyvale, CA; San Francisco, CA; Seattle, WA; Los Angeles, CA