David Chan's picture

8 8 2

David Chan

davidchan

·

https://dchan.cc

DavidMChan

AI & ML interests

Vision + Language

Organizations

authored a paper 6 months ago

TULIP: Towards Unified Language-Image Pretraining

Paper • 2503.15485 • Published Mar 19 • 49

authored 4 papers about 1 year ago

Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition

Paper • 2403.19822 • Published Mar 28, 2024

ALOHa: A New Measure for Hallucination in Captioning Models

Paper • 2404.02904 • Published Apr 3, 2024

Virtual Personas for Language Models via an Anthology of Backstories

Paper • 2407.06576 • Published Jul 9, 2024 • 1

Visual Haystacks: Answering Harder Questions About Sets of Images

Paper • 2407.13766 • Published Jul 18, 2024 • 2

authored 6 papers over 1 year ago

Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition

Paper • 2401.02417 • Published Jan 4, 2024 • 1

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Paper • 2312.14378 • Published Dec 22, 2023

See, Say, and Segment: Teaching LMMs to Overcome False Premises

Paper • 2312.08366 • Published Dec 13, 2023

CLAIR: Evaluating Image Captions with Large Language Models

Paper • 2310.12971 • Published Oct 19, 2023

Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition

Paper • 2301.02736 • Published Jan 6, 2023

ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video

Paper • 2401.05314 • Published Jan 10, 2024 • 12