MASc Seminar Announcement – Audio-Visual Feature Fusion through Transformers for Automated Depression Screening in Social Media Content
MASc Seminar at the University of Waterloo ECE Department
🎓 MASc Seminar Announcement
✅ Thursday, April 16, 2026
I am pleased to share an important milestone in my graduate journey: my MASc seminar at the Department of Electrical and Computer Engineering, University of Waterloo.
Title: Audio-Visual Feature Fusion through Transformers for Automated Depression Screening in Social Media Content
Candidate: Md Rezwanul Haque
Date: April 16, 2026
Time: 11:00 AM to 12:00 PM EDT
Location: Online
Supervisor: Prof. Fakhri Karray
Co-Supervisor: Prof. Pin-Han Ho
All are welcome to attend.
This seminar presents my MASc thesis research on multimodal depression screening using social media videos, with a focus on transformer-based audio-visual feature fusion. The presentation covers two main contributions: MDD-Net, which uses a mutual transformer to fuse acoustic and visual representations, and MMFformer, which explores multiple transformer-based fusion strategies for depression detection from audiovisual content. The work is evaluated on the D-Vlog and LMVD datasets and demonstrates strong performance improvements over prior approaches, along with encouraging cross-corpus generalization results.
🔗 Official Event Page: University of Waterloo ECE Seminar Listing
✅ Thursday, April 16, 2026
I am pleased to share an important milestone in my graduate journey: my MASc seminar at the Department of Electrical and Computer Engineering, University of Waterloo.
Title: Audio-Visual Feature Fusion through Transformers for Automated Depression Screening in Social Media Content
Candidate: Md Rezwanul Haque
Date: April 16, 2026
Time: 11:00 AM to 12:00 PM EDT
Location: Online
Supervisor: Prof. Fakhri Karray
Co-Supervisor: Prof. Pin-Han Ho
All are welcome to attend.
This seminar presents my MASc thesis research on multimodal depression screening using social media videos, with a focus on transformer-based audio-visual feature fusion. The presentation covers two main contributions: MDD-Net, which uses a mutual transformer to fuse acoustic and visual representations, and MMFformer, which explores multiple transformer-based fusion strategies for depression detection from audiovisual content. The work is evaluated on the D-Vlog and LMVD datasets and demonstrates strong performance improvements over prior approaches, along with encouraging cross-corpus generalization results.
🔗 Official Event Page: University of Waterloo ECE Seminar Listing