cv
Basics
| Name | Md Rezwanul Haque |
| Label | PhD Candidate Β· AI Researcher |
| Url | https://rezwanh001.github.io/ |
| Summary | Incoming PhD candidate in Electrical and Computer Engineering at the University of Waterloo, researching multimodal machine learning, large language models, and agentic AI systems at the CPAMI Lab. |
Work
-
2024.04 - Present Ontario, Canada
Graduate Researcher
Centre for Pattern Analysis and Machine Intelligence (CPAMI) Lab
Conducting advanced research in multimodal machine learning and large language models at the CPAMI Lab, University of Waterloo. Designing transformer-based architectures and multi-agent systems for human-centered AI applications.
- π§ Multimodal Machine Learning
- π€ Large Language Models (LLMs)
- ποΈβπ¨οΈ Vision-Language Models
- π Multimodal AI Agents
- π Embodied Agentic AI Systems
- π€ Human-Centered Machine Intelligence
- π¬ Deep Learning & Computer Vision
Education
-
2026.05 - Present Waterloo, Ontario, Canada
PhD
University of Waterloo
Electrical and Computer Engineering (ECE)
- π¬ Research: Agentic Systems for Embodied AI Β· Multi-Agent Embodied Systems with Shared Visual Understanding Β· World Models & Visual Cognitions for Embodied Agents
- π¨βπ¬ Supervisor(s): Prof. Fakhri Karray & Prof. Zhou Wang
-
2024.05 - 2026.04 Waterloo, Ontario, Canada
MASc
University of Waterloo
Electrical and Computer Engineering (ECE)
- π Thesis: βTitle is coming ...β
- π¬ Focus: Multimodal ML Β· LLMs Β· Human-Centered AI
- π¨βπ¬ Supervisor(s): Prof. Fakhri Karray & Prof. Pin-Han Ho
-
2014.12 - 2019.02 Khulna, Bangladesh
BSc
Khulna University of Engineering & Technology (KUET)
Computer Science and Engineering (CSE)
- π Thesis: βA Study on Non-Invasive Hemoglobin Measurement Techniques and Predictions.β
- π¨βπ¬ Supervisor: Prof. Dr. M.M.A Hashem
Teaching
-
2026.01 - 2026.04 Ontario, Canada
ECE 459: Programming for Performance
University of Waterloo
Teaching Assistant (TA)
- π Winter 2026 Β· January 2026 β April 2026
- π₯ Enrolled Students: 440/447
- β TA Rating: β β β β β (--.--/5)
-
2025.09 - 2025.12 Ontario, Canada
ECE 222: Digital Computers
University of Waterloo
Teaching Assistant (TA)
- π Fall 2025 Β· September 2025 β December 2025
- π₯ Enrolled Students: 279/280
- β TA Rating: β β β β β (--.--/5)
-
2025.05 - 2025.08 Ontario, Canada
ECE 252: Systems Programming and Concurrency
University of Waterloo
Teaching Assistant (TA)
- π Spring 2025 Β· May 2025 β August 2025
- π₯ Enrolled Students: 120/150
- β TA Rating: β β β β β (--.--/5)
-
2025.01 - 2025.04 Ontario, Canada
ECE 459: Programming for Performance
University of Waterloo
Teaching Assistant (TA)
- π Winter 2025 Β· January 2025 β April 2025
- π₯ Enrolled Students: 453/425 π
- β TA Rating: β β β β β (--.--/5)
-
2024.09 - 2024.12 Ontario, Canada
ECE 252: Systems Programming and Concurrency
University of Waterloo
Teaching Assistant (TA)
- π Fall 2024 Β· September 2024 β December 2024
- π₯ Enrolled Students: 206/190 π
- β TA Rating: β β β β β (4.5/5)
Publications
-
2025.11.14 GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning
2025 International Joint Conference on Neural Networks (IJCNN)
-
2025.10.19 FusionEnsemble-Net: An Attention-Based Ensemble of Spatiotemporal Networks for Multimodal Sign Language Recognition
In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
-
2025.10.19 A Signer-Invariant Conformer and Multi-Scale Fusion Transformer for Continuous Sign Language Recognition
In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2025
-
2025.10.08 MDD-Net: Multimodal Depression Detection through Mutual Transformer
2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
-
2025.10.08 MMFformer: Multimodal Fusion Transformer Network for Depression Detection
2025 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
-
2023.08.19 Badlad: A large multi-domain bengali document layout analysis dataset
ICDAR 2023 | Springer
-
2022.09.01 Breast cancer prediction: a comparative study using machine learning techniques
Springer Nature Computer Science
-
2021.03.04 Hemoglobin and glucose level estimation from PPG characteristics features of fingertip video using MGGP-based model
Elsevier β Biomedical Signal Processing and Control
-
2018.10.20 Performance evaluation of random forests and artificial neural networks for the classification of liver disorder
IC4ME2 2018 | IEEE Conference
-
2017.02.12 Prediction of breast cancer using support vector machine and K-Nearest neighbors
R10-HTC 2017 | IEEE Conference
Awards
- 2025
π Faculty of Engineering (FOE) Award (Spring)
UWaterloo, CA
π Awarded for exceptional merit in research and academics. π° Valued at CAD $1,500.
- 2025
π Best Poster Award
ICCV, USA
π₯ Awarded at the 1st Multimodal Sign Language Recognition Workshop at ICCV 2025.
- 2025
π₯ Winner (2nd) of MSLR-2025 Challenge
ICCV, USA
π― Secured 2nd place in The First Multimodal Sign Language Recognition Challenge. π° Valued at USD $750.
- 2025
π Faculty of Engineering (FOE) Award (Winter)
UWaterloo, CA
π Awarded for exceptional merit in research and academics. π° Valued at CAD $1,500.
- 2025
π Graduate Student Contingency Bursary
UWaterloo, CA
π Competitive bursary for graduate students with strong academic standing. π° Valued at CAD $5,000.
- 2024 . Present
π MASc - Graduate Research Studentship (GRS)
UWaterloo, CA
π Merit-based studentship supporting full-time thesis research in ECE. π° Valued at CAD $33,000/yr.
Volunteer
-
2023.01 - Present Conference & Journal Reviewer
Academic Peer Review
Serving as an invited peer reviewer for leading AI/ML conferences and journals, evaluating research in computer vision, multimodal learning, and deep learning.
- ποΈ ICCV 2025 β IEEE/CVF International Conference on Computer Vision
- π§ NeurIPS 2025 β Workshop on Music and Machine Learning (MusIML)
- βοΈ IEEE SMC 2025 β International Conference on Systems, Man, and Cybernetics
- π« IEEE IJCNN 2025 β International Joint Conference on Neural Networks
- π‘ IEEE EICT 2025 β Int'l Conference on Electrical Information and Communication Technology
- π PLOS One (2023βPresent) β Peer-reviewed open access scientific journal
Certificates
| Machine Learning | ||
| Coursera | Stanford University |
| Deep Learning Specialization | ||
| Coursera | DeepLearning.AI |
| Deep Learning with Python and PyTorch | ||
| edX | IBM |
| Quantum Machine Learning | ||
| edX | University of Toronto |
| Pretraining LLMs | ||
| DeepLearning.AI |
| Multimodal RAG: Chat with Videos | ||
| DeepLearning.AI |
Skills
| Machine Learning & AI | |
| Deep Learning | |
| Multimodal Learning | |
| Large Language Models (LLMs) | |
| Computer Vision | |
| Natural Language Processing | |
| Transformer Architectures |
| Programming Languages | |
| Python | |
| C / C++ | |
| Rust | |
| Java | |
| MATLAB | |
| Bash / Shell Scripting |
| Frameworks & Libraries | |
| PyTorch | |
| TensorFlow / Keras | |
| Hugging Face Transformers | |
| OpenCV | |
| scikit-learn | |
| NumPy / Pandas |
| Tools & Platforms | |
| Linux / Ubuntu | |
| Git / GitHub | |
| Docker | |
| SLURM (HPC Clusters) | |
| LaTeX | |
| Weights & Biases |
Languages
| Bangla | |
| Native speaker |
| English | |
| Fluent |
Interests
| Large Language Models | |
| Transformer Architectures | |
| Natural Language Understanding | |
| GPT / LLaMA / Gemma | |
| Instruction Tuning & RLHF | |
| Retrieval-Augmented Generation (RAG) | |
| Agentic Systems & Tool Use |
| Multimodal Machine Learning | |
| Vision-Language Models | |
| Multimodal Fusion & Representation | |
| Audio-Visual Learning | |
| Sign Language Recognition | |
| Depression Detection (Multimodal) | |
| Medical Image Analysis |
| AI Agents & Embodied Intelligence | |
| Multi-Agent Systems | |
| Embodied AI | |
| Visual Cognition & World Models | |
| Human-Centered Machine Intelligence | |
| Multimodal AI Agents | |
| Shared Visual Understanding |
| Deep Learning & Computer Vision | |
| Convolutional Neural Networks | |
| Vision Transformers (ViT) | |
| Object Detection & Segmentation | |
| Whole Slide Image Analysis | |
| Graph Neural Networks | |
| Self-Supervised & Contrastive Learning |