ML Researcher & Engineer

Tanveer Hossain Munim

Applied ML Engineer · AI Researcher

I build production ML systems at a national scale and publish the research behind the decisions.

Who I Am

About Me

I'm a senior ML engineer and researcher. I helped design and develop Timor-Leste's national meteorological platform (UN GCF Project), where I built data pipelines handling 1 TB+/day across five NWP models and two satellites and reduced core NetCDF read times from 30s to 80ms. My research focuses on detecting rare, high-impact weather extremes.

At RIMES I led the architecture and delivery of the GCF Timor-Leste CDIS, a national meteorological platform under UNEP's Green Climate Fund project, and designed CAP v1.2 emergency alert infrastructure now deployed across Timor-Leste and Bangladesh.

My research formalizes the gap between sparse and dense prediction (SCDR), with impossibility theorems and a dual-decoder solution validated across atmospheric ML, crowd density, and radar nowcasting. I also build government-scale geospatial systems for Bangladesh, including AI models for satellite-based parcel delineation and national land-record digitization.

Technical Focus

Sparse-Critical Dense Regression Vision-Language Models Hybrid RAG & Retrieval Geospatial ML Atmospheric & Climate ML Document AI & OCR LLM Serving & Inference
Tanveer Hossain Munim

Contact

Selected Work

Projects & Case Studies

End-to-end ML systems, open-source tools, and applied engineering.

GCF CDIS — Timor-Leste National Meteorological Platform

CDIS Architecture

GCF CDIS — Timor-Leste National Meteorological Platform

Designed & implemented Timor-Leste's national meteorological platform serving DNMG, part of UNEP's Green Climate Fund project. Live production at 97.64% task / 96.20% DAG success rate over 1.4+ years, processing 1 TB+/day across 5 NWP models (GFS, ECMWF-IFS/AIFS, ICON, and UK Met) and 2 satellites (Himawari-9 and GK2A). Key optimizations: NetCDF reads 30s → 80ms (375×); observation queries 4s → 800ms (5×).

CAP v1.2 Multi-Country Emergency Alert Infrastructure

CDIS CAP Alert System

CAP v1.2 Multi-Country Emergency Alert Infrastructure

Built the CAP v1.2 stack: pycap-validator (Python package for OASIS schema enforcement + digital signature verification), Django RSS backend, MQTT pub/sub, and n8n multi-channel dissemination (email, Facebook, Twitter/X). Deployed in production at DNMG Timor-Leste and Bangladesh Meteorological Department via Grameenphone for population-scale automated alerting.

Khatian VLM Cascade — Bengali Land-Record Extraction

Bengali Land Record OCR: Methodology

Khatian VLM Cascade — Bengali Land-Record Extraction

Production VLM system for extracting JL and plot numbers from Bangladesh government land-record scans at scale. Fine-tuned Qwen3-VL-2B/8B with sibling lookup + symbolic verifier + conformal prediction. Achieved 98.9% JL precision @ 94% auto-accept and 96.7% plot precision @ 92% auto-accept at 1.83s/image on a 396,802-image corpus across 6 regions.

SCDR / DART — Sparse-Critical Dense Regression

DART Architecture

SCDR / DART — Sparse-Critical Dense Regression

First-author ML theory + multi-domain empirical research formalizing SCDR as a new problem class. Proves no single-decoder loss can simultaneously achieve high CSI and Bias ≈ 1 (Theorem 1). DART (dual-decoder + gradient isolation) delivers 32–74% bias reduction at statistically equivalent CSI on Himawari atmospheric data (n=5 seeds, |t| up to 23.3); +0.049 CSI@10 on ShanghaiTech crowd density; beats pretrained RainNet on its own DWD radar data with ~7.6× less compute. Manuscript in preparation.

ParcelBoundary — Smallholder Paddy Parcel Delineation

Boundary IoU across the 9-model series — every yolo11-seg variant plateaus at the 0.167 ceiling; v6 Mask2Former (Swin-L) breaks it to 0.210.

ParcelBoundary — Smallholder Paddy Parcel Delineation

An instance-segmentation study delineating smallholder paddy parcels from 10 cm aerial imagery over Dhamrai, Bangladesh. A controlled 9-model series shows that every yolo11-seg variant — plus pseudo-labelling, RGB+nDSM multi-channel fusion (LiDAR-derived canopy height), and a small-patch specialist — plateaus at a Boundary IoU ceiling of 0.167. Mask2Former (Swin-L) is the first architecture to break it, reaching BIoU = 0.210 (+26% relative) at confidence 0.80. Built on 4,662 hand-digitised ground-truth parcels across 41 labelled tiles with matching LiDAR point clouds, trained on an NVIDIA GH200. Ships as a fully reproducible study (all 8 checkpoints, code, and a paper scaffold) with a live Streamlit demo.

Bengali Weather Chatbot — LangGraph + DeepSeek

System architecture — browser geolocation + SSE, FastAPI, and a LangGraph state machine over DeepSeek and BMD WRF forecasts.

Bengali Weather Chatbot — LangGraph + DeepSeek

A production-shaped weather assistant for Bangladesh that always replies in Bengali and answers weather questions only. Built as a single stateful LangGraph: a guardrail/intent node gates off-topic queries and routes service questions to a context-injected FAQ; a resolve_location node uses browser geolocation or pauses the graph with a human-in-the-loop interrupt() to ask the user; and a ReAct agent loop (DeepSeek) calls a weather tool backed by Bangladesh Meteorological Department WRF forecasts via the BDServers API at upazila resolution. FastAPI streams tokens over SSE; SqliteSaver checkpointing enables multi-turn memory and resume, with node-level retry policies for reliability. Thoroughly documented with architecture, graph-flow, state-schema, and request-sequence diagrams.

MIRA AI — Conversational Sales Agent

Multi-channel conversational sales AI on LangGraph + FastAPI + RAG over 10K+ products. Built for multi-tenant scale; 4 paying enterprise customers. NVIDIA Inception Program (2025). On course for Bangladesh Innovation Grant.

NLAS — Multilingual Livestock Advisory RAG System

Production hybrid RAG chatbot for Bangladesh Dept. of Livestock Services. Qwen2.5-7B Q4_K_M on a 4GB GPU; BGE-M3 dense + BM25 sparse + RRF fusion + bge-reranker-v2-m3 cross-encoder; Redis-backed sliding-window session memory; Bangla/English/Banglish transliteration-before-retrieval. Achieved 71% beta satisfaction from production rating API.

wis2downloader — WMO Open-Source Contribution

Refactored the World Meteorological Organization's wis2downloader from a monolithic architecture to a distributed Celery + async + spatial-filtering pipeline. Reduced data ingestion latency from ~4 hours to 130ms. Merged upstream; adopted by Timor-Leste's national meteorological agency and other NMHSs.

Technical Expertise

Skills & Tools

LLMs & Generation

  • Qwen3-VL Fine-tuning (Full FT + QLoRA)
  • Frontier Model Distillation
  • vLLM / Ollama / llama.cpp
  • Q4_K_M Quantization
  • KV Cache Management
  • Streaming SSE Inference

RAG & Retrieval

  • BGE-M3 Dense Retrieval
  • BM25 Sparse Retrieval
  • RRF Fusion
  • Cross-Encoder Reranking
  • ChromaDB / Vector Stores
  • Session Memory Design

Calibration & Evaluation

  • Conformal Prediction
  • Multi-seed Protocols
  • Paired t-tests
  • Ablation Design
  • Capacity / Floor Probes

ML Frameworks

  • PyTorch
  • HuggingFace Transformers
  • LangGraph / LangChain
  • PyMC (Bayesian ML)
  • DeepSpeed

ML Infrastructure

  • Apache Airflow (multi-server)
  • Celery
  • Redis
  • Docker
  • Kubernetes (RKE2 bare-metal)
  • HPC / NVIDIA GH200

Backend & Serving

  • Django / DRF
  • FastAPI
  • uvicorn + async SSE
  • PostgreSQL / PostGIS
  • MQTT

Languages

  • Python
  • SQL
  • JavaScript / TS
  • Java
  • C++

Geospatial

  • xarray
  • GeoPandas / GDAL
  • Rasterio / NetCDF
  • Cloud-Optimised GeoTIFF
  • Martin / TiTiler
  • Deck.gl

Professional Background

Experience

Academic Background

Education

Founded

Ventures

Founder & Lead Architect · Founded July 2025

MIRA AI

Enterprise AI platform built for operational scale.

Founded and building an end-to-end enterprise AI platform. Responsible for full-stack architecture, model development, and customer delivery.

Highlights

NVIDIA Inception member
4 paying enterprise customers

Research Background

Publications

Peer-reviewed papers, conference proceedings, and preprints.

Invited Talks & Presentations

  • 9 October 2025

    "Applications of Sparse-Critical Dense Regression to Agrometeorology"

    Capacity Building Training, National Center of Meteorology · UAE

  • 29 April 2026

    "Climate–Agriculture Risk Modeling: Sri Lanka"

    Climate Services User Forum, South Asian Hydromet Forum (SAHF) · Malé, Maldives

Credentials & Recognition

Awards & Affiliations

BUET Bachelor of Science in Computer Science and Engineering
NST Fellowship Bangladesh National Science & Technology Fellowship · 2026
Accelerating Asia Top 9 of 500+ global startups · 2023
NVIDIA Inception Member · MIRA AI

Let's Connect

Get in Touch

Whether you're a hiring manager, research collaborator, or prospective PhD supervisor — I'd love to hear from you.

The quickest way to reach me is directly by email. I usually respond within 24 hours.

Email Me

Also find me on

Open To

PhD / Research Positions ML Engineering Roles Applied AI & MLOps Consulting & Advisory Research Collaborations