Howdy! I’m a Ph.D. candidate in Statistics at Boston University, co-advised by Prof. Debarghya Mukherjee and Prof. Luis Carvalho, and I also collaborate with Prof. Nabarun Deb. Before BU, I earned my M.A. in Statistics from Columbia University and my B.S. in Mathematics from Shandong University, including a year of joint training at the Academy of Mathematics and Systems Science(AMSS), Chinese Academy of Sciences. My research sits at the intersection of statistics and machine learning, where I develop theoretically grounded transfer-learning and representation-learning methods—spanning optimal transport, graph mining, multimodal learning for structured, heterogeneous data in low-sample, high-dimensional, and non-IID settings.

The question that keeps me up (in a good way):

How can we reuse past knowledge when the world—and the data—won’t sit still?

In statistical learning, this is about transferring geometry or smoothness from a well-understood source distribution to a smaller, noisier target under shift. In reinforcement learning, the source might be prior trajectories, simulators, or related tasks, while the target is the evolving environment, so we need principled rules for what to keep, what to adapt, and what to forget. And yes! LLMs/VLMs make this even more exciting (and tricky): they already contain a lot of cross-domain knowledge, but the real challenge is extracting and specializing it safely for downstream tasks without overfitting, hallucination, or misalignment.

What I build

THEORY Theory that supports practice
Minimax rates · oracle inequalities · regret bounds · safe-transfer criteria under covariate or structural shift.
GRAPHS Graph-structured transfer
Aligning and transporting information across graphs and manifolds — robust transfer when correspondence is messy or unknown.
RL & BANDITS RL & bandits under drift
Warm-started policies with uncertainty-aware adaptation for reliable sequential decision-making in changing environments.
LLMs & VLMs Transfer for LLMs / VLMs
Controlled adaptation · domain grounding · structure-preserving fine-tuning — so models adapt without getting sloppy.

Curious about my research? View my slide deck on transfer learning.

Along my academic journey, I have been deeply fortunate to study and conduct research under the guidance of inspiring scholars, including Prof. Zhanxing Zhu, whose influential work includes Spatio-Temporal Graph Convolutional Networks (STGCN) for traffic forecasting, and Prof. Yongshun Gong. Their perspectives on deep learning, representation learning, and structured spatio-temporal systems have profoundly shaped how I think about evolving, heterogeneous data, and have guided my pursuit of principled transfer learning methods.

Beyond theory and modeling, I am drawn to building AI applications that reflect how I see people and the world. I have always felt that human beings are more than their outward forms, that something of the spirit, memory, and inner life exceeds the body that temporarily carries it. That is why I am especially fascinated by cinema, atmosphere, and emotionally resonant digital experiences ✨

🔥 News

  • 2025.09: 🎉 My first-author paper “Transfer Learning on Edge Connecting Probability Estimation Under Graphon Model” is accepted by (NeurIPS 2025)!
  • 2025.08: 🎉 My co-authored paper “Cross-Domain Hyperspectral Image Classification via Mamba-CNN and Knowledge Distillation” is accepted by (IEEE TGRS 2025)!

📝 Publications

Leading Author

GTrans NeurIPS 2025 Transfer Learning on Edge Connecting Probability Estimation Under Graphon Model  Paper Poster Slides Code
  • First graphon-level transfer without node correspondence — aligns graphs via Gromov–Wasserstein and transfers edge structure nonparametrically.
  • Residual smoothing unlocks small/sparse targets with convergence & stability guarantees; SOTA on link prediction and graph classification.
Phase Transition Under Review Phase Transition in Nonparametric Minimax Rates for Covariate Shifts on Approximate Manifolds  arXiv Poster Slides Code
  • New minimax theory for "near-manifold" shift: exposes a sharp phase transition controlled by the support gap between target and source neighborhoods — unifying multiple geometric-transfer regimes.
  • Ratio-free, adaptive estimator: achieves near-optimal, dimension-adaptive rates without density ratios and without assuming known geometry (works under approximate manifold mismatch).
TESS Under Review From Text to Forecasts: Bridging Modality Gap with Temporal Evolution Semantic Space  arXiv
  • Bridges the text–time-series modality gap: introduces a Temporal Evolution Semantic Space (TESS) that distills free-form text into interpretable temporal primitives (mean shift, volatility, shape, lag), instead of directly fusing noisy token embeddings.
  • LLM-guided yet numerically grounded forecasting: uses structured prompting + confidence-aware gating to inject reliable semantic signals as prefix tokens into a Transformer forecaster, yielding robust gains under event-driven non-stationarity (up to 29% error reduction).
SCOT Under Review SCOT: Multi-Source Cross-City Transfer with Optimal-Transport Soft-Correspondence Objectives  arXiv
  • Sinkhorn entropic-OT coupling enables many-to-many region alignment across cities — no node matching required.
  • OT-weighted contrastive loss + target-aware prototype hub prevents collapse and scales cleanly to multi-source heterogeneity.
INCM Under Review INCM: INConsistency-aware Multi-modal Recommendation with Cross-Modal Hard Negatives
  • Inconsistency-aware multimodal ranking: studies how cross-modal discrepancies may provide complementary ranking evidence or degrade fusion quality — explicitly modeled in training.
  • Cross-modal hard negatives + synergy-aware ranking loss: proposes CHNS to mine modality-specific hard negatives across branches, and a Synergy-aware BPR loss to ensure the fused branch achieves stronger preference margins than unimodal branches.

Co-author

MKDNet IEEE TGRS 2025 Cross-Domain Hyperspectral Image Classification via Mamba-CNN and Knowledge Distillation  IEEE Slides
  • Hybrid spectral–spatial modeling for domain shift: integrates a Mamba-based global spectral encoder with CNN local feature extraction, capturing long-range dependencies while preserving fine-grained spatial structure.
  • Dual-level transfer via distillation + graph alignment: performs teacher–student knowledge distillation for distribution alignment and OT-guided graph consistency across domains, yielding robust cross-domain generalization under severe spectral mismatch.
SSGP Under Review Semantic Scientific Graph Pruning for Reliable Agentic Paper Reproduction  arXiv
  • SSGP prunes dense scientific graphs into task-adaptive subgraphs via rank-based ensemble scoring — drastically shrinks agent search space.
  • Reuse–patch execution + confidence-weighted aggregation boosts reproducibility, stability, and success rate of LLM scientific agents.

🤖 LLM Engineering Projects

AGENT Traffic Bot Detection Code
Isolation Forest · GBM · LLM fingerprint scorer · CSIC 2010
AGENT CausalLens: LLM-Augmented Causal Pipeline Code
DoWhy · Double ML · Causal Forest · Claude API · Streamlit
RAG GraphRAG: Multimodal RAG Code
dense + entity graph + CLIP · FastAPI · ChromaDB
RAG Adaptive RAG Code
query routing · iterative retrieval · self-check · FastAPI
RAG RAGAudit: Hallucination Detection Code PDF
BM25+FAISS · NLI · SelfCheckGPT · sem. entropy · Mistral-7B
ALIGN TuneShift: Fine-Tuning Code
instruction · dialogue · LoRA / QLoRA · domain transfer
ALIGN AlignDPO Code PDF
DPO · IPO · KTO · QLoRA · Mistral-7B · HH-RLHF
ALIGN RLHF with PPO Code PDF
GPT-2 · GAE · clipped PPO · adaptive KL · W&B
CORE Mini LLM Pre-Training Code PDF
10.7M GPT · PyTorch · PPL 65 → 4.7 @ 5k iters
CORE HQQ: 1-bit Quantization Code
1–8 bit · proximal opt · W1G64: 12.7× · >4× speedup
CORE DraftVerify: Speculative Decoding Code
draft + verifier · latency · throughput · acceptance
CAUSAL Causal Promotion Optimization Code PDF
AIPW · LightGBM · DRLearner CATE · OR-Tools · FastAPI
CAUSAL Congestion Pricing Analyzer Code PDF
TWFE · CS-DiD · Synth DiD · Double ML · 12M+ NYC TLC
ML Demand Forecasting Code PDF
Seasonal Naive · LightGBM · TFT · M5 · 28-day · store-SKU

📖 Educations

  • 2021.09 – Now: Ph.D. in Statistics, Boston University

  • 2019.09 – 2020.05: M.A. in Statistics (Data Science Track), Columbia University

  • 2018.05 – 2019.06: B.S. in Mathematics, Chinese Academy of Sciences (Jointly Supervised Talent Program)

  • 2015.09 – 2019.06: B.S. in Mathematics, Shandong University

💻 Internships

Plymouth Rock

Data Scientist Intern · Plymouth Rock Insurance
📍 Boston, MA  ·  🗓️ May 2025 – Aug 2025

  • Architected an end-to-end AWS SageMaker pipeline for property-level loss prediction using an XGBoost Tweedie model on multi-million-policy data, lifting Gini by +4.3% over the production baseline and directly improving underwriting risk segmentation.

  • Pioneered an LLM-powered visual risk scoring system combining GPT-4o multimodal reasoning with Google Street View imagery to capture previously unobservable property features (roof condition, surroundings, hazards); integrated outputs into downstream actuarial pricing models as a novel signal layer.


✨ My Apps

MBTI Vibe PERSONA MBTI Vibe
What kind of personality atmosphere does this content radiate?

A multimodal AI app that reads text and images — captions, poems, screenshots, moodboards, all those tiny digital traces of self-expression — and whispers back the MBTI vibe it gives off. It doesn't claim to tell you who someone really is. It asks a softer, kinder question instead.

Try it Code


What If Cinema CINEMA What If Cinema
Rewrite a film's final heartbeat.

For anyone who has ever left a movie wondering — what if it ended differently? Insert a new scene, change one choice, shift a single moment. It stays close to the emotional soul of the original — preserving its tone, longing, and ache — while imagining endings that feel tender, devastating, hopeful, or quietly healing. Not every ending needs to be undone. But some deserve to be imagined differently.

Try it Code


Letters from the Screen LETTER Letters from the Screen
A healing letter from the character you need most.

Share what is on your heart, and receive a letter from the movie or TV character who would understand. More love letter than chatbot — intimate, tender, just a little magical. The comfort lies in its emotional closeness: it offers not advice, but presence. The rare feeling that a voice from another story has stepped out of the screen to sit beside you for a while.

Try it Code


If You Disappeared on a Trip ESCAPE If You Disappeared on a Trip
A small borrowed life in another city.

For anyone who has ever wanted to slip away for a day or two. Instead of a typical itinerary, it builds a small borrowed life in another city based on your mood — cinematic, soothing, touched with a bit of humor. Not because it solves anything, but because it gives you a brief imaginative place to rest, wander, and feel held by another version of life. Scenes, snacks, photographs, and the kind of inner weather that only changes when you leave home.

Try it Code


Souvenirs of a Life Not Yet Lived KEEPSAKE Souvenirs of a Life Not Yet Lived
A private museum of the lives you almost lived.

Step into a curated archive of parallel selves. Instead of planning a trip, it generates a small, emotionally resonant keepsake — a ticket, a postcard, a receipt, a note — from a life you've been quietly standing outside of. Cinematic, intimate, collectible. An invitation to imagine not just another city, but another self you might still be growing toward.

Try it Code


The Map of Me ATLAS The Map of Me
Part atlas, part city shelf, part personal collection.

A map-based cultural discovery app for collecting meaningful places and exploring what lives around them beyond geography alone. After adding a city, you move from the global atlas into a curated layer of screen references, books, and local landmarks — so a place becomes not just somewhere on the map, but a small world of its own. Editorial, archive-inspired, cultural browsing as cartography.

Try it Code


A Room in Macondo LITERARY A Room in Macondo
Step inside the weather of One Hundred Years of Solitude.

An atmospheric AI literary experience inspired by García Márquez's rain-soaked world of memory, fire, butterflies, and magical realism. Rather than retelling the novel, it invites you into a room, a ritual, and a fate of your own. A small sequence of questions transforms mood into a story fragment that feels less like generated text and more like something recovered from an old archive in Macondo itself. Dark-gold palette, burning-paper visuals, haunted editorial layout.

Try it Code

🎖 Honors

  • 2025: Student Travel Grant, Boston University
  • 2025: Ralph B. D’Agostino Endowed Fellowship, Boston University
  • 2025: Outstanding Teaching Fellow Award, Boston University

  • 2019: Outstanding Graduate, Shandong University

  • 2018: Hua Loo-Keng Scholarship, Chinese Academy of Sciences
  • 2018: National Gold Award, Internet+ Innovation & Entrepreneurship Competition
  • 2018: First-Class Scholarship, Shandong University
  • 2018: Outstanding Student Leader, Shandong University

📂 DS Projects

CV Dog Classification Code Demo
VGG16 · ResNet50 · Flask · 75.48%
ML Credit Risk Code
XGBoost · SMOTE · AUC 0.976
CV Pedestrian Detection Code
Fast R-CNN · Siamese · few-shot
NLP Financial Sentiment Code Demo
DistilBERT · 85% · 30%↑ speed
CV Mask Detection Code
ResNet50 · Grad-CAM · 94%
NLP Spam Detection Code
TF-IDF · NB · P 96 / R 94
APP Airbnb Dashboard Code Demo
R Shiny · maps · filtering
STATS Bayesian Logistic Code Demo
RStan · Spike-and-Slab · MCMC
STATS A/B Testing Code
Bootstrap · power · +15% conv.
TS Time Series Forecast Code Demo
SARIMA · ETS · Prophet
ML Movie Recommendation Code
ALS · SVD · +15% / −20%
ML Customer Segmentation Code
K-Means · elbow · silhouette

📝 Service & Teaching

Presentations  ·  CIKM 2024, NeurIPS 2025
Reviewer  ·  CIKM 2025, ICME 2026, ICML 2026, KDD 2026
Instructor @ Boston University  ·  MA 582 Mathematical Statistics, MA 113 Elementary Statistics
TA @ Boston University  ·  MA 575 Generalized Linear Models, MA 582, MA 415 Data Science in R, MA 214 Applied Stats

🎨 Interests

🎵 Mandarin R&B loyalist — Leehom Wang, David Tao, Khalil Fong🦋, Dean Ting

🎹 Trained in piano, calligraphy, and ink painting

🏞️ National park lover · 🫧 lake admirer · 🌅 opacarophile — welcome to my Gallery