Micah Zhang

ML Research Engineer • LLM-Based Apprenticeship • Human-Centered AI • Adaptive Inference

I am an ML research engineer studying how large language models can provide cognitive scaffolding for difficult human work while preserving agency and supporting long-term growth. My current work focuses on LLM-based apprenticeship, adaptive inference, and budgeted refinement methods for research ideation and other high-friction forms of knowledge work.

LLM-Based Apprenticeship Cognitive Scaffolding Human-Centered AI Research Ideation Adaptive Inference Evaluation

About

My research interests center on language model systems that help people do difficult cognitive work without replacing human agency. I am especially interested in when AI systems should provide more guidance, refine a candidate more deeply, structure feedback, or support a learner through a difficult step.

I am currently preparing PhD applications to study human-centered LLM systems, support allocation, adaptive inference, and evaluations that distinguish AI as substitution from AI as scaffolding.

Research Vision

I want to build and study LLM-based apprenticeship systems: AI systems that help people bridge the gap between potential and present capability, persist through difficult work, and become more capable over time. I believe support changes outcomes, and I want my research to make that claim technically precise and practically real.

Research Interests

LLM-based apprenticeship and cognitive scaffolding
Human-centered AI systems that preserve and grow agency
Research ideation, support allocation, and budgeted refinement
Adaptive inference, test-time compute, and quality-cost tradeoffs
Evaluation of whether AI support improves human capability over time

Selected Work

Budgeted Subset Refinement for LLM Research Ideation

Current independent manuscript and research prototype

This project studies a conditional support-allocation problem: given a noisy pool of LLM-generated research ideas, how should limited refinement effort be allocated to produce a stronger, more diverse, more execution-ready portfolio? The work compares raw generation, reranking, uniform refinement, random subset refinement, diversity-aware refinement, and two-stage micro-triage plus heavy refinement.

Code sample

HALO: Hybrid Adaptive Latent Reasoning for Language Models

Manuscript approved for public release; preprint pending

HALO studies adaptive refinement for improving language-model quality-compute tradeoffs. The project investigates selective latent refinement and controller-based allocation of additional computation for frozen language models.

FtG-CoT at SemEval-2024 Task 9: Solving Sentence Puzzles Using Fine-Tuned Language Models and Zero-Shot CoT Prompting

SemEval 2024 publication

This paper introduces Fine-tuned Generated Chain-of-Thought, a method combining a fine-tuned BERT encoder, zero-shot chain-of-thought generation, and a fine-tuned LLM for solving BRAINTEASER sentence puzzles.

ACL Anthology PDF