HenryShan (Haotian Shan)

Article

Decoding Strategies in Large Language Models

Oct 29, 2024

•

102

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

64

Article

Document Similarity Search with ColPali

Sep 21, 2024

•

52

Article

The Environmental Impacts of AI -- Primer

Sep 3, 2024

•

45

Article

RAG vs Fine-Tuning for LLMs: A Comprehensive Guide with Examples

Aug 16, 2024

•

10

Article

RegMix: Data Mixture as Regression for Language Model Pre-training

Jul 11, 2024

•

15

Article

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

May 7, 2024

•

112

Article

Merge Large Language Models with mergekit

Jan 9, 2024

•

147

Article

Deploying Your FastAPI Applications on Huggingface Via Docker

Dec 11, 2023

•

40

Article

4D masks support in Transformers

Jan 8, 2024

•

31

Article

Better RAG 3: The text is your friend

Mar 14, 2024

•

13

Article

Multilabel Classification using Mistral-7B on a single GPU with quantization and LoRA

Jan 22, 2024

•

26

Article

🕳️ Attention Sinks in LLMs for endless fluency

Oct 9, 2023

•

34

Article

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

Aug 25, 2023

•

38

Article

Sensitivity Aware Mixed Precision Quantization V1

Jun 13, 2025

•

25

Article

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

Jun 3, 2025

•

71

Article

Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs

May 7, 2025

•

42

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

Feb 11, 2025

•

98

Article

G2P Shrinks Speech Models

Feb 5, 2025

•

83

Article

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

Feb 11, 2025

•

94

Haotian Shan

AI & ML interests

Organizations

Decoding Strategies in Large Language Models

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Document Similarity Search with ColPali

The Environmental Impacts of AI -- Primer

RAG vs Fine-Tuning for LLMs: A Comprehensive Guide with Examples

RegMix: Data Mixture as Regression for Language Model Pre-training

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

Merge Large Language Models with mergekit

Deploying Your FastAPI Applications on Huggingface Via Docker

4D masks support in Transformers

Better RAG 3: The text is your friend

Multilabel Classification using Mistral-7B on a single GPU with quantization and LoRA

🕳️ Attention Sinks in LLMs for endless fluency

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

Sensitivity Aware Mixed Precision Quantization V1

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

G2P Shrinks Speech Models

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

Haotian Shan

AI & ML interests

Organizations

HenryShan's activity

Decoding Strategies in Large Language Models

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Document Similarity Search with ColPali

The Environmental Impacts of AI -- Primer

RAG vs Fine-Tuning for LLMs: A Comprehensive Guide with Examples

RegMix: Data Mixture as Regression for Language Model Pre-training

makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch

Merge Large Language Models with mergekit

Deploying Your FastAPI Applications on Huggingface Via Docker

4D masks support in Transformers

Better RAG 3: The text is your friend

Multilabel Classification using Mistral-7B on a single GPU with quantization and LoRA

🕳️ Attention Sinks in LLMs for endless fluency

Introduction to Quantization cooked in 🤗 with 💗🧑‍🍳

Sensitivity Aware Mixed Precision Quantization V1

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

G2P Shrinks Speech Models

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face