Awesome-Personalized-Finetuning

Papers

Human Alignment

This section focuses on the fundamental approach and the applications by training the model to align the human preference.

Fundamental Approach

Title	Venue	Year	Code	Keywords
Training language models to follow instructions with human feedback	NeurlPS	2022	OpenRLHF	RLHF
SLiC-HF: Sequence Likelihood Calibration with Human Feedback	ICLR	2023	Non-Official	SLiC-HF
Direct Preference Optimization: Your Language Model is Secretly a Reward Model	NeurlPS	2023	OpenRLHF	DPO
RRHF: Rank Responses to Align Language Models with Human Feedback without tears	NeurlPS	2023	Official	RRHF
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment	TMLR	2023	Official	RAFT
Back to Basics: Revisiting REINFORCE-Style Optimization for Learning from Human Feedback in LLMs	ACL	2024	OpenRLHF	RLOO
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models	ICML	2024	Official	SPIN
A General Theoretical Paradigm to Understand Learning from Human Preferences	AISTATS	2024	OpenRLHF	IPO
Statistical rejection sampling improves preference optimization	ICLR	2024	Official	Rejection Sampling
SimPO: Simple Preference Optimization with a Reference-Free Reward	NeurlPS	2024	Official	SimPO
KTO: Model Alignment as Prospect Theoretic Optimization	ICML	2024	OpenRLHF	KTO
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback	ICML	2024		RLAIF
RLHF Workflow: From Reward Modeling to Online RLHF	TMLR	2024	Official	Online-RLHF

Application

Title	Venue	Year	Code	Keywords
RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback	CVPR	2024	Official	RLHF-V
Diffusion Model Alignment Using Direct Preference Optimization	CVPR	2024	Official	DiffusionDPO
Training Diffusion Models with Reinforcement Learning	ICLR	2024	Official	DDPO
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback	ICML	2024	Official	RL-VLM-F
Aligning Diffusion Models by Optimizing Human Utility	NeurlPS	2024	Official	Diffusion-KTO

Data Distillation

Survey

Title	Venue	Year
Data Distillation: A Survey	TMLR	2023
A Comprehensive Survey of Dataset Distillation	T-PAMI	2024

Fundamental Approach

Title	Venue	Year	Code	Keywords
Dataset Distillation	arXiv	2018	Non-Official
Dataset Condensation with Gradient Matching	ICLR	2021	Official	gradient matching
CAFE: Learning to Condense Dataset by Aligning Features	CVPR	2022	Official	CAFE
Dataset Distillation by Matching Training Trajectories	CVPR	2022	Official	MTT, trajectory matching
Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching	ICLR	2024	Official	lossless
Multisize Dataset Condensation	ICLR	2024	Official	mltisize
Embarassingly Simple Dataset Distillation	ICLR	2024	Official	RaT-BPTT
D4M: Dataset Distillation via Disentangled Diffusion Model	CVPR	2024	Official	D4M
Dataset Distillation by Automatic Training Trajectories	ECCV	2024	Official	ATT
Elucidating the Design Space of Dataset Condensation	NeurlPS	2024	Official	EDC

Application

Title	Venue	Year	Code	Keywords
Dataset Distillation with Attention Labels for Fine-tuning BERT	ACL	2023	Official
Vision-Language Dataset Distillation	TMLR	2024	Official
Low-Rank Similarity Mining for Multimodal Dataset Distillation	ICML	2024	Official	LoRS
Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement	CVPR	2024	Official
DiLM: Distilling Dataset into Language Model for Text-level Dataset Distillation	NAACL	2024	Official	DiLM
Textual Dataset Distillation via Language Model Embedding	EMNLP	2024	N/A

Archiecture

Title	Venue	Year	Code	Keywords
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters	arXiv	2024	Official	TokenFormer
Large Concept Models: Language modeling in a sentence representation space	arXiv	2024	Official	LCM
Byte Latent Transformer: Patches Scale Better Than Tokens	arXiv	2024	Official	BLT

Acknowledgement

Thanks to the following repositories:

OpenRLHF
awesome-RLHF
Awesome Dataset Distillation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Awesome-Personalized-Finetuning

Table of Contents

Papers

Human Alignment

Fundamental Approach

Application

Data Distillation

Survey

Fundamental Approach

Application

Archiecture

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

Awesome-Personalized-Finetuning

Table of Contents

Papers

Human Alignment

Fundamental Approach

Application

Data Distillation

Survey

Fundamental Approach

Application

Archiecture

Acknowledgement