status
type
date
slug
summary
tags
category
icon
password

Dataset

 
 
 
 
 
 
 

Fine Tuning

notion image
  • dataset_text_field="text": Specifies the field in the dataset containing the text.
  • max_seq_length=512: Maximum sequence length for input texts.
  • per_device_train_batch_size=2: Batch size per GPU.
  • gradient_accumulation_steps=4: Number of steps to accumulate gradients before updating.
  • optim="paged_adamw_32bit": Specifies the optimizer.
  • save_steps=50: Save a checkpoint every 50 steps.
  • logging_steps=5: Log training metrics every 5 steps.
  • learning_rate=2e-4: The learning rate for training.
  • fp16=True: Use mixed precision training.
  • max_grad_norm=0.3: Maximum gradient norm for gradient clipping.
  • max_steps=200: Total number of training steps.
  • warmup_ratio=0.03: Portion of training steps used for learning rate warmup.
  • lr_scheduler_type="linear": Type of learning rate scheduler.
  • gradient_checkpointing=True: Use gradient checkpointing to save memory.
 
 
用llama.cpp 来merge
 
这里因为 lora是根据fp16 train的 而在finetone过程中模型先被dequantized到了nf4(normalize后将weight划分到16个相邻最近的值 减小模型大小 16bit → 4bit 每个para)
 
这里我们选择先将两个fp16的merge 然后再quant 来做inference
这样我们从本来运行不了这个大小的模型 现在不光可以finetune 还可以做inference
 
我们将原始模型(fp16) 转换到 gguf (convert_hf_to_gguf)
然后lora weight 到gguf (convert_lora_to_gguf)
merge(这里需要把 llama.cpp/bin/ 加到path里)
quant
 
 
notion image
 
 
Why left padding?
 
下面这篇文章写的很好
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Reference:
SAM 代码学习 [1]GGUF
Loading...
ran2323
ran2323
忘掉名字吧
Latest posts
SFT + DPO 塔罗解读
2025-4-14
Backtracking
2025-4-14
Leetcode 0001-1000 分组
2025-4-14
mcp 记录(1)
2025-4-14
DPO 相关
2025-3-29
今日paper(3/25) - MAGPIE
2025-3-27
Announcement
 
 
 
 
暂时没有新的内容