status
type
date
slug
summary
tags
category
icon
password
notion image
 
按照步骤看模块
 

VIT-VAE

 

1. Attention

这里用了两种 TemporalAxialAttention (Time dim in sequence of video)
和 SpatialAxialAttention
 
D → h(heads) * d
 
The attention mechanism is applied along the temporal axis (T), meaning that the model learns how the information at different time steps relates to each other
 
两种 rotary_emb 使用方法不同
 
  • rotary_emb module is used to embed the queries (q) and keys (k) to incorporate the frequency information, which is helpful in capturing periodic or sequential dependencies.
 
is_causal == True: Only attend to past time steps
 
Spatial —> Attention computed on H, W
dependencies between different spatial locations
 
 
is_causal == False: Attend to the whole image
 
 
 
 
 
 
 
 
推荐: diffusion-forcing 的作者 这里的rotary_emb attention代码来源
关键词: 生成式3d
 
浅谈 RAG 以及 GraphRAGCopycoder + cursor 复刻简单Netflix UI
Loading...
ran2323
ran2323
忘掉名字吧
Latest posts
SFT + DPO 塔罗解读
2025-4-14
Backtracking
2025-4-14
Leetcode 0001-1000 分组
2025-4-14
mcp 记录(1)
2025-4-14
DPO 相关
2025-3-29
今日paper(3/25) - MAGPIE
2025-3-27
Announcement
 
 
 
 
暂时没有新的内容