type
status
date
slug
summary
tags
category
icon
password
 

Diagonal Gaussian Dist

 
sample point from normal_dist with given mean and std
 
 

Attend

 
还是用了 rotary_emb for pixel
 
注意这里dim = head_dim // 4 ?
endow each pixel freq embed based on pos
 

Attend Block

注意这里不再用mod和activation了 只有norm
 
 

Autoencoder KL

 
input_image → patch_emb
 
encoder = enc_depth * AttentionBlock(enc_dim, enc_heads)
 
bottleneck:
 
decoder = dec_depth * AttentionBlock(dec_dim, dec_heads)
 
self.predictor = nn.Linear(dec_dim, self.patch_dim)
*self.patch_dim = 3 * patch_size**2
 
encoder 完后
 
从moments里sample 如果not variational → deterministic
var 设成跟moments.zeros_like 全部moments作为mean
若有var 把现有moments分成两份? 但是这样sample的x shape不会不同?
 
—> 因为如果没有var 不做sample
 
decode 完后 接predictor
然后unpatchify
 
最后返回type
 
enc, dec 的dim, heads相同 dec depth两倍
360*640 → patchify to size of 20*20
 
 
 
 
 
最近感想:
看着冰箱里剩的5块肉和4个辣椒陷入沉思
这波我只能说 拖就硬拖 但我就是要拖住!
 
Leetcode - Dijkstra相关AI生成Minecraft(open-oasis) 代码浅读 - 2(DIT)
Loading...
ran2323
ran2323
我们再来一次, 这一次, 好好来!
Latest posts
Leetcode记录「2」
2024-12-27
Flutter 基础 记录
2024-12-25
Flutter tutorial 记录
2024-12-25
Privicy policy for GitHub To Text (Chrome Extension)
2024-12-22
一些 Kubernetes 笔记
2024-12-21
一些 docker 笔记
2024-12-20
Announcement
 
 
 
 
暂时没有新的内容