type
status
date
slug
summary
tags
category
icon
password
1. IndexGrid mechanism
Linear offset from signed coordinate to values in external array
[m] Leaf node with indexes (i,j,k)(512,)
[n] Recorded offset on the mBitMask for the active, each mask represents 512 values
[w]
- Inputs:
- i,j,ki, j, ki,j,k are the signed coordinates representing a position within a grid or node.
- mOffsetmOffsetmOffset is the starting point in an external array where values are stored.
- m∈{0,511}m \in \{0, 511\}m∈{0,511} represents the linear index inside a 512-element array (since 512=8×8×8).
- n∈{0,7}n \in \{0, 7\}n∈{0,7} is the offset into the 64-bit array mBitMask, which stores the active status of the 512 values.
- mBitMaskmBitMaskmBitMask is an array where each bit represents whether a corresponding value in the dense 512-element array is "on" (active) or "off" (inactive).
512=8×8×8512 = 8 \times 8 \times 8
mBitMaskmBitMask
- Active Value Mapping:
- The 512 values are stored in a compressed format, where only the active values (those "on" in mBitMask) are stored in the external array.
- www is a specific 64-bit word from mBitMask, which corresponds to the block containing the value at position (i,j,k).
- The value of w determines which positions are active (i.e., bits set to 1) in the 64-bit block.
mBitMaskmBitMask
mBitMaskmBitMask
(i,j,k)(i, j, k)
ww
- Masking Process:
- The variable
mask
is used to filter out higher-order bits in w, ensuring that only the bits corresponding to positions before (i,j,k) are considered. This is important to determine the number of "on" values before the current position.
ww
(i,j,k)(i, j, k)
- Checking for Activity (Line 6):
- If (i,j,k) corresponds to an inactive value (i.e., the bit for (i,j,k) in w is 0), the function returns a zero offset. This likely maps to a "background index" or a default value, indicating no data at that position.
(i,j,k)(i, j, k)
(i,j,k)(i, j, k)
ww
- Preceding Active Values (Line 7):
- If w is not the first word in mBitMask, the code uses mPrefixSum to extract the count of active values from previous 64-bit words in mBitMask. mPrefixSum encodes the prefix sums, providing a cumulative count of the active values in the preceding words.
- Each prefix sum covers a block of 64 values, so the code extracts the count of "on" values before the current word, speeding up the calculation of the linear index.
ww
mBitMaskmBitMask
mPrefixSummPrefixSum
mBitMaskmBitMask
mPrefixSummPrefixSum
- Counting Active Values (Line 8):
- The code counts how many bits in w (representing the current word) are "on" before the bit corresponding to the position (i,j,k). This gives the count of active values in the current word up to (i,j,k).
- The sum of the active values from the preceding words (from mPrefixSum) and the current word gives the final linear index of the position (i,j,k).
ww
(i,j,k)(i, j, k)
(i,j,k)(i, j, k)
mPrefixSummPrefixSum
(i,j,k)(i, j, k)
2. GridBatch & JaggedTensor
Each grid has a random chosen origin in 3D world and random chosen voxel size
The fVDB documentation has more useful examples for these cases using functions like
sparse_grid_from_points
, sparse_grid_from_dense
and sparse_grid_from_mesh
The features stored in a
JaggedTensor
can be of any type that PyTorch supports, including float, float64, float16, bfloat16, int, etc., and we can have an arbitrary number of feature channels per voxel.For instance, there could be a
JaggedTensor
with 1 float feature that represents a signed distance field in each grid, or 3 float features that represent an RGB color in each voxel of the grids, or a 192 float feature that represents a learned feature vector of each voxel in each grids.VDBTensor
exists to wrap around a GridBatch
and JaggedTensor
Has operators that work with both at the same time
VDBTensor
concatenation has two different definitions.- Concatenating along dimension 0 is a concatenation along the batch dimension.
If
J1
has 10 member grids in the batch and J2
has 20 members in the batch, then VDBTensor.cat([J1,J2], dim=0)
will have 30 members.All input
VDBTensors
must have the same number of features.- Concatenating along dimension 1 concatenates the features of the
VDBTensors
together.
If
J1
has 3 features and J2
has 4 features, then VDBTensor.cat([J1,J2], dim=1)
will have 7 features.All input
VDBTensors
must have the same number of grids in the batch and number of voxels in each grid.fvdb.nn.Linear 作用在 VDBTensor上 可是只会改变JaggedTensor(只在feature维计算)
GridBatch保持原object(hasn't actually changed the topology of the grids)
VDBTensor
.
same_grid(out_vdbtensor
.
grid, vdbtensor
.
grid))
== True总结:主要目标是将ijk data与feature data拆分
gridbatch存ijk jaggedtensor存feature 由j_index索引
3. Actually training
这个是演示
Model本身没有太多值得注意的
将ijk value 赋予feature的每个c维 ijk = feature??
应该还是经过转换的
data.grid.grid_to_world()
: This function likely transforms the grid coordinates from the grid space to world space (real-world coordinates). It maps (i,j,k)(i, j, k)(i,j,k) grid positions to actual world coordinates using some transformation (e.g., scaling, rotation).
实际训练中feature数量应该如何设定? 这种方式只能应用在学习简单3D形状上
features = grid.grid_to_world(grid.ijk.float())
训练目标:
最终jdata将会是真实世界坐标
data.grid.ijk.float() → 汇总grid 中的ijk(这步操作是因为grid sparse的特性)
data.grid.grid_to_world(data.grid.ijk.float()) → 得到feature 也就是我们所学习的3D位置的Jagged Tensor
data.grid.grid_to_world(data.grid.ijk.float()).jdata → 得到真实3D位置
target = (dist < 1).float()
BCE Loss(Binary cross entropy)
4. Perform Conv
fvdb.nn.SparseConv3d
conv_layer = SparseConv3d
- Author:ran2323
- URL:https://www.blueif.me//article/58b5dab3-c61e-440e-93d8-472131fd509e
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!