Qwen2-VL
Qwen2-VL
Layers
Qwen2VisionMLP:
- fc1 = ColumnParallelLinear
- act = QuickGELU
- fc2 = RowParallelLinear
Qwen2VisionAttention:
- qkv = ColumnParallelLinear
- proj = RowParallelLinear
- rotary_pos_emb
- attention 3 types
Qwen2VisionBlock:
- norm1 = norm_layer
- norm2 = norm_layer
- attn = Qwen2VisionAttention
- mlp = Qwen2VisionMLP
Qwen2VisionPatchEmbed:
- proj = nn.Conv3d
Qwen2VisionPatchMerger:
- ln_q = norm_layer
- mlp = ColumnParallelLinear, nn.GELU(), RowParallelLinear
Qwen2VisionRotaryEmbedding:
Qwen2VisionTransformer:
- patch_embed = Qwen2VisionPatchEmbed
- rotary_pos_emb = Qwen2VisionRotaryEmbedding
- blocks = Qwen2VisionBlock * n
- merger = Qwen2VisionPatchMerger
Different from Qwen2.5-VL:
norm_layer = partial(nn.LayerNorm, eps=norm_eps)
本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来源 xhj的博客!