INFO 2025-01-05 00:44:53,544 train_utils.py: 108: MACHINE SEED: 4920 INFO 2025-01-05 00:44:53,546 train_utils.py: 154: Logging ENV_VARIABLES INFO 2025-01-05 00:44:53,546 train_utils.py: 155: BROWSER=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/bin/helpers/browser.sh COLORTERM=truecolor CONDA_DEFAULT_ENV=sam2 CONDA_EXE=/home/hossein/miniconda3/bin/conda CONDA_PREFIX=/ephemeral/hossein/envs/sam2 CONDA_PREFIX_1=/home/hossein/miniconda3 CONDA_PROMPT_MODIFIER=(sam2) CONDA_PYTHON_EXE=/home/hossein/miniconda3/bin/python CONDA_SHLVL=2 CUDA_MODULE_LOADING=LAZY DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/2095/bus GIT_ASKPASS=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/extensions/git/dist/askpass.sh HF_HOME=/ephemeral/ HISTSIZE=2000 HISTTIMEFORMAT=%F %T HOME=/home/hossein HYDRA_FULL_ERROR=1 LANG=C.UTF-8 LESSCLOSE=/usr/bin/lesspipe %s %s LESSOPEN=| /usr/bin/lesspipe %s LOCAL_RANK=0 LOGNAME=hossein LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: MASTER_ADDR=localhost MASTER_PORT=57008 MOTD_SHOWN=pam NCCL_TOPO_FILE=/etc/nccl-topo-h100-v1.xml OLDPWD=/home/hossein/hossein/projects/sam2 PATH=/home/hossein/.cursor-server/cli/servers/Stable-fe574d0820377383143b2ea26aa6ae28b3425220/server/bin/remote-cli:/ephemeral/hossein/envs/sam2/bin:/home/hossein/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin PWD=/home/hossein/hossein/projects/sam2/training PYTHON_PATH=/home/hossein/hossein/projects/hybrid_model_training:/home/hossein/hossein/projects/hybrid_model_training:/home/hossein/hossein/projects/hybrid_model_training: RANK=0 SHELL=/bin/bash SHLVL=2 SSH_CLIENT=142.186.28.106 64524 22 SSH_CONNECTION=110.238.90.22 3000 10.0.1.99 22 TERM=screen TERM_PROGRAM=tmux TERM_PROGRAM_VERSION=3.2a TMUX=/tmp/tmux-2095/default,727396,5 TMUX_PANE=%5 TORCH_NCCL_ASYNC_ERROR_HANDLING=1 USER=hossein VSCODE_GIT_ASKPASS_EXTRA_ARGS= VSCODE_GIT_ASKPASS_MAIN=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/extensions/git/dist/askpass-main.js VSCODE_GIT_ASKPASS_NODE=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/node VSCODE_GIT_IPC_HANDLE=/run/user/2095/vscode-git-cd38edda58.sock VSCODE_IPC_HOOK_CLI=/run/user/2095/vscode-ipc-e3cd88d8-a6c9-4e22-89a9-8e26349b2914.sock WORLD_SIZE=4 XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop XDG_RUNTIME_DIR=/run/user/2095 XDG_SESSION_CLASS=user XDG_SESSION_ID=524 XDG_SESSION_TYPE=tty _=/ephemeral/hossein/envs/sam2/bin/python _CE_CONDA= _CE_M= INFO 2025-01-05 00:44:53,546 trainer.py: 989: Setting up components: Model, loss, optim, meters etc. INFO 2025-01-05 00:44:53,547 logger.py: 66: TensorBoard SummaryWriter instantiated. Files will be stored in: /ephemeral/hossein/output/sam2/tensorboard INFO 2025-01-05 00:44:56,853 sam2.py: 81: Training with points (sampled from masks) as inputs with p=0.5 INFO 2025-01-05 00:44:56,861 trainer.py:1059: ==================== INFO 2025-01-05 00:44:56,861 trainer.py:1060: Summary for model INFO 2025-01-05 00:44:56,867 trainer.py:1061: Model is SAM2Train( (image_encoder): ImageEncoder( (trunk): Hiera( (patch_embed): PatchEmbed( (proj): Conv2d(3, 144, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3)) ) (blocks): ModuleList( (0-1): 2 x MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=144, out_features=432, bias=True) (proj): Linear(in_features=144, out_features=144, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=144, out_features=576, bias=True) (1): Linear(in_features=576, out_features=144, bias=True) ) (act): GELU(approximate='none') ) ) (2): MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=144, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=144, out_features=288, bias=True) ) (3-7): 5 x MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=288, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) ) (8): MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=288, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=288, out_features=576, bias=True) ) (9-43): 35 x MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=576, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) ) (44): MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=576, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=576, out_features=1152, bias=True) ) (45-47): 3 x MultiScaleBlock( (norm1): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=1152, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) ) ) ) (neck): FpnNeck( (position_encoding): PositionEmbeddingSine() (convs): ModuleList( (0): Sequential( (conv): Conv2d(1152, 256, kernel_size=(1, 1), stride=(1, 1)) ) (1): Sequential( (conv): Conv2d(576, 256, kernel_size=(1, 1), stride=(1, 1)) ) (2): Sequential( (conv): Conv2d(288, 256, kernel_size=(1, 1), stride=(1, 1)) ) (3): Sequential( (conv): Conv2d(144, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) ) ) (mask_downsample): Conv2d(1, 1, kernel_size=(4, 4), stride=(4, 4)) (memory_attention): MemoryAttention( (layers): ModuleList( (0-3): 4 x MemoryAttentionLayer( (self_attn): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (cross_attn_image): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=64, out_features=256, bias=True) (v_proj): Linear(in_features=64, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (linear1): Linear(in_features=256, out_features=2048, bias=True) (dropout): Dropout(p=0.1, inplace=False) (linear2): Linear(in_features=2048, out_features=256, bias=True) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) ) ) (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (memory_encoder): MemoryEncoder( (mask_downsampler): MaskDownSampler( (encoder): Sequential( (0): Conv2d(1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (7): LayerNorm2d() (8): GELU(approximate='none') (9): Conv2d(64, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (10): LayerNorm2d() (11): GELU(approximate='none') (12): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) (pix_feat_proj): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) (fuser): Fuser( (proj): Identity() (layers): ModuleList( (0-1): 2 x CXBlock( (dwconv): Conv2d(256, 256, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=256) (norm): LayerNorm2d() (pwconv1): Linear(in_features=256, out_features=1024, bias=True) (act): GELU(approximate='none') (pwconv2): Linear(in_features=1024, out_features=256, bias=True) (drop_path): Identity() ) ) ) (position_encoding): PositionEmbeddingSine() (out_proj): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) ) (sam_prompt_encoder): PromptEncoder( (pe_layer): PositionEmbeddingRandom() (point_embeddings): ModuleList( (0-3): 4 x Embedding(1, 256) ) (not_a_point_embed): Embedding(1, 256) (mask_downscaling): Sequential( (0): Conv2d(1, 4, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(2, 2), stride=(2, 2)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 256, kernel_size=(1, 1), stride=(1, 1)) ) (no_mask_embed): Embedding(1, 256) ) (sam_mask_decoder): MaskDecoder( (transformer): TwoWayTransformer( (layers): ModuleList( (0-1): 2 x TwoWayAttentionBlock( (self_attn): Attention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=2048, bias=True) (1): Linear(in_features=2048, out_features=256, bias=True) ) (act): ReLU() ) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm4): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_image_to_token): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) ) ) (final_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm_final_attn): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (iou_token): Embedding(1, 256) (mask_tokens): Embedding(4, 256) (obj_score_token): Embedding(1, 256) (output_upscaling): Sequential( (0): ConvTranspose2d(256, 64, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): ConvTranspose2d(64, 32, kernel_size=(2, 2), stride=(2, 2)) (4): GELU(approximate='none') ) (conv_s0): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1)) (conv_s1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) (output_hypernetworks_mlps): ModuleList( (0-3): 4 x MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=32, bias=True) ) (act): ReLU() ) ) (iou_prediction_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) (act): ReLU() ) (pred_obj_score_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) ) (act): ReLU() ) ) (obj_ptr_proj): MLP( (layers): ModuleList( (0-2): 3 x Linear(in_features=256, out_features=256, bias=True) ) (act): ReLU() ) (obj_ptr_tpos_proj): Linear(in_features=256, out_features=64, bias=True) ) INFO 2025-01-05 00:44:56,869 trainer.py:1062: Total parameters 224 M INFO 2025-01-05 00:44:56,869 trainer.py:1063: Trainable parameters 224 M INFO 2025-01-05 00:44:56,869 trainer.py:1066: Non-Trainable parameters 0 INFO 2025-01-05 00:44:56,869 trainer.py:1069: ==================== INFO 2025-01-05 00:44:56,877 trainer.py:1023: Finished setting up components: Model, loss, optim, meters etc. INFO 2025-01-05 00:44:56,877 trainer.py: 314: Moving components to device cuda:0 and local rank 0. INFO 2025-01-05 00:44:57,296 trainer.py: 320: Done moving components to device cuda:0 and local rank 0. INFO 2025-01-05 00:44:57,313 optimizer.py: 248: Matches for param_name [image_encoder.*]: {'image_encoder.trunk.blocks.17.attn.proj.weight', 'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.44.mlp.layers.0.weight', 'image_encoder.trunk.blocks.15.mlp.layers.1.weight', 'image_encoder.trunk.blocks.10.mlp.layers.0.weight', 'image_encoder.trunk.blocks.20.mlp.layers.1.weight', 'image_encoder.trunk.blocks.31.mlp.layers.1.weight', 'image_encoder.trunk.blocks.26.mlp.layers.1.weight', 'image_encoder.trunk.blocks.39.norm1.bias', 'image_encoder.trunk.blocks.11.norm2.weight', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.weight', 'image_encoder.trunk.blocks.26.attn.proj.weight', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.weight', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.3.attn.proj.weight', 'image_encoder.trunk.blocks.38.norm2.weight', 'image_encoder.trunk.blocks.1.attn.proj.weight', 'image_encoder.trunk.blocks.27.norm1.weight', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'image_encoder.trunk.blocks.38.norm1.weight', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.weight', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.12.norm1.weight', 'image_encoder.trunk.blocks.16.attn.proj.weight', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'image_encoder.trunk.blocks.31.attn.proj.weight', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.5.mlp.layers.0.weight', 'image_encoder.trunk.blocks.35.attn.qkv.weight', 'image_encoder.trunk.pos_embed_window', 'image_encoder.trunk.blocks.0.attn.qkv.weight', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.attn.qkv.weight', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.2.mlp.layers.0.weight', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.weight', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.weight', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.trunk.blocks.45.norm1.weight', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'image_encoder.trunk.blocks.22.norm1.weight', 'image_encoder.trunk.blocks.37.mlp.layers.0.weight', 'image_encoder.trunk.patch_embed.proj.bias', 'image_encoder.trunk.blocks.1.norm2.weight', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.21.attn.proj.weight', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.neck.convs.3.conv.bias', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.weight', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.norm1.weight', 'image_encoder.trunk.blocks.5.attn.proj.weight', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.14.norm2.weight', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.46.norm2.weight', 'image_encoder.trunk.blocks.29.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'image_encoder.neck.convs.3.conv.weight', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.36.attn.qkv.weight', 'image_encoder.trunk.blocks.44.proj.weight', 'image_encoder.trunk.blocks.18.attn.qkv.weight', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.weight', 'image_encoder.trunk.blocks.2.proj.weight', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'image_encoder.trunk.blocks.39.attn.proj.weight', 'image_encoder.trunk.blocks.4.attn.qkv.weight', 'image_encoder.trunk.blocks.0.mlp.layers.1.weight', 'image_encoder.trunk.blocks.12.mlp.layers.1.weight', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.44.proj.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.weight', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'image_encoder.trunk.blocks.20.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.norm2.weight', 'image_encoder.trunk.blocks.1.mlp.layers.0.weight', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.weight', 'image_encoder.trunk.blocks.21.mlp.layers.1.weight', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'image_encoder.trunk.blocks.8.attn.proj.weight', 'image_encoder.trunk.blocks.47.mlp.layers.0.weight', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'image_encoder.trunk.blocks.35.norm1.weight', 'image_encoder.trunk.blocks.6.mlp.layers.1.weight', 'image_encoder.neck.convs.2.conv.weight', 'image_encoder.trunk.blocks.0.attn.proj.weight', 'image_encoder.trunk.blocks.3.mlp.layers.1.weight', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.43.norm1.weight', 'image_encoder.trunk.blocks.20.attn.proj.weight', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.11.attn.proj.weight', 'image_encoder.trunk.blocks.19.norm2.weight', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.norm2.weight', 'image_encoder.trunk.blocks.33.mlp.layers.0.weight', 'image_encoder.trunk.blocks.40.attn.proj.weight', 'image_encoder.trunk.blocks.33.norm1.weight', 'image_encoder.trunk.blocks.2.mlp.layers.1.weight', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.weight', 'image_encoder.trunk.blocks.9.attn.proj.weight', 'image_encoder.trunk.blocks.34.attn.qkv.weight', 'image_encoder.trunk.blocks.34.attn.proj.weight', 'image_encoder.trunk.blocks.5.attn.qkv.weight', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.13.attn.proj.weight', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.weight', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.42.norm2.weight', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.weight', 'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.attn.proj.weight', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.attn.qkv.weight', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.15.attn.qkv.weight', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.24.attn.proj.weight', 'image_encoder.trunk.blocks.28.mlp.layers.0.weight', 'image_encoder.trunk.blocks.7.attn.proj.weight', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.neck.convs.2.conv.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.weight', 'image_encoder.trunk.blocks.18.mlp.layers.0.weight', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.attn.proj.weight', 'image_encoder.trunk.blocks.1.attn.qkv.weight', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.mlp.layers.1.weight', 'image_encoder.trunk.blocks.47.norm1.weight', 'image_encoder.trunk.blocks.8.mlp.layers.0.weight', 'image_encoder.trunk.blocks.19.attn.proj.weight', 'image_encoder.trunk.blocks.30.mlp.layers.1.weight', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.weight', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'image_encoder.trunk.blocks.19.norm1.weight', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.weight', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'image_encoder.trunk.blocks.10.attn.qkv.weight', 'image_encoder.trunk.blocks.34.mlp.layers.1.weight', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'image_encoder.trunk.blocks.27.attn.qkv.weight', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'image_encoder.trunk.blocks.21.mlp.layers.0.weight', 'image_encoder.trunk.blocks.42.mlp.layers.1.weight', 'image_encoder.trunk.blocks.32.attn.qkv.weight', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.44.attn.qkv.weight', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.weight', 'image_encoder.trunk.blocks.43.mlp.layers.1.weight', 'image_encoder.trunk.blocks.16.mlp.layers.1.weight', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'image_encoder.trunk.blocks.46.attn.qkv.weight', 'image_encoder.trunk.blocks.13.norm1.weight', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.weight', 'image_encoder.trunk.blocks.25.attn.proj.weight', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.28.attn.proj.weight', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'image_encoder.trunk.blocks.30.attn.proj.weight', 'image_encoder.trunk.blocks.45.attn.proj.weight', 'image_encoder.trunk.blocks.7.norm1.weight', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.weight', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.16.attn.qkv.weight', 'image_encoder.trunk.blocks.20.norm1.weight', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.weight', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'image_encoder.trunk.blocks.14.attn.qkv.weight', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.weight', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.43.attn.proj.weight', 'image_encoder.trunk.blocks.8.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.0.weight', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.25.mlp.layers.1.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'image_encoder.trunk.blocks.47.norm2.weight', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.weight', 'image_encoder.trunk.blocks.9.attn.qkv.weight', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.44.mlp.layers.1.weight', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'image_encoder.trunk.blocks.33.attn.qkv.weight', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.norm2.weight', 'image_encoder.trunk.blocks.33.mlp.layers.1.weight', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.29.attn.proj.weight', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'image_encoder.trunk.blocks.12.attn.proj.weight', 'image_encoder.trunk.blocks.8.attn.qkv.weight', 'image_encoder.trunk.blocks.7.norm2.weight', 'image_encoder.trunk.blocks.41.attn.proj.weight', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.41.attn.qkv.weight', 'image_encoder.trunk.blocks.31.norm2.weight', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'image_encoder.trunk.blocks.31.attn.qkv.weight', 'image_encoder.trunk.blocks.9.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.21.norm2.weight', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.33.attn.proj.weight', 'image_encoder.trunk.blocks.40.attn.qkv.weight', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.29.attn.qkv.weight', 'image_encoder.trunk.blocks.0.norm1.weight', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'image_encoder.trunk.blocks.39.mlp.layers.0.weight', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.trunk.blocks.13.attn.qkv.weight', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.22.attn.proj.weight', 'image_encoder.trunk.blocks.12.attn.qkv.weight', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'image_encoder.trunk.blocks.31.norm1.weight', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.32.attn.proj.weight', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.attn.qkv.weight', 'image_encoder.trunk.blocks.24.mlp.layers.1.weight', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.3.attn.proj.bias', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'image_encoder.trunk.blocks.46.attn.proj.weight', 'image_encoder.trunk.blocks.19.mlp.layers.1.weight', 'image_encoder.trunk.blocks.35.mlp.layers.0.weight', 'image_encoder.trunk.blocks.22.attn.qkv.weight', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.weight', 'image_encoder.trunk.blocks.24.norm2.weight', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.42.mlp.layers.0.weight', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.9.norm1.weight', 'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.norm2.weight', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.weight', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.weight', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.2.attn.qkv.weight', 'image_encoder.trunk.blocks.4.mlp.layers.0.weight', 'image_encoder.trunk.blocks.37.norm1.weight', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.2.norm2.weight', 'image_encoder.trunk.blocks.23.attn.proj.weight', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.weight', 'image_encoder.trunk.blocks.21.norm1.weight', 'image_encoder.trunk.blocks.8.norm1.weight', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.6.attn.qkv.weight', 'image_encoder.trunk.blocks.37.attn.qkv.weight', 'image_encoder.trunk.blocks.40.norm1.weight', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.attn.qkv.weight', 'image_encoder.neck.convs.0.conv.bias', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.weight', 'image_encoder.neck.convs.0.conv.weight', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.weight', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.18.attn.proj.weight', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.6.attn.proj.weight', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.weight', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.29.norm2.weight', 'image_encoder.trunk.blocks.8.proj.weight', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'image_encoder.trunk.blocks.31.mlp.layers.0.weight', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'image_encoder.trunk.blocks.42.attn.qkv.weight', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.norm1.weight', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.weight', 'image_encoder.trunk.blocks.24.attn.qkv.weight', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'image_encoder.trunk.blocks.44.attn.proj.weight', 'image_encoder.trunk.blocks.2.attn.proj.weight', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'image_encoder.trunk.blocks.6.norm2.weight', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.36.attn.proj.weight', 'image_encoder.trunk.blocks.17.attn.qkv.weight', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.30.attn.qkv.weight', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'image_encoder.trunk.blocks.15.attn.proj.weight', 'image_encoder.trunk.blocks.28.attn.qkv.weight', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.weight', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.patch_embed.proj.weight', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.attn.proj.weight', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.weight', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'image_encoder.trunk.blocks.25.norm1.weight', 'image_encoder.trunk.blocks.11.attn.qkv.weight', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.23.attn.qkv.weight', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'image_encoder.trunk.pos_embed', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'image_encoder.trunk.blocks.45.mlp.layers.1.weight', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'image_encoder.trunk.blocks.26.norm2.weight', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.26.norm1.weight', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.weight', 'image_encoder.trunk.blocks.24.mlp.layers.0.weight', 'image_encoder.trunk.blocks.25.norm2.weight', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.weight', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.41.mlp.layers.0.weight', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.19.attn.qkv.weight', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.20.attn.qkv.weight', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.40.mlp.layers.0.weight', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.weight', 'image_encoder.trunk.blocks.29.norm1.weight', 'image_encoder.trunk.blocks.3.attn.qkv.weight', 'image_encoder.trunk.blocks.17.mlp.layers.1.weight', 'image_encoder.trunk.blocks.5.mlp.layers.1.weight', 'image_encoder.trunk.blocks.25.attn.qkv.weight', 'image_encoder.trunk.blocks.14.mlp.layers.1.weight', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.weight', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.14.attn.proj.weight', 'image_encoder.trunk.blocks.36.norm2.weight', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.weight', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.17.norm1.weight', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'image_encoder.trunk.blocks.32.norm2.weight', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.neck.convs.1.conv.weight', 'image_encoder.trunk.blocks.26.attn.qkv.weight', 'image_encoder.trunk.blocks.40.mlp.layers.1.weight', 'image_encoder.trunk.blocks.29.mlp.layers.1.weight', 'image_encoder.trunk.blocks.41.norm1.weight', 'image_encoder.trunk.blocks.2.norm1.weight', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'image_encoder.trunk.blocks.10.attn.proj.weight', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.47.mlp.layers.1.weight', 'image_encoder.trunk.blocks.11.mlp.layers.1.weight', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.30.norm1.weight', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.36.norm1.weight', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.proj.weight', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.weight', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'image_encoder.trunk.blocks.27.attn.proj.weight', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.17.mlp.layers.0.weight', 'image_encoder.trunk.blocks.38.attn.proj.weight', 'image_encoder.trunk.blocks.21.attn.qkv.weight', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.42.norm1.weight', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.weight', 'image_encoder.trunk.blocks.46.mlp.layers.0.weight', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.36.mlp.layers.1.weight', 'image_encoder.trunk.blocks.42.attn.proj.weight', 'image_encoder.trunk.blocks.7.attn.qkv.weight', 'image_encoder.trunk.blocks.16.norm1.weight', 'image_encoder.trunk.blocks.47.attn.qkv.weight'} INFO 2025-01-05 00:44:57,319 optimizer.py: 248: Matches for param_name [*bias*]: {'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'memory_attention.layers.0.linear1.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.q_proj.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.0.bias', 'memory_encoder.out_proj.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'memory_attention.layers.3.self_attn.q_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.2.bias', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'memory_attention.layers.3.norm2.bias', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'memory_attention.layers.0.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'memory_attention.layers.3.norm3.bias', 'obj_ptr_proj.layers.2.bias', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.13.norm2.bias', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'image_encoder.trunk.patch_embed.proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.2.bias', 'image_encoder.trunk.blocks.24.norm1.bias', 'sam_prompt_encoder.mask_downscaling.6.bias', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'memory_attention.layers.0.self_attn.out_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.0.bias', 'obj_ptr_tpos_proj.bias', 'image_encoder.neck.convs.3.conv.bias', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'memory_attention.layers.3.self_attn.v_proj.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.k_proj.bias', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'memory_attention.layers.2.norm2.bias', 'sam_mask_decoder.pred_obj_score_head.layers.1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.44.proj.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.2.bias', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'memory_attention.layers.1.norm3.bias', 'memory_attention.layers.0.cross_attn_image.out_proj.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'memory_attention.layers.2.norm1.bias', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'memory_attention.layers.3.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.q_proj.bias', 'memory_encoder.fuser.layers.1.dwconv.bias', 'memory_attention.layers.3.self_attn.k_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.1.bias', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.1.bias', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'memory_attention.layers.1.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'memory_attention.layers.0.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'memory_attention.layers.2.linear2.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'memory_encoder.mask_downsampler.encoder.10.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.neck.convs.2.conv.bias', 'memory_attention.layers.1.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'memory_attention.layers.1.cross_attn_image.v_proj.bias', 'memory_attention.norm.bias', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'sam_mask_decoder.iou_prediction_head.layers.2.bias', 'memory_encoder.fuser.layers.0.norm.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.k_proj.bias', 'memory_attention.layers.2.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.v_proj.bias', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'memory_attention.layers.2.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.v_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.1.bias', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'sam_mask_decoder.conv_s1.bias', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'sam_prompt_encoder.mask_downscaling.4.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'image_encoder.trunk.blocks.33.norm1.bias', 'memory_attention.layers.1.linear1.bias', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'sam_mask_decoder.iou_prediction_head.layers.1.bias', 'memory_attention.layers.3.linear1.bias', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'memory_attention.layers.0.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'sam_mask_decoder.output_upscaling.3.bias', 'memory_attention.layers.0.norm3.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.k_proj.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'memory_attention.layers.2.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.8.proj.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'memory_attention.layers.2.linear1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.0.bias', 'obj_ptr_proj.layers.0.bias', 'memory_encoder.fuser.layers.1.norm.bias', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.out_proj.bias', 'memory_encoder.mask_downsampler.encoder.4.bias', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'memory_encoder.pix_feat_proj.bias', 'memory_attention.layers.1.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.out_proj.bias', 'memory_encoder.mask_downsampler.encoder.0.bias', 'memory_encoder.mask_downsampler.encoder.7.bias', 'sam_prompt_encoder.mask_downscaling.1.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.norm4.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'memory_attention.layers.1.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'sam_mask_decoder.pred_obj_score_head.layers.0.bias', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'memory_attention.layers.1.norm2.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'memory_encoder.mask_downsampler.encoder.12.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.q_proj.bias', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'obj_ptr_proj.layers.1.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'image_encoder.trunk.blocks.40.norm1.bias', 'memory_attention.layers.1.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.1.bias', 'image_encoder.trunk.blocks.3.attn.proj.bias', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'memory_attention.layers.0.self_attn.k_proj.bias', 'sam_prompt_encoder.mask_downscaling.0.bias', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'sam_mask_decoder.iou_prediction_head.layers.0.bias', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'memory_attention.layers.1.linear2.bias', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'memory_attention.layers.2.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'memory_encoder.fuser.layers.1.pwconv1.bias', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'memory_attention.layers.2.self_attn.k_proj.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.v_proj.bias', 'image_encoder.neck.convs.0.conv.bias', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'memory_encoder.fuser.layers.0.dwconv.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'memory_attention.layers.0.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'memory_attention.layers.2.cross_attn_image.out_proj.bias', 'mask_downsample.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.v_proj.bias', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'memory_encoder.mask_downsampler.encoder.6.bias', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.out_proj.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'memory_attention.layers.3.cross_attn_image.q_proj.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.v_proj.bias', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'memory_encoder.fuser.layers.0.pwconv1.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'sam_mask_decoder.conv_s0.bias', 'sam_mask_decoder.transformer.layers.1.norm3.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'memory_attention.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.1.bias', 'memory_attention.layers.1.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'memory_attention.layers.3.norm1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'memory_attention.layers.1.self_attn.v_proj.bias', 'memory_encoder.fuser.layers.0.pwconv2.bias', 'memory_attention.layers.0.linear2.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'sam_mask_decoder.output_upscaling.0.bias', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'image_encoder.trunk.blocks.25.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'memory_encoder.mask_downsampler.encoder.9.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'memory_encoder.mask_downsampler.encoder.1.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.2.bias', 'image_encoder.trunk.blocks.17.norm2.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.out_proj.bias', 'sam_prompt_encoder.mask_downscaling.3.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'memory_attention.layers.0.norm1.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'memory_encoder.fuser.layers.1.pwconv2.bias', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'memory_attention.layers.1.norm1.bias', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'memory_attention.layers.2.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.out_proj.bias', 'memory_encoder.mask_downsampler.encoder.3.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'sam_mask_decoder.output_upscaling.1.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'memory_attention.layers.3.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.k_proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.q_proj.bias', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'memory_attention.layers.3.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.q_proj.bias', 'image_encoder.trunk.blocks.32.norm1.bias', 'sam_mask_decoder.pred_obj_score_head.layers.2.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'memory_attention.layers.3.linear2.bias', 'memory_attention.layers.2.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.q_proj.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'memory_attention.layers.3.self_attn.out_proj.bias', 'memory_attention.layers.2.norm3.bias'} INFO 2025-01-05 00:44:57,320 optimizer.py: 220: Matches for module_cls_name [torch.nn.LayerNorm]: {'sam_mask_decoder.transformer.layers.1.norm4.bias', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.42.norm2.weight', 'image_encoder.trunk.blocks.11.norm2.weight', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'sam_mask_decoder.transformer.layers.1.norm3.weight', 'image_encoder.trunk.blocks.44.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'memory_attention.layers.3.norm2.weight', 'image_encoder.trunk.blocks.24.norm2.bias', 'sam_mask_decoder.transformer.norm_final_attn.weight', 'image_encoder.trunk.blocks.21.norm2.weight', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.35.norm1.bias', 'sam_mask_decoder.transformer.layers.1.norm4.weight', 'memory_attention.layers.2.norm2.weight', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.0.norm1.weight', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'memory_attention.layers.1.norm2.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.weight', 'image_encoder.trunk.blocks.38.norm2.weight', 'memory_attention.layers.1.norm2.weight', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.27.norm1.weight', 'image_encoder.trunk.blocks.38.norm1.weight', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.26.norm2.weight', 'image_encoder.trunk.blocks.31.norm1.weight', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.12.norm1.weight', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.40.norm1.bias', 'memory_attention.layers.3.norm2.bias', 'image_encoder.trunk.blocks.26.norm1.weight', 'memory_attention.norm.bias', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.47.norm1.weight', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.26.norm1.bias', 'memory_attention.layers.3.norm3.weight', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.25.norm2.weight', 'memory_attention.layers.0.norm1.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.19.norm1.weight', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.45.norm2.bias', 'memory_attention.layers.1.norm3.weight', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.23.norm1.bias', 'memory_attention.layers.3.norm3.bias', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'memory_attention.layers.1.norm1.bias', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.trunk.blocks.24.norm2.weight', 'image_encoder.trunk.blocks.45.norm1.weight', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.22.norm1.weight', 'image_encoder.trunk.blocks.18.norm2.bias', 'memory_attention.layers.0.norm3.weight', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.9.norm1.weight', 'image_encoder.trunk.blocks.1.norm2.weight', 'image_encoder.trunk.blocks.41.norm2.weight', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.weight', 'memory_attention.layers.0.norm2.weight', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.36.norm2.weight', 'image_encoder.trunk.blocks.14.norm2.weight', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.44.norm1.weight', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.trunk.blocks.46.norm2.weight', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.17.norm1.weight', 'image_encoder.trunk.blocks.40.norm2.bias', 'memory_attention.norm.weight', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'image_encoder.trunk.blocks.13.norm1.weight', 'memory_attention.layers.2.norm1.weight', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.13.norm1.bias', 'memory_attention.layers.2.norm3.bias', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.2.norm2.weight', 'image_encoder.trunk.blocks.37.norm1.weight', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.32.norm2.weight', 'memory_attention.layers.2.norm2.bias', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.21.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.8.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm4.weight', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.7.norm1.weight', 'image_encoder.trunk.blocks.41.norm1.weight', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.2.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm3.weight', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.20.norm1.weight', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.40.norm1.weight', 'memory_attention.layers.0.norm3.bias', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.30.norm1.weight', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.36.norm1.weight', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.14.norm2.bias', 'memory_attention.layers.1.norm3.bias', 'image_encoder.trunk.blocks.28.norm2.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'memory_attention.layers.3.norm1.weight', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.47.norm2.weight', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.weight', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.29.norm2.weight', 'sam_mask_decoder.transformer.layers.0.norm2.weight', 'image_encoder.trunk.blocks.37.norm1.bias', 'memory_attention.layers.2.norm1.bias', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.15.norm2.weight', 'image_encoder.trunk.blocks.35.norm1.weight', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.42.norm1.weight', 'image_encoder.trunk.blocks.8.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm1.weight', 'image_encoder.trunk.blocks.4.norm1.weight', 'memory_attention.layers.0.norm1.weight', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.7.norm2.weight', 'memory_attention.layers.2.norm3.weight', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.43.norm1.weight', 'sam_mask_decoder.transformer.layers.1.norm1.weight', 'memory_attention.layers.1.norm1.weight', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.6.norm2.weight', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'sam_mask_decoder.transformer.layers.1.norm3.bias', 'image_encoder.trunk.blocks.19.norm2.weight', 'image_encoder.trunk.blocks.0.norm2.weight', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.33.norm1.weight', 'image_encoder.trunk.blocks.31.norm2.weight', 'image_encoder.trunk.blocks.16.norm1.weight', 'image_encoder.trunk.blocks.15.norm1.bias', 'memory_attention.layers.3.norm1.bias'} INFO 2025-01-05 00:44:57,653 sam2_datasets.py: 125: Dataset mixing probabilities: [1.0] INFO 2025-01-05 01:19:44,695 train_utils.py: 108: MACHINE SEED: 4920 INFO 2025-01-05 01:19:44,697 train_utils.py: 154: Logging ENV_VARIABLES INFO 2025-01-05 01:19:44,697 train_utils.py: 155: BROWSER=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/bin/helpers/browser.sh COLORTERM=truecolor CONDA_DEFAULT_ENV=sam2 CONDA_EXE=/home/hossein/miniconda3/bin/conda CONDA_PREFIX=/ephemeral/hossein/envs/sam2 CONDA_PREFIX_1=/home/hossein/miniconda3 CONDA_PROMPT_MODIFIER=(sam2) CONDA_PYTHON_EXE=/home/hossein/miniconda3/bin/python CONDA_SHLVL=2 CUDA_MODULE_LOADING=LAZY DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/2095/bus GIT_ASKPASS=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/extensions/git/dist/askpass.sh HF_HOME=/ephemeral/ HISTSIZE=2000 HISTTIMEFORMAT=%F %T HOME=/home/hossein HYDRA_FULL_ERROR=1 LANG=C.UTF-8 LESSCLOSE=/usr/bin/lesspipe %s %s LESSOPEN=| /usr/bin/lesspipe %s LOCAL_RANK=0 LOGNAME=hossein LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: MASTER_ADDR=localhost MASTER_PORT=20492 MOTD_SHOWN=pam NCCL_TOPO_FILE=/etc/nccl-topo-h100-v1.xml OLDPWD=/home/hossein/hossein/projects/sam2 PATH=/home/hossein/.cursor-server/cli/servers/Stable-fe574d0820377383143b2ea26aa6ae28b3425220/server/bin/remote-cli:/ephemeral/hossein/envs/sam2/bin:/home/hossein/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin PWD=/home/hossein/hossein/projects/sam2/training PYTHON_PATH=/home/hossein/hossein/projects/hybrid_model_training:/home/hossein/hossein/projects/hybrid_model_training:/home/hossein/hossein/projects/hybrid_model_training: RANK=0 SHELL=/bin/bash SHLVL=2 SSH_CLIENT=142.186.28.106 64524 22 SSH_CONNECTION=110.238.90.22 3000 10.0.1.99 22 TERM=screen TERM_PROGRAM=tmux TERM_PROGRAM_VERSION=3.2a TMUX=/tmp/tmux-2095/default,727396,5 TMUX_PANE=%5 TORCH_NCCL_ASYNC_ERROR_HANDLING=1 USER=hossein VSCODE_GIT_ASKPASS_EXTRA_ARGS= VSCODE_GIT_ASKPASS_MAIN=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/extensions/git/dist/askpass-main.js VSCODE_GIT_ASKPASS_NODE=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/node VSCODE_GIT_IPC_HANDLE=/run/user/2095/vscode-git-cd38edda58.sock VSCODE_IPC_HOOK_CLI=/run/user/2095/vscode-ipc-e3cd88d8-a6c9-4e22-89a9-8e26349b2914.sock WORLD_SIZE=4 XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop XDG_RUNTIME_DIR=/run/user/2095 XDG_SESSION_CLASS=user XDG_SESSION_ID=524 XDG_SESSION_TYPE=tty _=/ephemeral/hossein/envs/sam2/bin/python _CE_CONDA= _CE_M= INFO 2025-01-05 01:19:44,697 trainer.py: 989: Setting up components: Model, loss, optim, meters etc. INFO 2025-01-05 01:19:44,698 logger.py: 66: TensorBoard SummaryWriter instantiated. Files will be stored in: /ephemeral/hossein/output/sam2/tensorboard INFO 2025-01-05 01:19:47,552 sam2.py: 81: Training with points (sampled from masks) as inputs with p=0.5 INFO 2025-01-05 01:19:47,557 trainer.py:1059: ==================== INFO 2025-01-05 01:19:47,557 trainer.py:1060: Summary for model INFO 2025-01-05 01:19:47,560 trainer.py:1061: Model is SAM2Train( (image_encoder): ImageEncoder( (trunk): Hiera( (patch_embed): PatchEmbed( (proj): Conv2d(3, 144, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3)) ) (blocks): ModuleList( (0-1): 2 x MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=144, out_features=432, bias=True) (proj): Linear(in_features=144, out_features=144, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=144, out_features=576, bias=True) (1): Linear(in_features=576, out_features=144, bias=True) ) (act): GELU(approximate='none') ) ) (2): MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=144, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=144, out_features=288, bias=True) ) (3-7): 5 x MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=288, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) ) (8): MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=288, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=288, out_features=576, bias=True) ) (9-43): 35 x MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=576, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) ) (44): MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=576, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=576, out_features=1152, bias=True) ) (45-47): 3 x MultiScaleBlock( (norm1): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=1152, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) ) ) ) (neck): FpnNeck( (position_encoding): PositionEmbeddingSine() (convs): ModuleList( (0): Sequential( (conv): Conv2d(1152, 256, kernel_size=(1, 1), stride=(1, 1)) ) (1): Sequential( (conv): Conv2d(576, 256, kernel_size=(1, 1), stride=(1, 1)) ) (2): Sequential( (conv): Conv2d(288, 256, kernel_size=(1, 1), stride=(1, 1)) ) (3): Sequential( (conv): Conv2d(144, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) ) ) (mask_downsample): Conv2d(1, 1, kernel_size=(4, 4), stride=(4, 4)) (memory_attention): MemoryAttention( (layers): ModuleList( (0-3): 4 x MemoryAttentionLayer( (self_attn): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (cross_attn_image): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=64, out_features=256, bias=True) (v_proj): Linear(in_features=64, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (linear1): Linear(in_features=256, out_features=2048, bias=True) (dropout): Dropout(p=0.1, inplace=False) (linear2): Linear(in_features=2048, out_features=256, bias=True) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) ) ) (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (memory_encoder): MemoryEncoder( (mask_downsampler): MaskDownSampler( (encoder): Sequential( (0): Conv2d(1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (7): LayerNorm2d() (8): GELU(approximate='none') (9): Conv2d(64, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (10): LayerNorm2d() (11): GELU(approximate='none') (12): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) (pix_feat_proj): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) (fuser): Fuser( (proj): Identity() (layers): ModuleList( (0-1): 2 x CXBlock( (dwconv): Conv2d(256, 256, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=256) (norm): LayerNorm2d() (pwconv1): Linear(in_features=256, out_features=1024, bias=True) (act): GELU(approximate='none') (pwconv2): Linear(in_features=1024, out_features=256, bias=True) (drop_path): Identity() ) ) ) (position_encoding): PositionEmbeddingSine() (out_proj): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) ) (sam_prompt_encoder): PromptEncoder( (pe_layer): PositionEmbeddingRandom() (point_embeddings): ModuleList( (0-3): 4 x Embedding(1, 256) ) (not_a_point_embed): Embedding(1, 256) (mask_downscaling): Sequential( (0): Conv2d(1, 4, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(2, 2), stride=(2, 2)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 256, kernel_size=(1, 1), stride=(1, 1)) ) (no_mask_embed): Embedding(1, 256) ) (sam_mask_decoder): MaskDecoder( (transformer): TwoWayTransformer( (layers): ModuleList( (0-1): 2 x TwoWayAttentionBlock( (self_attn): Attention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=2048, bias=True) (1): Linear(in_features=2048, out_features=256, bias=True) ) (act): ReLU() ) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm4): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_image_to_token): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) ) ) (final_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm_final_attn): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (iou_token): Embedding(1, 256) (mask_tokens): Embedding(4, 256) (obj_score_token): Embedding(1, 256) (output_upscaling): Sequential( (0): ConvTranspose2d(256, 64, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): ConvTranspose2d(64, 32, kernel_size=(2, 2), stride=(2, 2)) (4): GELU(approximate='none') ) (conv_s0): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1)) (conv_s1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) (output_hypernetworks_mlps): ModuleList( (0-3): 4 x MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=32, bias=True) ) (act): ReLU() ) ) (iou_prediction_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) (act): ReLU() ) (pred_obj_score_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) ) (act): ReLU() ) ) (obj_ptr_proj): MLP( (layers): ModuleList( (0-2): 3 x Linear(in_features=256, out_features=256, bias=True) ) (act): ReLU() ) (obj_ptr_tpos_proj): Linear(in_features=256, out_features=64, bias=True) ) INFO 2025-01-05 01:19:47,563 trainer.py:1062: Total parameters 224 M INFO 2025-01-05 01:19:47,563 trainer.py:1063: Trainable parameters 224 M INFO 2025-01-05 01:19:47,564 trainer.py:1066: Non-Trainable parameters 0 INFO 2025-01-05 01:19:47,564 trainer.py:1069: ==================== INFO 2025-01-05 01:19:47,567 trainer.py:1023: Finished setting up components: Model, loss, optim, meters etc. INFO 2025-01-05 01:19:47,567 trainer.py: 314: Moving components to device cuda:0 and local rank 0. INFO 2025-01-05 01:19:47,814 trainer.py: 320: Done moving components to device cuda:0 and local rank 0. INFO 2025-01-05 01:19:47,839 optimizer.py: 248: Matches for param_name [image_encoder.*]: {'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.neck.convs.3.conv.bias', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.weight', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.16.attn.qkv.weight', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.mlp.layers.0.weight', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.21.attn.proj.weight', 'image_encoder.trunk.blocks.36.mlp.layers.0.weight', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.weight', 'image_encoder.trunk.blocks.1.mlp.layers.0.weight', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.patch_embed.proj.weight', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.5.attn.qkv.weight', 'image_encoder.trunk.blocks.18.attn.proj.weight', 'image_encoder.trunk.blocks.22.attn.qkv.weight', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.1.attn.proj.weight', 'image_encoder.trunk.blocks.9.attn.qkv.weight', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.2.norm1.weight', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.weight', 'image_encoder.trunk.blocks.31.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.mlp.layers.0.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.weight', 'image_encoder.trunk.blocks.47.mlp.layers.1.weight', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.19.attn.qkv.weight', 'image_encoder.trunk.blocks.23.attn.proj.weight', 'image_encoder.trunk.blocks.41.attn.proj.weight', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'image_encoder.trunk.blocks.33.attn.proj.weight', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'image_encoder.trunk.blocks.14.attn.proj.weight', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'image_encoder.trunk.blocks.12.norm1.weight', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.weight', 'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'image_encoder.trunk.blocks.16.attn.proj.weight', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.attn.qkv.weight', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.6.attn.qkv.weight', 'image_encoder.trunk.blocks.16.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.22.mlp.layers.0.weight', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'image_encoder.trunk.blocks.2.attn.proj.weight', 'image_encoder.neck.convs.0.conv.weight', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.20.norm1.weight', 'image_encoder.trunk.blocks.3.mlp.layers.1.weight', 'image_encoder.trunk.blocks.24.attn.qkv.weight', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'image_encoder.trunk.blocks.26.attn.proj.weight', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.neck.convs.3.conv.weight', 'image_encoder.trunk.blocks.13.mlp.layers.1.weight', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'image_encoder.trunk.blocks.31.attn.proj.weight', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.0.attn.qkv.weight', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.weight', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.5.mlp.layers.0.weight', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.patch_embed.proj.bias', 'image_encoder.trunk.blocks.24.mlp.layers.1.weight', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.weight', 'image_encoder.trunk.blocks.46.attn.proj.weight', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.27.norm1.weight', 'image_encoder.trunk.blocks.41.mlp.layers.1.weight', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.17.attn.qkv.weight', 'image_encoder.trunk.blocks.39.mlp.layers.1.weight', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.10.attn.qkv.weight', 'image_encoder.trunk.blocks.44.norm1.weight', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.weight', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.norm2.weight', 'image_encoder.trunk.blocks.21.mlp.layers.0.weight', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.weight', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.weight', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'image_encoder.trunk.blocks.44.proj.weight', 'image_encoder.trunk.blocks.17.norm1.weight', 'image_encoder.trunk.blocks.45.attn.qkv.weight', 'image_encoder.trunk.blocks.40.mlp.layers.1.weight', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.weight', 'image_encoder.trunk.blocks.36.attn.proj.weight', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'image_encoder.trunk.blocks.25.attn.proj.weight', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.weight', 'image_encoder.trunk.blocks.30.mlp.layers.0.weight', 'image_encoder.trunk.blocks.25.attn.qkv.weight', 'image_encoder.trunk.blocks.30.attn.proj.weight', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'image_encoder.trunk.blocks.21.norm2.weight', 'image_encoder.trunk.blocks.24.attn.proj.weight', 'image_encoder.trunk.blocks.1.mlp.layers.1.weight', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.attn.qkv.weight', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'image_encoder.trunk.blocks.13.attn.proj.weight', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'image_encoder.trunk.blocks.20.attn.proj.weight', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.7.attn.qkv.weight', 'image_encoder.trunk.blocks.7.mlp.layers.0.weight', 'image_encoder.trunk.blocks.7.norm1.weight', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'image_encoder.trunk.blocks.33.mlp.layers.1.weight', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.41.mlp.layers.0.weight', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.attn.qkv.weight', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.weight', 'image_encoder.trunk.blocks.43.mlp.layers.1.weight', 'image_encoder.trunk.blocks.5.attn.proj.weight', 'image_encoder.trunk.blocks.37.norm1.weight', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.weight', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.15.attn.proj.weight', 'image_encoder.trunk.blocks.13.norm1.weight', 'image_encoder.trunk.blocks.37.attn.qkv.weight', 'image_encoder.trunk.blocks.6.mlp.layers.0.weight', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.trunk.blocks.26.norm2.weight', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.12.attn.qkv.weight', 'image_encoder.trunk.blocks.46.attn.qkv.weight', 'image_encoder.trunk.blocks.44.proj.bias', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.0.attn.proj.weight', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.8.attn.proj.weight', 'image_encoder.trunk.blocks.11.mlp.layers.1.weight', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.20.attn.qkv.weight', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.27.attn.proj.weight', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.weight', 'image_encoder.trunk.blocks.0.mlp.layers.1.weight', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.weight', 'image_encoder.trunk.blocks.13.attn.qkv.weight', 'image_encoder.trunk.blocks.38.mlp.layers.1.weight', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.43.attn.proj.weight', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.weight', 'image_encoder.trunk.blocks.34.mlp.layers.1.weight', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.weight', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'image_encoder.trunk.blocks.35.attn.qkv.weight', 'image_encoder.trunk.blocks.20.mlp.layers.0.weight', 'image_encoder.trunk.blocks.33.norm1.weight', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.weight', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'image_encoder.trunk.blocks.46.mlp.layers.1.weight', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'image_encoder.trunk.blocks.25.mlp.layers.1.weight', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.28.attn.qkv.weight', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'image_encoder.trunk.blocks.10.attn.proj.weight', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.38.mlp.layers.0.weight', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.2.attn.qkv.weight', 'image_encoder.trunk.blocks.26.norm1.weight', 'image_encoder.trunk.blocks.43.mlp.layers.0.weight', 'image_encoder.trunk.pos_embed_window', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.29.attn.qkv.weight', 'image_encoder.trunk.blocks.42.attn.proj.weight', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'image_encoder.trunk.blocks.42.attn.qkv.weight', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'image_encoder.trunk.blocks.17.mlp.layers.0.weight', 'image_encoder.trunk.blocks.21.norm1.weight', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.47.mlp.layers.0.weight', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.attn.qkv.weight', 'image_encoder.trunk.blocks.31.norm1.weight', 'image_encoder.trunk.blocks.41.norm2.weight', 'image_encoder.trunk.pos_embed', 'image_encoder.trunk.blocks.44.attn.qkv.weight', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.44.mlp.layers.1.weight', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.weight', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.trunk.blocks.4.mlp.layers.0.weight', 'image_encoder.trunk.blocks.2.mlp.layers.1.weight', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.neck.convs.1.conv.weight', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.attn.proj.weight', 'image_encoder.trunk.blocks.15.mlp.layers.1.weight', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.neck.convs.2.conv.bias', 'image_encoder.trunk.blocks.29.norm2.weight', 'image_encoder.trunk.blocks.36.norm1.weight', 'image_encoder.trunk.blocks.37.attn.proj.weight', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'image_encoder.trunk.blocks.31.norm2.weight', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.0.norm1.weight', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.attn.proj.weight', 'image_encoder.trunk.blocks.9.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.weight', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'image_encoder.trunk.blocks.9.attn.proj.weight', 'image_encoder.trunk.blocks.8.proj.weight', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.9.norm1.weight', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.44.attn.proj.weight', 'image_encoder.trunk.blocks.11.attn.proj.weight', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.weight', 'image_encoder.trunk.blocks.43.norm1.weight', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.17.attn.proj.weight', 'image_encoder.trunk.blocks.6.mlp.layers.1.weight', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'image_encoder.trunk.blocks.15.attn.qkv.weight', 'image_encoder.trunk.blocks.35.norm1.weight', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.weight', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'image_encoder.trunk.blocks.35.attn.proj.weight', 'image_encoder.trunk.blocks.29.mlp.layers.0.weight', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'image_encoder.trunk.blocks.41.norm1.weight', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.12.mlp.layers.1.weight', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.neck.convs.0.conv.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.weight', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.23.attn.qkv.weight', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.weight', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.weight', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'image_encoder.trunk.blocks.11.norm2.weight', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.4.attn.qkv.weight', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.38.attn.proj.weight', 'image_encoder.trunk.blocks.44.mlp.layers.0.weight', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.21.attn.qkv.weight', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.42.norm2.weight', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.norm1.weight', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.43.attn.qkv.weight', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.31.mlp.layers.1.weight', 'image_encoder.trunk.blocks.0.norm2.weight', 'image_encoder.trunk.blocks.6.norm2.weight', 'image_encoder.trunk.blocks.36.attn.qkv.weight', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'image_encoder.trunk.blocks.3.attn.proj.bias', 'image_encoder.trunk.blocks.22.norm1.weight', 'image_encoder.trunk.blocks.29.attn.proj.weight', 'image_encoder.trunk.blocks.32.attn.proj.weight', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.weight', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.26.attn.qkv.weight', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.weight', 'image_encoder.trunk.blocks.14.attn.qkv.weight', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.41.attn.qkv.weight', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.34.attn.proj.weight', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.42.mlp.layers.0.weight', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.13.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.norm2.weight', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.26.mlp.layers.1.weight', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.weight', 'image_encoder.trunk.blocks.45.mlp.layers.1.weight', 'image_encoder.trunk.blocks.38.norm1.weight', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.weight', 'image_encoder.trunk.blocks.22.attn.proj.weight', 'image_encoder.trunk.blocks.36.mlp.layers.1.weight', 'image_encoder.trunk.blocks.33.attn.qkv.weight', 'image_encoder.trunk.blocks.39.attn.proj.weight', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.trunk.blocks.45.attn.proj.weight', 'image_encoder.trunk.blocks.24.norm2.weight', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.weight', 'image_encoder.trunk.blocks.30.norm1.weight', 'image_encoder.trunk.blocks.7.attn.proj.weight', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.14.norm2.weight', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.weight', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.norm1.weight', 'image_encoder.trunk.blocks.25.norm2.weight', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.15.norm2.weight', 'image_encoder.trunk.blocks.18.attn.qkv.weight', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.attn.proj.weight', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.attn.qkv.weight', 'image_encoder.trunk.blocks.19.mlp.layers.1.weight', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'image_encoder.trunk.blocks.4.norm1.weight', 'image_encoder.trunk.blocks.19.attn.proj.weight', 'image_encoder.trunk.blocks.42.norm1.weight', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.weight', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.11.attn.qkv.weight', 'image_encoder.trunk.blocks.27.attn.qkv.weight', 'image_encoder.trunk.blocks.47.norm2.weight', 'image_encoder.trunk.blocks.7.norm2.weight', 'image_encoder.trunk.blocks.18.mlp.layers.1.weight', 'image_encoder.trunk.blocks.19.norm2.weight', 'image_encoder.trunk.blocks.30.attn.qkv.weight', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.norm2.weight', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'image_encoder.trunk.blocks.25.norm1.weight', 'image_encoder.trunk.blocks.34.mlp.layers.0.weight', 'image_encoder.trunk.blocks.40.norm1.weight', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.weight', 'image_encoder.trunk.blocks.3.attn.qkv.weight', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'image_encoder.neck.convs.2.conv.weight', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.attn.proj.weight', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'image_encoder.trunk.blocks.46.norm2.weight', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.30.mlp.layers.1.weight', 'image_encoder.trunk.blocks.6.attn.proj.weight', 'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.1.attn.qkv.weight', 'image_encoder.trunk.blocks.17.mlp.layers.1.weight', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.8.norm1.weight', 'image_encoder.trunk.blocks.4.attn.proj.weight', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.proj.weight', 'image_encoder.trunk.blocks.8.attn.qkv.weight', 'image_encoder.trunk.blocks.1.norm2.weight', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.weight', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.40.attn.qkv.weight', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.norm1.weight', 'image_encoder.trunk.blocks.31.attn.qkv.weight', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.weight', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.19.norm1.weight', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.weight', 'image_encoder.trunk.blocks.38.norm2.weight', 'image_encoder.trunk.blocks.8.proj.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.weight', 'image_encoder.trunk.blocks.3.attn.proj.weight', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.2.norm2.weight', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.weight', 'image_encoder.trunk.blocks.40.mlp.layers.0.weight', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.weight', 'image_encoder.trunk.blocks.7.norm2.bias'} INFO 2025-01-05 01:19:47,845 optimizer.py: 248: Matches for param_name [*bias*]: {'image_encoder.trunk.blocks.44.norm2.bias', 'memory_attention.layers.2.norm3.bias', 'image_encoder.neck.convs.3.conv.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.v_proj.bias', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'memory_attention.layers.3.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.blocks.1.norm1.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.out_proj.bias', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.2.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.21.norm1.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.v_proj.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.out_proj.bias', 'memory_attention.layers.3.cross_attn_image.q_proj.bias', 'memory_attention.layers.0.norm1.bias', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'obj_ptr_proj.layers.0.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.0.bias', 'memory_attention.layers.0.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'sam_mask_decoder.iou_prediction_head.layers.1.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'memory_attention.layers.1.norm1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'memory_encoder.fuser.layers.0.norm.bias', 'memory_attention.layers.1.linear2.bias', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'memory_attention.layers.1.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.13.norm1.bias', 'memory_attention.layers.0.linear2.bias', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'sam_mask_decoder.pred_obj_score_head.layers.2.bias', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'memory_attention.layers.0.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'obj_ptr_proj.layers.2.bias', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'obj_ptr_tpos_proj.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.19.norm1.bias', 'memory_encoder.mask_downsampler.encoder.7.bias', 'memory_encoder.fuser.layers.1.dwconv.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'memory_encoder.mask_downsampler.encoder.0.bias', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.patch_embed.proj.bias', 'memory_encoder.fuser.layers.1.pwconv2.bias', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'memory_attention.layers.1.norm3.bias', 'sam_mask_decoder.output_upscaling.0.bias', 'memory_attention.layers.2.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.2.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'memory_attention.layers.2.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'memory_attention.layers.3.norm3.bias', 'mask_downsample.bias', 'memory_attention.layers.1.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'memory_encoder.mask_downsampler.encoder.1.bias', 'memory_encoder.mask_downsampler.encoder.9.bias', 'obj_ptr_proj.layers.1.bias', 'memory_attention.layers.0.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'sam_mask_decoder.iou_prediction_head.layers.2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.norm4.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.44.proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.1.bias', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'memory_attention.layers.1.cross_attn_image.out_proj.bias', 'memory_attention.layers.2.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'memory_encoder.fuser.layers.1.pwconv1.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.out_proj.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.1.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'memory_attention.layers.3.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.33.norm1.bias', 'memory_attention.layers.2.norm1.bias', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.q_proj.bias', 'sam_mask_decoder.output_upscaling.3.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'memory_encoder.mask_downsampler.encoder.6.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.1.bias', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'memory_attention.layers.2.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'memory_attention.layers.0.norm3.bias', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.q_proj.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'memory_encoder.mask_downsampler.encoder.10.bias', 'sam_mask_decoder.conv_s1.bias', 'sam_mask_decoder.output_upscaling.1.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'memory_attention.layers.3.norm1.bias', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'memory_attention.layers.2.self_attn.k_proj.bias', 'memory_attention.layers.2.self_attn.v_proj.bias', 'memory_attention.layers.3.self_attn.out_proj.bias', 'memory_attention.layers.2.norm2.bias', 'memory_attention.layers.3.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'image_encoder.trunk.blocks.27.norm1.bias', 'sam_prompt_encoder.mask_downscaling.3.bias', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'memory_encoder.mask_downsampler.encoder.4.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.2.bias', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'memory_encoder.fuser.layers.0.dwconv.bias', 'memory_encoder.pix_feat_proj.bias', 'memory_attention.layers.3.norm2.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.neck.convs.2.conv.bias', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.out_proj.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'memory_encoder.fuser.layers.0.pwconv2.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'memory_attention.layers.0.linear1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.0.bias', 'sam_mask_decoder.iou_prediction_head.layers.0.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'image_encoder.trunk.blocks.31.norm2.bias', 'memory_attention.layers.2.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'memory_encoder.fuser.layers.0.pwconv1.bias', 'memory_attention.layers.3.self_attn.v_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.2.bias', 'memory_attention.layers.3.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.v_proj.bias', 'memory_attention.layers.1.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'memory_attention.layers.1.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'memory_attention.layers.1.self_attn.q_proj.bias', 'image_encoder.neck.convs.0.conv.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'image_encoder.trunk.blocks.22.norm2.bias', 'sam_prompt_encoder.mask_downscaling.4.bias', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.5.norm2.bias', 'memory_attention.layers.2.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'sam_mask_decoder.pred_obj_score_head.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.k_proj.bias', 'memory_attention.layers.1.self_attn.v_proj.bias', 'memory_attention.layers.0.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'memory_attention.layers.3.linear1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'memory_attention.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.v_proj.bias', 'memory_attention.layers.1.linear1.bias', 'memory_attention.layers.2.linear1.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.out_proj.bias', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'memory_attention.layers.0.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.3.attn.proj.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'memory_attention.layers.0.self_attn.out_proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.k_proj.bias', 'image_encoder.trunk.blocks.17.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.v_proj.bias', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'memory_encoder.out_proj.bias', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.k_proj.bias', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.1.norm3.bias', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.0.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'memory_encoder.mask_downsampler.encoder.3.bias', 'memory_attention.layers.3.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'sam_mask_decoder.conv_s0.bias', 'memory_attention.layers.1.self_attn.k_proj.bias', 'memory_attention.norm.bias', 'memory_attention.layers.1.norm2.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'memory_attention.layers.3.linear2.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'sam_prompt_encoder.mask_downscaling.6.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'sam_prompt_encoder.mask_downscaling.1.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'memory_encoder.fuser.layers.1.norm.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.q_proj.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.q_proj.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'memory_attention.layers.2.linear2.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'memory_attention.layers.0.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.out_proj.bias', 'sam_mask_decoder.pred_obj_score_head.layers.0.bias', 'memory_encoder.mask_downsampler.encoder.12.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.q_proj.bias', 'image_encoder.trunk.blocks.8.proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.0.bias', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'sam_prompt_encoder.mask_downscaling.0.bias', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.v_proj.bias', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'image_encoder.trunk.blocks.7.norm2.bias'} INFO 2025-01-05 01:19:47,847 optimizer.py: 220: Matches for module_cls_name [torch.nn.LayerNorm]: {'sam_mask_decoder.transformer.layers.1.norm2.weight', 'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'memory_attention.layers.2.norm3.bias', 'memory_attention.layers.3.norm2.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.21.norm2.weight', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.28.norm2.weight', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.29.norm2.weight', 'image_encoder.trunk.blocks.36.norm1.weight', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.38.norm1.weight', 'image_encoder.trunk.blocks.28.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'image_encoder.trunk.blocks.31.norm2.weight', 'sam_mask_decoder.transformer.layers.0.norm3.weight', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.0.norm1.weight', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.7.norm1.weight', 'memory_attention.norm.weight', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.2.norm1.weight', 'image_encoder.trunk.blocks.24.norm2.weight', 'memory_attention.layers.1.norm2.weight', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.37.norm1.weight', 'memory_attention.layers.3.norm2.weight', 'sam_mask_decoder.transformer.norm_final_attn.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.30.norm1.weight', 'image_encoder.trunk.blocks.31.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm4.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'memory_attention.layers.0.norm3.weight', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.33.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.14.norm2.weight', 'sam_mask_decoder.transformer.layers.1.norm3.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.13.norm1.weight', 'image_encoder.trunk.blocks.47.norm1.weight', 'memory_attention.layers.1.norm1.weight', 'image_encoder.trunk.blocks.25.norm2.weight', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.26.norm2.weight', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.9.norm1.weight', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.47.norm1.bias', 'memory_attention.layers.0.norm1.bias', 'sam_mask_decoder.transformer.layers.1.norm4.weight', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.15.norm2.weight', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.43.norm1.weight', 'memory_attention.layers.1.norm3.weight', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.35.norm1.weight', 'image_encoder.trunk.blocks.29.norm1.bias', 'memory_attention.norm.bias', 'memory_attention.layers.1.norm2.bias', 'image_encoder.trunk.blocks.12.norm1.weight', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'memory_attention.layers.1.norm1.bias', 'memory_attention.layers.2.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.0.norm1.bias', 'sam_mask_decoder.transformer.layers.1.norm3.weight', 'image_encoder.trunk.blocks.4.norm1.weight', 'memory_attention.layers.2.norm1.weight', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.42.norm1.weight', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.24.norm1.bias', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.41.norm1.weight', 'image_encoder.trunk.blocks.47.norm2.weight', 'memory_attention.layers.2.norm2.weight', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'image_encoder.trunk.blocks.7.norm2.weight', 'image_encoder.trunk.blocks.19.norm2.weight', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.20.norm1.weight', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.36.norm2.weight', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.25.norm1.weight', 'image_encoder.trunk.blocks.40.norm1.weight', 'memory_attention.layers.3.norm3.weight', 'image_encoder.trunk.blocks.33.norm1.weight', 'image_encoder.trunk.blocks.20.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm1.weight', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.46.norm2.weight', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.trunk.blocks.5.norm2.bias', 'memory_attention.layers.0.norm3.bias', 'image_encoder.trunk.blocks.23.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm2.weight', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.11.norm2.weight', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.32.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm1.weight', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.32.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm4.weight', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.8.norm1.weight', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.27.norm1.weight', 'memory_attention.layers.0.norm1.weight', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'memory_attention.layers.1.norm3.bias', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.1.norm2.weight', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.42.norm2.weight', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.44.norm1.weight', 'image_encoder.trunk.blocks.45.norm1.weight', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.26.norm1.weight', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.16.norm1.weight', 'memory_attention.layers.3.norm1.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.0.norm2.weight', 'memory_attention.layers.2.norm2.bias', 'image_encoder.trunk.blocks.6.norm2.weight', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.19.norm1.weight', 'memory_attention.layers.0.norm2.weight', 'image_encoder.trunk.blocks.21.norm1.weight', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.22.norm1.weight', 'image_encoder.trunk.blocks.32.norm2.weight', 'image_encoder.trunk.blocks.38.norm2.weight', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.31.norm1.weight', 'image_encoder.trunk.blocks.41.norm2.weight', 'memory_attention.layers.3.norm1.weight', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.2.norm2.weight', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.weight', 'image_encoder.trunk.blocks.17.norm1.weight', 'image_encoder.trunk.blocks.17.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.45.norm1.bias', 'memory_attention.layers.3.norm3.bias', 'memory_attention.layers.2.norm3.weight'} INFO 2025-01-05 01:19:48,207 sam2_datasets.py: 125: Dataset mixing probabilities: [1.0] INFO 2025-01-05 01:22:35,346 train_utils.py: 108: MACHINE SEED: 4920 INFO 2025-01-05 01:22:35,348 train_utils.py: 154: Logging ENV_VARIABLES INFO 2025-01-05 01:22:35,348 train_utils.py: 155: BROWSER=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/bin/helpers/browser.sh COLORTERM=truecolor CONDA_DEFAULT_ENV=sam2 CONDA_EXE=/home/hossein/miniconda3/bin/conda CONDA_PREFIX=/ephemeral/hossein/envs/sam2 CONDA_PREFIX_1=/home/hossein/miniconda3 CONDA_PROMPT_MODIFIER=(sam2) CONDA_PYTHON_EXE=/home/hossein/miniconda3/bin/python CONDA_SHLVL=2 CUDA_MODULE_LOADING=LAZY DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/2095/bus GIT_ASKPASS=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/extensions/git/dist/askpass.sh HF_HOME=/ephemeral/ HISTSIZE=2000 HISTTIMEFORMAT=%F %T HOME=/home/hossein HYDRA_FULL_ERROR=1 LANG=C.UTF-8 LESSCLOSE=/usr/bin/lesspipe %s %s LESSOPEN=| /usr/bin/lesspipe %s LOCAL_RANK=0 LOGNAME=hossein LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: MASTER_ADDR=localhost MASTER_PORT=16808 MOTD_SHOWN=pam NCCL_TOPO_FILE=/etc/nccl-topo-h100-v1.xml OLDPWD=/home/hossein/hossein/projects/sam2 PATH=/home/hossein/.cursor-server/cli/servers/Stable-fe574d0820377383143b2ea26aa6ae28b3425220/server/bin/remote-cli:/ephemeral/hossein/envs/sam2/bin:/home/hossein/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin PWD=/home/hossein/hossein/projects/sam2/training PYTHON_PATH=/home/hossein/hossein/projects/hybrid_model_training:/home/hossein/hossein/projects/hybrid_model_training:/home/hossein/hossein/projects/hybrid_model_training: RANK=0 SHELL=/bin/bash SHLVL=2 SSH_CLIENT=142.186.28.106 64524 22 SSH_CONNECTION=110.238.90.22 3000 10.0.1.99 22 TERM=screen TERM_PROGRAM=tmux TERM_PROGRAM_VERSION=3.2a TMUX=/tmp/tmux-2095/default,727396,5 TMUX_PANE=%5 TORCH_NCCL_ASYNC_ERROR_HANDLING=1 USER=hossein VSCODE_GIT_ASKPASS_EXTRA_ARGS= VSCODE_GIT_ASKPASS_MAIN=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/extensions/git/dist/askpass-main.js VSCODE_GIT_ASKPASS_NODE=/home/hossein/.vscode-server/bin/1a5daa3a0231a0fbba4f14db7ec463cf99d7768e/node VSCODE_GIT_IPC_HANDLE=/run/user/2095/vscode-git-cd38edda58.sock VSCODE_IPC_HOOK_CLI=/run/user/2095/vscode-ipc-e3cd88d8-a6c9-4e22-89a9-8e26349b2914.sock WORLD_SIZE=4 XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop XDG_RUNTIME_DIR=/run/user/2095 XDG_SESSION_CLASS=user XDG_SESSION_ID=524 XDG_SESSION_TYPE=tty _=/ephemeral/hossein/envs/sam2/bin/python _CE_CONDA= _CE_M= INFO 2025-01-05 01:22:35,348 trainer.py: 989: Setting up components: Model, loss, optim, meters etc. INFO 2025-01-05 01:22:35,349 logger.py: 66: TensorBoard SummaryWriter instantiated. Files will be stored in: /ephemeral/hossein/output/sam2/tensorboard INFO 2025-01-05 01:22:38,998 sam2.py: 81: Training with points (sampled from masks) as inputs with p=0.5 INFO 2025-01-05 01:22:39,001 trainer.py:1059: ==================== INFO 2025-01-05 01:22:39,001 trainer.py:1060: Summary for model INFO 2025-01-05 01:22:39,004 trainer.py:1061: Model is SAM2Train( (image_encoder): ImageEncoder( (trunk): Hiera( (patch_embed): PatchEmbed( (proj): Conv2d(3, 144, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3)) ) (blocks): ModuleList( (0-1): 2 x MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=144, out_features=432, bias=True) (proj): Linear(in_features=144, out_features=144, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=144, out_features=576, bias=True) (1): Linear(in_features=576, out_features=144, bias=True) ) (act): GELU(approximate='none') ) ) (2): MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=144, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=144, out_features=288, bias=True) ) (3-7): 5 x MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=288, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) ) (8): MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=288, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=288, out_features=576, bias=True) ) (9-43): 35 x MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=576, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) ) (44): MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=576, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=576, out_features=1152, bias=True) ) (45-47): 3 x MultiScaleBlock( (norm1): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=1152, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) ) ) ) (neck): FpnNeck( (position_encoding): PositionEmbeddingSine() (convs): ModuleList( (0): Sequential( (conv): Conv2d(1152, 256, kernel_size=(1, 1), stride=(1, 1)) ) (1): Sequential( (conv): Conv2d(576, 256, kernel_size=(1, 1), stride=(1, 1)) ) (2): Sequential( (conv): Conv2d(288, 256, kernel_size=(1, 1), stride=(1, 1)) ) (3): Sequential( (conv): Conv2d(144, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) ) ) (mask_downsample): Conv2d(1, 1, kernel_size=(4, 4), stride=(4, 4)) (memory_attention): MemoryAttention( (layers): ModuleList( (0-3): 4 x MemoryAttentionLayer( (self_attn): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (cross_attn_image): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=64, out_features=256, bias=True) (v_proj): Linear(in_features=64, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (linear1): Linear(in_features=256, out_features=2048, bias=True) (dropout): Dropout(p=0.1, inplace=False) (linear2): Linear(in_features=2048, out_features=256, bias=True) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) ) ) (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (memory_encoder): MemoryEncoder( (mask_downsampler): MaskDownSampler( (encoder): Sequential( (0): Conv2d(1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (7): LayerNorm2d() (8): GELU(approximate='none') (9): Conv2d(64, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (10): LayerNorm2d() (11): GELU(approximate='none') (12): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) (pix_feat_proj): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) (fuser): Fuser( (proj): Identity() (layers): ModuleList( (0-1): 2 x CXBlock( (dwconv): Conv2d(256, 256, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=256) (norm): LayerNorm2d() (pwconv1): Linear(in_features=256, out_features=1024, bias=True) (act): GELU(approximate='none') (pwconv2): Linear(in_features=1024, out_features=256, bias=True) (drop_path): Identity() ) ) ) (position_encoding): PositionEmbeddingSine() (out_proj): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) ) (sam_prompt_encoder): PromptEncoder( (pe_layer): PositionEmbeddingRandom() (point_embeddings): ModuleList( (0-3): 4 x Embedding(1, 256) ) (not_a_point_embed): Embedding(1, 256) (mask_downscaling): Sequential( (0): Conv2d(1, 4, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(2, 2), stride=(2, 2)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 256, kernel_size=(1, 1), stride=(1, 1)) ) (no_mask_embed): Embedding(1, 256) ) (sam_mask_decoder): MaskDecoder( (transformer): TwoWayTransformer( (layers): ModuleList( (0-1): 2 x TwoWayAttentionBlock( (self_attn): Attention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=2048, bias=True) (1): Linear(in_features=2048, out_features=256, bias=True) ) (act): ReLU() ) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm4): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_image_to_token): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) ) ) (final_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm_final_attn): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (iou_token): Embedding(1, 256) (mask_tokens): Embedding(4, 256) (obj_score_token): Embedding(1, 256) (output_upscaling): Sequential( (0): ConvTranspose2d(256, 64, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): ConvTranspose2d(64, 32, kernel_size=(2, 2), stride=(2, 2)) (4): GELU(approximate='none') ) (conv_s0): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1)) (conv_s1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) (output_hypernetworks_mlps): ModuleList( (0-3): 4 x MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=32, bias=True) ) (act): ReLU() ) ) (iou_prediction_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) (act): ReLU() ) (pred_obj_score_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) ) (act): ReLU() ) ) (obj_ptr_proj): MLP( (layers): ModuleList( (0-2): 3 x Linear(in_features=256, out_features=256, bias=True) ) (act): ReLU() ) (obj_ptr_tpos_proj): Linear(in_features=256, out_features=64, bias=True) ) INFO 2025-01-05 01:22:39,004 trainer.py:1062: Total parameters 224 M INFO 2025-01-05 01:22:39,004 trainer.py:1063: Trainable parameters 224 M INFO 2025-01-05 01:22:39,004 trainer.py:1066: Non-Trainable parameters 0 INFO 2025-01-05 01:22:39,004 trainer.py:1069: ==================== INFO 2025-01-05 01:22:39,007 trainer.py:1023: Finished setting up components: Model, loss, optim, meters etc. INFO 2025-01-05 01:22:39,007 trainer.py: 314: Moving components to device cuda:0 and local rank 0. INFO 2025-01-05 01:22:39,212 trainer.py: 320: Done moving components to device cuda:0 and local rank 0. INFO 2025-01-05 01:22:39,224 optimizer.py: 248: Matches for param_name [image_encoder.*]: {'image_encoder.trunk.blocks.22.mlp.layers.0.weight', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.attn.qkv.weight', 'image_encoder.trunk.blocks.6.mlp.layers.1.weight', 'image_encoder.trunk.blocks.42.mlp.layers.0.weight', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'image_encoder.trunk.blocks.4.attn.qkv.weight', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.attn.qkv.weight', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.10.attn.proj.weight', 'image_encoder.trunk.blocks.27.attn.qkv.weight', 'image_encoder.trunk.blocks.7.attn.qkv.weight', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.proj.weight', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.attn.qkv.weight', 'image_encoder.trunk.blocks.17.norm1.weight', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'image_encoder.trunk.blocks.14.attn.qkv.weight', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.weight', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'image_encoder.trunk.blocks.19.norm1.weight', 'image_encoder.trunk.blocks.26.mlp.layers.1.weight', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.proj.weight', 'image_encoder.trunk.blocks.1.mlp.layers.0.weight', 'image_encoder.trunk.blocks.17.attn.qkv.weight', 'image_encoder.trunk.blocks.44.mlp.layers.0.weight', 'image_encoder.trunk.blocks.11.attn.proj.weight', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.weight', 'image_encoder.neck.convs.1.conv.weight', 'image_encoder.trunk.blocks.0.attn.proj.weight', 'image_encoder.trunk.blocks.19.attn.qkv.weight', 'image_encoder.trunk.blocks.2.norm1.weight', 'image_encoder.trunk.blocks.1.norm2.weight', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.17.mlp.layers.1.weight', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.weight', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.weight', 'image_encoder.trunk.blocks.24.attn.proj.weight', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'image_encoder.trunk.blocks.31.attn.proj.weight', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.2.norm2.weight', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.weight', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.36.attn.qkv.weight', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'image_encoder.trunk.blocks.8.proj.weight', 'image_encoder.trunk.blocks.4.mlp.layers.0.weight', 'image_encoder.trunk.blocks.41.mlp.layers.1.weight', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.weight', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.weight', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.42.norm1.weight', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.weight', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.31.norm1.weight', 'image_encoder.trunk.blocks.9.norm1.weight', 'image_encoder.trunk.blocks.47.norm2.weight', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.weight', 'image_encoder.trunk.blocks.20.attn.proj.weight', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.21.attn.proj.weight', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.weight', 'image_encoder.trunk.blocks.13.norm1.weight', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.weight', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.25.attn.qkv.weight', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.proj.weight', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.trunk.blocks.36.mlp.layers.1.weight', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.25.norm2.weight', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.weight', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.25.mlp.layers.1.weight', 'image_encoder.trunk.blocks.34.attn.proj.weight', 'image_encoder.trunk.blocks.16.mlp.layers.1.weight', 'image_encoder.trunk.blocks.31.attn.qkv.weight', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.45.norm1.weight', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'image_encoder.trunk.patch_embed.proj.weight', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.weight', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.47.attn.qkv.weight', 'image_encoder.trunk.blocks.19.mlp.layers.0.weight', 'image_encoder.trunk.blocks.26.norm1.weight', 'image_encoder.trunk.blocks.39.mlp.layers.0.weight', 'image_encoder.trunk.blocks.37.norm1.weight', 'image_encoder.trunk.blocks.23.attn.proj.weight', 'image_encoder.trunk.blocks.11.attn.qkv.weight', 'image_encoder.trunk.blocks.21.attn.qkv.weight', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.weight', 'image_encoder.trunk.blocks.45.attn.proj.weight', 'image_encoder.trunk.blocks.13.mlp.layers.0.weight', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.weight', 'image_encoder.trunk.blocks.38.attn.qkv.weight', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.proj.weight', 'image_encoder.trunk.blocks.32.norm2.weight', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.33.mlp.layers.0.weight', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.0.norm2.weight', 'image_encoder.trunk.blocks.23.mlp.layers.0.weight', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'image_encoder.trunk.blocks.3.attn.qkv.weight', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.norm1.weight', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'image_encoder.trunk.blocks.22.attn.proj.weight', 'image_encoder.trunk.blocks.35.norm1.weight', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.38.norm1.weight', 'image_encoder.trunk.blocks.27.attn.proj.weight', 'image_encoder.trunk.blocks.46.mlp.layers.0.weight', 'image_encoder.trunk.blocks.36.norm1.weight', 'image_encoder.trunk.blocks.27.norm1.weight', 'image_encoder.trunk.blocks.42.norm2.weight', 'image_encoder.trunk.blocks.43.attn.qkv.weight', 'image_encoder.trunk.blocks.5.attn.proj.weight', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'image_encoder.trunk.blocks.26.norm2.weight', 'image_encoder.trunk.blocks.23.attn.qkv.weight', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.weight', 'image_encoder.trunk.blocks.13.attn.proj.weight', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.trunk.blocks.0.attn.qkv.weight', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.33.attn.proj.weight', 'image_encoder.trunk.blocks.10.mlp.layers.1.weight', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.7.norm2.weight', 'image_encoder.trunk.blocks.24.mlp.layers.1.weight', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.46.norm2.weight', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.norm1.weight', 'image_encoder.trunk.blocks.38.attn.proj.weight', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.weight', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.40.attn.qkv.weight', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.1.attn.qkv.weight', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.8.attn.qkv.weight', 'image_encoder.trunk.blocks.40.attn.proj.weight', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.44.proj.bias', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.trunk.blocks.18.attn.proj.weight', 'image_encoder.trunk.blocks.22.attn.qkv.weight', 'image_encoder.neck.convs.3.conv.bias', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.9.attn.qkv.weight', 'image_encoder.trunk.blocks.41.attn.proj.weight', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.weight', 'image_encoder.trunk.blocks.18.attn.qkv.weight', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'image_encoder.trunk.blocks.1.attn.proj.weight', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.16.attn.qkv.weight', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.weight', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.weight', 'image_encoder.trunk.blocks.33.attn.qkv.weight', 'image_encoder.trunk.blocks.39.norm1.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.weight', 'image_encoder.trunk.blocks.15.attn.qkv.weight', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.weight', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.norm2.weight', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.weight', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'image_encoder.neck.convs.2.conv.bias', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.neck.convs.3.conv.weight', 'image_encoder.trunk.blocks.26.attn.qkv.weight', 'image_encoder.trunk.blocks.0.mlp.layers.1.weight', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.attn.proj.weight', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.weight', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'image_encoder.trunk.blocks.31.mlp.layers.0.weight', 'image_encoder.trunk.blocks.44.mlp.layers.1.weight', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.13.attn.qkv.weight', 'image_encoder.trunk.blocks.4.mlp.layers.1.weight', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.weight', 'image_encoder.trunk.blocks.32.mlp.layers.0.weight', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.46.attn.qkv.weight', 'image_encoder.trunk.blocks.28.norm2.weight', 'image_encoder.trunk.blocks.12.mlp.layers.1.weight', 'image_encoder.trunk.blocks.36.attn.proj.weight', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.weight', 'image_encoder.trunk.blocks.44.attn.qkv.weight', 'image_encoder.trunk.blocks.15.attn.proj.weight', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.15.norm2.weight', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.30.norm1.weight', 'image_encoder.trunk.blocks.24.norm2.weight', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'image_encoder.trunk.blocks.10.mlp.layers.0.weight', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.weight', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.norm2.weight', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.weight', 'image_encoder.trunk.blocks.44.norm1.weight', 'image_encoder.trunk.blocks.30.attn.proj.weight', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.weight', 'image_encoder.trunk.blocks.34.attn.qkv.weight', 'image_encoder.trunk.blocks.47.mlp.layers.0.weight', 'image_encoder.trunk.blocks.10.attn.qkv.weight', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.trunk.blocks.11.norm2.weight', 'image_encoder.trunk.blocks.45.attn.qkv.weight', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.weight', 'image_encoder.trunk.blocks.9.mlp.layers.0.weight', 'image_encoder.trunk.blocks.18.mlp.layers.0.weight', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.weight', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.neck.convs.0.conv.bias', 'image_encoder.trunk.blocks.2.attn.proj.weight', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.attn.qkv.weight', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'image_encoder.trunk.blocks.21.norm2.weight', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'image_encoder.trunk.blocks.30.mlp.layers.1.weight', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.33.mlp.layers.1.weight', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.weight', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.47.attn.proj.weight', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.14.attn.proj.weight', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'image_encoder.trunk.blocks.32.attn.qkv.weight', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.weight', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.40.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'image_encoder.trunk.blocks.41.attn.qkv.weight', 'image_encoder.trunk.blocks.4.norm1.weight', 'image_encoder.trunk.blocks.6.attn.qkv.weight', 'image_encoder.trunk.blocks.12.norm1.weight', 'image_encoder.trunk.blocks.26.attn.proj.weight', 'image_encoder.trunk.blocks.40.mlp.layers.1.weight', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.17.attn.proj.weight', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.weight', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'image_encoder.trunk.blocks.44.attn.proj.weight', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'image_encoder.trunk.patch_embed.proj.bias', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.weight', 'image_encoder.trunk.blocks.46.mlp.layers.1.weight', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.weight', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.6.attn.proj.weight', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.weight', 'image_encoder.trunk.blocks.21.norm1.weight', 'image_encoder.trunk.blocks.30.mlp.layers.0.weight', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.3.attn.proj.weight', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.31.mlp.layers.1.weight', 'image_encoder.trunk.blocks.20.attn.qkv.weight', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'image_encoder.trunk.blocks.14.norm2.weight', 'image_encoder.trunk.blocks.5.mlp.layers.0.weight', 'image_encoder.trunk.blocks.6.norm2.weight', 'image_encoder.trunk.blocks.42.attn.proj.weight', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.43.attn.proj.weight', 'image_encoder.trunk.blocks.38.mlp.layers.0.weight', 'image_encoder.trunk.blocks.46.attn.proj.weight', 'image_encoder.trunk.pos_embed', 'image_encoder.neck.convs.2.conv.weight', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'image_encoder.trunk.blocks.35.attn.proj.weight', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.norm1.weight', 'image_encoder.trunk.blocks.9.attn.proj.weight', 'image_encoder.trunk.blocks.28.attn.qkv.weight', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.19.attn.proj.weight', 'image_encoder.trunk.blocks.41.mlp.layers.0.weight', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.pos_embed_window', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.weight', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.33.norm1.weight', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'image_encoder.trunk.blocks.29.norm2.weight', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'image_encoder.trunk.blocks.12.attn.proj.weight', 'image_encoder.trunk.blocks.20.mlp.layers.0.weight', 'image_encoder.trunk.blocks.0.mlp.layers.0.weight', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'image_encoder.trunk.blocks.29.attn.proj.weight', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'image_encoder.trunk.blocks.5.attn.qkv.weight', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.weight', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'image_encoder.trunk.blocks.21.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.attn.qkv.weight', 'image_encoder.trunk.blocks.3.mlp.layers.1.weight', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'image_encoder.trunk.blocks.4.attn.proj.weight', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.47.norm1.weight', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.25.attn.proj.weight', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'image_encoder.trunk.blocks.0.norm1.weight', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.weight', 'image_encoder.trunk.blocks.19.mlp.layers.1.weight', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.41.norm2.weight', 'image_encoder.trunk.blocks.8.norm1.weight', 'image_encoder.trunk.blocks.29.attn.qkv.weight', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.11.mlp.layers.1.weight', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.16.attn.proj.weight', 'image_encoder.trunk.blocks.8.proj.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.weight', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.weight', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.19.norm2.weight', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'image_encoder.trunk.blocks.2.mlp.layers.1.weight', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.neck.convs.0.conv.weight', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.weight', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.7.norm1.weight', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.32.attn.proj.weight', 'image_encoder.trunk.blocks.43.norm1.weight', 'image_encoder.trunk.blocks.34.mlp.layers.0.weight', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.weight', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.attn.qkv.weight', 'image_encoder.trunk.blocks.31.norm2.weight', 'image_encoder.trunk.blocks.39.attn.proj.weight', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.47.mlp.layers.1.weight', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.weight', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.45.mlp.layers.1.weight', 'image_encoder.trunk.blocks.7.attn.proj.weight', 'image_encoder.trunk.blocks.24.attn.qkv.weight', 'image_encoder.trunk.blocks.42.attn.qkv.weight', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.16.norm1.weight', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.43.mlp.layers.0.weight', 'image_encoder.trunk.blocks.17.mlp.layers.0.weight', 'image_encoder.trunk.blocks.25.norm1.weight', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.22.norm1.weight', 'image_encoder.trunk.blocks.3.attn.proj.bias'} INFO 2025-01-05 01:22:39,226 optimizer.py: 248: Matches for param_name [*bias*]: {'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.out_proj.bias', 'memory_attention.layers.2.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'obj_ptr_proj.layers.0.bias', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'memory_attention.layers.1.norm1.bias', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'memory_encoder.mask_downsampler.encoder.12.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.v_proj.bias', 'memory_attention.layers.3.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'sam_mask_decoder.conv_s1.bias', 'sam_mask_decoder.conv_s0.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.out_proj.bias', 'memory_attention.layers.1.cross_attn_image.out_proj.bias', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'obj_ptr_proj.layers.2.bias', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.0.bias', 'image_encoder.trunk.blocks.32.norm2.bias', 'memory_attention.layers.0.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'memory_attention.layers.3.self_attn.q_proj.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.0.bias', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'image_encoder.trunk.blocks.27.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.1.bias', 'memory_encoder.mask_downsampler.encoder.9.bias', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.q_proj.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'memory_encoder.fuser.layers.0.pwconv1.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'sam_mask_decoder.output_upscaling.1.bias', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'memory_attention.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.1.norm3.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.1.bias', 'memory_encoder.mask_downsampler.encoder.10.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.1.bias', 'memory_attention.layers.2.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.q_proj.bias', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'memory_attention.layers.1.self_attn.v_proj.bias', 'memory_attention.layers.3.cross_attn_image.v_proj.bias', 'sam_mask_decoder.iou_prediction_head.layers.0.bias', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.q_proj.bias', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'memory_attention.layers.3.norm2.bias', 'memory_encoder.mask_downsampler.encoder.6.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'memory_encoder.fuser.layers.1.norm.bias', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'memory_attention.layers.0.cross_attn_image.out_proj.bias', 'memory_attention.layers.0.norm3.bias', 'memory_attention.layers.2.self_attn.q_proj.bias', 'memory_attention.layers.1.cross_attn_image.k_proj.bias', 'memory_attention.layers.2.cross_attn_image.q_proj.bias', 'memory_attention.layers.1.norm2.bias', 'sam_prompt_encoder.mask_downscaling.0.bias', 'sam_prompt_encoder.mask_downscaling.6.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'memory_attention.layers.3.norm1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'image_encoder.trunk.blocks.44.norm2.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.q_proj.bias', 'memory_attention.layers.0.norm1.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.k_proj.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.v_proj.bias', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'memory_encoder.fuser.layers.0.pwconv2.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'memory_encoder.fuser.layers.0.dwconv.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'image_encoder.trunk.blocks.18.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.44.proj.bias', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.neck.convs.3.conv.bias', 'memory_attention.layers.1.self_attn.q_proj.bias', 'obj_ptr_proj.layers.1.bias', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'memory_attention.layers.2.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.1.norm4.bias', 'sam_mask_decoder.iou_prediction_head.layers.1.bias', 'memory_attention.layers.3.linear1.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'memory_encoder.mask_downsampler.encoder.3.bias', 'memory_attention.layers.3.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'sam_mask_decoder.pred_obj_score_head.layers.0.bias', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'image_encoder.neck.convs.2.conv.bias', 'memory_attention.layers.3.cross_attn_image.out_proj.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.out_proj.bias', 'memory_attention.layers.0.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.out_proj.bias', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.v_proj.bias', 'memory_attention.layers.0.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'memory_encoder.mask_downsampler.encoder.4.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.23.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.2.bias', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'memory_attention.layers.3.cross_attn_image.q_proj.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.out_proj.bias', 'memory_attention.layers.2.linear2.bias', 'memory_attention.layers.3.norm3.bias', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.2.bias', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'memory_attention.layers.2.norm2.bias', 'memory_encoder.fuser.layers.0.norm.bias', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.v_proj.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.neck.convs.0.conv.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'memory_encoder.fuser.layers.1.dwconv.bias', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'memory_attention.layers.0.cross_attn_image.v_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.1.bias', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'memory_attention.layers.2.norm1.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'memory_encoder.fuser.layers.1.pwconv1.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'memory_encoder.mask_downsampler.encoder.0.bias', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'obj_ptr_tpos_proj.bias', 'image_encoder.trunk.patch_embed.proj.bias', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.24.norm1.bias', 'memory_attention.layers.0.linear2.bias', 'memory_attention.layers.1.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'sam_prompt_encoder.mask_downscaling.1.bias', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'memory_attention.layers.3.self_attn.out_proj.bias', 'memory_encoder.fuser.layers.1.pwconv2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.1.bias', 'memory_attention.layers.3.cross_attn_image.k_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.2.bias', 'image_encoder.trunk.blocks.19.norm2.bias', 'memory_attention.layers.2.self_attn.v_proj.bias', 'memory_attention.layers.0.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'memory_attention.layers.1.norm3.bias', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'memory_attention.layers.1.linear2.bias', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'sam_prompt_encoder.mask_downscaling.4.bias', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.norm1.bias', 'memory_attention.layers.2.norm3.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.out_proj.bias', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.9.norm1.bias', 'memory_encoder.pix_feat_proj.bias', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'memory_attention.layers.1.self_attn.k_proj.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.v_proj.bias', 'memory_attention.layers.1.linear1.bias', 'memory_encoder.mask_downsampler.encoder.7.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'sam_mask_decoder.iou_prediction_head.layers.2.bias', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'sam_prompt_encoder.mask_downscaling.3.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'memory_attention.norm.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'memory_attention.layers.1.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'image_encoder.trunk.blocks.12.norm2.bias', 'memory_attention.layers.0.linear1.bias', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.norm2.bias', 'memory_attention.layers.2.linear1.bias', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'memory_attention.layers.1.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.8.proj.bias', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'memory_attention.layers.0.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.2.bias', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'memory_attention.layers.2.self_attn.k_proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.out_proj.bias', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'memory_attention.layers.3.linear2.bias', 'sam_mask_decoder.output_upscaling.3.bias', 'image_encoder.trunk.blocks.5.norm2.bias', 'mask_downsample.bias', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.q_proj.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.v_proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.k_proj.bias', 'image_encoder.trunk.blocks.24.norm2.bias', 'memory_encoder.out_proj.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'sam_mask_decoder.pred_obj_score_head.layers.2.bias', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.0.bias', 'memory_encoder.mask_downsampler.encoder.1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.0.bias', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.k_proj.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'sam_mask_decoder.pred_obj_score_head.layers.1.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'memory_attention.layers.2.cross_attn_image.k_proj.bias', 'sam_mask_decoder.output_upscaling.0.bias', 'image_encoder.trunk.blocks.3.attn.proj.bias'} INFO 2025-01-05 01:22:39,226 optimizer.py: 220: Matches for module_cls_name [torch.nn.LayerNorm]: {'sam_mask_decoder.transformer.layers.1.norm3.bias', 'image_encoder.trunk.blocks.26.norm1.weight', 'image_encoder.trunk.blocks.37.norm1.weight', 'memory_attention.layers.3.norm2.weight', 'memory_attention.layers.1.norm2.weight', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.14.norm2.weight', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.6.norm2.weight', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.32.norm2.weight', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.28.norm2.weight', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.17.norm1.weight', 'memory_attention.layers.1.norm3.bias', 'memory_attention.layers.3.norm1.weight', 'image_encoder.trunk.blocks.0.norm2.weight', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'memory_attention.layers.1.norm1.bias', 'image_encoder.trunk.blocks.19.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm1.weight', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.weight', 'memory_attention.layers.3.norm2.bias', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.40.norm1.weight', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.41.norm1.weight', 'memory_attention.layers.2.norm1.weight', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.35.norm1.weight', 'memory_attention.layers.0.norm3.bias', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.38.norm1.weight', 'memory_attention.layers.2.norm3.bias', 'image_encoder.trunk.blocks.2.norm1.weight', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.15.norm2.weight', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.1.norm2.weight', 'image_encoder.trunk.blocks.36.norm1.weight', 'memory_attention.layers.1.norm2.bias', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.24.norm2.weight', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.27.norm1.weight', 'image_encoder.trunk.blocks.30.norm1.weight', 'image_encoder.trunk.blocks.33.norm1.weight', 'image_encoder.trunk.blocks.42.norm2.weight', 'image_encoder.trunk.blocks.29.norm2.weight', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.26.norm2.weight', 'memory_attention.layers.3.norm1.bias', 'sam_mask_decoder.transformer.norm_final_attn.weight', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.36.norm2.weight', 'memory_attention.layers.3.norm3.weight', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.44.norm2.bias', 'memory_attention.layers.0.norm1.bias', 'image_encoder.trunk.blocks.44.norm1.weight', 'image_encoder.trunk.blocks.22.norm1.bias', 'memory_attention.layers.3.norm3.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.45.norm1.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.2.norm2.weight', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.7.norm2.weight', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.11.norm2.weight', 'sam_mask_decoder.transformer.layers.0.norm2.weight', 'sam_mask_decoder.transformer.layers.1.norm4.weight', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'memory_attention.layers.0.norm3.weight', 'memory_attention.layers.2.norm2.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.47.norm1.weight', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.46.norm2.weight', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.20.norm1.weight', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.trunk.blocks.42.norm1.weight', 'memory_attention.norm.bias', 'image_encoder.trunk.blocks.21.norm2.weight', 'sam_mask_decoder.transformer.layers.1.norm1.weight', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.31.norm1.weight', 'image_encoder.trunk.blocks.38.norm1.bias', 'memory_attention.layers.1.norm3.weight', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.trunk.blocks.9.norm1.weight', 'image_encoder.trunk.blocks.18.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.47.norm2.weight', 'memory_attention.norm.weight', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.0.norm1.weight', 'image_encoder.trunk.blocks.0.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm3.weight', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'memory_attention.layers.0.norm2.weight', 'image_encoder.trunk.blocks.41.norm2.weight', 'image_encoder.trunk.blocks.8.norm1.weight', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.37.norm2.bias', 'memory_attention.layers.2.norm1.bias', 'image_encoder.trunk.blocks.19.norm2.weight', 'sam_mask_decoder.transformer.layers.1.norm3.weight', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'memory_attention.layers.2.norm2.weight', 'memory_attention.layers.2.norm3.weight', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.weight', 'image_encoder.trunk.blocks.4.norm1.weight', 'image_encoder.trunk.blocks.12.norm1.weight', 'sam_mask_decoder.transformer.layers.1.norm4.bias', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.7.norm1.weight', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.13.norm1.weight', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.43.norm1.weight', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'memory_attention.layers.1.norm1.weight', 'image_encoder.trunk.blocks.31.norm2.weight', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.5.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm4.weight', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.38.norm2.weight', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.25.norm2.weight', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.21.norm1.weight', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.weight', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.45.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'image_encoder.trunk.blocks.25.norm1.weight', 'memory_attention.layers.0.norm1.weight', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.22.norm1.weight'} INFO 2025-01-05 01:22:39,473 sam2_datasets.py: 125: Dataset mixing probabilities: [1.0] INFO 2025-01-05 01:22:40,065 trainer.py: 417: Loading pretrained checkpoint from {'_partial_': True, '_target_': 'training.utils.checkpoint_utils.load_state_dict_into_model', 'strict': True, 'ignore_unexpected_keys': None, 'ignore_missing_keys': None, 'state_dict': {'_target_': 'training.utils.checkpoint_utils.load_checkpoint_and_apply_kernels', 'checkpoint_path': '/home/hossein/hossein/projects/sam2/checkpoints/sam2.1_hiera_large.pt', 'ckpt_state_dict_keys': ['model']}} INFO 2025-01-05 01:22:52,653 train_utils.py: 271: Train Epoch: [0][ 0/35] | Batch Time: 11.17 (11.17) | Data Time: 4.56 (4.56) | Mem (GB): 66.00 (66.00/66.00) | Time Elapsed: 00d 00h 00m | Losses/train_all_loss: 1.31e+01 (1.31e+01) INFO 2025-01-05 01:23:07,801 train_utils.py: 271: Train Epoch: [0][10/35] | Batch Time: 1.49 (2.39) | Data Time: 0.00 (0.41) | Mem (GB): 73.00 (70.64/76.00) | Time Elapsed: 00d 00h 00m | Losses/train_all_loss: 2.71e+01 (1.65e+01) INFO 2025-01-05 01:23:22,036 train_utils.py: 271: Train Epoch: [0][20/35] | Batch Time: 1.41 (1.93) | Data Time: 0.00 (0.22) | Mem (GB): 70.00 (70.71/76.00) | Time Elapsed: 00d 00h 00m | Losses/train_all_loss: 8.33e+00 (1.51e+01) INFO 2025-01-05 01:23:36,881 train_utils.py: 271: Train Epoch: [0][30/35] | Batch Time: 1.40 (1.79) | Data Time: 0.00 (0.15) | Mem (GB): 70.00 (71.32/76.00) | Time Elapsed: 00d 00h 01m | Losses/train_all_loss: 1.04e+01 (1.61e+01) INFO 2025-01-05 01:23:43,360 trainer.py: 950: Estimated time remaining: 00d 00h 39m INFO 2025-01-05 01:23:43,538 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:23:43,539 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 15.52312662942069, 'Losses/train_all_loss_mask': 0.03629592404467985, 'Losses/train_all_loss_dice': 10.42937673841204, 'Losses/train_all_loss_iou': 1.4414258205597954, 'Losses/train_all_loss_class': 2.9264054538948194, 'Losses/train_all_core_loss': 15.52312662942069, 'Trainer/where': 0.024285714285714285, 'Trainer/epoch': 0, 'Trainer/steps_train': 35} INFO 2025-01-05 01:23:56,650 train_utils.py: 271: Train Epoch: [1][ 0/35] | Batch Time: 7.71 (7.71) | Data Time: 4.38 (4.38) | Mem (GB): 75.00 (75.00/75.00) | Time Elapsed: 00d 00h 01m | Losses/train_all_loss: 2.39e+01 (2.39e+01) INFO 2025-01-05 01:24:10,843 train_utils.py: 271: Train Epoch: [1][10/35] | Batch Time: 1.54 (1.99) | Data Time: 0.00 (0.40) | Mem (GB): 74.00 (70.73/76.00) | Time Elapsed: 00d 00h 01m | Losses/train_all_loss: 2.55e+01 (1.39e+01) INFO 2025-01-05 01:24:25,315 train_utils.py: 271: Train Epoch: [1][20/35] | Batch Time: 1.40 (1.73) | Data Time: 0.00 (0.21) | Mem (GB): 70.00 (70.90/76.00) | Time Elapsed: 00d 00h 01m | Losses/train_all_loss: 6.09e+00 (1.44e+01) INFO 2025-01-05 01:24:39,832 train_utils.py: 271: Train Epoch: [1][30/35] | Batch Time: 1.54 (1.64) | Data Time: 0.00 (0.14) | Mem (GB): 74.00 (70.97/76.00) | Time Elapsed: 00d 00h 02m | Losses/train_all_loss: 1.85e+01 (1.42e+01) INFO 2025-01-05 01:24:46,129 trainer.py: 950: Estimated time remaining: 00d 00h 35m INFO 2025-01-05 01:24:46,315 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:24:46,316 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 13.35472868510655, 'Losses/train_all_loss_mask': 0.021235573248954358, 'Losses/train_all_loss_dice': 10.641365950448172, 'Losses/train_all_loss_iou': 0.7040971059019544, 'Losses/train_all_loss_class': 1.5845541834831238, 'Losses/train_all_core_loss': 13.35472868510655, 'Trainer/where': 0.04928571428571429, 'Trainer/epoch': 1, 'Trainer/steps_train': 70} INFO 2025-01-05 01:25:00,279 train_utils.py: 271: Train Epoch: [2][ 0/35] | Batch Time: 8.04 (8.04) | Data Time: 5.30 (5.30) | Mem (GB): 70.00 (70.00/70.00) | Time Elapsed: 00d 00h 02m | Losses/train_all_loss: 6.94e+00 (6.94e+00) INFO 2025-01-05 01:25:14,870 train_utils.py: 271: Train Epoch: [2][10/35] | Batch Time: 1.40 (2.06) | Data Time: 0.00 (0.48) | Mem (GB): 70.00 (71.64/76.00) | Time Elapsed: 00d 00h 02m | Losses/train_all_loss: 8.49e+00 (1.33e+01) INFO 2025-01-05 01:25:29,394 train_utils.py: 271: Train Epoch: [2][20/35] | Batch Time: 1.37 (1.77) | Data Time: 0.00 (0.25) | Mem (GB): 68.00 (71.57/76.00) | Time Elapsed: 00d 00h 02m | Losses/train_all_loss: 7.55e+00 (1.37e+01) INFO 2025-01-05 01:25:43,721 train_utils.py: 271: Train Epoch: [2][30/35] | Batch Time: 1.57 (1.66) | Data Time: 0.00 (0.17) | Mem (GB): 76.00 (71.35/76.00) | Time Elapsed: 00d 00h 03m | Losses/train_all_loss: 2.21e+01 (1.31e+01) INFO 2025-01-05 01:25:50,212 trainer.py: 950: Estimated time remaining: 00d 00h 35m INFO 2025-01-05 01:25:50,312 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:25:50,312 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 13.056742286682129, 'Losses/train_all_loss_mask': 0.011416948284854048, 'Losses/train_all_loss_dice': 10.823509979248048, 'Losses/train_all_loss_iou': 0.5041234560734925, 'Losses/train_all_loss_class': 1.500769633267607, 'Losses/train_all_core_loss': 13.056742286682129, 'Trainer/where': 0.07428571428571429, 'Trainer/epoch': 2, 'Trainer/steps_train': 105} INFO 2025-01-05 01:26:04,421 train_utils.py: 271: Train Epoch: [3][ 0/35] | Batch Time: 8.28 (8.28) | Data Time: 6.68 (6.68) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 03m | Losses/train_all_loss: 1.64e+01 (1.64e+01) INFO 2025-01-05 01:26:19,204 train_utils.py: 271: Train Epoch: [3][10/35] | Batch Time: 1.40 (2.10) | Data Time: 0.00 (0.61) | Mem (GB): 70.00 (72.82/76.00) | Time Elapsed: 00d 00h 03m | Losses/train_all_loss: 7.94e+00 (1.38e+01) INFO 2025-01-05 01:26:33,738 train_utils.py: 271: Train Epoch: [3][20/35] | Batch Time: 1.60 (1.79) | Data Time: 0.00 (0.32) | Mem (GB): 76.00 (72.19/76.00) | Time Elapsed: 00d 00h 03m | Losses/train_all_loss: 1.80e+01 (1.25e+01) INFO 2025-01-05 01:26:48,369 train_utils.py: 271: Train Epoch: [3][30/35] | Batch Time: 1.36 (1.68) | Data Time: 0.00 (0.22) | Mem (GB): 68.00 (72.03/76.00) | Time Elapsed: 00d 00h 04m | Losses/train_all_loss: 6.92e+00 (1.28e+01) INFO 2025-01-05 01:26:55,159 trainer.py: 950: Estimated time remaining: 00d 00h 34m INFO 2025-01-05 01:26:55,326 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:26:55,326 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 13.252275030953543, 'Losses/train_all_loss_mask': 0.0038689142230266173, 'Losses/train_all_loss_dice': 10.98888555254255, 'Losses/train_all_loss_iou': 0.22835017135366797, 'Losses/train_all_loss_class': 1.9576609894633292, 'Losses/train_all_core_loss': 13.252275030953543, 'Trainer/where': 0.09928571428571428, 'Trainer/epoch': 3, 'Trainer/steps_train': 140} INFO 2025-01-05 01:27:09,367 train_utils.py: 271: Train Epoch: [4][ 0/35] | Batch Time: 8.00 (8.00) | Data Time: 5.87 (5.87) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 04m | Losses/train_all_loss: 1.62e+01 (1.62e+01) INFO 2025-01-05 01:27:23,675 train_utils.py: 271: Train Epoch: [4][10/35] | Batch Time: 1.36 (2.03) | Data Time: 0.00 (0.53) | Mem (GB): 68.00 (71.00/76.00) | Time Elapsed: 00d 00h 04m | Losses/train_all_loss: 7.66e+00 (1.21e+01) INFO 2025-01-05 01:27:38,600 train_utils.py: 271: Train Epoch: [4][20/35] | Batch Time: 1.40 (1.77) | Data Time: 0.00 (0.28) | Mem (GB): 70.00 (71.76/76.00) | Time Elapsed: 00d 00h 05m | Losses/train_all_loss: 7.23e+00 (1.32e+01) INFO 2025-01-05 01:27:52,969 train_utils.py: 271: Train Epoch: [4][30/35] | Batch Time: 1.36 (1.66) | Data Time: 0.00 (0.19) | Mem (GB): 68.00 (71.55/76.00) | Time Elapsed: 00d 00h 05m | Losses/train_all_loss: 7.15e+00 (1.24e+01) INFO 2025-01-05 01:27:59,848 trainer.py: 950: Estimated time remaining: 00d 00h 33m INFO 2025-01-05 01:27:59,862 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:27:59,862 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 13.165853296007429, 'Losses/train_all_loss_mask': 0.003397888245152509, 'Losses/train_all_loss_dice': 10.91664868763515, 'Losses/train_all_loss_iou': 0.17255155175392117, 'Losses/train_all_loss_class': 2.00869502723217, 'Losses/train_all_core_loss': 13.165853296007429, 'Trainer/where': 0.12428571428571429, 'Trainer/epoch': 4, 'Trainer/steps_train': 175} INFO 2025-01-05 01:28:12,814 train_utils.py: 271: Train Epoch: [5][ 0/35] | Batch Time: 6.90 (6.90) | Data Time: 5.45 (5.45) | Mem (GB): 68.00 (68.00/68.00) | Time Elapsed: 00d 00h 05m | Losses/train_all_loss: 7.59e+00 (7.59e+00) INFO 2025-01-05 01:28:26,898 train_utils.py: 271: Train Epoch: [5][10/35] | Batch Time: 1.36 (1.91) | Data Time: 0.00 (0.50) | Mem (GB): 68.00 (69.91/76.00) | Time Elapsed: 00d 00h 05m | Losses/train_all_loss: 7.05e+00 (9.02e+00) INFO 2025-01-05 01:28:41,630 train_utils.py: 271: Train Epoch: [5][20/35] | Batch Time: 1.49 (1.70) | Data Time: 0.00 (0.26) | Mem (GB): 73.00 (70.95/76.00) | Time Elapsed: 00d 00h 06m | Losses/train_all_loss: 1.47e+01 (1.09e+01) INFO 2025-01-05 01:28:56,687 train_utils.py: 271: Train Epoch: [5][30/35] | Batch Time: 1.58 (1.64) | Data Time: 0.00 (0.18) | Mem (GB): 76.00 (71.74/76.00) | Time Elapsed: 00d 00h 06m | Losses/train_all_loss: 2.02e+01 (1.24e+01) INFO 2025-01-05 01:29:03,238 trainer.py: 950: Estimated time remaining: 00d 00h 31m INFO 2025-01-05 01:29:03,343 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:29:03,343 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.23187631879534, 'Losses/train_all_loss_mask': 0.003290402367877375, 'Losses/train_all_loss_dice': 10.583557428632464, 'Losses/train_all_loss_iou': 0.1472343360522895, 'Losses/train_all_loss_class': 1.4352764577205692, 'Losses/train_all_core_loss': 12.23187631879534, 'Trainer/where': 0.1492857142857143, 'Trainer/epoch': 5, 'Trainer/steps_train': 210} INFO 2025-01-05 01:29:16,905 train_utils.py: 271: Train Epoch: [6][ 0/35] | Batch Time: 7.48 (7.48) | Data Time: 4.87 (4.87) | Mem (GB): 70.00 (70.00/70.00) | Time Elapsed: 00d 00h 06m | Losses/train_all_loss: 6.44e+00 (6.44e+00) INFO 2025-01-05 01:29:31,439 train_utils.py: 271: Train Epoch: [6][10/35] | Batch Time: 1.41 (2.00) | Data Time: 0.00 (0.44) | Mem (GB): 70.00 (71.27/76.00) | Time Elapsed: 00d 00h 06m | Losses/train_all_loss: 6.89e+00 (1.19e+01) INFO 2025-01-05 01:29:46,043 train_utils.py: 271: Train Epoch: [6][20/35] | Batch Time: 1.40 (1.74) | Data Time: 0.00 (0.23) | Mem (GB): 70.00 (71.29/76.00) | Time Elapsed: 00d 00h 07m | Losses/train_all_loss: 7.80e+00 (1.21e+01) INFO 2025-01-05 01:30:00,876 train_utils.py: 271: Train Epoch: [6][30/35] | Batch Time: 1.58 (1.66) | Data Time: 0.00 (0.16) | Mem (GB): 76.00 (71.55/76.00) | Time Elapsed: 00d 00h 07m | Losses/train_all_loss: 2.16e+01 (1.25e+01) INFO 2025-01-05 01:30:07,453 trainer.py: 950: Estimated time remaining: 00d 00h 31m INFO 2025-01-05 01:30:07,626 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:30:07,626 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.532559653690884, 'Losses/train_all_loss_mask': 0.00453987946135125, 'Losses/train_all_loss_dice': 10.932144478389196, 'Losses/train_all_loss_iou': 0.11682707172585652, 'Losses/train_all_loss_class': 1.3927903246666704, 'Losses/train_all_core_loss': 12.532559653690884, 'Trainer/where': 0.1742857142857143, 'Trainer/epoch': 6, 'Trainer/steps_train': 245} INFO 2025-01-05 01:30:21,890 train_utils.py: 271: Train Epoch: [7][ 0/35] | Batch Time: 8.45 (8.45) | Data Time: 5.57 (5.57) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 07m | Losses/train_all_loss: 1.46e+01 (1.46e+01) INFO 2025-01-05 01:30:37,146 train_utils.py: 271: Train Epoch: [7][10/35] | Batch Time: 1.58 (2.16) | Data Time: 0.00 (0.51) | Mem (GB): 76.00 (73.73/76.00) | Time Elapsed: 00d 00h 08m | Losses/train_all_loss: 1.69e+01 (1.61e+01) INFO 2025-01-05 01:30:51,058 train_utils.py: 271: Train Epoch: [7][20/35] | Batch Time: 1.42 (1.79) | Data Time: 0.00 (0.27) | Mem (GB): 70.00 (71.67/76.00) | Time Elapsed: 00d 00h 08m | Losses/train_all_loss: 7.36e+00 (1.19e+01) INFO 2025-01-05 01:31:05,720 train_utils.py: 271: Train Epoch: [7][30/35] | Batch Time: 1.58 (1.69) | Data Time: 0.00 (0.18) | Mem (GB): 76.00 (71.74/76.00) | Time Elapsed: 00d 00h 08m | Losses/train_all_loss: 1.68e+01 (1.20e+01) INFO 2025-01-05 01:31:12,221 trainer.py: 950: Estimated time remaining: 00d 00h 30m INFO 2025-01-05 01:31:12,559 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:31:12,559 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.028561905452184, 'Losses/train_all_loss_mask': 0.003908876593257966, 'Losses/train_all_loss_dice': 9.698760529926846, 'Losses/train_all_loss_iou': 0.10878030185870427, 'Losses/train_all_loss_class': 2.1428432097658514, 'Losses/train_all_core_loss': 12.028561905452184, 'Trainer/where': 0.1992857142857143, 'Trainer/epoch': 7, 'Trainer/steps_train': 280} INFO 2025-01-05 01:31:26,363 train_utils.py: 271: Train Epoch: [8][ 0/35] | Batch Time: 7.98 (7.98) | Data Time: 6.62 (6.62) | Mem (GB): 68.00 (68.00/68.00) | Time Elapsed: 00d 00h 08m | Losses/train_all_loss: 6.89e+00 (6.89e+00) INFO 2025-01-05 01:31:41,435 train_utils.py: 271: Train Epoch: [8][10/35] | Batch Time: 1.57 (2.10) | Data Time: 0.00 (0.60) | Mem (GB): 76.00 (73.00/76.00) | Time Elapsed: 00d 00h 09m | Losses/train_all_loss: 1.47e+01 (1.41e+01) INFO 2025-01-05 01:31:56,099 train_utils.py: 271: Train Epoch: [8][20/35] | Batch Time: 1.49 (1.80) | Data Time: 0.00 (0.32) | Mem (GB): 73.00 (72.38/76.00) | Time Elapsed: 00d 00h 09m | Losses/train_all_loss: 1.33e+01 (1.33e+01) INFO 2025-01-05 01:32:10,587 train_utils.py: 271: Train Epoch: [8][30/35] | Batch Time: 1.40 (1.68) | Data Time: 0.00 (0.21) | Mem (GB): 70.00 (72.00/76.00) | Time Elapsed: 00d 00h 09m | Losses/train_all_loss: 8.46e+00 (1.30e+01) INFO 2025-01-05 01:32:17,073 trainer.py: 950: Estimated time remaining: 00d 00h 29m INFO 2025-01-05 01:32:17,348 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:32:17,348 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.847667748587472, 'Losses/train_all_loss_mask': 0.002982073316317318, 'Losses/train_all_loss_dice': 11.120562178747994, 'Losses/train_all_loss_iou': 0.07551417656109802, 'Losses/train_all_loss_class': 1.5919497845993777, 'Losses/train_all_core_loss': 12.847667748587472, 'Trainer/where': 0.22428571428571428, 'Trainer/epoch': 8, 'Trainer/steps_train': 315} INFO 2025-01-05 01:32:31,049 train_utils.py: 271: Train Epoch: [9][ 0/35] | Batch Time: 7.90 (7.90) | Data Time: 5.14 (5.14) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 09m | Losses/train_all_loss: 1.28e+01 (1.28e+01) INFO 2025-01-05 01:32:45,100 train_utils.py: 271: Train Epoch: [9][10/35] | Batch Time: 1.40 (2.00) | Data Time: 0.00 (0.47) | Mem (GB): 70.00 (70.27/74.00) | Time Elapsed: 00d 00h 10m | Losses/train_all_loss: 7.68e+00 (9.11e+00) INFO 2025-01-05 01:32:59,618 train_utils.py: 271: Train Epoch: [9][20/35] | Batch Time: 1.55 (1.74) | Data Time: 0.00 (0.25) | Mem (GB): 74.00 (70.81/76.00) | Time Elapsed: 00d 00h 10m | Losses/train_all_loss: 2.17e+01 (9.99e+00) INFO 2025-01-05 01:33:14,296 train_utils.py: 271: Train Epoch: [9][30/35] | Batch Time: 1.40 (1.65) | Data Time: 0.00 (0.17) | Mem (GB): 70.00 (71.10/76.00) | Time Elapsed: 00d 00h 10m | Losses/train_all_loss: 7.30e+00 (1.08e+01) INFO 2025-01-05 01:33:20,775 trainer.py: 950: Estimated time remaining: 00d 00h 28m INFO 2025-01-05 01:33:20,909 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:33:20,909 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 11.098733384268625, 'Losses/train_all_loss_mask': 0.007828620975903635, 'Losses/train_all_loss_dice': 9.035280391148158, 'Losses/train_all_loss_iou': 0.10779441692700077, 'Losses/train_all_loss_class': 1.7990862643612282, 'Losses/train_all_core_loss': 11.098733384268625, 'Trainer/where': 0.24928571428571428, 'Trainer/epoch': 9, 'Trainer/steps_train': 350} INFO 2025-01-05 01:33:33,917 train_utils.py: 271: Train Epoch: [10][ 0/35] | Batch Time: 7.21 (7.21) | Data Time: 4.25 (4.25) | Mem (GB): 70.00 (70.00/70.00) | Time Elapsed: 00d 00h 10m | Losses/train_all_loss: 7.59e+00 (7.59e+00) INFO 2025-01-05 01:33:48,619 train_utils.py: 271: Train Epoch: [10][10/35] | Batch Time: 1.38 (1.99) | Data Time: 0.00 (0.39) | Mem (GB): 68.00 (70.91/76.00) | Time Elapsed: 00d 00h 11m | Losses/train_all_loss: 6.61e+00 (1.09e+01) INFO 2025-01-05 01:34:03,482 train_utils.py: 271: Train Epoch: [10][20/35] | Batch Time: 1.61 (1.75) | Data Time: 0.00 (0.20) | Mem (GB): 76.00 (71.05/76.00) | Time Elapsed: 00d 00h 11m | Losses/train_all_loss: 2.03e+01 (1.11e+01) INFO 2025-01-05 01:34:18,132 train_utils.py: 271: Train Epoch: [10][30/35] | Batch Time: 1.62 (1.66) | Data Time: 0.00 (0.14) | Mem (GB): 76.00 (71.06/76.00) | Time Elapsed: 00d 00h 11m | Losses/train_all_loss: 1.17e+01 (1.05e+01) INFO 2025-01-05 01:34:24,932 trainer.py: 950: Estimated time remaining: 00d 00h 27m INFO 2025-01-05 01:34:24,946 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:34:24,946 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 10.939446571895054, 'Losses/train_all_loss_mask': 0.003415139138399224, 'Losses/train_all_loss_dice': 9.397412940434046, 'Losses/train_all_loss_iou': 0.050908478014337434, 'Losses/train_all_loss_class': 1.4228219677827187, 'Losses/train_all_core_loss': 10.939446571895054, 'Trainer/where': 0.2742857142857143, 'Trainer/epoch': 10, 'Trainer/steps_train': 385} INFO 2025-01-05 01:34:39,063 train_utils.py: 271: Train Epoch: [11][ 0/35] | Batch Time: 8.37 (8.37) | Data Time: 4.46 (4.46) | Mem (GB): 70.00 (70.00/70.00) | Time Elapsed: 00d 00h 12m | Losses/train_all_loss: 7.17e+00 (7.17e+00) INFO 2025-01-05 01:34:53,586 train_utils.py: 271: Train Epoch: [11][10/35] | Batch Time: 1.40 (2.08) | Data Time: 0.00 (0.41) | Mem (GB): 70.00 (71.45/76.00) | Time Elapsed: 00d 00h 12m | Losses/train_all_loss: 7.02e+00 (1.06e+01) INFO 2025-01-05 01:35:08,207 train_utils.py: 271: Train Epoch: [11][20/35] | Batch Time: 1.57 (1.79) | Data Time: 0.00 (0.21) | Mem (GB): 76.00 (71.57/76.00) | Time Elapsed: 00d 00h 12m | Losses/train_all_loss: 1.89e+01 (1.16e+01) INFO 2025-01-05 01:35:22,757 train_utils.py: 271: Train Epoch: [11][30/35] | Batch Time: 1.58 (1.68) | Data Time: 0.00 (0.14) | Mem (GB): 76.00 (71.61/76.00) | Time Elapsed: 00d 00h 12m | Losses/train_all_loss: 1.62e+01 (1.16e+01) INFO 2025-01-05 01:35:29,205 trainer.py: 950: Estimated time remaining: 00d 00h 26m INFO 2025-01-05 01:35:29,464 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:35:29,465 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 11.260014452253069, 'Losses/train_all_loss_mask': 0.0037810460206986005, 'Losses/train_all_loss_dice': 9.496538509641375, 'Losses/train_all_loss_iou': 0.06279493517223662, 'Losses/train_all_loss_class': 1.6250600540411793, 'Losses/train_all_core_loss': 11.260014452253069, 'Trainer/where': 0.29928571428571427, 'Trainer/epoch': 11, 'Trainer/steps_train': 420} INFO 2025-01-05 01:35:42,519 train_utils.py: 271: Train Epoch: [12][ 0/35] | Batch Time: 7.29 (7.29) | Data Time: 5.11 (5.11) | Mem (GB): 68.00 (68.00/68.00) | Time Elapsed: 00d 00h 13m | Losses/train_all_loss: 7.20e+00 (7.20e+00) INFO 2025-01-05 01:35:56,994 train_utils.py: 271: Train Epoch: [12][10/35] | Batch Time: 1.60 (1.98) | Data Time: 0.00 (0.47) | Mem (GB): 74.00 (70.09/75.00) | Time Elapsed: 00d 00h 13m | Losses/train_all_loss: 1.77e+01 (9.75e+00) INFO 2025-01-05 01:36:12,516 train_utils.py: 271: Train Epoch: [12][20/35] | Batch Time: 1.62 (1.78) | Data Time: 0.00 (0.24) | Mem (GB): 74.00 (71.48/76.00) | Time Elapsed: 00d 00h 13m | Losses/train_all_loss: 1.57e+01 (1.26e+01) INFO 2025-01-05 01:36:27,852 train_utils.py: 271: Train Epoch: [12][30/35] | Batch Time: 1.60 (1.70) | Data Time: 0.00 (0.17) | Mem (GB): 74.00 (71.81/76.00) | Time Elapsed: 00d 00h 13m | Losses/train_all_loss: 1.69e+01 (1.33e+01) INFO 2025-01-05 01:36:34,253 trainer.py: 950: Estimated time remaining: 00d 00h 26m INFO 2025-01-05 01:36:34,303 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:36:34,303 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.587461267198835, 'Losses/train_all_loss_mask': 0.005486930625712765, 'Losses/train_all_loss_dice': 10.807248279026576, 'Losses/train_all_loss_iou': 0.12015448743186426, 'Losses/train_all_loss_class': 1.550319859199226, 'Losses/train_all_core_loss': 12.587461267198835, 'Trainer/where': 0.3242857142857143, 'Trainer/epoch': 12, 'Trainer/steps_train': 455} INFO 2025-01-05 01:36:47,389 train_utils.py: 271: Train Epoch: [13][ 0/35] | Batch Time: 7.32 (7.32) | Data Time: 4.98 (4.98) | Mem (GB): 68.00 (68.00/68.00) | Time Elapsed: 00d 00h 14m | Losses/train_all_loss: 5.17e+00 (5.17e+00) INFO 2025-01-05 01:37:02,110 train_utils.py: 271: Train Epoch: [13][10/35] | Batch Time: 1.37 (2.00) | Data Time: 0.00 (0.45) | Mem (GB): 68.00 (71.55/76.00) | Time Elapsed: 00d 00h 14m | Losses/train_all_loss: 7.16e+00 (1.27e+01) INFO 2025-01-05 01:37:16,848 train_utils.py: 271: Train Epoch: [13][20/35] | Batch Time: 1.57 (1.75) | Data Time: 0.00 (0.24) | Mem (GB): 74.00 (71.67/76.00) | Time Elapsed: 00d 00h 14m | Losses/train_all_loss: 1.84e+01 (1.27e+01) INFO 2025-01-05 01:37:31,431 train_utils.py: 271: Train Epoch: [13][30/35] | Batch Time: 1.40 (1.66) | Data Time: 0.00 (0.16) | Mem (GB): 70.00 (71.52/76.00) | Time Elapsed: 00d 00h 14m | Losses/train_all_loss: 7.24e+00 (1.24e+01) INFO 2025-01-05 01:37:38,130 trainer.py: 950: Estimated time remaining: 00d 00h 24m INFO 2025-01-05 01:37:38,488 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:37:38,488 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.622871085575648, 'Losses/train_all_loss_mask': 0.0038172564951569907, 'Losses/train_all_loss_dice': 11.162222140175956, 'Losses/train_all_loss_iou': 0.05127683051479315, 'Losses/train_all_loss_class': 1.3330271471424826, 'Losses/train_all_core_loss': 12.622871085575648, 'Trainer/where': 0.3492857142857143, 'Trainer/epoch': 13, 'Trainer/steps_train': 490} INFO 2025-01-05 01:37:52,530 train_utils.py: 271: Train Epoch: [14][ 0/35] | Batch Time: 8.27 (8.27) | Data Time: 6.69 (6.69) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 15m | Losses/train_all_loss: 2.19e+01 (2.19e+01) INFO 2025-01-05 01:38:06,931 train_utils.py: 271: Train Epoch: [14][10/35] | Batch Time: 1.59 (2.06) | Data Time: 0.00 (0.61) | Mem (GB): 76.00 (71.55/76.00) | Time Elapsed: 00d 00h 15m | Losses/train_all_loss: 1.75e+01 (1.11e+01) INFO 2025-01-05 01:38:21,726 train_utils.py: 271: Train Epoch: [14][20/35] | Batch Time: 1.58 (1.78) | Data Time: 0.00 (0.32) | Mem (GB): 74.00 (71.86/76.00) | Time Elapsed: 00d 00h 15m | Losses/train_all_loss: 1.96e+01 (1.21e+01) INFO 2025-01-05 01:38:35,853 train_utils.py: 271: Train Epoch: [14][30/35] | Batch Time: 1.40 (1.66) | Data Time: 0.00 (0.22) | Mem (GB): 70.00 (71.32/76.00) | Time Elapsed: 00d 00h 16m | Losses/train_all_loss: 7.54e+00 (1.10e+01) INFO 2025-01-05 01:38:42,495 trainer.py: 950: Estimated time remaining: 00d 00h 23m INFO 2025-01-05 01:38:42,496 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:38:42,496 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 10.91449010031564, 'Losses/train_all_loss_mask': 0.0021623846517676223, 'Losses/train_all_loss_dice': 9.109422874450683, 'Losses/train_all_loss_iou': 0.03527758292870463, 'Losses/train_all_loss_class': 1.7265419553599453, 'Losses/train_all_core_loss': 10.91449010031564, 'Trainer/where': 0.3742857142857143, 'Trainer/epoch': 14, 'Trainer/steps_train': 525} INFO 2025-01-05 01:38:55,758 train_utils.py: 271: Train Epoch: [15][ 0/35] | Batch Time: 7.51 (7.51) | Data Time: 5.04 (5.04) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 16m | Losses/train_all_loss: 1.55e+01 (1.55e+01) INFO 2025-01-05 01:39:10,638 train_utils.py: 271: Train Epoch: [15][10/35] | Batch Time: 1.58 (2.04) | Data Time: 0.00 (0.46) | Mem (GB): 76.00 (72.36/76.00) | Time Elapsed: 00d 00h 16m | Losses/train_all_loss: 1.99e+01 (1.31e+01) INFO 2025-01-05 01:39:25,632 train_utils.py: 271: Train Epoch: [15][20/35] | Batch Time: 1.40 (1.78) | Data Time: 0.00 (0.24) | Mem (GB): 70.00 (72.67/76.00) | Time Elapsed: 00d 00h 16m | Losses/train_all_loss: 6.91e+00 (1.37e+01) INFO 2025-01-05 01:39:39,756 train_utils.py: 271: Train Epoch: [15][30/35] | Batch Time: 1.40 (1.66) | Data Time: 0.00 (0.16) | Mem (GB): 70.00 (71.74/76.00) | Time Elapsed: 00d 00h 17m | Losses/train_all_loss: 7.43e+00 (1.22e+01) INFO 2025-01-05 01:39:46,525 trainer.py: 950: Estimated time remaining: 00d 00h 22m INFO 2025-01-05 01:39:46,668 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:39:46,668 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.423973451341901, 'Losses/train_all_loss_mask': 0.002862129371247387, 'Losses/train_all_loss_dice': 10.686428315298897, 'Losses/train_all_loss_iou': 0.07063553876796505, 'Losses/train_all_loss_class': 1.6096669961332477, 'Losses/train_all_core_loss': 12.423973451341901, 'Trainer/where': 0.3992857142857143, 'Trainer/epoch': 15, 'Trainer/steps_train': 560} INFO 2025-01-05 01:40:00,865 train_utils.py: 271: Train Epoch: [16][ 0/35] | Batch Time: 8.46 (8.46) | Data Time: 5.41 (5.41) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 17m | Losses/train_all_loss: 1.64e+01 (1.64e+01) INFO 2025-01-05 01:40:15,802 train_utils.py: 271: Train Epoch: [16][10/35] | Batch Time: 1.58 (2.13) | Data Time: 0.00 (0.49) | Mem (GB): 76.00 (72.91/76.00) | Time Elapsed: 00d 00h 17m | Losses/train_all_loss: 2.13e+01 (1.46e+01) INFO 2025-01-05 01:40:30,833 train_utils.py: 271: Train Epoch: [16][20/35] | Batch Time: 1.53 (1.83) | Data Time: 0.00 (0.26) | Mem (GB): 75.00 (73.05/76.00) | Time Elapsed: 00d 00h 17m | Losses/train_all_loss: 1.99e+01 (1.46e+01) INFO 2025-01-05 01:40:45,863 train_utils.py: 271: Train Epoch: [16][30/35] | Batch Time: 1.58 (1.72) | Data Time: 0.00 (0.18) | Mem (GB): 76.00 (73.10/76.00) | Time Elapsed: 00d 00h 18m | Losses/train_all_loss: 1.78e+01 (1.42e+01) INFO 2025-01-05 01:40:52,172 trainer.py: 950: Estimated time remaining: 00d 00h 22m INFO 2025-01-05 01:40:52,628 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:40:52,628 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 13.769169889177595, 'Losses/train_all_loss_mask': 0.004980977749385472, 'Losses/train_all_loss_dice': 12.173356628417968, 'Losses/train_all_loss_iou': 0.042235918748857716, 'Losses/train_all_loss_class': 1.4539576956204006, 'Losses/train_all_core_loss': 13.769169889177595, 'Trainer/where': 0.42428571428571427, 'Trainer/epoch': 16, 'Trainer/steps_train': 595} INFO 2025-01-05 01:41:06,828 train_utils.py: 271: Train Epoch: [17][ 0/35] | Batch Time: 8.47 (8.47) | Data Time: 6.70 (6.70) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 18m | Losses/train_all_loss: 1.54e+01 (1.54e+01) INFO 2025-01-05 01:41:21,477 train_utils.py: 271: Train Epoch: [17][10/35] | Batch Time: 1.56 (2.10) | Data Time: 0.00 (0.61) | Mem (GB): 74.00 (71.82/74.00) | Time Elapsed: 00d 00h 18m | Losses/train_all_loss: 1.65e+01 (1.34e+01) INFO 2025-01-05 01:41:35,620 train_utils.py: 271: Train Epoch: [17][20/35] | Batch Time: 1.36 (1.77) | Data Time: 0.00 (0.32) | Mem (GB): 68.00 (71.10/76.00) | Time Elapsed: 00d 00h 19m | Losses/train_all_loss: 6.62e+00 (1.16e+01) INFO 2025-01-05 01:41:50,281 train_utils.py: 271: Train Epoch: [17][30/35] | Batch Time: 1.36 (1.68) | Data Time: 0.00 (0.22) | Mem (GB): 68.00 (71.35/76.00) | Time Elapsed: 00d 00h 19m | Losses/train_all_loss: 6.77e+00 (1.15e+01) INFO 2025-01-05 01:41:56,728 trainer.py: 950: Estimated time remaining: 00d 00h 21m INFO 2025-01-05 01:41:56,759 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:41:56,760 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 11.263181223188127, 'Losses/train_all_loss_mask': 0.0019078197024230447, 'Losses/train_all_loss_dice': 9.721550485066006, 'Losses/train_all_loss_iou': 0.0294029886584862, 'Losses/train_all_loss_class': 1.4740716303233057, 'Losses/train_all_core_loss': 11.263181223188127, 'Trainer/where': 0.4492857142857143, 'Trainer/epoch': 17, 'Trainer/steps_train': 630} INFO 2025-01-05 01:42:10,672 train_utils.py: 271: Train Epoch: [18][ 0/35] | Batch Time: 8.17 (8.17) | Data Time: 5.70 (5.70) | Mem (GB): 68.00 (68.00/68.00) | Time Elapsed: 00d 00h 19m | Losses/train_all_loss: 6.68e+00 (6.68e+00) INFO 2025-01-05 01:42:25,749 train_utils.py: 271: Train Epoch: [18][10/35] | Batch Time: 1.60 (2.11) | Data Time: 0.00 (0.52) | Mem (GB): 76.00 (73.00/76.00) | Time Elapsed: 00d 00h 19m | Losses/train_all_loss: 2.10e+01 (1.35e+01) INFO 2025-01-05 01:42:40,785 train_utils.py: 271: Train Epoch: [18][20/35] | Batch Time: 1.40 (1.82) | Data Time: 0.00 (0.27) | Mem (GB): 70.00 (72.86/76.00) | Time Elapsed: 00d 00h 20m | Losses/train_all_loss: 6.28e+00 (1.36e+01) INFO 2025-01-05 01:42:55,466 train_utils.py: 271: Train Epoch: [18][30/35] | Batch Time: 1.40 (1.71) | Data Time: 0.00 (0.18) | Mem (GB): 70.00 (72.32/76.00) | Time Elapsed: 00d 00h 20m | Losses/train_all_loss: 6.07e+00 (1.30e+01) INFO 2025-01-05 01:43:02,292 trainer.py: 950: Estimated time remaining: 00d 00h 20m INFO 2025-01-05 01:43:02,463 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:43:02,464 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.981634317125593, 'Losses/train_all_loss_mask': 0.010148391163342499, 'Losses/train_all_loss_dice': 11.691525718144009, 'Losses/train_all_loss_iou': 0.06949597106098995, 'Losses/train_all_loss_class': 1.0176445866269725, 'Losses/train_all_core_loss': 12.981634317125593, 'Trainer/where': 0.4742857142857143, 'Trainer/epoch': 18, 'Trainer/steps_train': 665} INFO 2025-01-05 01:43:16,725 train_utils.py: 271: Train Epoch: [19][ 0/35] | Batch Time: 8.54 (8.54) | Data Time: 5.28 (5.28) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 20m | Losses/train_all_loss: 2.04e+01 (2.04e+01) INFO 2025-01-05 01:43:31,260 train_utils.py: 271: Train Epoch: [19][10/35] | Batch Time: 1.59 (2.10) | Data Time: 0.00 (0.48) | Mem (GB): 74.00 (71.73/76.00) | Time Elapsed: 00d 00h 20m | Losses/train_all_loss: 1.59e+01 (1.21e+01) INFO 2025-01-05 01:43:45,888 train_utils.py: 271: Train Epoch: [19][20/35] | Batch Time: 1.56 (1.80) | Data Time: 0.00 (0.25) | Mem (GB): 75.00 (71.67/76.00) | Time Elapsed: 00d 00h 21m | Losses/train_all_loss: 1.62e+01 (1.16e+01) INFO 2025-01-05 01:44:00,183 train_utils.py: 271: Train Epoch: [19][30/35] | Batch Time: 1.40 (1.68) | Data Time: 0.00 (0.17) | Mem (GB): 70.00 (71.29/76.00) | Time Elapsed: 00d 00h 21m | Losses/train_all_loss: 6.81e+00 (1.09e+01) INFO 2025-01-05 01:44:06,856 trainer.py: 950: Estimated time remaining: 00d 00h 19m INFO 2025-01-05 01:44:06,944 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:44:06,945 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 11.267893355233328, 'Losses/train_all_loss_mask': 0.0026320721621492077, 'Losses/train_all_loss_dice': 9.648453426361083, 'Losses/train_all_loss_iou': 0.06427928070271653, 'Losses/train_all_loss_class': 1.502519281820527, 'Losses/train_all_core_loss': 11.267893355233328, 'Trainer/where': 0.4992857142857143, 'Trainer/epoch': 19, 'Trainer/steps_train': 700} INFO 2025-01-05 01:44:21,121 train_utils.py: 271: Train Epoch: [20][ 0/35] | Batch Time: 8.44 (8.44) | Data Time: 5.89 (5.89) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 21m | Losses/train_all_loss: 1.89e+01 (1.89e+01) INFO 2025-01-05 01:44:35,504 train_utils.py: 271: Train Epoch: [20][10/35] | Batch Time: 1.36 (2.07) | Data Time: 0.00 (0.54) | Mem (GB): 68.00 (71.45/76.00) | Time Elapsed: 00d 00h 22m | Losses/train_all_loss: 7.07e+00 (1.15e+01) INFO 2025-01-05 01:44:50,078 train_utils.py: 271: Train Epoch: [20][20/35] | Batch Time: 1.40 (1.78) | Data Time: 0.00 (0.28) | Mem (GB): 70.00 (71.48/76.00) | Time Elapsed: 00d 00h 22m | Losses/train_all_loss: 7.82e+00 (1.14e+01) INFO 2025-01-05 01:45:04,748 train_utils.py: 271: Train Epoch: [20][30/35] | Batch Time: 1.40 (1.68) | Data Time: 0.00 (0.19) | Mem (GB): 70.00 (71.58/76.00) | Time Elapsed: 00d 00h 22m | Losses/train_all_loss: 6.98e+00 (1.18e+01) INFO 2025-01-05 01:45:11,618 trainer.py: 950: Estimated time remaining: 00d 00h 18m INFO 2025-01-05 01:45:11,712 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:45:11,712 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.060971723284041, 'Losses/train_all_loss_mask': 0.004294625139196536, 'Losses/train_all_loss_dice': 10.43238205228533, 'Losses/train_all_loss_iou': 0.0722124845030651, 'Losses/train_all_loss_class': 1.470484510040842, 'Losses/train_all_core_loss': 12.060971723284041, 'Trainer/where': 0.5242857142857142, 'Trainer/epoch': 20, 'Trainer/steps_train': 735} INFO 2025-01-05 01:45:25,092 train_utils.py: 271: Train Epoch: [21][ 0/35] | Batch Time: 7.62 (7.62) | Data Time: 5.76 (5.76) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 22m | Losses/train_all_loss: 2.02e+01 (2.02e+01) INFO 2025-01-05 01:45:39,758 train_utils.py: 271: Train Epoch: [21][10/35] | Batch Time: 1.56 (2.03) | Data Time: 0.00 (0.52) | Mem (GB): 74.00 (72.27/76.00) | Time Elapsed: 00d 00h 23m | Losses/train_all_loss: 1.54e+01 (1.31e+01) INFO 2025-01-05 01:45:53,956 train_utils.py: 271: Train Epoch: [21][20/35] | Batch Time: 1.36 (1.74) | Data Time: 0.00 (0.27) | Mem (GB): 68.00 (71.24/76.00) | Time Elapsed: 00d 00h 23m | Losses/train_all_loss: 7.29e+00 (1.14e+01) INFO 2025-01-05 01:46:08,210 train_utils.py: 271: Train Epoch: [21][30/35] | Batch Time: 1.49 (1.64) | Data Time: 0.00 (0.19) | Mem (GB): 73.00 (71.03/76.00) | Time Elapsed: 00d 00h 23m | Losses/train_all_loss: 1.16e+01 (1.09e+01) INFO 2025-01-05 01:46:15,023 trainer.py: 950: Estimated time remaining: 00d 00h 16m INFO 2025-01-05 01:46:15,101 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:46:15,101 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 10.979690497262137, 'Losses/train_all_loss_mask': 0.002396076506036999, 'Losses/train_all_loss_dice': 9.051968785694667, 'Losses/train_all_loss_iou': 0.03635571563832595, 'Losses/train_all_loss_class': 1.8434445864254876, 'Losses/train_all_core_loss': 10.979690497262137, 'Trainer/where': 0.5492857142857143, 'Trainer/epoch': 21, 'Trainer/steps_train': 770} INFO 2025-01-05 01:46:29,260 train_utils.py: 271: Train Epoch: [22][ 0/35] | Batch Time: 8.43 (8.43) | Data Time: 5.36 (5.36) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 23m | Losses/train_all_loss: 1.24e+01 (1.24e+01) INFO 2025-01-05 01:46:43,578 train_utils.py: 271: Train Epoch: [22][10/35] | Batch Time: 1.54 (2.07) | Data Time: 0.00 (0.49) | Mem (GB): 74.00 (71.36/76.00) | Time Elapsed: 00d 00h 24m | Losses/train_all_loss: 2.04e+01 (1.07e+01) INFO 2025-01-05 01:46:58,280 train_utils.py: 271: Train Epoch: [22][20/35] | Batch Time: 1.55 (1.78) | Data Time: 0.00 (0.26) | Mem (GB): 74.00 (71.57/76.00) | Time Elapsed: 00d 00h 24m | Losses/train_all_loss: 1.89e+01 (1.15e+01) INFO 2025-01-05 01:47:12,802 train_utils.py: 271: Train Epoch: [22][30/35] | Batch Time: 1.49 (1.68) | Data Time: 0.00 (0.17) | Mem (GB): 73.00 (71.58/76.00) | Time Elapsed: 00d 00h 24m | Losses/train_all_loss: 1.22e+01 (1.15e+01) INFO 2025-01-05 01:47:19,151 trainer.py: 950: Estimated time remaining: 00d 00h 16m INFO 2025-01-05 01:47:19,397 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:47:19,397 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 10.912646198272705, 'Losses/train_all_loss_mask': 0.002020072060570653, 'Losses/train_all_loss_dice': 9.306321804864066, 'Losses/train_all_loss_iou': 0.03728623108545435, 'Losses/train_all_loss_class': 1.5286366003632013, 'Losses/train_all_core_loss': 10.912646198272705, 'Trainer/where': 0.5742857142857143, 'Trainer/epoch': 22, 'Trainer/steps_train': 805} INFO 2025-01-05 01:47:33,519 train_utils.py: 271: Train Epoch: [23][ 0/35] | Batch Time: 8.39 (8.39) | Data Time: 5.00 (5.00) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 24m | Losses/train_all_loss: 2.06e+01 (2.06e+01) INFO 2025-01-05 01:47:48,283 train_utils.py: 271: Train Epoch: [23][10/35] | Batch Time: 1.36 (2.10) | Data Time: 0.00 (0.45) | Mem (GB): 68.00 (72.45/75.00) | Time Elapsed: 00d 00h 25m | Losses/train_all_loss: 6.01e+00 (1.45e+01) INFO 2025-01-05 01:48:02,816 train_utils.py: 271: Train Epoch: [23][20/35] | Batch Time: 1.36 (1.79) | Data Time: 0.00 (0.24) | Mem (GB): 68.00 (71.95/76.00) | Time Elapsed: 00d 00h 25m | Losses/train_all_loss: 6.28e+00 (1.29e+01) INFO 2025-01-05 01:48:17,464 train_utils.py: 271: Train Epoch: [23][30/35] | Batch Time: 1.36 (1.69) | Data Time: 0.00 (0.16) | Mem (GB): 68.00 (71.90/76.00) | Time Elapsed: 00d 00h 25m | Losses/train_all_loss: 7.21e+00 (1.27e+01) INFO 2025-01-05 01:48:24,315 trainer.py: 950: Estimated time remaining: 00d 00h 15m INFO 2025-01-05 01:48:24,505 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:48:24,505 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.972393226623534, 'Losses/train_all_loss_mask': 0.0029303550138138233, 'Losses/train_all_loss_dice': 11.663762542179652, 'Losses/train_all_loss_iou': 0.03950240197342022, 'Losses/train_all_loss_class': 1.210520985183705, 'Losses/train_all_core_loss': 12.972393226623534, 'Trainer/where': 0.5992857142857143, 'Trainer/epoch': 23, 'Trainer/steps_train': 840} INFO 2025-01-05 01:48:38,387 train_utils.py: 271: Train Epoch: [24][ 0/35] | Batch Time: 8.12 (8.12) | Data Time: 4.39 (4.39) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 26m | Losses/train_all_loss: 1.36e+01 (1.36e+01) INFO 2025-01-05 01:48:53,473 train_utils.py: 271: Train Epoch: [24][10/35] | Batch Time: 1.43 (2.11) | Data Time: 0.00 (0.40) | Mem (GB): 70.00 (72.64/76.00) | Time Elapsed: 00d 00h 26m | Losses/train_all_loss: 7.37e+00 (1.37e+01) INFO 2025-01-05 01:49:07,535 train_utils.py: 271: Train Epoch: [24][20/35] | Batch Time: 1.36 (1.77) | Data Time: 0.00 (0.21) | Mem (GB): 68.00 (71.24/76.00) | Time Elapsed: 00d 00h 26m | Losses/train_all_loss: 6.42e+00 (1.12e+01) INFO 2025-01-05 01:49:22,407 train_utils.py: 271: Train Epoch: [24][30/35] | Batch Time: 1.36 (1.68) | Data Time: 0.00 (0.14) | Mem (GB): 68.00 (71.65/76.00) | Time Elapsed: 00d 00h 26m | Losses/train_all_loss: 7.50e+00 (1.20e+01) INFO 2025-01-05 01:49:29,037 trainer.py: 950: Estimated time remaining: 00d 00h 14m INFO 2025-01-05 01:49:29,165 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:49:29,165 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 11.836965492793492, 'Losses/train_all_loss_mask': 0.002555759671875941, 'Losses/train_all_loss_dice': 9.82940263066973, 'Losses/train_all_loss_iou': 0.06007927144362059, 'Losses/train_all_loss_class': 1.8963684562028253, 'Losses/train_all_core_loss': 11.836965492793492, 'Trainer/where': 0.6242857142857143, 'Trainer/epoch': 24, 'Trainer/steps_train': 875} INFO 2025-01-05 01:49:43,180 train_utils.py: 271: Train Epoch: [25][ 0/35] | Batch Time: 8.30 (8.30) | Data Time: 4.82 (4.82) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 27m | Losses/train_all_loss: 1.84e+01 (1.84e+01) INFO 2025-01-05 01:49:58,194 train_utils.py: 271: Train Epoch: [25][10/35] | Batch Time: 1.50 (2.12) | Data Time: 0.00 (0.44) | Mem (GB): 73.00 (73.27/76.00) | Time Elapsed: 00d 00h 27m | Losses/train_all_loss: 1.27e+01 (1.52e+01) INFO 2025-01-05 01:50:12,416 train_utils.py: 271: Train Epoch: [25][20/35] | Batch Time: 1.36 (1.79) | Data Time: 0.00 (0.23) | Mem (GB): 68.00 (71.76/76.00) | Time Elapsed: 00d 00h 27m | Losses/train_all_loss: 7.66e+00 (1.24e+01) INFO 2025-01-05 01:50:27,020 train_utils.py: 271: Train Epoch: [25][30/35] | Batch Time: 1.36 (1.68) | Data Time: 0.00 (0.16) | Mem (GB): 68.00 (71.81/76.00) | Time Elapsed: 00d 00h 27m | Losses/train_all_loss: 7.69e+00 (1.22e+01) INFO 2025-01-05 01:50:33,590 trainer.py: 950: Estimated time remaining: 00d 00h 13m INFO 2025-01-05 01:50:33,719 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:50:33,719 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.333917222704207, 'Losses/train_all_loss_mask': 0.005893107972640012, 'Losses/train_all_loss_dice': 10.638660866873606, 'Losses/train_all_loss_iou': 0.1051822858673404, 'Losses/train_all_loss_class': 1.4722119661713284, 'Losses/train_all_core_loss': 12.333917222704207, 'Trainer/where': 0.6492857142857142, 'Trainer/epoch': 25, 'Trainer/steps_train': 910} INFO 2025-01-05 01:50:46,416 train_utils.py: 271: Train Epoch: [26][ 0/35] | Batch Time: 6.97 (6.97) | Data Time: 4.86 (4.86) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 28m | Losses/train_all_loss: 1.88e+01 (1.88e+01) INFO 2025-01-05 01:51:01,162 train_utils.py: 271: Train Epoch: [26][10/35] | Batch Time: 1.37 (1.97) | Data Time: 0.00 (0.44) | Mem (GB): 68.00 (72.18/76.00) | Time Elapsed: 00d 00h 28m | Losses/train_all_loss: 7.77e+00 (1.30e+01) INFO 2025-01-05 01:51:16,283 train_utils.py: 271: Train Epoch: [26][20/35] | Batch Time: 1.49 (1.75) | Data Time: 0.00 (0.23) | Mem (GB): 73.00 (72.76/76.00) | Time Elapsed: 00d 00h 28m | Losses/train_all_loss: 1.38e+01 (1.39e+01) INFO 2025-01-05 01:51:30,832 train_utils.py: 271: Train Epoch: [26][30/35] | Batch Time: 1.58 (1.66) | Data Time: 0.00 (0.16) | Mem (GB): 76.00 (72.29/76.00) | Time Elapsed: 00d 00h 28m | Losses/train_all_loss: 2.07e+01 (1.34e+01) INFO 2025-01-05 01:51:37,503 trainer.py: 950: Estimated time remaining: 00d 00h 12m INFO 2025-01-05 01:51:37,560 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:51:37,560 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.976327841622489, 'Losses/train_all_loss_mask': 0.0058499143504637426, 'Losses/train_all_loss_dice': 11.136853994641985, 'Losses/train_all_loss_iou': 0.07329373683772117, 'Losses/train_all_loss_class': 1.6491819693573884, 'Losses/train_all_core_loss': 12.976327841622489, 'Trainer/where': 0.6742857142857143, 'Trainer/epoch': 26, 'Trainer/steps_train': 945} INFO 2025-01-05 01:51:51,326 train_utils.py: 271: Train Epoch: [27][ 0/35] | Batch Time: 8.03 (8.03) | Data Time: 6.40 (6.40) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 29m | Losses/train_all_loss: 1.38e+01 (1.38e+01) INFO 2025-01-05 01:52:06,254 train_utils.py: 271: Train Epoch: [27][10/35] | Batch Time: 1.57 (2.09) | Data Time: 0.00 (0.58) | Mem (GB): 75.00 (72.45/76.00) | Time Elapsed: 00d 00h 29m | Losses/train_all_loss: 1.80e+01 (1.33e+01) INFO 2025-01-05 01:52:21,541 train_utils.py: 271: Train Epoch: [27][20/35] | Batch Time: 1.59 (1.82) | Data Time: 0.00 (0.31) | Mem (GB): 74.00 (72.81/76.00) | Time Elapsed: 00d 00h 29m | Losses/train_all_loss: 1.35e+01 (1.37e+01) INFO 2025-01-05 01:52:36,167 train_utils.py: 271: Train Epoch: [27][30/35] | Batch Time: 1.37 (1.71) | Data Time: 0.00 (0.21) | Mem (GB): 68.00 (72.19/76.00) | Time Elapsed: 00d 00h 30m | Losses/train_all_loss: 6.56e+00 (1.32e+01) INFO 2025-01-05 01:52:42,702 trainer.py: 950: Estimated time remaining: 00d 00h 11m INFO 2025-01-05 01:52:42,748 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:52:42,748 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.715955025809151, 'Losses/train_all_loss_mask': 0.0051494090393784326, 'Losses/train_all_loss_dice': 11.038715130942208, 'Losses/train_all_loss_iou': 0.07952312049164903, 'Losses/train_all_loss_class': 1.494728443427344, 'Losses/train_all_core_loss': 12.715955025809151, 'Trainer/where': 0.6992857142857143, 'Trainer/epoch': 27, 'Trainer/steps_train': 980} INFO 2025-01-05 01:52:56,095 train_utils.py: 271: Train Epoch: [28][ 0/35] | Batch Time: 7.61 (7.61) | Data Time: 5.62 (5.62) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 30m | Losses/train_all_loss: 1.86e+01 (1.86e+01) INFO 2025-01-05 01:53:10,448 train_utils.py: 271: Train Epoch: [28][10/35] | Batch Time: 1.42 (2.00) | Data Time: 0.00 (0.51) | Mem (GB): 70.00 (70.73/76.00) | Time Elapsed: 00d 00h 30m | Losses/train_all_loss: 7.67e+00 (1.09e+01) INFO 2025-01-05 01:53:25,122 train_utils.py: 271: Train Epoch: [28][20/35] | Batch Time: 1.59 (1.74) | Data Time: 0.00 (0.27) | Mem (GB): 74.00 (71.05/76.00) | Time Elapsed: 00d 00h 30m | Losses/train_all_loss: 2.18e+01 (1.16e+01) INFO 2025-01-05 01:53:40,133 train_utils.py: 271: Train Epoch: [28][30/35] | Batch Time: 1.50 (1.67) | Data Time: 0.00 (0.18) | Mem (GB): 73.00 (71.52/76.00) | Time Elapsed: 00d 00h 31m | Losses/train_all_loss: 1.07e+01 (1.19e+01) INFO 2025-01-05 01:53:46,951 trainer.py: 950: Estimated time remaining: 00d 00h 10m INFO 2025-01-05 01:53:46,978 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:53:46,979 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.254196275983539, 'Losses/train_all_loss_mask': 0.002468374198568719, 'Losses/train_all_loss_dice': 10.371674939564295, 'Losses/train_all_loss_iou': 0.035466401658517756, 'Losses/train_all_loss_class': 1.7976875406590158, 'Losses/train_all_core_loss': 12.254196275983539, 'Trainer/where': 0.7242857142857143, 'Trainer/epoch': 28, 'Trainer/steps_train': 1015} INFO 2025-01-05 01:53:59,655 train_utils.py: 271: Train Epoch: [29][ 0/35] | Batch Time: 6.97 (6.97) | Data Time: 4.42 (4.42) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 31m | Losses/train_all_loss: 1.23e+01 (1.23e+01) INFO 2025-01-05 01:54:14,408 train_utils.py: 271: Train Epoch: [29][10/35] | Batch Time: 1.41 (1.97) | Data Time: 0.00 (0.40) | Mem (GB): 70.00 (72.00/76.00) | Time Elapsed: 00d 00h 31m | Losses/train_all_loss: 7.33e+00 (1.21e+01) INFO 2025-01-05 01:54:29,501 train_utils.py: 271: Train Epoch: [29][20/35] | Batch Time: 1.60 (1.75) | Data Time: 0.00 (0.21) | Mem (GB): 76.00 (72.29/76.00) | Time Elapsed: 00d 00h 31m | Losses/train_all_loss: 1.76e+01 (1.28e+01) INFO 2025-01-05 01:54:43,980 train_utils.py: 271: Train Epoch: [29][30/35] | Batch Time: 1.49 (1.65) | Data Time: 0.00 (0.14) | Mem (GB): 73.00 (71.90/76.00) | Time Elapsed: 00d 00h 32m | Losses/train_all_loss: 1.35e+01 (1.23e+01) INFO 2025-01-05 01:54:50,329 trainer.py: 950: Estimated time remaining: 00d 00h 09m INFO 2025-01-05 01:54:50,549 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:54:50,549 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 11.89058301108224, 'Losses/train_all_loss_mask': 0.0033038316294550897, 'Losses/train_all_loss_dice': 10.133634703499931, 'Losses/train_all_loss_iou': 0.06524576840498152, 'Losses/train_all_loss_class': 1.6256259542756848, 'Losses/train_all_core_loss': 11.89058301108224, 'Trainer/where': 0.7492857142857143, 'Trainer/epoch': 29, 'Trainer/steps_train': 1050} INFO 2025-01-05 01:55:04,731 train_utils.py: 271: Train Epoch: [30][ 0/35] | Batch Time: 8.46 (8.46) | Data Time: 6.86 (6.86) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 32m | Losses/train_all_loss: 1.95e+01 (1.95e+01) INFO 2025-01-05 01:55:19,158 train_utils.py: 271: Train Epoch: [30][10/35] | Batch Time: 1.50 (2.08) | Data Time: 0.00 (0.62) | Mem (GB): 73.00 (71.64/76.00) | Time Elapsed: 00d 00h 32m | Losses/train_all_loss: 1.22e+01 (1.22e+01) INFO 2025-01-05 01:55:34,149 train_utils.py: 271: Train Epoch: [30][20/35] | Batch Time: 1.40 (1.80) | Data Time: 0.00 (0.33) | Mem (GB): 70.00 (72.43/76.00) | Time Elapsed: 00d 00h 32m | Losses/train_all_loss: 7.16e+00 (1.32e+01) INFO 2025-01-05 01:55:48,868 train_utils.py: 271: Train Epoch: [30][30/35] | Batch Time: 1.40 (1.70) | Data Time: 0.00 (0.22) | Mem (GB): 70.00 (72.26/76.00) | Time Elapsed: 00d 00h 33m | Losses/train_all_loss: 7.24e+00 (1.33e+01) INFO 2025-01-05 01:55:55,203 trainer.py: 950: Estimated time remaining: 00d 00h 08m INFO 2025-01-05 01:55:55,478 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:55:55,478 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.994428975241524, 'Losses/train_all_loss_mask': 0.0032894188288732297, 'Losses/train_all_loss_dice': 11.455551188332693, 'Losses/train_all_loss_iou': 0.05269411236804444, 'Losses/train_all_loss_class': 1.4203952280871037, 'Losses/train_all_core_loss': 12.994428975241524, 'Trainer/where': 0.7742857142857142, 'Trainer/epoch': 30, 'Trainer/steps_train': 1085} INFO 2025-01-05 01:56:09,715 train_utils.py: 271: Train Epoch: [31][ 0/35] | Batch Time: 8.42 (8.42) | Data Time: 6.84 (6.84) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 33m | Losses/train_all_loss: 1.50e+01 (1.50e+01) INFO 2025-01-05 01:56:23,796 train_utils.py: 271: Train Epoch: [31][10/35] | Batch Time: 1.36 (2.05) | Data Time: 0.00 (0.62) | Mem (GB): 68.00 (70.45/74.00) | Time Elapsed: 00d 00h 33m | Losses/train_all_loss: 6.79e+00 (9.89e+00) INFO 2025-01-05 01:56:38,210 train_utils.py: 271: Train Epoch: [31][20/35] | Batch Time: 1.40 (1.76) | Data Time: 0.00 (0.33) | Mem (GB): 70.00 (70.71/75.00) | Time Elapsed: 00d 00h 34m | Losses/train_all_loss: 6.30e+00 (1.09e+01) INFO 2025-01-05 01:56:52,506 train_utils.py: 271: Train Epoch: [31][30/35] | Batch Time: 1.36 (1.65) | Data Time: 0.00 (0.22) | Mem (GB): 68.00 (70.65/75.00) | Time Elapsed: 00d 00h 34m | Losses/train_all_loss: 6.14e+00 (1.06e+01) INFO 2025-01-05 01:56:59,311 trainer.py: 950: Estimated time remaining: 00d 00h 07m INFO 2025-01-05 01:56:59,312 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:56:59,312 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 11.028971740177699, 'Losses/train_all_loss_mask': 0.0017663538056824888, 'Losses/train_all_loss_dice': 9.26269657952445, 'Losses/train_all_loss_iou': 0.020907437572480245, 'Losses/train_all_loss_class': 1.7100406129179255, 'Losses/train_all_core_loss': 11.028971740177699, 'Trainer/where': 0.7992857142857143, 'Trainer/epoch': 31, 'Trainer/steps_train': 1120} INFO 2025-01-05 01:57:12,522 train_utils.py: 271: Train Epoch: [32][ 0/35] | Batch Time: 7.47 (7.47) | Data Time: 4.43 (4.43) | Mem (GB): 68.00 (68.00/68.00) | Time Elapsed: 00d 00h 34m | Losses/train_all_loss: 5.94e+00 (5.94e+00) INFO 2025-01-05 01:57:27,125 train_utils.py: 271: Train Epoch: [32][10/35] | Batch Time: 1.53 (2.01) | Data Time: 0.00 (0.40) | Mem (GB): 75.00 (71.45/75.00) | Time Elapsed: 00d 00h 34m | Losses/train_all_loss: 2.19e+01 (1.22e+01) INFO 2025-01-05 01:57:42,004 train_utils.py: 271: Train Epoch: [32][20/35] | Batch Time: 1.57 (1.76) | Data Time: 0.00 (0.21) | Mem (GB): 76.00 (72.10/76.00) | Time Elapsed: 00d 00h 35m | Losses/train_all_loss: 1.96e+01 (1.31e+01) INFO 2025-01-05 01:57:56,561 train_utils.py: 271: Train Epoch: [32][30/35] | Batch Time: 1.40 (1.66) | Data Time: 0.00 (0.14) | Mem (GB): 70.00 (71.77/76.00) | Time Elapsed: 00d 00h 35m | Losses/train_all_loss: 7.01e+00 (1.27e+01) INFO 2025-01-05 01:58:03,297 trainer.py: 950: Estimated time remaining: 00d 00h 06m INFO 2025-01-05 01:58:03,342 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:58:03,342 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.77100260598319, 'Losses/train_all_loss_mask': 0.005347782664466649, 'Losses/train_all_loss_dice': 11.073454679761614, 'Losses/train_all_loss_iou': 0.1097778601632204, 'Losses/train_all_loss_class': 1.480814137798734, 'Losses/train_all_core_loss': 12.77100260598319, 'Trainer/where': 0.8242857142857142, 'Trainer/epoch': 32, 'Trainer/steps_train': 1155} INFO 2025-01-05 01:58:17,583 train_utils.py: 271: Train Epoch: [33][ 0/35] | Batch Time: 8.35 (8.35) | Data Time: 5.58 (5.58) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 35m | Losses/train_all_loss: 2.06e+01 (2.06e+01) INFO 2025-01-05 01:58:32,317 train_utils.py: 271: Train Epoch: [33][10/35] | Batch Time: 1.36 (2.10) | Data Time: 0.00 (0.51) | Mem (GB): 68.00 (72.00/76.00) | Time Elapsed: 00d 00h 35m | Losses/train_all_loss: 7.51e+00 (1.37e+01) INFO 2025-01-05 01:58:46,956 train_utils.py: 271: Train Epoch: [33][20/35] | Batch Time: 1.50 (1.80) | Data Time: 0.00 (0.27) | Mem (GB): 73.00 (71.76/76.00) | Time Elapsed: 00d 00h 36m | Losses/train_all_loss: 1.10e+01 (1.27e+01) INFO 2025-01-05 01:59:01,897 train_utils.py: 271: Train Epoch: [33][30/35] | Batch Time: 1.36 (1.70) | Data Time: 0.00 (0.18) | Mem (GB): 68.00 (72.03/76.00) | Time Elapsed: 00d 00h 36m | Losses/train_all_loss: 7.15e+00 (1.30e+01) INFO 2025-01-05 01:59:08,558 trainer.py: 950: Estimated time remaining: 00d 00h 05m INFO 2025-01-05 01:59:08,559 trainer.py: 892: Synchronizing meters INFO 2025-01-05 01:59:08,559 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.876578099387032, 'Losses/train_all_loss_mask': 0.014166244934313, 'Losses/train_all_loss_dice': 10.838990374973841, 'Losses/train_all_loss_iou': 0.09881567690504849, 'Losses/train_all_loss_class': 1.6554468802575555, 'Losses/train_all_core_loss': 12.876578099387032, 'Trainer/where': 0.8492857142857142, 'Trainer/epoch': 33, 'Trainer/steps_train': 1190} INFO 2025-01-05 01:59:22,015 train_utils.py: 271: Train Epoch: [34][ 0/35] | Batch Time: 7.70 (7.70) | Data Time: 5.75 (5.75) | Mem (GB): 76.00 (76.00/76.00) | Time Elapsed: 00d 00h 36m | Losses/train_all_loss: 1.55e+01 (1.55e+01) INFO 2025-01-05 01:59:36,389 train_utils.py: 271: Train Epoch: [34][10/35] | Batch Time: 1.41 (2.01) | Data Time: 0.00 (0.52) | Mem (GB): 70.00 (71.45/76.00) | Time Elapsed: 00d 00h 37m | Losses/train_all_loss: 7.87e+00 (1.06e+01) INFO 2025-01-05 01:59:51,034 train_utils.py: 271: Train Epoch: [34][20/35] | Batch Time: 1.40 (1.75) | Data Time: 0.00 (0.27) | Mem (GB): 70.00 (71.33/76.00) | Time Elapsed: 00d 00h 37m | Losses/train_all_loss: 7.86e+00 (1.05e+01) INFO 2025-01-05 02:00:05,741 train_utils.py: 271: Train Epoch: [34][30/35] | Batch Time: 1.50 (1.66) | Data Time: 0.00 (0.19) | Mem (GB): 73.00 (71.48/76.00) | Time Elapsed: 00d 00h 37m | Losses/train_all_loss: 1.30e+01 (1.06e+01) INFO 2025-01-05 02:00:12,420 trainer.py: 950: Estimated time remaining: 00d 00h 04m INFO 2025-01-05 02:00:12,474 trainer.py: 892: Synchronizing meters INFO 2025-01-05 02:00:12,474 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 10.704918697902135, 'Losses/train_all_loss_mask': 0.0019853250996675342, 'Losses/train_all_loss_dice': 8.530561372212, 'Losses/train_all_loss_iou': 0.022953493090088678, 'Losses/train_all_loss_class': 2.1116973195324786, 'Losses/train_all_core_loss': 10.704918697902135, 'Trainer/where': 0.8742857142857142, 'Trainer/epoch': 34, 'Trainer/steps_train': 1225} INFO 2025-01-05 02:00:26,832 train_utils.py: 271: Train Epoch: [35][ 0/35] | Batch Time: 8.62 (8.62) | Data Time: 4.20 (4.20) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 37m | Losses/train_all_loss: 1.31e+01 (1.31e+01) INFO 2025-01-05 02:00:41,367 train_utils.py: 271: Train Epoch: [35][10/35] | Batch Time: 1.57 (2.11) | Data Time: 0.00 (0.38) | Mem (GB): 76.00 (71.36/76.00) | Time Elapsed: 00d 00h 38m | Losses/train_all_loss: 2.04e+01 (1.17e+01) INFO 2025-01-05 02:00:56,129 train_utils.py: 271: Train Epoch: [35][20/35] | Batch Time: 1.57 (1.81) | Data Time: 0.00 (0.20) | Mem (GB): 76.00 (71.76/76.00) | Time Elapsed: 00d 00h 38m | Losses/train_all_loss: 1.95e+01 (1.27e+01) INFO 2025-01-05 02:01:10,645 train_utils.py: 271: Train Epoch: [35][30/35] | Batch Time: 1.40 (1.69) | Data Time: 0.00 (0.14) | Mem (GB): 70.00 (71.58/76.00) | Time Elapsed: 00d 00h 38m | Losses/train_all_loss: 7.07e+00 (1.22e+01) INFO 2025-01-05 02:01:17,626 trainer.py: 950: Estimated time remaining: 00d 00h 03m INFO 2025-01-05 02:01:17,680 trainer.py: 892: Synchronizing meters INFO 2025-01-05 02:01:17,681 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.674189213344029, 'Losses/train_all_loss_mask': 0.004104748995242906, 'Losses/train_all_loss_dice': 11.569254003252302, 'Losses/train_all_loss_iou': 0.013460466006654315, 'Losses/train_all_loss_class': 1.0093798162881285, 'Losses/train_all_core_loss': 12.674189213344029, 'Trainer/where': 0.8992857142857142, 'Trainer/epoch': 35, 'Trainer/steps_train': 1260} INFO 2025-01-05 02:01:31,711 train_utils.py: 271: Train Epoch: [36][ 0/35] | Batch Time: 8.11 (8.11) | Data Time: 6.36 (6.36) | Mem (GB): 74.00 (74.00/74.00) | Time Elapsed: 00d 00h 38m | Losses/train_all_loss: 1.81e+01 (1.81e+01) INFO 2025-01-05 02:01:46,899 train_utils.py: 271: Train Epoch: [36][10/35] | Batch Time: 1.40 (2.12) | Data Time: 0.00 (0.58) | Mem (GB): 70.00 (73.00/76.00) | Time Elapsed: 00d 00h 39m | Losses/train_all_loss: 7.67e+00 (1.48e+01) INFO 2025-01-05 02:02:01,260 train_utils.py: 271: Train Epoch: [36][20/35] | Batch Time: 1.36 (1.79) | Data Time: 0.00 (0.30) | Mem (GB): 68.00 (71.86/76.00) | Time Elapsed: 00d 00h 39m | Losses/train_all_loss: 7.59e+00 (1.31e+01) INFO 2025-01-05 02:02:15,928 train_utils.py: 271: Train Epoch: [36][30/35] | Batch Time: 1.36 (1.69) | Data Time: 0.00 (0.21) | Mem (GB): 68.00 (71.77/76.00) | Time Elapsed: 00d 00h 39m | Losses/train_all_loss: 7.00e+00 (1.27e+01) INFO 2025-01-05 02:02:23,031 trainer.py: 950: Estimated time remaining: 00d 00h 02m INFO 2025-01-05 02:02:23,064 trainer.py: 892: Synchronizing meters INFO 2025-01-05 02:02:23,064 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 13.152547141483852, 'Losses/train_all_loss_mask': 0.00731085481404859, 'Losses/train_all_loss_dice': 11.433105080468314, 'Losses/train_all_loss_iou': 0.04558718047472732, 'Losses/train_all_loss_class': 1.5276377756015531, 'Losses/train_all_core_loss': 13.152547141483852, 'Trainer/where': 0.9242857142857142, 'Trainer/epoch': 36, 'Trainer/steps_train': 1295} INFO 2025-01-05 02:02:36,154 train_utils.py: 271: Train Epoch: [37][ 0/35] | Batch Time: 7.17 (7.17) | Data Time: 5.14 (5.14) | Mem (GB): 70.00 (70.00/70.00) | Time Elapsed: 00d 00h 40m | Losses/train_all_loss: 6.15e+00 (6.15e+00) INFO 2025-01-05 02:02:50,306 train_utils.py: 271: Train Epoch: [37][10/35] | Batch Time: 1.35 (1.94) | Data Time: 0.00 (0.47) | Mem (GB): 68.00 (70.00/74.00) | Time Elapsed: 00d 00h 40m | Losses/train_all_loss: 5.35e+00 (8.84e+00) INFO 2025-01-05 02:03:04,799 train_utils.py: 271: Train Epoch: [37][20/35] | Batch Time: 1.36 (1.71) | Data Time: 0.00 (0.25) | Mem (GB): 68.00 (70.43/76.00) | Time Elapsed: 00d 00h 40m | Losses/train_all_loss: 5.26e+00 (9.78e+00) INFO 2025-01-05 02:03:19,499 train_utils.py: 271: Train Epoch: [37][30/35] | Batch Time: 1.36 (1.63) | Data Time: 0.00 (0.17) | Mem (GB): 68.00 (70.81/76.00) | Time Elapsed: 00d 00h 40m | Losses/train_all_loss: 6.27e+00 (1.08e+01) INFO 2025-01-05 02:03:26,058 trainer.py: 950: Estimated time remaining: 00d 00h 01m INFO 2025-01-05 02:03:26,304 trainer.py: 892: Synchronizing meters INFO 2025-01-05 02:03:26,304 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 10.936215046473912, 'Losses/train_all_loss_mask': 0.00343067699328198, 'Losses/train_all_loss_dice': 9.400099325180054, 'Losses/train_all_loss_iou': 0.0718443956205322, 'Losses/train_all_loss_class': 1.3956576879734972, 'Losses/train_all_core_loss': 10.936215046473912, 'Trainer/where': 0.9492857142857142, 'Trainer/epoch': 37, 'Trainer/steps_train': 1330} INFO 2025-01-05 02:03:39,442 train_utils.py: 271: Train Epoch: [38][ 0/35] | Batch Time: 7.44 (7.44) | Data Time: 5.90 (5.90) | Mem (GB): 73.00 (73.00/73.00) | Time Elapsed: 00d 00h 41m | Losses/train_all_loss: 1.41e+01 (1.41e+01) INFO 2025-01-05 02:03:54,152 train_utils.py: 271: Train Epoch: [38][10/35] | Batch Time: 1.50 (2.01) | Data Time: 0.00 (0.54) | Mem (GB): 73.00 (72.09/76.00) | Time Elapsed: 00d 00h 41m | Losses/train_all_loss: 1.35e+01 (1.23e+01) INFO 2025-01-05 02:04:09,009 train_utils.py: 271: Train Epoch: [38][20/35] | Batch Time: 1.59 (1.76) | Data Time: 0.00 (0.28) | Mem (GB): 74.00 (71.90/76.00) | Time Elapsed: 00d 00h 41m | Losses/train_all_loss: 1.65e+01 (1.28e+01) INFO 2025-01-05 02:04:24,306 train_utils.py: 271: Train Epoch: [38][30/35] | Batch Time: 1.58 (1.69) | Data Time: 0.00 (0.19) | Mem (GB): 74.00 (72.26/76.00) | Time Elapsed: 00d 00h 41m | Losses/train_all_loss: 1.77e+01 (1.36e+01) INFO 2025-01-05 02:04:30,795 trainer.py: 950: Estimated time remaining: 00d 00h 00m INFO 2025-01-05 02:04:30,848 trainer.py: 892: Synchronizing meters INFO 2025-01-05 02:04:30,848 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 13.038009902409145, 'Losses/train_all_loss_mask': 0.001936620579467022, 'Losses/train_all_loss_dice': 11.594574492318289, 'Losses/train_all_loss_iou': 0.04597411672626289, 'Losses/train_all_loss_class': 1.3587288303616722, 'Losses/train_all_core_loss': 13.038009902409145, 'Trainer/where': 0.9742857142857142, 'Trainer/epoch': 38, 'Trainer/steps_train': 1365} INFO 2025-01-05 02:04:44,364 train_utils.py: 271: Train Epoch: [39][ 0/35] | Batch Time: 7.81 (7.81) | Data Time: 6.39 (6.39) | Mem (GB): 70.00 (70.00/70.00) | Time Elapsed: 00d 00h 42m | Losses/train_all_loss: 9.72e+00 (9.72e+00) INFO 2025-01-05 02:04:59,209 train_utils.py: 271: Train Epoch: [39][10/35] | Batch Time: 1.55 (2.06) | Data Time: 0.00 (0.58) | Mem (GB): 74.00 (72.27/76.00) | Time Elapsed: 00d 00h 42m | Losses/train_all_loss: 1.57e+01 (1.27e+01) INFO 2025-01-05 02:05:13,432 train_utils.py: 271: Train Epoch: [39][20/35] | Batch Time: 1.50 (1.76) | Data Time: 0.00 (0.31) | Mem (GB): 73.00 (71.43/76.00) | Time Elapsed: 00d 00h 42m | Losses/train_all_loss: 1.44e+01 (1.12e+01) INFO 2025-01-05 02:05:28,354 train_utils.py: 271: Train Epoch: [39][30/35] | Batch Time: 1.55 (1.67) | Data Time: 0.00 (0.21) | Mem (GB): 74.00 (71.87/76.00) | Time Elapsed: 00d 00h 42m | Losses/train_all_loss: 1.68e+01 (1.20e+01) INFO 2025-01-05 02:05:35,121 trainer.py: 950: Estimated time remaining: 00d 00h 00m INFO 2025-01-05 02:05:35,364 trainer.py: 892: Synchronizing meters INFO 2025-01-05 02:05:35,364 trainer.py: 830: Losses and meters: {'Losses/train_all_loss': 12.413451303754535, 'Losses/train_all_loss_mask': 0.003644983982667327, 'Losses/train_all_loss_dice': 10.79392305782863, 'Losses/train_all_loss_iou': 0.06104640555193847, 'Losses/train_all_loss_class': 1.4855821575769887, 'Losses/train_all_core_loss': 12.413451303754535, 'Trainer/where': 0.9992857142857142, 'Trainer/epoch': 39, 'Trainer/steps_train': 1400} INFO 2025-01-20 16:22:35,397 train_utils.py: 108: MACHINE SEED: 4920 INFO 2025-01-20 16:22:35,398 train_utils.py: 154: Logging ENV_VARIABLES INFO 2025-01-20 16:22:35,398 train_utils.py: 155: BROWSER=/home/hossein/.cursor-server/cli/servers/Stable-316e524257c2ea23b755332b0a72c50cf23e1b00/server/bin/helpers/browser.sh COLORTERM=truecolor CONDA_DEFAULT_ENV=sam2 CONDA_EXE=/home/hossein/miniconda3/bin/conda CONDA_PREFIX=/ephemeral/hossein/envs/sam2 CONDA_PREFIX_1=/home/hossein/miniconda3 CONDA_PREFIX_2=/ephemeral/hossein/envs/sam2 CONDA_PROMPT_MODIFIER=(sam2) CONDA_PYTHON_EXE=/home/hossein/miniconda3/bin/python CONDA_ROOT=/home/hossein/miniconda3 CONDA_SHLVL=2 CUDA_MODULE_LOADING=LAZY DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/2095/bus GIT_ASKPASS=/home/hossein/.cursor-server/cli/servers/Stable-316e524257c2ea23b755332b0a72c50cf23e1b00/server/extensions/git/dist/askpass.sh HF_HOME=/ephemeral/ HISTSIZE=1000 HISTTIMEFORMAT=%F %T HOME=/home/hossein HYDRA_FULL_ERROR=1 LANG=C.UTF-8 LESSCLOSE=/usr/bin/lesspipe %s %s LESSOPEN=| /usr/bin/lesspipe %s LOCAL_RANK=0 LOGNAME=hossein LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: MASTER_ADDR=localhost MASTER_PORT=56657 MOTD_SHOWN=pam NCCL_TOPO_FILE=/etc/nccl-topo-h100-v1.xml OLDPWD=/home/hossein/hossein/projects/sam2 PATH=/home/hossein/.cursor-server/cli/servers/Stable-316e524257c2ea23b755332b0a72c50cf23e1b00/server/bin/remote-cli:/ephemeral/hossein/envs/sam2/bin:/home/hossein/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/hossein/.cursor-server/cli/servers/Stable-316e524257c2ea23b755332b0a72c50cf23e1b00/server/bin/remote-cli:/home/hossein/miniconda3/bin:/home/hossein/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/hossein/.cursor-server/cli/servers/Stable-316e524257c2ea23b755332b0a72c50cf23e1b00/server/bin/remote-cli:/home/hossein/miniconda3/bin:/home/hossein/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin PS1=\[]633;A\](sam2) (base) \[\e]0;\u@\h: \w\a\]${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ \[]633;B\] PWD=/home/hossein/hossein/projects/sam2/training PYTHON_PATH=/home/hossein/hossein/projects/hybrid_model_training:/home/hossein/hossein/projects/hybrid_model_training: RANK=0 SHELL=/bin/bash SHLVL=1 SSH_CLIENT=99.251.10.243 63563 22 SSH_CONNECTION=99.251.10.243 63563 10.0.1.99 22 SSL_CERT_DIR=/usr/lib/ssl/certs SSL_CERT_FILE=/usr/lib/ssl/certs/ca-certificates.crt TERM=xterm-256color TERM_PROGRAM=vscode TERM_PROGRAM_VERSION=0.44.9 TORCH_NCCL_ASYNC_ERROR_HANDLING=1 USER=hossein VSCODE_GIT_ASKPASS_EXTRA_ARGS= VSCODE_GIT_ASKPASS_MAIN=/home/hossein/.cursor-server/cli/servers/Stable-316e524257c2ea23b755332b0a72c50cf23e1b00/server/extensions/git/dist/askpass-main.js VSCODE_GIT_ASKPASS_NODE=/home/hossein/.cursor-server/cli/servers/Stable-316e524257c2ea23b755332b0a72c50cf23e1b00/server/node VSCODE_GIT_IPC_HANDLE=/run/user/2095/vscode-git-dfa9aeeda3.sock VSCODE_IPC_HOOK_CLI=/run/user/2095/vscode-ipc-a4ea06dd-70ac-4895-899e-ebf10ec4b480.sock WORLD_SIZE=4 XDG_DATA_DIRS=/usr/local/share:/usr/share:/var/lib/snapd/desktop XDG_RUNTIME_DIR=/run/user/2095 XDG_SESSION_CLASS=user XDG_SESSION_ID=11718 XDG_SESSION_TYPE=tty _=/ephemeral/hossein/envs/sam2/bin/python _CE_CONDA= _CE_M= INFO 2025-01-20 16:22:35,398 trainer.py: 989: Setting up components: Model, loss, optim, meters etc. INFO 2025-01-20 16:22:35,399 logger.py: 66: TensorBoard SummaryWriter instantiated. Files will be stored in: /ephemeral/hossein/output/sam2/tensorboard INFO 2025-01-20 16:22:38,920 sam2.py: 81: Training with points (sampled from masks) as inputs with p=0.5 INFO 2025-01-20 16:22:38,924 trainer.py:1059: ==================== INFO 2025-01-20 16:22:38,924 trainer.py:1060: Summary for model INFO 2025-01-20 16:22:38,926 trainer.py:1061: Model is SAM2Train( (image_encoder): ImageEncoder( (trunk): Hiera( (patch_embed): PatchEmbed( (proj): Conv2d(3, 144, kernel_size=(7, 7), stride=(4, 4), padding=(3, 3)) ) (blocks): ModuleList( (0-1): 2 x MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=144, out_features=432, bias=True) (proj): Linear(in_features=144, out_features=144, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=144, out_features=576, bias=True) (1): Linear(in_features=576, out_features=144, bias=True) ) (act): GELU(approximate='none') ) ) (2): MultiScaleBlock( (norm1): LayerNorm((144,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=144, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=144, out_features=288, bias=True) ) (3-7): 5 x MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=288, out_features=864, bias=True) (proj): Linear(in_features=288, out_features=288, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=288, out_features=1152, bias=True) (1): Linear(in_features=1152, out_features=288, bias=True) ) (act): GELU(approximate='none') ) ) (8): MultiScaleBlock( (norm1): LayerNorm((288,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=288, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=288, out_features=576, bias=True) ) (9-43): 35 x MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=576, out_features=1728, bias=True) (proj): Linear(in_features=576, out_features=576, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=576, out_features=2304, bias=True) (1): Linear(in_features=2304, out_features=576, bias=True) ) (act): GELU(approximate='none') ) ) (44): MultiScaleBlock( (norm1): LayerNorm((576,), eps=1e-06, elementwise_affine=True) (pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (attn): MultiScaleAttention( (q_pool): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), padding=0, dilation=1, ceil_mode=False) (qkv): Linear(in_features=576, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) (proj): Linear(in_features=576, out_features=1152, bias=True) ) (45-47): 3 x MultiScaleBlock( (norm1): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (attn): MultiScaleAttention( (qkv): Linear(in_features=1152, out_features=3456, bias=True) (proj): Linear(in_features=1152, out_features=1152, bias=True) ) (drop_path): Identity() (norm2): LayerNorm((1152,), eps=1e-06, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=1152, out_features=4608, bias=True) (1): Linear(in_features=4608, out_features=1152, bias=True) ) (act): GELU(approximate='none') ) ) ) ) (neck): FpnNeck( (position_encoding): PositionEmbeddingSine() (convs): ModuleList( (0): Sequential( (conv): Conv2d(1152, 256, kernel_size=(1, 1), stride=(1, 1)) ) (1): Sequential( (conv): Conv2d(576, 256, kernel_size=(1, 1), stride=(1, 1)) ) (2): Sequential( (conv): Conv2d(288, 256, kernel_size=(1, 1), stride=(1, 1)) ) (3): Sequential( (conv): Conv2d(144, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) ) ) (mask_downsample): Conv2d(1, 1, kernel_size=(4, 4), stride=(4, 4)) (memory_attention): MemoryAttention( (layers): ModuleList( (0-3): 4 x MemoryAttentionLayer( (self_attn): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (cross_attn_image): RoPEAttention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=64, out_features=256, bias=True) (v_proj): Linear(in_features=64, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (linear1): Linear(in_features=256, out_features=2048, bias=True) (dropout): Dropout(p=0.1, inplace=False) (linear2): Linear(in_features=2048, out_features=256, bias=True) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (dropout1): Dropout(p=0.1, inplace=False) (dropout2): Dropout(p=0.1, inplace=False) (dropout3): Dropout(p=0.1, inplace=False) ) ) (norm): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (memory_encoder): MemoryEncoder( (mask_downsampler): MaskDownSampler( (encoder): Sequential( (0): Conv2d(1, 4, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (7): LayerNorm2d() (8): GELU(approximate='none') (9): Conv2d(64, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (10): LayerNorm2d() (11): GELU(approximate='none') (12): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) ) ) (pix_feat_proj): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1)) (fuser): Fuser( (proj): Identity() (layers): ModuleList( (0-1): 2 x CXBlock( (dwconv): Conv2d(256, 256, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3), groups=256) (norm): LayerNorm2d() (pwconv1): Linear(in_features=256, out_features=1024, bias=True) (act): GELU(approximate='none') (pwconv2): Linear(in_features=1024, out_features=256, bias=True) (drop_path): Identity() ) ) ) (position_encoding): PositionEmbeddingSine() (out_proj): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) ) (sam_prompt_encoder): PromptEncoder( (pe_layer): PositionEmbeddingRandom() (point_embeddings): ModuleList( (0-3): 4 x Embedding(1, 256) ) (not_a_point_embed): Embedding(1, 256) (mask_downscaling): Sequential( (0): Conv2d(1, 4, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): Conv2d(4, 16, kernel_size=(2, 2), stride=(2, 2)) (4): LayerNorm2d() (5): GELU(approximate='none') (6): Conv2d(16, 256, kernel_size=(1, 1), stride=(1, 1)) ) (no_mask_embed): Embedding(1, 256) ) (sam_mask_decoder): MaskDecoder( (transformer): TwoWayTransformer( (layers): ModuleList( (0-1): 2 x TwoWayAttentionBlock( (self_attn): Attention( (q_proj): Linear(in_features=256, out_features=256, bias=True) (k_proj): Linear(in_features=256, out_features=256, bias=True) (v_proj): Linear(in_features=256, out_features=256, bias=True) (out_proj): Linear(in_features=256, out_features=256, bias=True) ) (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (layers): ModuleList( (0): Linear(in_features=256, out_features=2048, bias=True) (1): Linear(in_features=2048, out_features=256, bias=True) ) (act): ReLU() ) (norm3): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (norm4): LayerNorm((256,), eps=1e-05, elementwise_affine=True) (cross_attn_image_to_token): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) ) ) (final_attn_token_to_image): Attention( (q_proj): Linear(in_features=256, out_features=128, bias=True) (k_proj): Linear(in_features=256, out_features=128, bias=True) (v_proj): Linear(in_features=256, out_features=128, bias=True) (out_proj): Linear(in_features=128, out_features=256, bias=True) ) (norm_final_attn): LayerNorm((256,), eps=1e-05, elementwise_affine=True) ) (iou_token): Embedding(1, 256) (mask_tokens): Embedding(4, 256) (obj_score_token): Embedding(1, 256) (output_upscaling): Sequential( (0): ConvTranspose2d(256, 64, kernel_size=(2, 2), stride=(2, 2)) (1): LayerNorm2d() (2): GELU(approximate='none') (3): ConvTranspose2d(64, 32, kernel_size=(2, 2), stride=(2, 2)) (4): GELU(approximate='none') ) (conv_s0): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1)) (conv_s1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1)) (output_hypernetworks_mlps): ModuleList( (0-3): 4 x MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=32, bias=True) ) (act): ReLU() ) ) (iou_prediction_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=4, bias=True) ) (act): ReLU() ) (pred_obj_score_head): MLP( (layers): ModuleList( (0-1): 2 x Linear(in_features=256, out_features=256, bias=True) (2): Linear(in_features=256, out_features=1, bias=True) ) (act): ReLU() ) ) (obj_ptr_proj): MLP( (layers): ModuleList( (0-2): 3 x Linear(in_features=256, out_features=256, bias=True) ) (act): ReLU() ) (obj_ptr_tpos_proj): Linear(in_features=256, out_features=64, bias=True) ) INFO 2025-01-20 16:22:38,927 trainer.py:1062: Total parameters 224 M INFO 2025-01-20 16:22:38,927 trainer.py:1063: Trainable parameters 224 M INFO 2025-01-20 16:22:38,927 trainer.py:1066: Non-Trainable parameters 0 INFO 2025-01-20 16:22:38,927 trainer.py:1069: ==================== INFO 2025-01-20 16:22:38,930 trainer.py:1023: Finished setting up components: Model, loss, optim, meters etc. INFO 2025-01-20 16:22:38,930 trainer.py: 314: Moving components to device cuda:0 and local rank 0. INFO 2025-01-20 16:22:40,574 trainer.py: 320: Done moving components to device cuda:0 and local rank 0. INFO 2025-01-20 16:22:40,590 optimizer.py: 248: Matches for param_name [image_encoder.*]: {'image_encoder.trunk.blocks.21.mlp.layers.0.weight', 'image_encoder.trunk.blocks.37.attn.proj.weight', 'image_encoder.trunk.blocks.13.mlp.layers.0.weight', 'image_encoder.trunk.blocks.31.mlp.layers.0.weight', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.norm2.weight', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.weight', 'image_encoder.trunk.blocks.30.attn.proj.weight', 'image_encoder.trunk.blocks.8.mlp.layers.0.weight', 'image_encoder.trunk.blocks.27.norm2.bias', 'image_encoder.trunk.blocks.11.mlp.layers.1.weight', 'image_encoder.trunk.blocks.16.norm1.weight', 'image_encoder.trunk.blocks.46.mlp.layers.0.weight', 'image_encoder.trunk.blocks.40.norm1.weight', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.2.norm2.weight', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.20.mlp.layers.0.weight', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.weight', 'image_encoder.trunk.blocks.17.norm1.weight', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.33.attn.qkv.weight', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'image_encoder.trunk.blocks.19.attn.qkv.weight', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.attn.qkv.weight', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.0.norm1.weight', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'image_encoder.trunk.blocks.40.mlp.layers.0.weight', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.weight', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.18.attn.proj.weight', 'image_encoder.trunk.blocks.24.mlp.layers.0.weight', 'image_encoder.trunk.blocks.29.norm1.weight', 'image_encoder.trunk.blocks.39.norm2.bias', 'image_encoder.trunk.blocks.37.norm1.weight', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'image_encoder.trunk.blocks.4.attn.proj.weight', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'image_encoder.trunk.blocks.46.attn.proj.weight', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.32.attn.proj.weight', 'image_encoder.trunk.blocks.30.mlp.layers.1.weight', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.weight', 'image_encoder.trunk.blocks.26.norm1.weight', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'image_encoder.trunk.blocks.45.attn.proj.weight', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.13.attn.proj.weight', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.6.attn.proj.weight', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.weight', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'image_encoder.trunk.blocks.24.attn.qkv.weight', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.weight', 'image_encoder.trunk.blocks.5.mlp.layers.1.weight', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'image_encoder.trunk.blocks.44.norm1.weight', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.blocks.4.norm1.weight', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.21.attn.proj.weight', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'image_encoder.trunk.blocks.46.attn.qkv.weight', 'image_encoder.trunk.blocks.27.attn.qkv.weight', 'image_encoder.trunk.blocks.44.attn.proj.weight', 'image_encoder.trunk.blocks.21.norm1.weight', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.attn.proj.weight', 'image_encoder.trunk.blocks.36.mlp.layers.1.weight', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.neck.convs.3.conv.bias', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'image_encoder.trunk.blocks.2.attn.proj.weight', 'image_encoder.trunk.blocks.8.norm1.weight', 'image_encoder.trunk.blocks.26.mlp.layers.1.weight', 'image_encoder.trunk.blocks.1.mlp.layers.0.weight', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.22.mlp.layers.0.weight', 'image_encoder.trunk.blocks.29.mlp.layers.1.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.25.attn.proj.weight', 'image_encoder.trunk.blocks.6.attn.qkv.weight', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.16.attn.qkv.weight', 'image_encoder.trunk.blocks.38.attn.proj.weight', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.15.attn.proj.weight', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'image_encoder.trunk.blocks.12.attn.qkv.weight', 'image_encoder.trunk.blocks.45.norm1.weight', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'image_encoder.trunk.blocks.47.attn.proj.weight', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.weight', 'image_encoder.trunk.blocks.29.attn.proj.weight', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.26.norm2.weight', 'image_encoder.trunk.blocks.13.mlp.layers.1.weight', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.18.attn.qkv.weight', 'image_encoder.trunk.blocks.9.mlp.layers.1.weight', 'image_encoder.trunk.blocks.4.mlp.layers.0.weight', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'image_encoder.trunk.blocks.30.attn.qkv.weight', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.weight', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'image_encoder.trunk.blocks.5.mlp.layers.0.weight', 'image_encoder.neck.convs.3.conv.weight', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'image_encoder.trunk.blocks.14.attn.proj.weight', 'image_encoder.trunk.blocks.39.attn.qkv.weight', 'image_encoder.trunk.blocks.0.mlp.layers.1.weight', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.21.norm2.weight', 'image_encoder.trunk.blocks.27.attn.proj.weight', 'image_encoder.trunk.blocks.14.norm2.weight', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.attn.qkv.weight', 'image_encoder.trunk.blocks.42.norm1.weight', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'image_encoder.trunk.blocks.36.attn.proj.weight', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.proj.bias', 'image_encoder.trunk.blocks.25.attn.qkv.weight', 'image_encoder.trunk.blocks.39.norm1.bias', 'image_encoder.trunk.blocks.9.norm1.weight', 'image_encoder.trunk.blocks.17.attn.proj.weight', 'image_encoder.trunk.blocks.38.mlp.layers.1.weight', 'image_encoder.trunk.blocks.17.attn.qkv.weight', 'image_encoder.trunk.blocks.14.attn.qkv.weight', 'image_encoder.trunk.blocks.42.mlp.layers.0.weight', 'image_encoder.trunk.blocks.29.norm2.weight', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.weight', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.19.norm1.weight', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'image_encoder.trunk.blocks.11.norm2.weight', 'image_encoder.trunk.blocks.44.norm1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.weight', 'image_encoder.trunk.blocks.16.attn.proj.weight', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.weight', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.28.attn.qkv.weight', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.47.norm2.weight', 'image_encoder.trunk.blocks.47.mlp.layers.0.weight', 'image_encoder.trunk.blocks.11.mlp.layers.0.weight', 'image_encoder.trunk.blocks.9.norm2.bias', 'image_encoder.trunk.blocks.34.attn.proj.weight', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.3.mlp.layers.1.weight', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.25.norm2.weight', 'image_encoder.trunk.blocks.27.norm1.weight', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.weight', 'image_encoder.trunk.blocks.31.norm1.weight', 'image_encoder.trunk.blocks.1.mlp.layers.1.weight', 'image_encoder.trunk.blocks.7.norm2.weight', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.36.norm1.weight', 'image_encoder.trunk.blocks.40.mlp.layers.1.weight', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.weight', 'image_encoder.trunk.blocks.37.attn.qkv.weight', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.patch_embed.proj.bias', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.42.norm2.weight', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'image_encoder.trunk.blocks.0.attn.qkv.weight', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.33.mlp.layers.1.weight', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'image_encoder.trunk.blocks.41.mlp.layers.0.weight', 'image_encoder.trunk.blocks.47.mlp.layers.1.weight', 'image_encoder.trunk.blocks.15.mlp.layers.0.weight', 'image_encoder.trunk.blocks.1.attn.qkv.weight', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.25.mlp.layers.1.weight', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'image_encoder.trunk.pos_embed', 'image_encoder.trunk.blocks.35.attn.proj.weight', 'image_encoder.trunk.blocks.41.norm2.weight', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.weight', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.7.norm1.weight', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.weight', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.31.mlp.layers.1.weight', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'image_encoder.trunk.blocks.47.attn.qkv.weight', 'image_encoder.trunk.blocks.17.mlp.layers.0.weight', 'image_encoder.trunk.blocks.12.mlp.layers.0.weight', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.mlp.layers.0.weight', 'image_encoder.trunk.blocks.24.mlp.layers.1.weight', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.7.attn.qkv.weight', 'image_encoder.trunk.blocks.12.norm1.weight', 'image_encoder.trunk.blocks.40.attn.proj.weight', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.weight', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'image_encoder.trunk.blocks.13.attn.qkv.weight', 'image_encoder.trunk.blocks.9.attn.qkv.weight', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'image_encoder.trunk.blocks.35.norm1.weight', 'image_encoder.trunk.blocks.44.proj.bias', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.41.attn.qkv.weight', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'image_encoder.trunk.blocks.39.attn.proj.weight', 'image_encoder.trunk.blocks.5.attn.proj.weight', 'image_encoder.trunk.blocks.31.attn.qkv.weight', 'image_encoder.trunk.blocks.22.attn.qkv.weight', 'image_encoder.neck.convs.2.conv.weight', 'image_encoder.trunk.blocks.43.attn.qkv.weight', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.44.mlp.layers.1.weight', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.attn.qkv.weight', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'image_encoder.trunk.blocks.23.attn.qkv.weight', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'image_encoder.trunk.blocks.2.mlp.layers.1.weight', 'image_encoder.trunk.blocks.33.mlp.layers.0.weight', 'image_encoder.trunk.blocks.20.norm1.weight', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'image_encoder.trunk.blocks.38.attn.qkv.weight', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.31.norm2.weight', 'image_encoder.trunk.blocks.20.attn.qkv.weight', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.weight', 'image_encoder.trunk.blocks.19.norm2.weight', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'image_encoder.trunk.pos_embed_window', 'image_encoder.trunk.blocks.41.attn.proj.weight', 'image_encoder.trunk.blocks.37.mlp.layers.0.weight', 'image_encoder.trunk.blocks.2.mlp.layers.0.weight', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.neck.convs.2.conv.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.weight', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.8.attn.qkv.weight', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.mlp.layers.1.weight', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.45.attn.qkv.weight', 'image_encoder.trunk.blocks.33.attn.proj.weight', 'image_encoder.trunk.blocks.6.norm2.weight', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.26.attn.qkv.weight', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.17.mlp.layers.1.weight', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.44.mlp.layers.0.weight', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.3.attn.proj.weight', 'image_encoder.trunk.blocks.11.attn.qkv.weight', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.norm2.weight', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.weight', 'image_encoder.trunk.blocks.34.attn.qkv.weight', 'image_encoder.trunk.blocks.5.attn.qkv.weight', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.28.norm2.weight', 'image_encoder.neck.convs.0.conv.bias', 'image_encoder.trunk.blocks.26.attn.proj.weight', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'image_encoder.trunk.blocks.36.attn.qkv.weight', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.weight', 'image_encoder.trunk.blocks.10.mlp.layers.0.weight', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'image_encoder.trunk.blocks.31.attn.proj.weight', 'image_encoder.trunk.blocks.29.attn.qkv.weight', 'image_encoder.trunk.blocks.35.mlp.layers.0.weight', 'image_encoder.trunk.blocks.12.norm2.bias', 'image_encoder.trunk.blocks.23.attn.proj.weight', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.11.attn.proj.weight', 'image_encoder.trunk.blocks.27.mlp.layers.1.weight', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.weight', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.40.attn.qkv.weight', 'image_encoder.trunk.blocks.46.mlp.layers.1.weight', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.42.attn.qkv.weight', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.25.norm2.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.weight', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.weight', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.weight', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.weight', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'image_encoder.trunk.blocks.0.attn.proj.weight', 'image_encoder.trunk.blocks.15.norm2.weight', 'image_encoder.trunk.blocks.13.norm1.weight', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.weight', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.20.attn.proj.weight', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.norm2.weight', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.weight', 'image_encoder.trunk.blocks.26.mlp.layers.0.weight', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.trunk.blocks.30.norm1.weight', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.weight', 'image_encoder.trunk.blocks.19.mlp.layers.1.weight', 'image_encoder.trunk.blocks.32.attn.qkv.weight', 'image_encoder.trunk.blocks.38.norm1.weight', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.43.norm1.weight', 'image_encoder.trunk.blocks.15.attn.qkv.weight', 'image_encoder.trunk.blocks.4.attn.qkv.weight', 'image_encoder.trunk.blocks.10.attn.qkv.weight', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.8.attn.proj.weight', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.weight', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'image_encoder.trunk.blocks.2.proj.weight', 'image_encoder.trunk.blocks.46.norm2.weight', 'image_encoder.trunk.blocks.37.mlp.layers.1.weight', 'image_encoder.trunk.blocks.8.proj.weight', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.weight', 'image_encoder.trunk.blocks.43.attn.proj.weight', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'image_encoder.trunk.patch_embed.proj.weight', 'image_encoder.trunk.blocks.1.attn.proj.weight', 'image_encoder.trunk.blocks.24.attn.proj.weight', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.21.attn.qkv.weight', 'image_encoder.trunk.blocks.3.attn.proj.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'image_encoder.trunk.blocks.44.proj.weight', 'image_encoder.trunk.blocks.19.mlp.layers.0.weight', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.10.attn.proj.weight', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'image_encoder.trunk.blocks.19.attn.proj.weight', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'image_encoder.trunk.blocks.1.norm2.weight', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'image_encoder.neck.convs.0.conv.weight', 'image_encoder.trunk.blocks.25.norm1.weight', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.3.attn.qkv.weight', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.weight', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.42.attn.proj.weight', 'image_encoder.trunk.blocks.22.norm1.weight', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.32.norm2.weight', 'image_encoder.trunk.blocks.7.attn.proj.weight', 'image_encoder.trunk.blocks.9.mlp.layers.0.weight', 'image_encoder.trunk.blocks.22.attn.proj.weight', 'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'image_encoder.trunk.blocks.12.attn.proj.weight', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.6.mlp.layers.1.weight', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.47.norm1.weight', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'image_encoder.trunk.blocks.41.norm1.weight', 'image_encoder.trunk.blocks.2.norm1.weight', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.weight', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'image_encoder.trunk.blocks.33.norm1.weight', 'image_encoder.trunk.blocks.38.mlp.layers.0.weight', 'image_encoder.trunk.blocks.20.mlp.layers.1.weight', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.weight', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'image_encoder.trunk.blocks.24.norm2.weight', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.weight', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.weight', 'image_encoder.trunk.blocks.45.mlp.layers.1.weight', 'image_encoder.neck.convs.1.conv.weight', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.9.attn.proj.weight', 'image_encoder.trunk.blocks.45.norm1.bias'} INFO 2025-01-20 16:22:40,592 optimizer.py: 248: Matches for param_name [*bias*]: {'image_encoder.trunk.blocks.44.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.0.mlp.layers.1.bias', 'image_encoder.trunk.blocks.30.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'memory_attention.layers.2.norm2.bias', 'memory_encoder.fuser.layers.1.pwconv1.bias', 'memory_attention.layers.2.linear2.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.v_proj.bias', 'image_encoder.trunk.blocks.23.mlp.layers.1.bias', 'image_encoder.trunk.blocks.24.mlp.layers.0.bias', 'memory_attention.layers.3.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.31.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'image_encoder.trunk.blocks.15.attn.qkv.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'memory_encoder.mask_downsampler.encoder.4.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.0.bias', 'image_encoder.trunk.blocks.2.attn.proj.bias', 'image_encoder.trunk.blocks.10.mlp.layers.1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.1.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.2.bias', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.35.attn.proj.bias', 'image_encoder.trunk.blocks.1.attn.proj.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.33.norm1.bias', 'sam_mask_decoder.pred_obj_score_head.layers.0.bias', 'image_encoder.trunk.blocks.31.attn.qkv.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'memory_attention.layers.0.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.1.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.norm4.bias', 'memory_attention.layers.0.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.7.attn.qkv.bias', 'memory_encoder.fuser.layers.1.pwconv2.bias', 'image_encoder.trunk.blocks.31.norm1.bias', 'memory_encoder.fuser.layers.1.dwconv.bias', 'image_encoder.trunk.blocks.38.attn.proj.bias', 'sam_mask_decoder.pred_obj_score_head.layers.1.bias', 'image_encoder.trunk.blocks.22.mlp.layers.1.bias', 'image_encoder.trunk.blocks.11.mlp.layers.0.bias', 'memory_attention.layers.3.self_attn.k_proj.bias', 'memory_encoder.mask_downsampler.encoder.12.bias', 'image_encoder.trunk.blocks.25.mlp.layers.1.bias', 'image_encoder.trunk.blocks.25.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.out_proj.bias', 'image_encoder.trunk.blocks.9.attn.qkv.bias', 'image_encoder.trunk.blocks.31.mlp.layers.0.bias', 'image_encoder.trunk.blocks.14.mlp.layers.1.bias', 'image_encoder.trunk.blocks.27.attn.qkv.bias', 'memory_attention.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.29.attn.qkv.bias', 'sam_mask_decoder.output_upscaling.3.bias', 'image_encoder.trunk.blocks.42.attn.proj.bias', 'image_encoder.trunk.blocks.17.mlp.layers.1.bias', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.35.mlp.layers.0.bias', 'image_encoder.trunk.blocks.21.attn.qkv.bias', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.38.attn.qkv.bias', 'image_encoder.neck.convs.3.conv.bias', 'image_encoder.trunk.blocks.43.attn.proj.bias', 'image_encoder.trunk.blocks.21.attn.proj.bias', 'sam_mask_decoder.pred_obj_score_head.layers.2.bias', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.38.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.mlp.layers.1.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.v_proj.bias', 'memory_attention.layers.3.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.18.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.q_proj.bias', 'memory_attention.layers.3.norm2.bias', 'memory_encoder.pix_feat_proj.bias', 'image_encoder.trunk.blocks.47.mlp.layers.1.bias', 'memory_attention.layers.1.linear2.bias', 'image_encoder.trunk.blocks.14.attn.proj.bias', 'image_encoder.trunk.blocks.25.mlp.layers.0.bias', 'image_encoder.trunk.blocks.46.attn.qkv.bias', 'obj_ptr_proj.layers.2.bias', 'image_encoder.trunk.blocks.39.attn.proj.bias', 'image_encoder.trunk.blocks.39.attn.qkv.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.2.bias', 'image_encoder.trunk.blocks.19.mlp.layers.1.bias', 'image_encoder.trunk.blocks.47.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.1.norm3.bias', 'memory_encoder.fuser.layers.0.dwconv.bias', 'image_encoder.trunk.blocks.5.mlp.layers.0.bias', 'memory_attention.layers.0.cross_attn_image.q_proj.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.0.bias', 'image_encoder.trunk.blocks.20.attn.proj.bias', 'image_encoder.trunk.blocks.31.attn.proj.bias', 'memory_attention.layers.1.cross_attn_image.v_proj.bias', 'image_encoder.trunk.blocks.6.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.mlp.layers.1.bias', 'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.19.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.attn.proj.bias', 'image_encoder.trunk.blocks.14.norm2.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.5.mlp.layers.1.bias', 'image_encoder.trunk.blocks.34.mlp.layers.1.bias', 'memory_attention.layers.3.self_attn.out_proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.k_proj.bias', 'memory_attention.layers.3.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.46.norm2.bias', 'image_encoder.trunk.blocks.8.attn.qkv.bias', 'image_encoder.trunk.blocks.30.attn.proj.bias', 'image_encoder.trunk.blocks.20.mlp.layers.1.bias', 'obj_ptr_proj.layers.1.bias', 'image_encoder.trunk.blocks.36.attn.proj.bias', 'image_encoder.trunk.blocks.38.mlp.layers.1.bias', 'memory_attention.layers.1.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.42.mlp.layers.0.bias', 'image_encoder.trunk.blocks.8.proj.bias', 'image_encoder.trunk.blocks.39.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.3.layers.1.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.q_proj.bias', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.v_proj.bias', 'memory_attention.layers.2.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.32.mlp.layers.0.bias', 'sam_prompt_encoder.mask_downscaling.4.bias', 'image_encoder.trunk.blocks.4.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.attn.proj.bias', 'image_encoder.trunk.blocks.13.mlp.layers.0.bias', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.27.mlp.layers.1.bias', 'image_encoder.trunk.blocks.7.mlp.layers.0.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'sam_prompt_encoder.mask_downscaling.3.bias', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.42.attn.qkv.bias', 'memory_attention.layers.0.norm3.bias', 'image_encoder.trunk.blocks.40.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.6.attn.proj.bias', 'image_encoder.trunk.blocks.19.norm1.bias', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'memory_attention.layers.2.cross_attn_image.v_proj.bias', 'memory_attention.layers.1.norm2.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.mlp.layers.1.bias', 'memory_encoder.out_proj.bias', 'image_encoder.trunk.blocks.18.attn.proj.bias', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.23.norm2.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.9.attn.proj.bias', 'image_encoder.trunk.blocks.4.attn.proj.bias', 'image_encoder.trunk.blocks.36.mlp.layers.0.bias', 'image_encoder.trunk.patch_embed.proj.bias', 'image_encoder.trunk.blocks.17.attn.qkv.bias', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.29.mlp.layers.1.bias', 'image_encoder.trunk.blocks.19.attn.qkv.bias', 'image_encoder.trunk.blocks.32.attn.proj.bias', 'memory_attention.layers.2.norm1.bias', 'image_encoder.trunk.blocks.5.attn.qkv.bias', 'image_encoder.trunk.blocks.26.mlp.layers.1.bias', 'image_encoder.trunk.blocks.29.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.v_proj.bias', 'image_encoder.trunk.blocks.46.mlp.layers.1.bias', 'image_encoder.trunk.blocks.36.attn.qkv.bias', 'memory_attention.layers.0.linear2.bias', 'image_encoder.trunk.blocks.17.mlp.layers.0.bias', 'memory_attention.layers.3.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.45.attn.proj.bias', 'image_encoder.trunk.blocks.30.mlp.layers.0.bias', 'image_encoder.trunk.blocks.10.attn.qkv.bias', 'image_encoder.neck.convs.1.conv.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'image_encoder.trunk.blocks.11.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.30.attn.qkv.bias', 'image_encoder.trunk.blocks.28.attn.proj.bias', 'image_encoder.trunk.blocks.42.mlp.layers.1.bias', 'image_encoder.trunk.blocks.22.attn.qkv.bias', 'image_encoder.trunk.blocks.21.norm2.bias', 'memory_attention.layers.3.norm3.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.34.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.q_proj.bias', 'image_encoder.trunk.blocks.32.attn.qkv.bias', 'memory_attention.layers.1.self_attn.v_proj.bias', 'sam_mask_decoder.transformer.layers.0.mlp.layers.1.bias', 'sam_mask_decoder.conv_s0.bias', 'image_encoder.trunk.blocks.15.mlp.layers.1.bias', 'image_encoder.trunk.blocks.9.mlp.layers.0.bias', 'memory_encoder.fuser.layers.1.norm.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.37.attn.qkv.bias', 'image_encoder.trunk.blocks.44.attn.proj.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'image_encoder.trunk.blocks.37.mlp.layers.1.bias', 'image_encoder.trunk.blocks.3.attn.qkv.bias', 'image_encoder.trunk.blocks.8.mlp.layers.0.bias', 'memory_attention.layers.2.norm3.bias', 'image_encoder.trunk.blocks.27.attn.proj.bias', 'image_encoder.trunk.blocks.28.mlp.layers.1.bias', 'image_encoder.trunk.blocks.0.attn.proj.bias', 'image_encoder.trunk.blocks.44.proj.bias', 'image_encoder.trunk.blocks.26.attn.proj.bias', 'image_encoder.trunk.blocks.0.attn.qkv.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.21.mlp.layers.1.bias', 'memory_attention.layers.1.norm1.bias', 'image_encoder.trunk.blocks.18.mlp.layers.1.bias', 'image_encoder.trunk.blocks.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.40.norm2.bias', 'memory_attention.layers.3.self_attn.v_proj.bias', 'memory_attention.layers.3.linear1.bias', 'image_encoder.trunk.blocks.27.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.attn.proj.bias', 'image_encoder.trunk.blocks.4.mlp.layers.1.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.2.attn.qkv.bias', 'memory_encoder.mask_downsampler.encoder.0.bias', 'sam_prompt_encoder.mask_downscaling.1.bias', 'sam_prompt_encoder.mask_downscaling.6.bias', 'image_encoder.trunk.blocks.33.mlp.layers.1.bias', 'obj_ptr_tpos_proj.bias', 'image_encoder.trunk.blocks.22.attn.proj.bias', 'sam_mask_decoder.transformer.layers.1.mlp.layers.1.bias', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.10.mlp.layers.0.bias', 'image_encoder.trunk.blocks.18.norm1.bias', 'image_encoder.neck.convs.2.conv.bias', 'memory_attention.layers.0.linear1.bias', 'image_encoder.trunk.blocks.39.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.attn.proj.bias', 'image_encoder.trunk.blocks.33.mlp.layers.0.bias', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.39.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'image_encoder.trunk.blocks.26.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.43.mlp.layers.1.bias', 'image_encoder.trunk.blocks.34.attn.proj.bias', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.17.attn.proj.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.22.mlp.layers.0.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.out_proj.bias', 'image_encoder.trunk.blocks.16.mlp.layers.1.bias', 'image_encoder.trunk.blocks.41.attn.proj.bias', 'image_encoder.trunk.blocks.25.attn.proj.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.0.mlp.layers.0.bias', 'image_encoder.trunk.blocks.46.mlp.layers.0.bias', 'memory_attention.layers.2.self_attn.out_proj.bias', 'memory_attention.layers.3.cross_attn_image.v_proj.bias', 'memory_encoder.mask_downsampler.encoder.10.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.29.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.attn.qkv.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.4.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.1.bias', 'image_encoder.trunk.blocks.41.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.out_proj.bias', 'memory_attention.layers.0.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.16.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.k_proj.bias', 'image_encoder.trunk.blocks.3.mlp.layers.1.bias', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.neck.convs.0.conv.bias', 'sam_mask_decoder.iou_prediction_head.layers.0.bias', 'image_encoder.trunk.blocks.47.attn.proj.bias', 'memory_attention.layers.3.norm1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.1.bias', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.4.attn.qkv.bias', 'image_encoder.trunk.blocks.44.mlp.layers.0.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.1.layers.0.bias', 'image_encoder.trunk.blocks.2.norm2.bias', 'image_encoder.trunk.blocks.19.attn.proj.bias', 'image_encoder.trunk.blocks.12.mlp.layers.0.bias', 'image_encoder.trunk.blocks.12.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.6.attn.qkv.bias', 'image_encoder.trunk.blocks.15.mlp.layers.0.bias', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.17.norm1.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'memory_encoder.mask_downsampler.encoder.9.bias', 'image_encoder.trunk.blocks.2.proj.bias', 'image_encoder.trunk.blocks.9.norm1.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.k_proj.bias', 'image_encoder.trunk.blocks.45.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.12.attn.qkv.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.out_proj.bias', 'image_encoder.trunk.blocks.33.attn.proj.bias', 'image_encoder.trunk.blocks.20.norm1.bias', 'obj_ptr_proj.layers.0.bias', 'image_encoder.trunk.blocks.25.norm2.bias', 'sam_mask_decoder.transformer.layers.0.self_attn.out_proj.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_token_to_image.q_proj.bias', 'memory_attention.norm.bias', 'image_encoder.trunk.blocks.35.attn.qkv.bias', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.40.attn.qkv.bias', 'image_encoder.trunk.blocks.41.attn.qkv.bias', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.32.mlp.layers.1.bias', 'image_encoder.trunk.blocks.23.mlp.layers.0.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.out_proj.bias', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'image_encoder.trunk.blocks.11.attn.qkv.bias', 'image_encoder.trunk.blocks.34.mlp.layers.0.bias', 'image_encoder.trunk.blocks.23.attn.proj.bias', 'memory_attention.layers.3.linear2.bias', 'image_encoder.trunk.blocks.26.attn.qkv.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.23.attn.qkv.bias', 'image_encoder.trunk.blocks.40.mlp.layers.1.bias', 'image_encoder.trunk.blocks.20.mlp.layers.0.bias', 'image_encoder.trunk.blocks.28.attn.qkv.bias', 'sam_mask_decoder.output_upscaling.1.bias', 'sam_mask_decoder.output_hypernetworks_mlps.2.layers.0.bias', 'sam_mask_decoder.transformer.final_attn_token_to_image.k_proj.bias', 'image_encoder.trunk.blocks.45.attn.qkv.bias', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.1.mlp.layers.0.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.36.mlp.layers.1.bias', 'sam_mask_decoder.conv_s1.bias', 'image_encoder.trunk.blocks.45.mlp.layers.1.bias', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.15.attn.proj.bias', 'memory_attention.layers.2.self_attn.v_proj.bias', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.33.attn.qkv.bias', 'memory_attention.layers.0.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.24.mlp.layers.1.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.11.attn.proj.bias', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.3.attn.proj.bias', 'memory_encoder.fuser.layers.0.norm.bias', 'memory_attention.layers.1.cross_attn_image.out_proj.bias', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.43.attn.qkv.bias', 'image_encoder.trunk.blocks.9.mlp.layers.1.bias', 'image_encoder.trunk.blocks.24.attn.qkv.bias', 'memory_attention.layers.2.self_attn.q_proj.bias', 'memory_attention.layers.0.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.12.attn.proj.bias', 'image_encoder.trunk.blocks.47.attn.qkv.bias', 'memory_attention.layers.2.cross_attn_image.out_proj.bias', 'sam_mask_decoder.iou_prediction_head.layers.1.bias', 'image_encoder.trunk.blocks.40.attn.proj.bias', 'memory_encoder.fuser.layers.0.pwconv2.bias', 'memory_attention.layers.0.norm1.bias', 'sam_mask_decoder.output_upscaling.0.bias', 'mask_downsample.bias', 'image_encoder.trunk.blocks.13.mlp.layers.1.bias', 'memory_attention.layers.2.cross_attn_image.q_proj.bias', 'image_encoder.trunk.blocks.24.attn.proj.bias', 'memory_attention.layers.1.norm3.bias', 'image_encoder.trunk.blocks.13.attn.qkv.bias', 'memory_attention.layers.2.linear1.bias', 'memory_attention.layers.1.linear1.bias', 'sam_mask_decoder.iou_prediction_head.layers.2.bias', 'image_encoder.trunk.blocks.13.attn.proj.bias', 'image_encoder.trunk.blocks.7.mlp.layers.1.bias', 'image_encoder.trunk.blocks.46.attn.proj.bias', 'memory_encoder.mask_downsampler.encoder.6.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_image_to_token.v_proj.bias', 'image_encoder.trunk.blocks.43.mlp.layers.0.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'sam_mask_decoder.output_hypernetworks_mlps.0.layers.2.bias', 'memory_encoder.mask_downsampler.encoder.7.bias', 'memory_encoder.mask_downsampler.encoder.1.bias', 'memory_encoder.mask_downsampler.encoder.3.bias', 'sam_mask_decoder.transformer.layers.0.cross_attn_token_to_image.q_proj.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.trunk.blocks.10.attn.proj.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.44.attn.qkv.bias', 'sam_mask_decoder.transformer.layers.1.cross_attn_image_to_token.q_proj.bias', 'memory_attention.layers.1.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.21.mlp.layers.0.bias', 'image_encoder.trunk.blocks.2.mlp.layers.0.bias', 'memory_encoder.fuser.layers.0.pwconv1.bias', 'image_encoder.trunk.blocks.18.mlp.layers.0.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.14.attn.qkv.bias', 'memory_attention.layers.2.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.14.mlp.layers.0.bias', 'image_encoder.trunk.blocks.41.mlp.layers.1.bias', 'memory_attention.layers.1.cross_attn_image.k_proj.bias', 'image_encoder.trunk.blocks.7.attn.proj.bias', 'image_encoder.trunk.blocks.0.norm1.bias', 'memory_attention.layers.1.cross_attn_image.q_proj.bias', 'memory_attention.layers.0.cross_attn_image.v_proj.bias', 'sam_mask_decoder.transformer.layers.1.self_attn.k_proj.bias', 'image_encoder.trunk.blocks.15.norm1.bias', 'memory_attention.layers.1.self_attn.out_proj.bias', 'image_encoder.trunk.blocks.16.mlp.layers.0.bias', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.28.mlp.layers.0.bias', 'sam_prompt_encoder.mask_downscaling.0.bias', 'image_encoder.trunk.blocks.45.norm1.bias'} INFO 2025-01-20 16:22:40,592 optimizer.py: 220: Matches for module_cls_name [torch.nn.LayerNorm]: {'image_encoder.trunk.blocks.33.norm2.bias', 'image_encoder.trunk.blocks.10.norm1.bias', 'image_encoder.trunk.blocks.45.norm2.bias', 'image_encoder.trunk.blocks.17.norm2.weight', 'image_encoder.trunk.blocks.13.norm1.bias', 'image_encoder.trunk.blocks.23.norm1.weight', 'image_encoder.trunk.blocks.36.norm2.weight', 'image_encoder.trunk.blocks.38.norm2.weight', 'memory_attention.layers.1.norm1.bias', 'image_encoder.trunk.blocks.37.norm2.weight', 'image_encoder.trunk.blocks.14.norm2.bias', 'image_encoder.trunk.blocks.18.norm2.weight', 'image_encoder.trunk.blocks.21.norm2.weight', 'image_encoder.trunk.blocks.40.norm1.bias', 'image_encoder.trunk.blocks.14.norm2.weight', 'image_encoder.trunk.blocks.40.norm2.bias', 'image_encoder.trunk.blocks.27.norm2.bias', 'sam_mask_decoder.transformer.norm_final_attn.weight', 'image_encoder.trunk.blocks.30.norm1.weight', 'image_encoder.trunk.blocks.15.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.weight', 'image_encoder.trunk.blocks.42.norm1.weight', 'image_encoder.trunk.blocks.26.norm1.bias', 'image_encoder.trunk.blocks.14.norm1.bias', 'image_encoder.trunk.blocks.31.norm2.bias', 'image_encoder.trunk.blocks.35.norm2.bias', 'image_encoder.trunk.blocks.42.norm2.bias', 'image_encoder.trunk.blocks.8.norm2.weight', 'image_encoder.trunk.blocks.46.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm3.weight', 'image_encoder.trunk.blocks.20.norm1.weight', 'memory_attention.layers.1.norm2.weight', 'memory_attention.layers.2.norm2.weight', 'image_encoder.trunk.blocks.34.norm2.weight', 'image_encoder.trunk.blocks.40.norm1.weight', 'image_encoder.trunk.blocks.44.norm2.weight', 'image_encoder.trunk.blocks.28.norm1.weight', 'image_encoder.trunk.blocks.31.norm2.weight', 'image_encoder.trunk.blocks.39.norm1.bias', 'sam_mask_decoder.transformer.layers.0.norm2.weight', 'image_encoder.trunk.blocks.9.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm3.weight', 'image_encoder.trunk.blocks.19.norm2.weight', 'image_encoder.trunk.blocks.38.norm1.weight', 'image_encoder.trunk.blocks.13.norm2.bias', 'image_encoder.trunk.blocks.10.norm1.weight', 'image_encoder.trunk.blocks.2.norm2.weight', 'sam_mask_decoder.transformer.layers.0.norm3.bias', 'image_encoder.trunk.blocks.35.norm2.weight', 'image_encoder.trunk.blocks.25.norm1.bias', 'image_encoder.trunk.blocks.37.norm2.bias', 'image_encoder.trunk.blocks.30.norm1.bias', 'image_encoder.trunk.blocks.43.norm1.weight', 'image_encoder.trunk.blocks.18.norm1.bias', 'memory_attention.layers.1.norm3.weight', 'image_encoder.trunk.blocks.29.norm2.weight', 'image_encoder.trunk.blocks.6.norm1.weight', 'image_encoder.trunk.blocks.24.norm1.weight', 'image_encoder.trunk.blocks.32.norm2.bias', 'image_encoder.trunk.blocks.41.norm1.bias', 'memory_attention.layers.3.norm1.weight', 'image_encoder.trunk.blocks.12.norm1.bias', 'image_encoder.trunk.blocks.5.norm2.bias', 'image_encoder.trunk.blocks.17.norm1.weight', 'image_encoder.trunk.blocks.5.norm2.weight', 'image_encoder.trunk.blocks.16.norm2.bias', 'image_encoder.trunk.blocks.3.norm2.bias', 'image_encoder.trunk.blocks.46.norm2.weight', 'memory_attention.layers.3.norm3.weight', 'image_encoder.trunk.blocks.17.norm2.bias', 'image_encoder.trunk.blocks.11.norm2.weight', 'image_encoder.trunk.blocks.19.norm1.weight', 'image_encoder.trunk.blocks.47.norm2.bias', 'image_encoder.trunk.blocks.38.norm2.bias', 'image_encoder.trunk.blocks.44.norm1.bias', 'memory_attention.layers.0.norm3.weight', 'sam_mask_decoder.transformer.layers.0.norm2.bias', 'image_encoder.trunk.blocks.22.norm2.bias', 'image_encoder.trunk.blocks.0.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm4.bias', 'image_encoder.trunk.blocks.37.norm1.bias', 'image_encoder.trunk.blocks.18.norm1.weight', 'image_encoder.trunk.blocks.6.norm1.bias', 'image_encoder.trunk.blocks.42.norm1.bias', 'image_encoder.trunk.blocks.6.norm2.weight', 'memory_attention.layers.0.norm3.bias', 'image_encoder.trunk.blocks.19.norm2.bias', 'image_encoder.trunk.blocks.32.norm1.weight', 'image_encoder.trunk.blocks.33.norm1.bias', 'image_encoder.trunk.blocks.41.norm2.bias', 'image_encoder.trunk.blocks.22.norm1.bias', 'image_encoder.trunk.blocks.10.norm2.bias', 'image_encoder.trunk.blocks.43.norm2.bias', 'image_encoder.trunk.blocks.16.norm1.bias', 'memory_attention.norm.weight', 'image_encoder.trunk.blocks.29.norm1.weight', 'memory_attention.layers.0.norm2.weight', 'image_encoder.trunk.blocks.19.norm1.bias', 'image_encoder.trunk.blocks.47.norm2.weight', 'sam_mask_decoder.transformer.layers.0.norm1.bias', 'sam_mask_decoder.transformer.norm_final_attn.bias', 'image_encoder.trunk.blocks.8.norm2.bias', 'image_encoder.trunk.blocks.39.norm2.bias', 'sam_mask_decoder.transformer.layers.0.norm1.weight', 'image_encoder.trunk.blocks.1.norm1.bias', 'image_encoder.trunk.blocks.37.norm1.weight', 'memory_attention.layers.1.norm2.bias', 'image_encoder.trunk.blocks.9.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm4.bias', 'image_encoder.trunk.blocks.34.norm2.bias', 'image_encoder.trunk.blocks.23.norm2.weight', 'image_encoder.trunk.blocks.9.norm2.weight', 'image_encoder.trunk.blocks.30.norm2.weight', 'image_encoder.trunk.blocks.20.norm2.bias', 'image_encoder.trunk.blocks.31.norm1.bias', 'image_encoder.trunk.blocks.3.norm1.bias', 'image_encoder.trunk.blocks.25.norm2.weight', 'image_encoder.trunk.blocks.26.norm1.weight', 'image_encoder.trunk.blocks.27.norm1.weight', 'image_encoder.trunk.blocks.10.norm2.weight', 'image_encoder.trunk.blocks.5.norm1.bias', 'image_encoder.trunk.blocks.34.norm1.bias', 'image_encoder.trunk.blocks.3.norm2.weight', 'image_encoder.trunk.blocks.4.norm1.bias', 'image_encoder.trunk.blocks.14.norm1.weight', 'image_encoder.trunk.blocks.31.norm1.weight', 'memory_attention.layers.2.norm2.bias', 'image_encoder.trunk.blocks.0.norm2.weight', 'image_encoder.trunk.blocks.23.norm2.bias', 'image_encoder.trunk.blocks.7.norm2.weight', 'image_encoder.trunk.blocks.39.norm1.weight', 'image_encoder.trunk.blocks.36.norm1.weight', 'sam_mask_decoder.transformer.layers.1.norm1.weight', 'image_encoder.trunk.blocks.27.norm2.weight', 'image_encoder.trunk.blocks.11.norm1.bias', 'image_encoder.trunk.blocks.28.norm2.weight', 'memory_attention.layers.3.norm1.bias', 'image_encoder.trunk.blocks.22.norm2.weight', 'image_encoder.trunk.blocks.44.norm1.weight', 'memory_attention.layers.0.norm1.weight', 'sam_mask_decoder.transformer.layers.0.norm4.weight', 'image_encoder.trunk.blocks.42.norm2.weight', 'memory_attention.layers.0.norm1.bias', 'image_encoder.trunk.blocks.28.norm1.bias', 'image_encoder.trunk.blocks.4.norm1.weight', 'image_encoder.trunk.blocks.0.norm2.bias', 'image_encoder.trunk.blocks.1.norm2.weight', 'memory_attention.layers.2.norm1.bias', 'memory_attention.layers.2.norm3.weight', 'image_encoder.trunk.blocks.45.norm2.weight', 'image_encoder.trunk.blocks.3.norm1.weight', 'image_encoder.trunk.blocks.24.norm2.bias', 'image_encoder.trunk.blocks.25.norm1.weight', 'memory_attention.layers.1.norm3.bias', 'image_encoder.trunk.blocks.2.norm2.bias', 'memory_attention.layers.1.norm1.weight', 'image_encoder.trunk.blocks.20.norm2.weight', 'image_encoder.trunk.blocks.21.norm1.weight', 'image_encoder.trunk.blocks.32.norm1.bias', 'image_encoder.trunk.blocks.7.norm2.bias', 'image_encoder.trunk.blocks.12.norm2.weight', 'image_encoder.trunk.blocks.12.norm2.bias', 'sam_mask_decoder.transformer.layers.1.norm2.weight', 'sam_mask_decoder.transformer.layers.1.norm2.bias', 'image_encoder.trunk.blocks.41.norm2.weight', 'image_encoder.trunk.blocks.26.norm2.bias', 'image_encoder.trunk.blocks.8.norm1.weight', 'image_encoder.trunk.blocks.17.norm1.bias', 'image_encoder.trunk.blocks.36.norm2.bias', 'image_encoder.trunk.blocks.36.norm1.bias', 'image_encoder.trunk.blocks.7.norm1.bias', 'memory_attention.layers.0.norm2.bias', 'image_encoder.trunk.blocks.5.norm1.weight', 'image_encoder.trunk.blocks.1.norm1.weight', 'image_encoder.trunk.blocks.13.norm2.weight', 'image_encoder.trunk.blocks.21.norm1.bias', 'image_encoder.trunk.blocks.22.norm1.weight', 'image_encoder.trunk.blocks.9.norm1.bias', 'image_encoder.trunk.blocks.11.norm2.bias', 'image_encoder.trunk.blocks.4.norm2.bias', 'image_encoder.trunk.blocks.2.norm1.bias', 'image_encoder.trunk.blocks.32.norm2.weight', 'image_encoder.trunk.blocks.7.norm1.weight', 'image_encoder.trunk.blocks.23.norm1.bias', 'image_encoder.trunk.blocks.38.norm1.bias', 'image_encoder.trunk.blocks.44.norm2.bias', 'image_encoder.trunk.blocks.30.norm2.bias', 'image_encoder.trunk.blocks.24.norm1.bias', 'image_encoder.trunk.blocks.33.norm2.weight', 'image_encoder.trunk.blocks.15.norm1.weight', 'image_encoder.trunk.blocks.21.norm2.bias', 'image_encoder.trunk.blocks.47.norm1.weight', 'memory_attention.layers.3.norm3.bias', 'image_encoder.trunk.blocks.1.norm2.bias', 'image_encoder.trunk.blocks.8.norm1.bias', 'image_encoder.trunk.blocks.47.norm1.bias', 'image_encoder.trunk.blocks.41.norm1.weight', 'image_encoder.trunk.blocks.34.norm1.weight', 'image_encoder.trunk.blocks.20.norm1.bias', 'image_encoder.trunk.blocks.2.norm1.weight', 'image_encoder.trunk.blocks.11.norm1.weight', 'image_encoder.trunk.blocks.25.norm2.bias', 'memory_attention.layers.3.norm2.bias', 'image_encoder.trunk.blocks.6.norm2.bias', 'image_encoder.trunk.blocks.16.norm2.weight', 'image_encoder.trunk.blocks.33.norm1.weight', 'memory_attention.norm.bias', 'memory_attention.layers.3.norm2.weight', 'image_encoder.trunk.blocks.35.norm1.bias', 'image_encoder.trunk.blocks.29.norm2.bias', 'image_encoder.trunk.blocks.45.norm1.weight', 'image_encoder.trunk.blocks.46.norm1.bias', 'image_encoder.trunk.blocks.0.norm1.bias', 'image_encoder.trunk.blocks.29.norm1.bias', 'image_encoder.trunk.blocks.26.norm2.weight', 'image_encoder.trunk.blocks.12.norm1.weight', 'image_encoder.trunk.blocks.15.norm1.bias', 'image_encoder.trunk.blocks.27.norm1.bias', 'image_encoder.trunk.blocks.43.norm2.weight', 'image_encoder.trunk.blocks.24.norm2.weight', 'image_encoder.trunk.blocks.18.norm2.bias', 'image_encoder.trunk.blocks.28.norm2.bias', 'image_encoder.trunk.blocks.43.norm1.bias', 'sam_mask_decoder.transformer.layers.1.norm4.weight', 'image_encoder.trunk.blocks.15.norm2.weight', 'sam_mask_decoder.transformer.layers.1.norm3.bias', 'image_encoder.trunk.blocks.13.norm1.weight', 'memory_attention.layers.2.norm3.bias', 'sam_mask_decoder.transformer.layers.1.norm1.bias', 'image_encoder.trunk.blocks.35.norm1.weight', 'image_encoder.trunk.blocks.4.norm2.weight', 'image_encoder.trunk.blocks.39.norm2.weight', 'image_encoder.trunk.blocks.46.norm1.weight', 'image_encoder.trunk.blocks.40.norm2.weight', 'image_encoder.trunk.blocks.45.norm1.bias', 'memory_attention.layers.2.norm1.weight'} INFO 2025-01-20 16:22:41,023 sam2_datasets.py: 125: Dataset mixing probabilities: [1.0] INFO 2025-01-20 16:22:41,024 trainer.py: 423: Resuming training from /ephemeral/hossein/output/sam2/checkpoints/checkpoint.pt