开展工作:运用swin-transformer-object-detection进行目标检测任务 第一步:下载swin-transformer-object-detection 第二步:配置环境(最重要) 需要: cuda 10.2.89 cudnn 7.6.5 python 3.8 vs 2019 pytorch 1.8.0 mmcv 1.3.17 mmdet 2.20.0(最终变成了2.11.0) 命令: conda install cudatoolkit=10.2 -c https://mirrors.ustc.edu.cn/anaconda/pkgs/free/win-64/ conda install cudnn=7.6.5 -c https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/win-64/ conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 conda install pytorch==1.8.1 torchvision==0.9.1 torchaudio==0.8.1 cudatoolkit=10.2 -c pytorch conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=10.2 conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.0 -c pytorch -c conda-forge 可以安装 CUDA_PATH CUDA_HOME envs ls env conda install pytorch torchvision cudatoolkit=10.2 -c pytorch pyt1.8.1 + torchvision0.9.1 + cuda10.2 https://download.openmmlab.com/mmcv/dist/index.html pip install mmcv-full==1.3.17 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html swin-T-pytorch18 conda create -n swin-T-pytorch18 --clone pytorch18 installing mmdet from https://github.com/open-mmlab/mmdetection.git. Cloning into 'C:\Users\zjf\AppData\Local\Temp\tmptf6pnzkx\mmdetection'. pip install -r requirements.txt -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host pip install mmcv-full==1.3.17 -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html 复制前:33.5GB 复制后: MMDetection V2.24.1 Release conda list | grep cuda,然后就会获得虚拟环境下安装的版本 conda search cudnn --info 查看所有cuda版本和对应的cudnn版本 查询显卡算力 进入目录 deviceQuery.exe >> cd D:\360Downloads\CUDA\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\demo_suite 运行 deviceQuery.exe >> ./deviceQuery.exe 我的显卡算力为7.5 Device 0: "NVIDIA GeForce GTX 1660 Ti" CUDA Driver Version / Runtime Version 11.6 / 11.0 CUDA Capability Major/Minor version number: 7.5 、deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.0, NumDevs = 1, Device0 = NVIDIA GeForce GTX 1660 Ti E:\CUDA\v11.0\bin>nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Wed_Jul_22_19:09:35_Pacific_Daylight_Time_2020 Cuda compilation tools, release 11.0, V11.0.221 Build cuda_11.0_bu.relgpu_drvr445TC445_37.28845127_0 True 10.2 0 1 NVIDIA GeForce GTX 1660 Ti 1.8.0 10.2 7605 运行demo 图片: python demo/image_demo.py demo/demo.jpg configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py mask_rcnn_swin_tiny_patch4_window7.pth # 变成了纯目标检测,配置文件已经改变,所以之前的权重文件已经失效 python demo/image_demo.py demo/test.png configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py mask_rcnn_swin_tiny_patch4_window7.pth python demo/image_demo.py demo/demo.jpg configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py mask_rcnn_swin_tiny_patch4_window7.pth 视频: 保存: python demo/video_demo.py demo/demo.mp4 configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py mask_rcnn_swin_tiny_patch4_window7.pth --out demo\1.mp4 展示: python demo/video_demo.py demo/demo.mp4 configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py mask_rcnn_swin_tiny_patch4_window7.pth --show python demo/video_demo.py demo/demo.mp4 configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py latest.pth --show python demo/image_demo.py demo/test.png configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py latest.pth 我的推理测试命令: python demo/image_demo.py demo/6.jpg configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py checkpoints/epoch_50.pth python demo/image_demo.py demo/demo.jpg configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth 2022年5月24日20:40,历时两天,终于安装环境成功,版本为1.8.0 10.2.89 7.6.5 coco数据集是json格式 voc数据集是xml格式 初步决定先做mask r cnn做实例分割 再做faster rcnn做纯目标检测 实例分割:相当于带有mask的目标检测 实例分割 目标检测和语义分割的结合,在图像中将目标检测出来(目标检测),然后对目标每个像素打上标签(语义分割),不需要将图像中所有像素打上标签 需要COCO格式 CLASSES = ['ignored regions', 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others'] 三个py文件 修改Swin-Transformer-Object-Detection/mmdet/datasets/coco.py的CLASSES 修改Swin-Transformer-Object-Detection/mmdet/core/evaluation/class_names.py 的coco_classes 修改路径/configs/base/models/mask_rcnn_swin_fpn.py中的num_classes,也有两处大概在第54行和73行,修改为自己数据集的类别数量 ———————————————— 原文链接:https://blog.csdn.net/qq_41964545/article/details/123140485 4.修改训练参数 路径/configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x(1x)_coco.py中: 第3行’…/base/datasets/coco_instance.py’修改为’…/base/datasets/coco_detection.py’ 第69行的max_epochs按需修改 第31行的samples_per_gpu表示batch size大小,太大会内存溢出 第32行的workers_per_gpu表示每个GPU对应线程数,2、4、6、8按需修改 ———————————————— 原文链接:https://blog.csdn.net/qq_36622589/article/details/124355564 制作数据集: 将数据集从txt转换成xml,再转换成json checkpoint中包含了需要加载模型的目录 这里以mask_rcnn_swin_fpn作为backbone为例,数据格式为coco数据格式,其他模型的修改等同 训练命令 python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py --cfg-options model.pretrained='checkpoints/latest.pth' python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py --cfg-options model.pretrained='checkpoints/mask_rcnn_swin_tiny_patch4_window7.pth' python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py --resume-from model.pretrained='checkpoints/latest.pth' python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py --resume-from model.pretrained='checkpoints/epoch_50.pth' 这不行 python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py --resume-from 'checkpoints/epoch_75.pth' 这可以 断点续传 测试命令: python demo/image_demo.py demo/000019.jpg configs\swin\mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco/latest.pth python demo/image_demo.py testfiles/img1.jpg configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco.py work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_1x_coco/latest.pth --score-thr 0.5 python demo/video_demo.py demo/demo.mp4 configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco/latest.pth --score-thr 0.5 --show 3)性能统计 python tools/test.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco/latest.pth --eval bbox 4)日志分析 python tools/analysis_tools/analyze_logs.py plot_curve work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco/20220527_092210.log.json id字段:指的是这个annotation的一个id image_id:等同于前面image字段里面的id。 category_id:类别id segmentation:用于分割 我的数据集中没有这一项,所以完成不了实例分割,只能进行目标检测 area:标注区域面积 bbox:标注框,左上角坐标 标注框宽和高 iscrowd:决定是RLE格式还是polygon格式。 参考文献 https://zhuanlan.zhihu.com/p/451816231 https://github.com/SwinTransformer/Swin-Transformer-Object-Detection/issues/113 https://blog.csdn.net/qq_41964545/article/details/123140485 # 去除实例分割,仅进行目标检测 https://github.com/SwinTransformer/Swin-Transformer-Object-Detection/issues/50 https://github.com/SwinTransformer/Swin-Transformer-Object-Detection/issues/25 有效 # 训练目标检测 https://blog.csdn.net/qq_36622589/article/details/124355564?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1-124355564-blog-123140485.pc_relevant_paycolumn_v3&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1-124355564-blog-123140485.pc_relevant_paycolumn_v3&utm_relevant_index=2 将visdrone数据集转化为coco格式并在mmdetection上训练,附上转好的json文件 https://blog.csdn.net/S5242/article/details/121114907 https://github.com/VisDrone/VisDrone-Dataset http://aiskyeye.com/download/object-detection-2/ https://beyonderwei.com/2022/03/23/Swin-Transformer%E7%9B%AE%E6%A0%87%E6%A3%80%E6%B5%8B4%E2%80%94%E2%80%94%E8%AE%AD%E7%BB%83%E8%87%AA%E5%B7%B1%E6%95%B0%E6%8D%AE%E9%9B%86/ https://blog.csdn.net/CSDN_X_W/article/details/123845728 https://blog.csdn.net/m0_37605642/article/details/117932717 https://blog.csdn.net/qq_42138662/article/details/109227007 coco数据集介绍 https://blog.csdn.net/qq_44554428/article/details/122597358 Ze Liu Yutong Lin Yue Cao Han Hu Yixuan Wei Zheng Zhang Stephen Lin Baining Guo Swin Transformer : Hierarchical Vision Transformer using Shifted Windows. ICCV 2021 swin_t心路之旅 read://https_blog.51cto.com/?url=https%3A%2F%2Fblog.51cto.com%2Fu_13565704%2F5105836%3Fb%3Dtotalstatistic https://wenku.baidu.com/view/9aa300a3bfeb19e8b8f67c1cfad6195f312be8cb.html