yolo_trainning

Created At : 2024-11-15 07:11

Count:1.4k Views 👀 :

环境准备
(ultralytics) .\py12\python.exe -m pip install .
分割模型知识
yolo命令行
Py代码
1. 完成命令行的事儿
训练数据和打标
结论
数据格式化处理
训练

Among 3rd and 5th, November, I helped others to research YOLO to distinguish pictures.

It’s a little hurry, so I pasted my original note here, please translate into English manually.

环境准备
分割模型知识
yolo命令行
- Train
- Predict
- Val
- Export
- Special
Py代码
- 完成命令行的事儿
训练数据和打标
结论
数据格式化处理
训练

环境准备

git clone https://github.com/ultralytics/ultralytics.git

cd 进入

conda create -n py12 python=3.12

然后会看到解释器（conda\envs路径下）

E:\anaconda3\envs\py12\python.exe

复制到ultralytics目录下

pycharm添加python解释器 py12, 然后创建venv的库目录

pip3 install torch torchvision torchaudio

(ultralytics) .\py12\python.exe -m pip install .

成功运行

>>> from ultralytics import YOLO
Creating new Ultralytics Settings v0.0.6 file ✅
View Ultralytics Settings with 'yolo settings' or at 'C:\Users\ranja\AppData\Roaming\Ultralytics\settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-setti
>>>

分割模型知识

yolo的识别之你说的coco8的那80多个物品类别，但是我要具体识别其中几个，我自己定义个coco8，这是yolo的coco8的子集

yolo11n.pt与yolo11n-seg.pt区别：

一个只是框框，另一个会抠图

yolo命令行

Ultralytics yolo commands use the following syntax:

yolo TASK MODE ARGS

TASK (optional) is one of (detect, segment, classify, pose, obb)

MODE (required) is one of (train, val, predict, export, track, benchmark)

ARGS (optional) are arg=value pairs like imgsz=640 that override defaults.

See all ARGS in the full Configuration Guide or with the yolo cfg CLI command.

Train

Train a detection model for 10 epochs with an initial learning_rate of 0.01

yolo train data=coco8.yaml model=yolo11n.pt epochs=10 lr0=0.01

epochs是训练轮数，过度训练会导致在训练数据的过于拟合，在训练数据上表现良好，但是测试数据上表现不好

lr是学习率

过高的学习率：可能导致训练不稳定，甚至无法收敛（即模型无法达到理想的准确率）。
过低的学习率：可能导致训练速度太慢，或者在局部最优解徘徊。

Predict

Predict a YouTube video using a pretrained segmentation model at image size 320:

yolo predict model=yolo11n-seg.pt source='https://youtu.be/LNwODJXcvt4' imgsz=320

会从官网下载那个pt文件，会解析那个youtube视频流

https://github.com/ultralytics/yolov5/releases/download/v{version}/{model}.pt

https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt

如果自定义pt，可以直接指定路径

model='/path/to/your_custom_model.pt'

Val

Val a pretrained detection model at batch-size 1 and image size 640:

验证集是一个独立于训练集的数据子集，用于测试模型在未见数据上的表现。通过对验证集进行评估，可以获取模型的准确性、召回率、F1-score 和 mAP（mean Average Precision）等指标。

yolo val model=yolo11n.pt data=coco8.yaml batch=1 imgsz=640

Export

Export a yolo11n classification model to ONNX format at image size 224 by 128 (no TASK required)

yolo export model=yolo11n-cls.pt format=onnx imgsz=224,128

导出模型

format=onnx：这表明导出的格式是 ONNX（Open Neural Network Exchange），这是一种开放的深度学习模型格式，用于在不同的框架和平台之间共享和使用模型。
imgsz=224,128：指定输入图像的尺寸为 224 像素（宽）和 128 像素（高）。这意味着在导出时，模型会调整其输入层的大小，以适应这一特定的图像尺寸。此参数确保当您使用导出的模型时，输入图像的尺寸符合要求。

Special

Run special commands to see version, view settings, run checks and more:

yolo help
yolo checks
yolo version
yolo settings
yolo copy-cfg
yolo cfg

Py代码

完成命令行的事儿

from ultralytics import YOLO

# Create a new YOLO model from scratch
model = YOLO("yolo11n.yaml")

# Load a pretrained YOLO model (recommended for training)
model = YOLO("yolo11n.pt")

# Train the model using the 'coco8.yaml' dataset for 3 epochs
results = model.train(data="coco8.yaml", epochs=3)

# Evaluate the model's performance on the validation set
results = model.val()

# Perform object detection on an image using the model
results = model("https://ultralytics.com/images/bus.jpg")

# Export the model to ONNX format
success = model.export(format="onnx")

训练数据和打标

https://github.com/Incalos/YOLO-Datasets-And-Training-Methods/blob/master/README_CN.md

pip install labelimg

labelimg

命令直接启动labelimg

结论

https://www.kaggle.com/code/creazyeeeeli/acsnansck#Training
https://docs.ultralytics.com/models/yolov8/#performance-metrics

数据格式化处理

RLE转mask

最开始看着那个RLE数据，是奇数个，而非偶数个，最开始以为RLE出错了。
后来李顺想到，可能是“前景，背景，前景，背景。。。”这样交替下来的像素

然后我觉得应该是对的，于是写了个脚本：
一个图片对应多个segments，因此每个segment都是一个掩码图片，最终我把这些掩码图片叠加起来得到最终掩码图片。

1.坑：图片全黑

最开始我处理掩码，一维转二维，都只用0，1表示，然后保存成图片，但是最终所有图片全黑。我百思不得其解，但是最终发现原来单通道图片是一个字节表示一个像素，是0-255，有灰阶的，不是单纯的非黑即白。因此我直接令np.array矩阵点乘255，得到图片。

2.坑：重复连续

应该是我们自己写的rle数据转mask存在问题，我们写的不对，导致其看起来像高中学的正选波形一样在图片上重复。

使用coco进行解码，rle->mask解码正确。

后来顺哥意识到，应该有其他库能进行解码，然后找到了coco去进行解码，正确。

训练

Welcome to point out the mistakes and faults!