YOLOv8是一種令人驚嘆的分割模型;它易于訓(xùn)練、測(cè)試和部署。在本教程中,我們將學(xué)習(xí)如何在自定義數(shù)據(jù)集上使用YOLOv8。但在此之前,我想告訴你為什么在存在其他優(yōu)秀的分割模型時(shí)應(yīng)該使用YOLOv8呢?
我正在從事與醫(yī)學(xué)圖像分割相關(guān)的項(xiàng)目,當(dāng)我的合作者突然告訴我,我們只有來(lái)自175名患者的600張圖像和標(biāo)注。在醫(yī)學(xué)成像領(lǐng)域,這是一個(gè)常見(jiàn)的問(wèn)題,因?yàn)榕R床醫(yī)生是最忙碌的人,他們有許多職責(zé)。然而,他向我保證,一旦模型訓(xùn)練好(并進(jìn)行微調(diào)),我們將獲得來(lái)自其他300多名患者的圖像和標(biāo)注,作為額外的測(cè)試集以評(píng)估我們的模型。
我開始將這50名患者分為訓(xùn)練、測(cè)試和驗(yàn)證數(shù)據(jù)集,使用8010的比例。對(duì)于模型,我首先嘗試了UNet及其變體(ResUNet、Attention UNet、Res-Attention UNet)。這些模型在訓(xùn)練、測(cè)試和驗(yàn)證數(shù)據(jù)集上表現(xiàn)出色,但在額外的測(cè)試集上表現(xiàn)糟糕。然后我想,“讓我們?cè)囋嘫OLOv8;如果有效,那將是很好的,如果不行,那將是一次有趣的學(xué)習(xí)經(jīng)歷?!睅讉€(gè)小時(shí)后,它奏效了,令我驚訝的是,在額外的測(cè)試集上遠(yuǎn)遠(yuǎn)超出了我的預(yù)期。我不能透露具體數(shù)值,因?yàn)檎撐娜栽趯彶橹?,但我愿意分享如何將其調(diào)整為自定義數(shù)據(jù)集,以便你可以節(jié)省大量工作時(shí)間。讓我們開始制定攻略。
攻略
以下是我們將學(xué)習(xí)的主題:
1. YOLOv8簡(jiǎn)介
2. 安裝庫(kù)
3. 數(shù)據(jù)集準(zhǔn)備
4. 訓(xùn)練準(zhǔn)備
5. 訓(xùn)練模型
6. 結(jié)果
YOLOv8簡(jiǎn)介
YOLOv8是YOLO系列的最新版本,用于實(shí)時(shí)目標(biāo)檢測(cè),由Ultralytics開發(fā)。它通過(guò)引入空間注意力和特征融合等修改來(lái)提高準(zhǔn)確性和速度。該架構(gòu)將修改過(guò)的CSPDarknet53骨干網(wǎng)絡(luò)與用于處理的先進(jìn)頭部相結(jié)合。這些先進(jìn)之處使YOLOv8成為各種計(jì)算機(jī)視覺(jué)任務(wù)的最新選擇。
安裝庫(kù)
以下是安裝庫(kù)的選項(xiàng)。
# Install the ultralytics package using conda conda install -c conda-forge ultralytics or # Install the ultralytics package from PyPI pip install ultralytics
數(shù)據(jù)集準(zhǔn)備
數(shù)據(jù)集需要進(jìn)行兩個(gè)步驟的處理:
步驟1:請(qǐng)按照以下結(jié)構(gòu)組織您的數(shù)據(jù)集(圖像和掩膜):理想情況下,訓(xùn)練、測(cè)試和驗(yàn)證(val)的比例為8010。數(shù)據(jù)集文件夾的安排如下:
dataset | |---train | |-- images | |-- labels | |---Val | |-- images | |-- labels | |---test | |-- images | |-- labels
步驟2:第二步是將 .png(或任何類型)掩膜(標(biāo)簽)轉(zhuǎn)換為所有3個(gè)標(biāo)簽文件夾中的 .txt 文件。以下是將標(biāo)簽(.png、.jpg)轉(zhuǎn)換為 .txt 文件的Python代碼。(您也可以在此操作)
將每個(gè)標(biāo)簽圖像轉(zhuǎn)換為 .txt 文件
import numpy as np from PIL import Image import numpy as np from PIL import Image from pathlib import Path def create_label(image_path, label_path): # Load the image from the given path and convert it to a NumPy array mask = np.asarray(Image.open(image_path)) # Find the coordinates of non-zero (i.e., not black) pixels in the mask's first channel (assumed to be red) rows, cols = np.nonzero(mask[:, :, 0]) # If no non-zero pixels are found in the mask, return early as there's nothing to label if len(rows) == 0: return # Optionally, handle the case of no non-zero pixels as needed # Calculate the normalized coordinates by dividing by the respective dimensions of the image # This is done to ensure that the coordinates are relative (between 0 and 1) rather than absolute normalized_coords = [(col / mask.shape[1], row / mask.shape[0]) for row, col in zip(rows, cols)] # Construct a string representing the label data # The format starts with '0' (which might represent a class id or similar) followed by pairs of normalized coordinates label_line = '0 ' + ' '.join([f'{cord[0]} {cord[1]}' for cord in normalized_coords]) # Ensure that the directory for the label_path exists, create it if not Path(label_path).parent.mkdir(parents=True, exist_ok=True) # Open the label file in write mode and write the label_line to it with open(label_path, 'w') as f: f.write(label_line) import os for x in ['train', 'val', 'test']: images_dir_path = Path(f'datasets/{x}/labels') for img_path in images_dir_path.iterdir(): if img_path.is_file() and img_path.suffix.lower() in ['.jpg', '.jpeg', '.png', '.bmp']: label_path = img_path.parent.parent / 'labels_' / f'{img_path.stem}.txt' label_line = create_label(img_path, label_path) else: print(f"Skipping non-image file: {img_path}")
請(qǐng)注意:在運(yùn)行上述代碼后,請(qǐng)不要忘記從標(biāo)簽文件夾中刪除標(biāo)簽(掩膜)圖像。
訓(xùn)練準(zhǔn)備
為訓(xùn)練創(chuàng)建 'data.yaml' 文件。只需在Python中運(yùn)行下面的代碼,它將為YOLOv8創(chuàng)建 'data.yaml' 文件。
yaml_content = f''' train: train/images val: val/images test: test/images names: ['object'] # Hyperparameters ------------------------------------------------------------------------------------------------------ # lr0: 0.01 # initial learning rate (i.e. SGD=1E-2, Adam=1E-3) # lrf: 0.01 # final learning rate (lr0 * lrf) # momentum: 0.937 # SGD momentum/Adam beta1 # weight_decay: 0.0005 # optimizer weight decay 5e-4 # warmup_epochs: 3.0 # warmup epochs (fractions ok) # warmup_momentum: 0.8 # warmup initial momentum # warmup_bias_lr: 0.1 # warmup initial bias lr # box: 7.5 # box loss gain # cls: 0.5 # cls loss gain (scale with pixels) # dfl: 1.5 # dfl loss gain # pose: 12.0 # pose loss gain # kobj: 1.0 # keypoint obj loss gain # label_smoothing: 0.0 # label smoothing (fraction) # nbs: 64 # nominal batch size # hsv_h: 0.015 # image HSV-Hue augmentation (fraction) # hsv_s: 0.7 # image HSV-Saturation augmentation (fraction) # hsv_v: 0.4 # image HSV-Value augmentation (fraction) degrees: 0.5 # image rotation (+/- deg) translate: 0.1 # image translation (+/- fraction) scale: 0.2 # image scale (+/- gain) shear: 0.2 # image shear (+/- deg) from -0.5 to 0.5 perspective: 0.1 # image perspective (+/- fraction), range 0-0.001 flipud: 0.7 # image flip up-down (probability) fliplr: 0.5 # image flip left-right (probability) mosaic: 0.8 # image mosaic (probability) mixup: 0.1 # image mixup (probability) # copy_paste: 0.0 # segment copy-paste (probability) ''' with Path('data.yaml').open('w') as f: f.write(yaml_content)
訓(xùn)練模型
一旦數(shù)據(jù)準(zhǔn)備好,其余的非常簡(jiǎn)單,只需運(yùn)行以下代碼。
import matplotlib.pyplot as plt from ultralytics import YOLO model = YOLO("yolov8n-seg.pt") results = model.train( batch=8, device="cpu", data="data.yaml", epochs=100, imgsz=255)
恭喜,你成功了?,F(xiàn)在你會(huì)看到一個(gè) 'runs' 文件夾,你可以在其中找到所有的訓(xùn)練矩陣和圖表。
結(jié)果
好,讓我們?cè)跍y(cè)試數(shù)據(jù)上檢查結(jié)果:
model = YOLO("runs/segment/train13/weights/best.pt") # load the model file = glob.glob('datasets/test/images/*') # let's get the images
現(xiàn)在讓我們?cè)趫D像上運(yùn)行代碼。
# lets run the model over every image for i in range(len(file)): result = model(file[i], save=True, save_txt=True)
將每個(gè) Pred.txt 文件轉(zhuǎn)換為 mask.png
import numpy as np import cv2 def convert_label_to_image(label_path, image_path): # Read the .txt label file with open(label_path, 'r') as f: label_line = f.readline() # Parse the label line to extract the normalized coordinates coords = label_line.strip().split()[1:] # Remove the class label (assuming it's always 0) # Convert normalized coordinates to pixel coordinates width, height = 256, 256 # Set the dimensions of the output image coordinates = [(float(coords[i]) * width, float(coords[i+1]) * height) for i in range(0, len(coords), 2)] coordinates = np.array(coordinates, dtype=np.int32) # Create a blank image image = np.zeros((height, width, 3), dtype=np.uint8) # Draw the polygon using the coordinates cv2.fillPoly(image, [coordinates], (255, 255, 255)) # Fill the polygon with white color print(image.shape) # Save the image cv2.imwrite(image_path, image) print("Image saved successfully.") # Example usage label_path = 'runs/segment/predict4/val_labels/img_105.txt' image_path = 'runs/segment/predict4/val_labels/img_105.jpg' convert_label_to_image(label_path, image_path) file = glob.glob('runs/segment/predict11/labels/*.txt') for i in range(len(file)): label_path = file[i] image_path = file[i][:-3]+'jpg' convert_label_to_image(label_path, image_path)審核編輯:湯梓紅
-
模型
+關(guān)注
關(guān)注
1文章
3226瀏覽量
48807 -
數(shù)據(jù)集
+關(guān)注
關(guān)注
4文章
1208瀏覽量
24689 -
醫(yī)學(xué)圖像分割
+關(guān)注
關(guān)注
0文章
5瀏覽量
829
原文標(biāo)題:基于YOLOv8的自定義醫(yī)學(xué)圖像分割
文章出處:【微信號(hào):vision263com,微信公眾號(hào):新機(jī)器視覺(jué)】歡迎添加關(guān)注!文章轉(zhuǎn)載請(qǐng)注明出處。
發(fā)布評(píng)論請(qǐng)先 登錄
相關(guān)推薦
評(píng)論