VOC ๋ฐ์ดํฐ ์ธํธ
PASCAL VOC (์๊ฐ์ ๊ฐ์ฒด ํด๋์ค) ๋ฐ์ดํฐ ์ธํธ๋ ์ ์๋ ค์ง ๊ฐ์ฒด ๊ฐ์ง, ๋ถํ , ๋ถ๋ฅ ๋ฐ์ดํฐ ์ธํธ์ ๋๋ค. ๋ค์ํ ๊ฐ์ฒด ๋ฒ์ฃผ์ ๋ํ ์ฐ๊ตฌ๋ฅผ ์ฅ๋ คํ๊ธฐ ์ํด ์ค๊ณ๋์์ผ๋ฉฐ ์ผ๋ฐ์ ์ผ๋ก ์ปดํจํฐ ๋น์ ๋ชจ๋ธ์ ๋ฒค์น๋งํนํ๋ ๋ฐ ์ฌ์ฉ๋ฉ๋๋ค. ๊ฐ์ฒด ๊ฐ์ง, ์ธ๋ถํ, ๋ถ๋ฅ ์์ ์ ํ๋ ์ฐ๊ตฌ์์ ๊ฐ๋ฐ์์๊ฒ ํ์์ ์ธ ๋ฐ์ดํฐ ์ธํธ์ ๋๋ค.
์ฃผ์ ๊ธฐ๋ฅ
- VOC ๋ฐ์ดํฐ ์ธํธ์๋ ๋ ๊ฐ์ง ์ฃผ์ ๊ณผ์ ๊ฐ ์์ต๋๋ค: VOC2007๊ณผ VOC2012.
- ๋ฐ์ดํฐ ์ธํธ๋ ์๋์ฐจ, ์์ ๊ฑฐ, ๋๋ฌผ๊ณผ ๊ฐ์ ์ผ๋ฐ์ ์ธ ๊ฐ์ฒด์ ๋ณดํธ, ์ํ, ์ํ๊ณผ ๊ฐ์ ๋ณด๋ค ๊ตฌ์ฒด์ ์ธ ์นดํ ๊ณ ๋ฆฌ๋ฅผ ํฌํจํ 20๊ฐ์ ๊ฐ์ฒด ์นดํ ๊ณ ๋ฆฌ๋ก ๊ตฌ์ฑ๋์ด ์์ต๋๋ค.
- ์ฃผ์์๋ ๊ฐ์ฒด ๊ฐ์ง ๋ฐ ๋ถ๋ฅ ์์ ์ ์ํ ๊ฐ์ฒด ๊ฒฝ๊ณ ์์ ๋ฐ ํด๋์ค ๋ ์ด๋ธ, ์ธ๋ถํ ์์ ์ ์ํ ์ธ๋ถํ ๋ง์คํฌ๊ฐ ํฌํจ๋ฉ๋๋ค.
- VOC provides standardized evaluation metrics like mean Average Precision (mAP) for object detection and classification, making it suitable for comparing model performance.
๋ฐ์ดํฐ ์ธํธ ๊ตฌ์กฐ
VOC ๋ฐ์ดํฐ ์ธํธ๋ ์ธ ๊ฐ์ ํ์ ์งํฉ์ผ๋ก ๋๋ฉ๋๋ค:
- ํ๋ จ: ์ด ํ์ ์งํฉ์๋ ๊ฐ์ฒด ๊ฐ์ง, ์ธ๋ถํ ๋ฐ ๋ถ๋ฅ ๋ชจ๋ธ์ ํ๋ จํ๊ธฐ ์ํ ์ด๋ฏธ์ง๊ฐ ํฌํจ๋์ด ์์ต๋๋ค.
- ์ ํจ์ฑ ๊ฒ์ฌ: ์ด ํ์ ์งํฉ์๋ ๋ชจ๋ธ ํ์ต ์ค ์ ํจ์ฑ ๊ฒ์ฌ ๋ชฉ์ ์ผ๋ก ์ฌ์ฉ๋๋ ์ด๋ฏธ์ง๊ฐ ์์ต๋๋ค.
- ํ ์คํธ: ์ด ํ์ ์งํฉ์ ํ์ต๋ ๋ชจ๋ธ์ ํ ์คํธํ๊ณ ๋ฒค์น๋งํนํ๋ ๋ฐ ์ฌ์ฉ๋๋ ์ด๋ฏธ์ง๋ก ๊ตฌ์ฑ๋ฉ๋๋ค. ์ด ํ์ ์งํฉ์ ๋ํ ์ค์ธก ์๋ฃ ์ฃผ์์ ๊ณต๊ฐ๋์ง ์์ผ๋ฉฐ, ๊ฒฐ๊ณผ๋ ์ฑ๋ฅ ํ๊ฐ๋ฅผ ์ํด PASCAL VOC ํ๊ฐ ์๋ฒ์ ์ ์ถ๋ฉ๋๋ค.
์ ํ๋ฆฌ์ผ์ด์
The VOC dataset is widely used for training and evaluating deep learning models in object detection (such as YOLO, Faster R-CNN, and SSD), instance segmentation (such as Mask R-CNN), and image classification. The dataset's diverse set of object categories, large number of annotated images, and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners.
๋ฐ์ดํฐ ์ธํธ YAML
๋ฐ์ดํฐ ์ธํธ ๊ตฌ์ฑ์ ์ ์ํ๋ ๋ฐ๋ YAML(๋ ๋ค๋ฅธ ๋งํฌ์
์ธ์ด) ํ์ผ์ด ์ฌ์ฉ๋ฉ๋๋ค. ์ฌ๊ธฐ์๋ ๋ฐ์ดํฐ ์ธํธ์ ๊ฒฝ๋ก, ํด๋์ค ๋ฐ ๊ธฐํ ๊ด๋ จ ์ ๋ณด์ ๋ํ ์ ๋ณด๊ฐ ํฌํจ๋์ด ์์ต๋๋ค. VOC ๋ฐ์ดํฐ ์ธํธ์ ๊ฒฝ์ฐ, ๋ฐ์ดํฐ ์ธํธ์ VOC.yaml
ํ์ผ์ ๋ค์ ์์น์์ ์ ์ง๋ฉ๋๋ค. https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VOC.yaml.
ultralytics/cfg/datasets/VOC.yaml
# Ultralytics YOLO ๐, AGPL-3.0 license
# PASCAL VOC dataset http://host.robots.ox.ac.uk/pascal/VOC by University of Oxford
# Documentation: # Documentation: https://docs.ultralytics.com/datasets/detect/voc/
# Example usage: yolo train data=VOC.yaml
# parent
# โโโ ultralytics
# โโโ datasets
# โโโ VOC โ downloads here (2.8 GB)
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/VOC
train: # train images (relative to 'path') 16551 images
- images/train2012
- images/train2007
- images/val2012
- images/val2007
val: # val images (relative to 'path') 4952 images
- images/test2007
test: # test images (optional)
- images/test2007
# Classes
names:
0: aeroplane
1: bicycle
2: bird
3: boat
4: bottle
5: bus
6: car
7: cat
8: chair
9: cow
10: diningtable
11: dog
12: horse
13: motorbike
14: person
15: pottedplant
16: sheep
17: sofa
18: train
19: tvmonitor
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
import xml.etree.ElementTree as ET
from tqdm import tqdm
from ultralytics.utils.downloads import download
from pathlib import Path
def convert_label(path, lb_path, year, image_id):
def convert_box(size, box):
dw, dh = 1. / size[0], 1. / size[1]
x, y, w, h = (box[0] + box[1]) / 2.0 - 1, (box[2] + box[3]) / 2.0 - 1, box[1] - box[0], box[3] - box[2]
return x * dw, y * dh, w * dw, h * dh
in_file = open(path / f'VOC{year}/Annotations/{image_id}.xml')
out_file = open(lb_path, 'w')
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
names = list(yaml['names'].values()) # names list
for obj in root.iter('object'):
cls = obj.find('name').text
if cls in names and int(obj.find('difficult').text) != 1:
xmlbox = obj.find('bndbox')
bb = convert_box((w, h), [float(xmlbox.find(x).text) for x in ('xmin', 'xmax', 'ymin', 'ymax')])
cls_id = names.index(cls) # class id
out_file.write(" ".join(str(a) for a in (cls_id, *bb)) + '\n')
# Download
dir = Path(yaml['path']) # dataset root dir
url = 'https://github.com/ultralytics/assets/releases/download/v0.0.0/'
urls = [f'{url}VOCtrainval_06-Nov-2007.zip', # 446MB, 5012 images
f'{url}VOCtest_06-Nov-2007.zip', # 438MB, 4953 images
f'{url}VOCtrainval_11-May-2012.zip'] # 1.95GB, 17126 images
download(urls, dir=dir / 'images', curl=True, threads=3, exist_ok=True) # download and unzip over existing paths (required)
# Convert
path = dir / 'images/VOCdevkit'
for year, image_set in ('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test'):
imgs_path = dir / 'images' / f'{image_set}{year}'
lbs_path = dir / 'labels' / f'{image_set}{year}'
imgs_path.mkdir(exist_ok=True, parents=True)
lbs_path.mkdir(exist_ok=True, parents=True)
with open(path / f'VOC{year}/ImageSets/Main/{image_set}.txt') as f:
image_ids = f.read().strip().split()
for id in tqdm(image_ids, desc=f'{image_set}{year}'):
f = path / f'VOC{year}/JPEGImages/{id}.jpg' # old img path
lb_path = (lbs_path / f.name).with_suffix('.txt') # new label path
f.rename(imgs_path / f.name) # move image
convert_label(path, lb_path, year, id) # convert labels to YOLO format
์ฌ์ฉ๋ฒ
To train a YOLO11n model on the VOC dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.
์ด์ฐจ ์์
์ํ ์ด๋ฏธ์ง ๋ฐ ์ฃผ์
VOC ๋ฐ์ดํฐ ์ธํธ์๋ ๋ค์ํ ๊ฐ์ฒด ๋ฒ์ฃผ์ ๋ณต์กํ ์ฅ๋ฉด์ด ํฌํจ๋ ๋ค์ํ ์ด๋ฏธ์ง ์ธํธ๊ฐ ํฌํจ๋์ด ์์ต๋๋ค. ๋ค์์ ๋ฐ์ดํฐ ์ธํธ์ ์ด๋ฏธ์ง์ ํด๋น ์ฃผ์์ ๋ช ๊ฐ์ง ์์ ๋๋ค:
- ๋ชจ์์ดํฌ ์ด๋ฏธ์ง: ์ด ์ด๋ฏธ์ง๋ ๋ชจ์์ดํฌ๋ ๋ฐ์ดํฐ ์ธํธ ์ด๋ฏธ์ง๋ก ๊ตฌ์ฑ๋ ํ๋ จ ๋ฐฐ์น์ ์์์ ๋๋ค. ๋ชจ์์ดํฌ๋ ์ฌ๋ฌ ์ด๋ฏธ์ง๋ฅผ ํ๋์ ์ด๋ฏธ์ง๋ก ๊ฒฐํฉํ์ฌ ๊ฐ ํ๋ จ ๋ฐฐ์น ๋ด์์ ๋ค์ํ ๊ฐ์ฒด์ ์ฅ๋ฉด์ ๋๋ฆฌ๊ธฐ ์ํด ํ๋ จ ์ค์ ์ฌ์ฉ๋๋ ๊ธฐ์ ์ ๋๋ค. ์ด๋ฅผ ํตํด ๋ค์ํ ๊ฐ์ฒด ํฌ๊ธฐ, ์ข ํก๋น ๋ฐ ์ปจํ ์คํธ์ ์ผ๋ฐํํ๋ ๋ชจ๋ธ์ ๋ฅ๋ ฅ์ ํฅ์์ํฌ ์ ์์ต๋๋ค.
์ด ์๋ VOC ๋ฐ์ดํฐ ์ธํธ์ ํฌํจ๋ ์ด๋ฏธ์ง์ ๋ค์์ฑ๊ณผ ๋ณต์ก์ฑ, ๊ทธ๋ฆฌ๊ณ ํ์ต ๊ณผ์ ์์ ๋ชจ์์ดํฌ ์ฌ์ฉ์ ์ด์ ์ ๋ณด์ฌ์ค๋๋ค.
์ธ์ฉ ๋ฐ ๊ฐ์ฌ
์ฐ๊ตฌ ๋๋ ๊ฐ๋ฐ ์์ ์ VOC ๋ฐ์ดํฐ์ธํธ๋ฅผ ์ฌ์ฉํ๋ ๊ฒฝ์ฐ ๋ค์ ๋ ผ๋ฌธ์ ์ธ์ฉํด ์ฃผ์ธ์:
We would like to acknowledge the PASCAL VOC Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the VOC dataset and its creators, visit the PASCAL VOC dataset website.
์์ฃผ ๋ฌป๋ ์ง๋ฌธ
PASCAL VOC ๋ฐ์ดํฐ ์ธํธ๋ ๋ฌด์์ด๋ฉฐ ์ปดํจํฐ ๋น์ ์์ ์ ์ค์ํ ์ด์ ๋ ๋ฌด์์ธ๊ฐ์?
The PASCAL VOC (Visual Object Classes) dataset is a renowned benchmark for object detection, segmentation, and classification in computer vision. It includes comprehensive annotations like bounding boxes, class labels, and segmentation masks across 20 different object categories. Researchers use it widely to evaluate the performance of models like Faster R-CNN, YOLO, and Mask R-CNN due to its standardized evaluation metrics such as mean Average Precision (mAP).
How do I train a YOLO11 model using the VOC dataset?
To train a YOLO11 model with the VOC dataset, you need the dataset configuration in a YAML file. Here's an example to start training a YOLO11n model for 100 epochs with an image size of 640:
์ด์ฐจ ์์
VOC ๋ฐ์ดํฐ ์ธํธ์ ํฌํจ๋ ์ฃผ์ ๊ณผ์ ๋ ๋ฌด์์ธ๊ฐ์?
VOC ๋ฐ์ดํฐ ์ธํธ์๋ ๋ ๊ฐ์ง ์ฃผ์ ๊ณผ์ ๊ฐ ํฌํจ๋์ด ์์ต๋๋ค: VOC2007๊ณผ VOC2012. ์ด ๊ณผ์ ๋ค์ 20๊ฐ์ ๋ค์ํ ๊ฐ์ฒด ๋ฒ์ฃผ์ ๊ฑธ์ณ ๊ฐ์ฒด ๊ฐ์ง, ์ธ๋ถํ ๋ฐ ๋ถ๋ฅ๋ฅผ ํ ์คํธํฉ๋๋ค. ๊ฐ ์ด๋ฏธ์ง์๋ ๊ฒฝ๊ณ ์์, ํด๋์ค ๋ ์ด๋ธ, ์ธ๋ถํ ๋ง์คํฌ๊ฐ ๊ผผ๊ผผํ๊ฒ ์ฃผ์ ์ฒ๋ฆฌ๋์ด ์์ต๋๋ค. ์ด ์ฑ๋ฆฐ์ง๋ mAP์ ๊ฐ์ ํ์คํ๋ ์งํ๋ฅผ ์ ๊ณตํ์ฌ ๋ค์ํ ์ปดํจํฐ ๋น์ ๋ชจ๋ธ์ ์ฝ๊ฒ ๋น๊ตํ๊ณ ๋ฒค์น๋งํนํ ์ ์์ต๋๋ค.
PASCAL VOC ๋ฐ์ดํฐ ์ธํธ๋ ๋ชจ๋ธ ๋ฒค์น๋งํน ๋ฐ ํ๊ฐ๋ฅผ ์ด๋ป๊ฒ ํฅ์์ํค๋์?
The PASCAL VOC dataset enhances model benchmarking and evaluation through its detailed annotations and standardized metrics like mean Average Precision (mAP). These metrics are crucial for assessing the performance of object detection and classification models. The dataset's diverse and complex images ensure comprehensive model evaluation across various real-world scenarios.
How do I use the VOC dataset for semantic segmentation in YOLO models?
YOLO ๋ชจ๋ธ์ ์ฌ์ฉํ์ฌ ์๋ฏธ๋ก ์ ์ธ๋ถํ ์์ ์ VOC ๋ฐ์ดํฐ์ ์ ์ฌ์ฉํ๋ ค๋ฉด YAML ํ์ผ์์ ๋ฐ์ดํฐ์ ์ ์ฌ๋ฐ๋ฅด๊ฒ ๊ตฌ์ฑํด์ผ ํฉ๋๋ค. YAML ํ์ผ์ ์ธ๋ถํ ๋ชจ๋ธ ํ์ต์ ํ์ํ ๊ฒฝ๋ก์ ํด๋์ค๋ฅผ ์ ์ํฉ๋๋ค. ์์ธํ ์ค์ ์ VOC.yaml์์ V OC ๋ฐ์ดํฐ ์ธํธ YAML ๊ตฌ์ฑ ํ์ผ์ ํ์ธํ์ธ์.