跳转至内容

Open Images V7 数据集

Open Images V7 是 Google 倡导的通用且广泛的数据集。 旨在推动计算机视觉领域的研究,它拥有大量的图像集合,这些图像都标有大量数据,包括图像级标签、对象边界框、对象分割掩码、视觉关系和本地化叙述。



观看:目标检测 使用 OpenImagesV7 预训练模型

Open Images V7 预训练模型

模型尺寸
(像素)
mAPval
50-95
速度
CPU ONNX
(毫秒)
速度
A100 TensorRT
(毫秒)
参数
(M)
FLOPs
(B)
YOLOv8n64018.4142.41.213.510.5
YOLOv8s64027.7183.11.4011.429.7
YOLOv8m64033.6408.52.2626.280.6
YOLOv8l64034.9596.92.4344.1167.4
YOLOv8x64036.3860.63.5668.7260.6

您可以按如下方式使用这些预训练模型进行推理或微调。

预训练模型使用示例

from ultralytics import YOLO

# Load an Open Images Dataset V7 pretrained YOLOv8n model
model = YOLO("yolov8n-oiv7.pt")

# Run prediction
results = model.predict(source="image.jpg")

# Start training from the pretrained checkpoint
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
# Predict using an Open Images Dataset V7 pretrained model
yolo detect predict source=image.jpg model=yolov8n-oiv7.pt

# Start training from an Open Images Dataset V7 pretrained checkpoint
yolo detect train data=coco8.yaml model=yolov8n-oiv7.pt epochs=100 imgsz=640

Open Images V7 类别的可视化

主要功能

  • 包含约 900 万张以各种方式标注的图像,以适应多种计算机视觉任务。
  • 在 190 万张图像中包含惊人的 1600 万个边界框,涵盖 600 个对象类别。这些框主要由专家手工绘制,以确保高精度
  • 提供总计 330 万个视觉关系注释,详细描述了 1,466 个独特的关联三元组、对象属性和人类活动。
  • V5 引入了 280 万个对象的分割掩码,涵盖 350 个类别。
  • V6 引入了 67.5 万个本地化叙述,融合了语音、文本和鼠标轨迹,突出了所描述的对象。
  • V7 在 140 万张图像上引入了 6640 万个点级标签,涵盖 5,827 个类别。
  • 包含 6140 万个图像级标签,涵盖 20,638 个不同的类别。
  • 图像分类、目标检测、关系检测、实例分割和多模态图像描述提供统一的平台。

数据集结构

Open Images V7 在多个组件中构建,以满足各种计算机视觉挑战:

  • 图像:约 900 万张图像,通常展示复杂的场景,平均每张图像有 8.3 个对象。
  • 边界框:超过 1600 万个框,用于标定 600 个类别中的对象。
  • 分割掩码:这些详细描述了 350 个类别中 280 万个对象的精确边界。
  • 视觉关系:330 万个注释,指示对象关系、属性和动作。
  • 本地化叙述:67.5 万个描述,结合了语音、文本和鼠标轨迹。
  • 点级标签:140 万张图像上的 6640 万个标签,适用于零样本/少样本语义分割

应用

Open Images V7 是训练和评估各种计算机视觉任务中最先进模型的基石。该数据集的广泛范围和高质量的注释使其对于专门从事计算机视觉的研究人员和开发人员来说是不可或缺的。

一些关键应用包括:

  • 高级目标检测: 训练模型以高精度识别和定位复杂场景中的多个目标。
  • 语义理解:开发能够理解对象之间视觉关系的的系统。
  • 图像分割:为对象创建精确的像素级掩码,从而实现详细的场景分析。
  • 多模态学习: 将视觉数据与文本描述相结合,以实现更丰富的 AI 理解。
  • Zero-shot Learning:利用广泛的类别覆盖来识别训练期间未见过的对象。

数据集 YAML

Ultralytics 维护着一个 open-images-v7.yaml 文件,其中指定了训练所需的数据集路径、类名和其他配置细节。

OpenImagesV7.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Open Images v7 dataset https://storage.googleapis.com/openimages/web/index.html by Google
# Documentation: https://docs.ultralytics.com/datasets/detect/open-images-v7/
# Example usage: yolo train data=open-images-v7.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── open-images-v7 ← downloads here (561 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: open-images-v7 # dataset root dir
train: images/train # train images (relative to 'path') 1743042 images
val: images/val # val images (relative to 'path') 41620 images
test: # test images (optional)

# Classes
names:
  0: Accordion
  1: Adhesive tape
  2: Aircraft
  3: Airplane
  4: Alarm clock
  5: Alpaca
  6: Ambulance
  7: Animal
  8: Ant
  9: Antelope
  10: Apple
  11: Armadillo
  12: Artichoke
  13: Auto part
  14: Axe
  15: Backpack
  16: Bagel
  17: Baked goods
  18: Balance beam
  19: Ball
  20: Balloon
  21: Banana
  22: Band-aid
  23: Banjo
  24: Barge
  25: Barrel
  26: Baseball bat
  27: Baseball glove
  28: Bat (Animal)
  29: Bathroom accessory
  30: Bathroom cabinet
  31: Bathtub
  32: Beaker
  33: Bear
  34: Bed
  35: Bee
  36: Beehive
  37: Beer
  38: Beetle
  39: Bell pepper
  40: Belt
  41: Bench
  42: Bicycle
  43: Bicycle helmet
  44: Bicycle wheel
  45: Bidet
  46: Billboard
  47: Billiard table
  48: Binoculars
  49: Bird
  50: Blender
  51: Blue jay
  52: Boat
  53: Bomb
  54: Book
  55: Bookcase
  56: Boot
  57: Bottle
  58: Bottle opener
  59: Bow and arrow
  60: Bowl
  61: Bowling equipment
  62: Box
  63: Boy
  64: Brassiere
  65: Bread
  66: Briefcase
  67: Broccoli
  68: Bronze sculpture
  69: Brown bear
  70: Building
  71: Bull
  72: Burrito
  73: Bus
  74: Bust
  75: Butterfly
  76: Cabbage
  77: Cabinetry
  78: Cake
  79: Cake stand
  80: Calculator
  81: Camel
  82: Camera
  83: Can opener
  84: Canary
  85: Candle
  86: Candy
  87: Cannon
  88: Canoe
  89: Cantaloupe
  90: Car
  91: Carnivore
  92: Carrot
  93: Cart
  94: Cassette deck
  95: Castle
  96: Cat
  97: Cat furniture
  98: Caterpillar
  99: Cattle
  100: Ceiling fan
  101: Cello
  102: Centipede
  103: Chainsaw
  104: Chair
  105: Cheese
  106: Cheetah
  107: Chest of drawers
  108: Chicken
  109: Chime
  110: Chisel
  111: Chopsticks
  112: Christmas tree
  113: Clock
  114: Closet
  115: Clothing
  116: Coat
  117: Cocktail
  118: Cocktail shaker
  119: Coconut
  120: Coffee
  121: Coffee cup
  122: Coffee table
  123: Coffeemaker
  124: Coin
  125: Common fig
  126: Common sunflower
  127: Computer keyboard
  128: Computer monitor
  129: Computer mouse
  130: Container
  131: Convenience store
  132: Cookie
  133: Cooking spray
  134: Corded phone
  135: Cosmetics
  136: Couch
  137: Countertop
  138: Cowboy hat
  139: Crab
  140: Cream
  141: Cricket ball
  142: Crocodile
  143: Croissant
  144: Crown
  145: Crutch
  146: Cucumber
  147: Cupboard
  148: Curtain
  149: Cutting board
  150: Dagger
  151: Dairy Product
  152: Deer
  153: Desk
  154: Dessert
  155: Diaper
  156: Dice
  157: Digital clock
  158: Dinosaur
  159: Dishwasher
  160: Dog
  161: Dog bed
  162: Doll
  163: Dolphin
  164: Door
  165: Door handle
  166: Donut
  167: Dragonfly
  168: Drawer
  169: Dress
  170: Drill (Tool)
  171: Drink
  172: Drinking straw
  173: Drum
  174: Duck
  175: Dumbbell
  176: Eagle
  177: Earrings
  178: Egg (Food)
  179: Elephant
  180: Envelope
  181: Eraser
  182: Face powder
  183: Facial tissue holder
  184: Falcon
  185: Fashion accessory
  186: Fast food
  187: Fax
  188: Fedora
  189: Filing cabinet
  190: Fire hydrant
  191: Fireplace
  192: Fish
  193: Flag
  194: Flashlight
  195: Flower
  196: Flowerpot
  197: Flute
  198: Flying disc
  199: Food
  200: Food processor
  201: Football
  202: Football helmet
  203: Footwear
  204: Fork
  205: Fountain
  206: Fox
  207: French fries
  208: French horn
  209: Frog
  210: Fruit
  211: Frying pan
  212: Furniture
  213: Garden Asparagus
  214: Gas stove
  215: Giraffe
  216: Girl
  217: Glasses
  218: Glove
  219: Goat
  220: Goggles
  221: Goldfish
  222: Golf ball
  223: Golf cart
  224: Gondola
  225: Goose
  226: Grape
  227: Grapefruit
  228: Grinder
  229: Guacamole
  230: Guitar
  231: Hair dryer
  232: Hair spray
  233: Hamburger
  234: Hammer
  235: Hamster
  236: Hand dryer
  237: Handbag
  238: Handgun
  239: Harbor seal
  240: Harmonica
  241: Harp
  242: Harpsichord
  243: Hat
  244: Headphones
  245: Heater
  246: Hedgehog
  247: Helicopter
  248: Helmet
  249: High heels
  250: Hiking equipment
  251: Hippopotamus
  252: Home appliance
  253: Honeycomb
  254: Horizontal bar
  255: Horse
  256: Hot dog
  257: House
  258: Houseplant
  259: Human arm
  260: Human beard
  261: Human body
  262: Human ear
  263: Human eye
  264: Human face
  265: Human foot
  266: Human hair
  267: Human hand
  268: Human head
  269: Human leg
  270: Human mouth
  271: Human nose
  272: Humidifier
  273: Ice cream
  274: Indoor rower
  275: Infant bed
  276: Insect
  277: Invertebrate
  278: Ipod
  279: Isopod
  280: Jacket
  281: Jacuzzi
  282: Jaguar (Animal)
  283: Jeans
  284: Jellyfish
  285: Jet ski
  286: Jug
  287: Juice
  288: Kangaroo
  289: Kettle
  290: Kitchen & dining room table
  291: Kitchen appliance
  292: Kitchen knife
  293: Kitchen utensil
  294: Kitchenware
  295: Kite
  296: Knife
  297: Koala
  298: Ladder
  299: Ladle
  300: Ladybug
  301: Lamp
  302: Land vehicle
  303: Lantern
  304: Laptop
  305: Lavender (Plant)
  306: Lemon
  307: Leopard
  308: Light bulb
  309: Light switch
  310: Lighthouse
  311: Lily
  312: Limousine
  313: Lion
  314: Lipstick
  315: Lizard
  316: Lobster
  317: Loveseat
  318: Luggage and bags
  319: Lynx
  320: Magpie
  321: Mammal
  322: Man
  323: Mango
  324: Maple
  325: Maracas
  326: Marine invertebrates
  327: Marine mammal
  328: Measuring cup
  329: Mechanical fan
  330: Medical equipment
  331: Microphone
  332: Microwave oven
  333: Milk
  334: Miniskirt
  335: Mirror
  336: Missile
  337: Mixer
  338: Mixing bowl
  339: Mobile phone
  340: Monkey
  341: Moths and butterflies
  342: Motorcycle
  343: Mouse
  344: Muffin
  345: Mug
  346: Mule
  347: Mushroom
  348: Musical instrument
  349: Musical keyboard
  350: Nail (Construction)
  351: Necklace
  352: Nightstand
  353: Oboe
  354: Office building
  355: Office supplies
  356: Orange
  357: Organ (Musical Instrument)
  358: Ostrich
  359: Otter
  360: Oven
  361: Owl
  362: Oyster
  363: Paddle
  364: Palm tree
  365: Pancake
  366: Panda
  367: Paper cutter
  368: Paper towel
  369: Parachute
  370: Parking meter
  371: Parrot
  372: Pasta
  373: Pastry
  374: Peach
  375: Pear
  376: Pen
  377: Pencil case
  378: Pencil sharpener
  379: Penguin
  380: Perfume
  381: Person
  382: Personal care
  383: Personal flotation device
  384: Piano
  385: Picnic basket
  386: Picture frame
  387: Pig
  388: Pillow
  389: Pineapple
  390: Pitcher (Container)
  391: Pizza
  392: Pizza cutter
  393: Plant
  394: Plastic bag
  395: Plate
  396: Platter
  397: Plumbing fixture
  398: Polar bear
  399: Pomegranate
  400: Popcorn
  401: Porch
  402: Porcupine
  403: Poster
  404: Potato
  405: Power plugs and sockets
  406: Pressure cooker
  407: Pretzel
  408: Printer
  409: Pumpkin
  410: Punching bag
  411: Rabbit
  412: Raccoon
  413: Racket
  414: Radish
  415: Ratchet (Device)
  416: Raven
  417: Rays and skates
  418: Red panda
  419: Refrigerator
  420: Remote control
  421: Reptile
  422: Rhinoceros
  423: Rifle
  424: Ring binder
  425: Rocket
  426: Roller skates
  427: Rose
  428: Rugby ball
  429: Ruler
  430: Salad
  431: Salt and pepper shakers
  432: Sandal
  433: Sandwich
  434: Saucer
  435: Saxophone
  436: Scale
  437: Scarf
  438: Scissors
  439: Scoreboard
  440: Scorpion
  441: Screwdriver
  442: Sculpture
  443: Sea lion
  444: Sea turtle
  445: Seafood
  446: Seahorse
  447: Seat belt
  448: Segway
  449: Serving tray
  450: Sewing machine
  451: Shark
  452: Sheep
  453: Shelf
  454: Shellfish
  455: Shirt
  456: Shorts
  457: Shotgun
  458: Shower
  459: Shrimp
  460: Sink
  461: Skateboard
  462: Ski
  463: Skirt
  464: Skull
  465: Skunk
  466: Skyscraper
  467: Slow cooker
  468: Snack
  469: Snail
  470: Snake
  471: Snowboard
  472: Snowman
  473: Snowmobile
  474: Snowplow
  475: Soap dispenser
  476: Sock
  477: Sofa bed
  478: Sombrero
  479: Sparrow
  480: Spatula
  481: Spice rack
  482: Spider
  483: Spoon
  484: Sports equipment
  485: Sports uniform
  486: Squash (Plant)
  487: Squid
  488: Squirrel
  489: Stairs
  490: Stapler
  491: Starfish
  492: Stationary bicycle
  493: Stethoscope
  494: Stool
  495: Stop sign
  496: Strawberry
  497: Street light
  498: Stretcher
  499: Studio couch
  500: Submarine
  501: Submarine sandwich
  502: Suit
  503: Suitcase
  504: Sun hat
  505: Sunglasses
  506: Surfboard
  507: Sushi
  508: Swan
  509: Swim cap
  510: Swimming pool
  511: Swimwear
  512: Sword
  513: Syringe
  514: Table
  515: Table tennis racket
  516: Tablet computer
  517: Tableware
  518: Taco
  519: Tank
  520: Tap
  521: Tart
  522: Taxi
  523: Tea
  524: Teapot
  525: Teddy bear
  526: Telephone
  527: Television
  528: Tennis ball
  529: Tennis racket
  530: Tent
  531: Tiara
  532: Tick
  533: Tie
  534: Tiger
  535: Tin can
  536: Tire
  537: Toaster
  538: Toilet
  539: Toilet paper
  540: Tomato
  541: Tool
  542: Toothbrush
  543: Torch
  544: Tortoise
  545: Towel
  546: Tower
  547: Toy
  548: Traffic light
  549: Traffic sign
  550: Train
  551: Training bench
  552: Treadmill
  553: Tree
  554: Tree house
  555: Tripod
  556: Trombone
  557: Trousers
  558: Truck
  559: Trumpet
  560: Turkey
  561: Turtle
  562: Umbrella
  563: Unicycle
  564: Van
  565: Vase
  566: Vegetable
  567: Vehicle
  568: Vehicle registration plate
  569: Violin
  570: Volleyball (Ball)
  571: Waffle
  572: Waffle iron
  573: Wall clock
  574: Wardrobe
  575: Washing machine
  576: Waste container
  577: Watch
  578: Watercraft
  579: Watermelon
  580: Weapon
  581: Whale
  582: Wheel
  583: Wheelchair
  584: Whisk
  585: Whiteboard
  586: Willow
  587: Window
  588: Window blind
  589: Wine
  590: Wine glass
  591: Wine rack
  592: Winter melon
  593: Wok
  594: Woman
  595: Wood-burning stove
  596: Woodpecker
  597: Worm
  598: Wrench
  599: Zebra
  600: Zucchini

# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
  import warnings

  from ultralytics.utils import LOGGER, SETTINGS, Path
  from ultralytics.utils.checks import check_requirements

  check_requirements("fiftyone")

  import fiftyone as fo
  import fiftyone.zoo as foz

  name = "open-images-v7"
  fo.config.dataset_zoo_dir = Path(SETTINGS["datasets_dir"]) / "fiftyone" / name
  fraction = 1.0  # fraction of full dataset to use
  LOGGER.warning("Open Images V7 dataset requires at least **561 GB of free space. Starting download...")
  for split in "train", "validation":  # 1743042 train, 41620 val images
      train = split == "train"

      # Load Open Images dataset
      dataset = foz.load_zoo_dataset(
          name,
          split=split,
          label_types=["detections"],
          max_samples=round((1743042 if train else 41620) * fraction),
      )

      # Define classes
      if train:
          classes = dataset.default_classes  # all classes
          # classes = dataset.distinct('ground_truth.detections.label')  # only observed classes

      # Export to YOLO format
      with warnings.catch_warnings():
          warnings.filterwarnings("ignore", category=UserWarning, module="fiftyone.utils.yolo")
          dataset.export(
              export_dir=str(Path(SETTINGS["datasets_dir"]) / name),
              dataset_type=fo.types.YOLOv5Dataset,
              label_field="ground_truth",
              split="val" if split == "validation" else split,
              classes=classes,
              overwrite=train,
          )

用法

要在 Open Images V7 数据集上训练 YOLO11n 模型 100 个 epochs,图像大小为 640,您可以使用以下代码片段。 有关可用参数的完整列表,请参阅模型训练页面。

警告

完整的 Open Images V7 数据集包含 1,743,042 张训练图像和 41,620 张验证图像,下载后大约需要 561 GB 的存储空间

如果本地尚不存在完整数据集,执行以下命令将触发自动下载。 在运行以下示例之前,至关重要的是:

  • 确认您的设备有足够的存储容量。
  • 确保网络连接稳定且快速。

训练示例

from ultralytics import YOLO

# Load a COCO-pretrained YOLO11n model
model = YOLO("yolo11n.pt")

# Train the model on the Open Images V7 dataset
results = model.train(data="open-images-v7.yaml", epochs=100, imgsz=640)
# Train a COCO-pretrained YOLO11n model on the Open Images V7 dataset
yolo detect train data=open-images-v7.yaml model=yolo11n.pt epochs=100 imgsz=640

样本数据和注释

数据集的图示有助于深入了解其丰富性:

数据集样本图像

  • Open Images V7:此图像例证了可用注释的深度和细节,包括边界框、关系和分割掩码。

研究人员可以深入了解数据集所解决的各种计算机视觉挑战,从基本目标检测到复杂的关联识别。 注释的多样性 使 Open Images V7 在开发能够理解复杂视觉场景的模型方面具有特殊的价值。

引用和致谢

对于在其工作中采用 Open Images V7 的用户,建议引用相关论文并感谢创建者:

@article{OpenImages,
  author = {Alina Kuznetsova and Hassan Rom and Neil Alldrin and Jasper Uijlings and Ivan Krasin and Jordi Pont-Tuset and Shahab Kamali and Stefan Popov and Matteo Malloci and Alexander Kolesnikov and Tom Duerig and Vittorio Ferrari},
  title = {The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale},
  year = {2020},
  journal = {IJCV}
}

衷心感谢 Google AI 团队创建和维护 Open Images V7 数据集。要深入了解数据集及其产品,请访问 Open Images V7 官方网站

常见问题

什么是 Open Images V7 数据集?

Open Images V7 是 Google 创建的一个广泛而通用的数据集,旨在推进计算机视觉领域的研究。它包括图像级标签、对象边界框、对象分割掩码、视觉关系和本地化叙述,使其成为各种计算机视觉任务(如对象检测、分割和关系检测)的理想选择。

如何在 Open Images V7 数据集上训练 YOLO11 模型?

要在 Open Images V7 数据集上训练 YOLO11 模型,您可以使用 python 和 CLI 命令。以下是使用图像大小为 640 训练 YOLO11n 模型 100 个 epoch 的示例:

训练示例

from ultralytics import YOLO

# Load a COCO-pretrained YOLO11n model
model = YOLO("yolo11n.pt")

# Train the model on the Open Images V7 dataset
results = model.train(data="open-images-v7.yaml", epochs=100, imgsz=640)
# Train a COCO-pretrained YOLO11n model on the Open Images V7 dataset
yolo detect train data=open-images-v7.yaml model=yolo11n.pt epochs=100 imgsz=640

有关参数和设置的更多详细信息,请参阅训练页面。

Open Images V7 数据集有哪些主要功能?

Open Images V7 数据集包含大约 900 万张带有各种注释的图像:

  • 边界框:跨 600 个对象类别的 1600 万个边界框。
  • 分割掩码:跨 350 个类别的 280 万个对象的掩码。
  • 视觉关系:330 万个注释,指示关系、属性和动作。
  • 本地化叙述:675,000 个描述,结合了语音、文本和鼠标轨迹。
  • 点级标签:跨 140 万张图像的 6640 万个标签。
  • 图像级标签:跨 20,638 个类别的 6140 万个标签。

Open Images V7 数据集有哪些可用的预训练模型?

Ultralytics 为 Open Images V7 数据集提供了多个 YOLOv8 预训练模型,每个模型都有不同的大小和性能指标:

模型尺寸
(像素)
mAPval
50-95
速度
CPU ONNX
(毫秒)
速度
A100 TensorRT
(毫秒)
参数
(M)
FLOPs
(B)
YOLOv8n64018.4142.41.213.510.5
YOLOv8s64027.7183.11.4011.429.7
YOLOv8m64033.6408.52.2626.280.6
YOLOv8l64034.9596.92.4344.1167.4
YOLOv8x64036.3860.63.5668.7260.6

Open Images V7 数据集可以用于哪些应用?

Open Images V7 数据集支持各种计算机视觉任务,包括:

  • 图像分类
  • 目标检测
  • 实例分割
  • 视觉关系检测
  • 多模态图像描述

其全面的注释和广泛的范围使其适合训练和评估高级机器学习模型,正如我们的应用部分中详细介绍的实际用例中所强调的那样。



📅创建于 2 年前 ✏️已更新 5 天前
glenn-jocherUltralyticsAssistantRizwanMunawarY-T-GMatthewNoyce

评论