Reference for hub_sdk/modules/datasets.py
Note
This file is available at https://github.com/ultralytics/hub-sdk/blob/main/hub_sdk/modules/datasets.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
hub_sdk.modules.datasets.Datasets
Datasets(dataset_id: str | None = None, headers: dict[str, Any] | None = None)
Bases: CRUDClient
flowchart TD
hub_sdk.modules.datasets.Datasets[Datasets]
hub_sdk.base.crud_client.CRUDClient[CRUDClient]
hub_sdk.base.api_client.APIClient[APIClient]
hub_sdk.base.crud_client.CRUDClient --> hub_sdk.modules.datasets.Datasets
hub_sdk.base.api_client.APIClient --> hub_sdk.base.crud_client.CRUDClient
click hub_sdk.modules.datasets.Datasets href "" "hub_sdk.modules.datasets.Datasets"
click hub_sdk.base.crud_client.CRUDClient href "" "hub_sdk.base.crud_client.CRUDClient"
click hub_sdk.base.api_client.APIClient href "" "hub_sdk.base.api_client.APIClient"
A class representing a client for interacting with Datasets through CRUD operations.
This class extends the CRUDClient class and provides specific methods for working with Datasets.
Attributes:
| Name | Type | Description |
|---|---|---|
hub_client |
DatasetUpload
| An instance of DatasetUpload used for interacting with dataset uploads. |
id |
str | None
| The unique identifier of the dataset, if available. |
data |
Dict
| A dictionary to store dataset data. |
Notes
The 'id' attribute is set during initialization and can be used to uniquely identify a dataset. The 'data' attribute is used to store dataset data fetched from the API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_id
|
str
| Unique id of the dataset. |
None
|
headers
|
Dict
| Headers to include in HTTP requests. |
None
|
Source code in hub_sdk/modules/datasets.py
29 30 31 32 33 34 35 36 37 38 39 40 41 | |
create_dataset
create_dataset(dataset_data: dict) -> None
Create a new dataset with the provided data and set the dataset ID for the current instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_data
|
Dict
| A dictionary containing the data for creating the dataset. | required |
Source code in hub_sdk/modules/datasets.py
76 77 78 79 80 81 82 83 84 | |
delete
delete(hard: bool = False) -> Response | None
Delete the dataset resource represented by this instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
hard
|
bool
| If True, perform a hard delete. |
False
|
Returns:
| Type | Description |
|---|---|
Optional[Response]
| Response object from the delete request, or None if delete fails. |
Notes
The 'hard' parameter determines whether to perform a soft delete (default) or a hard delete. In a soft delete, the dataset might be marked as deleted but retained in the system. In a hard delete, the dataset is permanently removed from the system.
Source code in hub_sdk/modules/datasets.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 | |
get_data
get_data() -> None
Retrieve data for the current dataset instance.
If a valid dataset ID has been set, it sends a request to fetch the dataset data and stores it in the instance. If no dataset ID has been set, it logs an error message.
Source code in hub_sdk/modules/datasets.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 | |
get_download_link
get_download_link() -> str | None
Get dataset download link.
Returns:
| Type | Description |
|---|---|
Optional[str]
| Return download link or None if the link is not available. |
Source code in hub_sdk/modules/datasets.py
124 125 126 127 128 129 130 | |
update
update(data: dict) -> Response | None
Update the dataset resource represented by this instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Dict
| The updated data for the dataset resource. | required |
Returns:
| Type | Description |
|---|---|
Optional[Response]
| Response object from the update request, or None if update fails. |
Source code in hub_sdk/modules/datasets.py
102 103 104 105 106 107 108 109 110 111 | |
upload_dataset
upload_dataset(file: str | None = None) -> Response | None
Upload a dataset file to the hub.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
file
|
str
| The path to the dataset file to upload. |
None
|
Returns:
| Type | Description |
|---|---|
Optional[Response]
| Response object from the upload request, or None if upload fails. |
Source code in hub_sdk/modules/datasets.py
113 114 115 116 117 118 119 120 121 122 | |
hub_sdk.modules.datasets.DatasetList
DatasetList(page_size=None, public=None, headers=None)
Bases: PaginatedList
flowchart TD
hub_sdk.modules.datasets.DatasetList[DatasetList]
hub_sdk.base.paginated_list.PaginatedList[PaginatedList]
hub_sdk.base.api_client.APIClient[APIClient]
hub_sdk.base.paginated_list.PaginatedList --> hub_sdk.modules.datasets.DatasetList
hub_sdk.base.api_client.APIClient --> hub_sdk.base.paginated_list.PaginatedList
click hub_sdk.modules.datasets.DatasetList href "" "hub_sdk.modules.datasets.DatasetList"
click hub_sdk.base.paginated_list.PaginatedList href "" "hub_sdk.base.paginated_list.PaginatedList"
click hub_sdk.base.api_client.APIClient href "" "hub_sdk.base.api_client.APIClient"
A class for managing a paginated list of datasets from the Ultralytics Hub API.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
page_size
|
int
| The number of items to request per page. |
None
|
public
|
bool
| Whether the items should be publicly accessible. |
None
|
headers
|
Dict
| Headers to be included in API requests. |
None
|
Source code in hub_sdk/modules/datasets.py
136 137 138 139 140 141 142 143 144 145 | |