Reference for hub_sdk/modules/datasets.py
Note
This file is available at https://github.com/ultralytics/hub-sdk/blob/main/hub_sdk/modules/datasets.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
hub_sdk.modules.datasets.Datasets
Bases: CRUDClient
A class representing a client for interacting with Datasets through CRUD operations. This class extends the CRUDClient class and provides specific methods for working with Datasets.
Attributes:
Name | Type | Description |
---|---|---|
hub_client |
DatasetUpload
|
An instance of DatasetUpload used for interacting with model uploads. |
id |
(str, None)
|
The unique identifier of the dataset, if available. |
data |
dict
|
A dictionary to store dataset data. |
Note
The 'id' attribute is set during initialization and can be used to uniquely identify a dataset. The 'data' attribute is used to store dataset data fetched from the API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_id |
str
|
Unique id of the dataset. |
None
|
headers |
dict
|
Headers to include in HTTP requests. |
None
|
Source code in hub_sdk/modules/datasets.py
create_dataset
Creates a new dataset with the provided data and sets the dataset ID for the current instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_data |
dict
|
A dictionary containing the data for creating the dataset. |
required |
Returns:
Type | Description |
---|---|
None
|
The method does not return a value. |
Source code in hub_sdk/modules/datasets.py
delete
Delete the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hard |
bool
|
If True, perform a hard delete. |
False
|
Note
The 'hard' parameter determines whether to perform a soft delete (default) or a hard delete. In a soft delete, the dataset might be marked as deleted but retained in the system. In a hard delete, the dataset is permanently removed from the system.
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the delete request, or None if delete fails. |
Source code in hub_sdk/modules/datasets.py
get_data
Retrieves data for the current dataset instance.
If a valid dataset ID has been set, it sends a request to fetch the dataset data and stores it in the instance. If no dataset ID has been set, it logs an error message.
Returns:
Type | Description |
---|---|
None
|
The method does not return a value. |
Source code in hub_sdk/modules/datasets.py
get_download_link
Get dataset download link.
Returns:
Type | Description |
---|---|
Optional[str]
|
Return download link or None if the link is not available. |
update
Update the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
The updated data for the dataset resource. |
required |
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the update request, or None if update fails. |
Source code in hub_sdk/modules/datasets.py
upload_dataset
Uploads a dataset file to the hub.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file |
str
|
The path to the dataset file to upload. |
None
|
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the upload request, or None if upload fails. |
Source code in hub_sdk/modules/datasets.py
hub_sdk.modules.datasets.DatasetList
Bases: PaginatedList
A class for managing a paginated list of datasets from the Ultralytics Hub API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
page_size |
int
|
The number of items to request per page. |
None
|
public |
bool
|
Whether the items should be publicly accessible. |
None
|
headers |
dict
|
Headers to be included in API requests. |
None
|