Reference for hub_sdk/modules/datasets.py
Note
This file is available at https://github.com/ultralytics/hub-sdk/blob/main/hub_sdk/modules/datasets.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
hub_sdk.modules.datasets.Datasets
Bases: CRUDClient
A class representing a client for interacting with Datasets through CRUD operations. This class extends the CRUDClient class and provides specific methods for working with Datasets.
Attributes:
Name | Type | Description |
---|---|---|
hub_client | DatasetUpload | An instance of DatasetUpload used for interacting with model uploads. |
id | (str, None) | The unique identifier of the dataset, if available. |
data | dict | A dictionary to store dataset data. |
Note
The 'id' attribute is set during initialization and can be used to uniquely identify a dataset. The 'data' attribute is used to store dataset data fetched from the API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_id | str | Unique id of the dataset. | None |
headers | dict | Headers to include in HTTP requests. | None |
Source code in hub_sdk/modules/datasets.py
create_dataset
Creates a new dataset with the provided data and sets the dataset ID for the current instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_data | dict | A dictionary containing the data for creating the dataset. | required |
Returns:
Type | Description |
---|---|
None | The method does not return a value. |
Source code in hub_sdk/modules/datasets.py
delete
Delete the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hard | bool | If True, perform a hard delete. | False |
Note
The 'hard' parameter determines whether to perform a soft delete (default) or a hard delete. In a soft delete, the dataset might be marked as deleted but retained in the system. In a hard delete, the dataset is permanently removed from the system.
Returns:
Type | Description |
---|---|
Optional[Response] | Response object from the delete request, or None if delete fails. |
Source code in hub_sdk/modules/datasets.py
get_data
Retrieves data for the current dataset instance.
If a valid dataset ID has been set, it sends a request to fetch the dataset data and stores it in the instance. If no dataset ID has been set, it logs an error message.
Returns:
Type | Description |
---|---|
None | The method does not return a value. |
Source code in hub_sdk/modules/datasets.py
get_download_link
Get dataset download link.
Returns:
Type | Description |
---|---|
Optional[str] | Return download link or None if the link is not available. |
update
Update the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data | dict | The updated data for the dataset resource. | required |
Returns:
Type | Description |
---|---|
Optional[Response] | Response object from the update request, or None if update fails. |
Source code in hub_sdk/modules/datasets.py
upload_dataset
Uploads a dataset file to the hub.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file | str | The path to the dataset file to upload. | None |
Returns:
Type | Description |
---|---|
Optional[Response] | Response object from the upload request, or None if upload fails. |
Source code in hub_sdk/modules/datasets.py
hub_sdk.modules.datasets.DatasetList
Bases: PaginatedList
A class for managing a paginated list of datasets from the Ultralytics Hub API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
page_size | int | The number of items to request per page. | None |
public | bool | Whether the items should be publicly accessible. | None |
headers | dict | Headers to be included in API requests. | None |