Reference for hub_sdk/modules/datasets.py
Note
This file is available at https://github.com/ultralytics/hub-sdk/blob/main/hub_sdk/modules/datasets.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
hub_sdk.modules.datasets.Datasets
Bases: CRUDClient
A class representing a client for interacting with Datasets through CRUD operations. This class extends the CRUDClient class and provides specific methods for working with Datasets.
Attributes:
Name | Type | Description |
---|---|---|
hub_client |
DatasetUpload
|
An instance of DatasetUpload used for interacting with model uploads. |
id |
(str, None)
|
The unique identifier of the dataset, if available. |
data |
dict
|
A dictionary to store dataset data. |
Note
The 'id' attribute is set during initialization and can be used to uniquely identify a dataset. The 'data' attribute is used to store dataset data fetched from the API.
Source code in hub_sdk/modules/datasets.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
|
__init__(dataset_id=None, headers=None)
Initialize a Datasets client.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_id |
str
|
Unique id of the dataset. |
None
|
headers |
dict
|
Headers to include in HTTP requests. |
None
|
Source code in hub_sdk/modules/datasets.py
create_dataset(dataset_data)
Creates a new dataset with the provided data and sets the dataset ID for the current instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_data |
dict
|
A dictionary containing the data for creating the dataset. |
required |
Returns:
Type | Description |
---|---|
None
|
The method does not return a value. |
Source code in hub_sdk/modules/datasets.py
delete(hard=False)
Delete the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hard |
bool
|
If True, perform a hard delete. |
False
|
Note
The 'hard' parameter determines whether to perform a soft delete (default) or a hard delete. In a soft delete, the dataset might be marked as deleted but retained in the system. In a hard delete, the dataset is permanently removed from the system.
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the delete request, or None if delete fails. |
Source code in hub_sdk/modules/datasets.py
get_data()
Retrieves data for the current dataset instance.
If a valid dataset ID has been set, it sends a request to fetch the dataset data and stores it in the instance. If no dataset ID has been set, it logs an error message.
Returns:
Type | Description |
---|---|
None
|
The method does not return a value. |
Source code in hub_sdk/modules/datasets.py
get_download_link()
Get dataset download link.
Returns:
Type | Description |
---|---|
Optional[str]
|
Return download link or None if the link is not available. |
update(data)
Update the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
dict
|
The updated data for the dataset resource. |
required |
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the update request, or None if update fails. |
Source code in hub_sdk/modules/datasets.py
upload_dataset(file=None)
Uploads a dataset file to the hub.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file |
str
|
The path to the dataset file to upload. |
None
|
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the upload request, or None if upload fails. |
Source code in hub_sdk/modules/datasets.py
hub_sdk.modules.datasets.DatasetList
Bases: PaginatedList
Source code in hub_sdk/modules/datasets.py
__init__(page_size=None, public=None, headers=None)
Initialize a Dataset instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
page_size |
int
|
The number of items to request per page. |
None
|
public |
bool
|
Whether the items should be publicly accessible. |
None
|
headers |
dict
|
Headers to be included in API requests. |
None
|