Implement workspace-level search (DATAMAN-163)#441
Implement workspace-level search (DATAMAN-163)#441digaobarbosa wants to merge 4 commits intomainfrom
Conversation
tests/manual/demo_workspace_search.py
Dismissed
| count += 1 | ||
| print(f"Page {count}: {len(page_results)} results") | ||
| for img in page_results: | ||
| print(f" - {img.get('filename', 'N/A')}") |
Check failure
Code scanning / CodeQL
Clear-text logging of sensitive information High test
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 1 day ago
General approach: Ensure that no method that is likely to be logged (__str__/__repr__ or demo printouts) exposes secrets like api_key. We do not need to change how the key is used in HTTP requests; only remove or mask it from any string/log representations.
Best single fix without changing functionality:
The direct, real leak is Roboflow.__str__ in roboflow/__init__.py, which currently returns a JSON blob containing the raw api_key. That method can be called implicitly by print(rf) or loggers, thus logging the key in clear text. We should change __str__ so that it no longer includes the API key, or at minimum masks it. To avoid altering behavior elsewhere, we’ll keep the same structure but replace the key with a masked version (e.g., show last 4 characters only). This preserves usefulness for debugging while preventing secret exposure.
Concretely:
- In
roboflow/__init__.py, updateRoboflow.__str__(lines 272–275 in the snippet) so that:- It computes a
masked_api_keystring that does not reveal the full key (e.g.,"****"if empty, or"****" + last_4_chars). - It sets
json_value = {"api_key": masked_api_key, "workspace": self.workspace}instead of usingself.api_keydirectly.
- It computes a
- Leave the rest of the class unchanged; no change to how HTTP requests are made.
- The other printed value in
tests/manual/demo_workspace_search.py(filename) does not involve the API key and can remain as-is, so no change is necessary there for secrecy.
No new imports or helper methods are needed; masking logic can be implemented inline in __str__.
| @@ -271,5 +271,11 @@ | ||
|
|
||
| def __str__(self): | ||
| """to string function""" | ||
| json_value = {"api_key": self.api_key, "workspace": self.workspace} | ||
| # Avoid exposing the full API key when this object is printed or logged. | ||
| api_key = self.api_key or "" | ||
| if len(api_key) > 4: | ||
| masked_api_key = ("*" * (len(api_key) - 4)) + api_key[-4:] | ||
| else: | ||
| masked_api_key = "*" * len(api_key) | ||
| json_value = {"api_key": masked_api_key, "workspace": self.workspace} | ||
| return json.dumps(json_value, indent=2) |
There was a problem hiding this comment.
It's just for local run
d0716d5 to
aade1e0
Compare
Description
Adds workspace-level search functionality to expose the existing API endpoint through the Python SDK. Supports RoboQL queries (e.g.
tag:review,project:falsefor orphan images) and semantic CLIP search, with automatic pagination viacontinuationToken.Changes
/{workspace}/search/v1results,total, andcontinuationTokenfor single-page queries["tags", "projects", "filename"](lightweight defaults)Type of change
How has this change been tested?
python -m unittest tests.test_workspace_search -vtest_search_basic— response structure and payloadtest_search_with_continuation_token— token forwardingtest_search_custom_fields— fields parametertest_search_api_error— error handlingtest_search_all_single_page— single page resultstest_search_all_multiple_pages— multi-page paginationtest_search_all_empty_results— empty result handlingmake check_code_quality✓ (ruff format, ruff check, mypy)Will the change affect Universe?
No. This is a pure SDK enhancement with no impact on the platform.
Deployment considerations
None. This is a Python package update only.
Docs