Skip to content

Implement workspace-level search (DATAMAN-163)#441

Open
digaobarbosa wants to merge 4 commits intomainfrom
feat/DATAMAN-163-workspace-search
Open

Implement workspace-level search (DATAMAN-163)#441
digaobarbosa wants to merge 4 commits intomainfrom
feat/DATAMAN-163-workspace-search

Conversation

@digaobarbosa
Copy link
Contributor

Description

Adds workspace-level search functionality to expose the existing API endpoint through the Python SDK. Supports RoboQL queries (e.g. tag:review, project:false for orphan images) and semantic CLIP search, with automatic pagination via continuationToken.

Changes

  • rfapi.workspace_search(): New adapter function that POSTs to /{workspace}/search/v1
  • Workspace.search(): Returns full response dict with results, total, and continuationToken for single-page queries
  • Workspace.search_all(): Generator that yields pages of results with automatic pagination
  • Default fields: ["tags", "projects", "filename"] (lightweight defaults)
  • 7 unit tests with 100% coverage
  • Manual demo script for staging API testing

Type of change

  • New feature

How has this change been tested?

  • 7 unit tests: python -m unittest tests.test_workspace_search -v
    • test_search_basic — response structure and payload
    • test_search_with_continuation_token — token forwarding
    • test_search_custom_fields — fields parameter
    • test_search_api_error — error handling
    • test_search_all_single_page — single page results
    • test_search_all_multiple_pages — multi-page pagination
    • test_search_all_empty_results — empty result handling
  • Code quality: make check_code_quality ✓ (ruff format, ruff check, mypy)

Will the change affect Universe?

No. This is a pure SDK enhancement with no impact on the platform.

Deployment considerations

None. This is a Python package update only.

Docs

  • Updated docstrings with examples

count += 1
print(f"Page {count}: {len(page_results)} results")
for img in page_results:
print(f" - {img.get('filename', 'N/A')}")

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High test

This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.
This expression logs
sensitive data (password)
as clear text.

Copilot Autofix

AI 1 day ago

General approach: Ensure that no method that is likely to be logged (__str__/__repr__ or demo printouts) exposes secrets like api_key. We do not need to change how the key is used in HTTP requests; only remove or mask it from any string/log representations.

Best single fix without changing functionality:
The direct, real leak is Roboflow.__str__ in roboflow/__init__.py, which currently returns a JSON blob containing the raw api_key. That method can be called implicitly by print(rf) or loggers, thus logging the key in clear text. We should change __str__ so that it no longer includes the API key, or at minimum masks it. To avoid altering behavior elsewhere, we’ll keep the same structure but replace the key with a masked version (e.g., show last 4 characters only). This preserves usefulness for debugging while preventing secret exposure.

Concretely:

  • In roboflow/__init__.py, update Roboflow.__str__ (lines 272–275 in the snippet) so that:
    • It computes a masked_api_key string that does not reveal the full key (e.g., "****" if empty, or "****" + last_4_chars).
    • It sets json_value = {"api_key": masked_api_key, "workspace": self.workspace} instead of using self.api_key directly.
  • Leave the rest of the class unchanged; no change to how HTTP requests are made.
  • The other printed value in tests/manual/demo_workspace_search.py (filename) does not involve the API key and can remain as-is, so no change is necessary there for secrecy.

No new imports or helper methods are needed; masking logic can be implemented inline in __str__.


Suggested changeset 1
roboflow/__init__.py
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/roboflow/__init__.py b/roboflow/__init__.py
--- a/roboflow/__init__.py
+++ b/roboflow/__init__.py
@@ -271,5 +271,11 @@
 
     def __str__(self):
         """to string function"""
-        json_value = {"api_key": self.api_key, "workspace": self.workspace}
+        # Avoid exposing the full API key when this object is printed or logged.
+        api_key = self.api_key or ""
+        if len(api_key) > 4:
+            masked_api_key = ("*" * (len(api_key) - 4)) + api_key[-4:]
+        else:
+            masked_api_key = "*" * len(api_key)
+        json_value = {"api_key": masked_api_key, "workspace": self.workspace}
         return json.dumps(json_value, indent=2)
EOF
@@ -271,5 +271,11 @@

def __str__(self):
"""to string function"""
json_value = {"api_key": self.api_key, "workspace": self.workspace}
# Avoid exposing the full API key when this object is printed or logged.
api_key = self.api_key or ""
if len(api_key) > 4:
masked_api_key = ("*" * (len(api_key) - 4)) + api_key[-4:]
else:
masked_api_key = "*" * len(api_key)
json_value = {"api_key": masked_api_key, "workspace": self.workspace}
return json.dumps(json_value, indent=2)
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just for local run

@digaobarbosa digaobarbosa force-pushed the feat/DATAMAN-163-workspace-search branch from d0716d5 to aade1e0 Compare February 27, 2026 18:37
@digaobarbosa digaobarbosa requested a review from a team February 27, 2026 18:44
@digaobarbosa digaobarbosa self-assigned this Feb 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant