fix: adjust index loading timing#14470
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRemoved i18n wrapping from desktop document titles; added a new loadIndexSnapshots step and delegated refresh responsibility to indexer storage in the indexing loop; introduced a native ftsLoadIndexSnapshots API and handlers; removed immediate in-memory index init after migrations in native connect. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
958268f to
a28b3cd
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/frontend/native/nbstore/src/storage.rs`:
- Around line 66-69: The impl for SqliteDocStorage is missing the async methods
flush_index(&self) -> Result<()> and init_index(&self) -> Result<()>, causing
unresolved method errors where self.flush_index() and self.init_index() are
called; add these two pub async fn methods to the SqliteDocStorage impl that
delegate to the existing indexer logic (or inline the flush/init logic),
ensuring they acquire the same index read/write locks used elsewhere (e.g.,
where index.read().await.has_unpersisted_changes(None) is checked) and return
Result<()> so callers (including connect()) compile correctly; alternatively
move the flush/init calls to a place that has access to the indexer, but keep
the method signatures exactly as pub async fn flush_index(&self) -> Result<()>
and pub async fn init_index(&self) -> Result<()> and reference SqliteDocStorage
in the impl.
a28b3cd to
0366105
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## canary #14470 +/- ##
==========================================
- Coverage 57.51% 56.93% -0.59%
==========================================
Files 2859 2859
Lines 154468 154471 +3
Branches 23336 23199 -137
==========================================
- Hits 88845 87949 -896
- Misses 62720 63613 +893
- Partials 2903 2909 +6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Head branch was pushed to by a user without write access
|
Force flush index to disk on stop and adjust snapshots loading timing. TODO: |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/common/nbstore/src/impls/idb/indexer/index.ts (1)
228-231: UnimplementedloadIndexSnapshotsfor IDB — track follow-upThis is a permanent no-op with a TODO. If
sync/indexer/index.tsnow callsloadIndexSnapshots()at startup or after reconnection, IDB will silently skip snapshot loading. Confirm whether IDB needs actual snapshot initialization (e.g., loading an in-memory index from stored data) or whether the no-op is intentionally correct for browser storage.Happy to help implement the IDB snapshot loading or open a tracking issue if you'd like.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/common/nbstore/src/impls/idb/indexer/index.ts` around lines 228 - 231, The override of loadIndexSnapshots() in the IDB indexer is a no-op which will silently skip snapshot initialization when sync/indexer/index.ts calls it; implement loading persisted snapshots from IndexedDB inside the IDB indexer (the override async loadIndexSnapshots method) so the in-memory index is reconstructed after startup/reconnect: read the saved snapshot records from the IDB store used by this indexer, validate/deserialize them, and apply them to the in-memory index via the indexer’s existing apply/restore method(s) (or call the same snapshot-applier used by other storage backends); if implementing now is not possible, replace the TODO with a clear comment and create a tracked issue (and log/warn when loadIndexSnapshots is skipped) so callers are aware of the missing behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/common/nbstore/src/sync/indexer/index.ts`:
- Around line 481-482: The call to this.indexer.refreshIfNeed() can throw and
currently sits before unsubscribe(), so if it throws the subscribeDocUpdate
listener remains attached and this.status is retained; fix by wrapping the await
this.indexer.refreshIfNeed() (which triggers native ftsFlushIndex) in its own
try/finally that always calls unsubscribe() in the finally block so cleanup
always runs, ensuring unsubscribe() is executed even if refreshIfNeed() throws.
- Around line 210-211: Move the call to this.indexer.loadIndexSnapshots() out of
the pre-loop area and into retryLoop() immediately after waitForConnected()
resolves so the native FTS snapshot load runs only when the connection is
established and runs on every retry; also protect the unsubscribe() call by
ensuring it runs even if this.indexer.refreshIfNeed() throws (wrap
refreshIfNeed() in a try/catch or call unsubscribe() from a finally block) so
the subscribeDocUpdate listener is always removed; references:
loadIndexSnapshots, retryLoop, waitForConnected, refreshIfNeed, unsubscribe,
subscribeDocUpdate.
---
Nitpick comments:
In `@packages/common/nbstore/src/impls/idb/indexer/index.ts`:
- Around line 228-231: The override of loadIndexSnapshots() in the IDB indexer
is a no-op which will silently skip snapshot initialization when
sync/indexer/index.ts calls it; implement loading persisted snapshots from
IndexedDB inside the IDB indexer (the override async loadIndexSnapshots method)
so the in-memory index is reconstructed after startup/reconnect: read the saved
snapshot records from the IDB store used by this indexer, validate/deserialize
them, and apply them to the in-memory index via the indexer’s existing
apply/restore method(s) (or call the same snapshot-applier used by other storage
backends); if implementing now is not possible, replace the TODO with a clear
comment and create a tracked issue (and log/warn when loadIndexSnapshots is
skipped) so callers are aware of the missing behavior.
| await this.indexer.loadIndexSnapshots(); | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find NativeDBConnection implementation
echo "=== Locating NativeDBConnection ==="
fd -t f "\.ts$" | xargs grep -l "class NativeDBConnection" 2>/dev/null | head -5
# Also search for NativeDBConnection in the codebase
rg "class NativeDBConnection" -t ts -B 2 -A 10Repository: toeverything/AFFiNE
Length of output: 1080
🏁 Script executed:
#!/bin/bash
# Check loadIndexSnapshots, ftsLoadIndexSnapshots placement and waitForConnected usage
echo "=== loadIndexSnapshots and waitForConnected in index.ts ==="
rg -n "loadIndexSnapshots|ftsLoadIndexSnapshots|waitForConnected" packages/common/nbstore/src/sync/indexer/index.ts -B 2 -A 2
echo "=== Check refreshIfNeed in finally block ==="
rg -n "refreshIfNeed|finally" packages/common/nbstore/src/sync/indexer/index.ts -B 3 -A 3Repository: toeverything/AFFiNE
Length of output: 3783
🏁 Script executed:
#!/bin/bash
# Get context around mainLoop and retryLoop to see exact placement
echo "=== mainLoop function context (lines 205-225) ==="
sed -n '205,225p' packages/common/nbstore/src/sync/indexer/index.ts
echo "=== retryLoop function (find waitForConnected) ==="
rg -n "retryLoop\|waitForConnected" packages/common/nbstore/src/sync/indexer/index.ts -A 50 | head -80Repository: toeverything/AFFiNE
Length of output: 756
🏁 Script executed:
#!/bin/bash
# Get full retryLoop function to see the structure
echo "=== retryLoop function (lines 235-485) ==="
sed -n '235,485p' packages/common/nbstore/src/sync/indexer/index.ts | head -100
echo ""
echo "=== Full finally block and unsubscribe ==="
sed -n '475,485p' packages/common/nbstore/src/sync/indexer/index.tsRepository: toeverything/AFFiNE
Length of output: 3427
🏁 Script executed:
#!/bin/bash
# Check SqliteIndexerStorage.loadIndexSnapshots implementation
echo "=== SqliteIndexerStorage.loadIndexSnapshots ==="
rg -n "loadIndexSnapshots" -B 2 -A 10 --type ts | grep -A 12 "SqliteIndexerStorage\|class.*IndexerStorage"
echo ""
echo "=== ftsLoadIndexSnapshots definition ==="
rg -n "ftsLoadIndexSnapshots" -B 2 -A 5 --type tsRepository: toeverything/AFFiNE
Length of output: 2601
🏁 Script executed:
#!/bin/bash
# Check if there's any connection guard inside loadIndexSnapshots
echo "=== Checking loadIndexSnapshots implementations across indexer storage classes ==="
fd -t f "\.ts$" | xargs grep -l "loadIndexSnapshots" | head -10
echo ""
echo "=== Full loadIndexSnapshots in each storage class ==="
for file in $(fd -t f "\.ts$" | xargs grep -l "loadIndexSnapshots" | head -5); do
echo "File: $file"
rg -n "loadIndexSnapshots" -B 2 -A 8 "$file"
echo ""
doneRepository: toeverything/AFFiNE
Length of output: 1749
Move loadIndexSnapshots() into retryLoop() after waitForConnected() resolves, and protect unsubscribe() from refreshIfNeed() exceptions.
Three critical placement issues:
-
Pre-connection call.
loadIndexSnapshots()at line 210 executes before thewhile (true)loop begins, well beforeretryLoop()and itswaitForConnected()calls. ForSqliteIndexerStorage, this invokes the native APIftsLoadIndexSnapshots()against a potentially unestablished connection. Every other native API call in the codebase (lines 577, 580, 607, etc.) is guarded bywaitForConnected()first. If this call fails,mainLoop()throws, the catch handler logs the error, and the indexer stops permanently. -
Not repeated on retry.
loadIndexSnapshots()runs only once before the loop. On eachretryLoop()failure, the connection may reset and the native FTS in-memory state may reinitialize. Subsequent retries skip the snapshot reload and operate on a potentially empty index. -
Unsubscribe unprotected in finally block. Lines 481–482 call
await this.indexer.refreshIfNeed()beforeunsubscribe(). IfrefreshIfNeed()throws,unsubscribe()is never called, leaving thesubscribeDocUpdatelistener (established at line 270) dangling indefinitely.
🔧 Suggested fixes
// Move loadIndexSnapshots into retryLoop after waitForConnected
private async retryLoop(signal?: AbortSignal) {
await Promise.race([...]);
this.status.errorMessage = null;
this.status.statusUpdatedSubject$.next(true);
+ await this.indexer.loadIndexSnapshots();
+
const indexVersion = await this.indexer.indexVersion();
// Protect unsubscribe in finally block
} finally {
+ try {
await this.indexer.refreshIfNeed();
+ } finally {
unsubscribe();
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/common/nbstore/src/sync/indexer/index.ts` around lines 210 - 211,
Move the call to this.indexer.loadIndexSnapshots() out of the pre-loop area and
into retryLoop() immediately after waitForConnected() resolves so the native FTS
snapshot load runs only when the connection is established and runs on every
retry; also protect the unsubscribe() call by ensuring it runs even if
this.indexer.refreshIfNeed() throws (wrap refreshIfNeed() in a try/catch or call
unsubscribe() from a finally block) so the subscribeDocUpdate listener is always
removed; references: loadIndexSnapshots, retryLoop, waitForConnected,
refreshIfNeed, unsubscribe, subscribeDocUpdate.
| await this.indexer.refreshIfNeed(); | ||
| unsubscribe(); |
There was a problem hiding this comment.
unsubscribe() is skipped if this.indexer.refreshIfNeed() throws.
Before this PR the finally block only contained unsubscribe(), which is synchronous and cannot throw. The new await this.indexer.refreshIfNeed() (which calls ftsFlushIndex on the native side) can throw — e.g., if the connection was reset. If it does, unsubscribe() is never reached, leaving the subscribeDocUpdate listener (line 270) permanently attached and holding references to this.status, causing a memory/behavioral leak.
Wrap the flush in its own try/finally so cleanup is guaranteed:
🛡️ Proposed fix
} finally {
- await this.indexer.refreshIfNeed();
- unsubscribe();
+ try {
+ await this.indexer.refreshIfNeed();
+ } finally {
+ unsubscribe();
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| await this.indexer.refreshIfNeed(); | |
| unsubscribe(); | |
| try { | |
| await this.indexer.refreshIfNeed(); | |
| } finally { | |
| unsubscribe(); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/common/nbstore/src/sync/indexer/index.ts` around lines 481 - 482,
The call to this.indexer.refreshIfNeed() can throw and currently sits before
unsubscribe(), so if it throws the subscribeDocUpdate listener remains attached
and this.status is retained; fix by wrapping the await
this.indexer.refreshIfNeed() (which triggers native ftsFlushIndex) in its own
try/finally that always calls unsubscribe() in the finally block so cleanup
always runs, ensuring unsubscribe() is executed even if refreshIfNeed() throws.
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@packages/common/nbstore/src/sync/indexer/index.ts`:
- Around line 480-483: The finally block currently awaits
this.indexer.refreshIfNeed() and then calls unsubscribe(), but if
refreshIfNeed() throws the unsubscribe() call is skipped leaving the
subscribeDocUpdate listener attached; change the finally so unsubscribe() always
runs regardless of refreshIfNeed() outcome — e.g. wrap the refreshIfNeed() call
in its own try/catch (or perform refresh in a try and put unsubscribe() in a
nested finally) so that unsubscribe() is invoked unconditionally; reference
this.indexer.refreshIfNeed(), unsubscribe(), and the subscribeDocUpdate listener
to locate and update the code.
|
@darkskygit Hello, could this pr merge into master? |
fix #14467
fix #14191
In Electron apps, the default timer for flushing the FTS index (
fts_flush_index) is 30 seconds. This can lead to a scenario where:This fix calls
fts_flush_indexbefore reconnecting the database when it is running, ensuring any pending index updates are flushed to disk and preventing data loss.Summary by CodeRabbit
New Features
Bug Fixes
Refactor