Fix fallback pull race after abort #167

Merged
hodlbod merged 6 commits from nayan9617/flotilla:fix/pull-fallback-abort-race into dev 2026-04-08 16:43:05 +00:00
Contributor

Description

This PR fixes the sleep/wake sync leak by stopping stale async work in the fallback pull path from continuing after abort.

The investigation showed that welshman cleanup was already completing correctly, so the bug was not in request teardown. The real problem was in the async flow inside sync.ts: pullAndListen() was starting pullWithFallback() without awaiting it, and pullOneWithFallback() could continue past controller.abort() after loadRelay() resolved. That allowed the chain to reach Difference creation and fallback request() calls even though teardown had already won.

Fixes #111

What changed:

-Added an early signal.aborted guard at the top of pullOneWithFallback().
-Kept the abort check before Difference creation so no new listener work starts after teardown.
-Kept the abort check before the fallback request() call so the relay request is not re-issued after abort.
-Added an early guard in pullAndListen() so a stale call does not start the async chain at all.
-Kept the visibility lifecycle support already in place via state.ts.

Validation:

pnpm run check passed with 0 errors and 0 warnings.

## Description This PR fixes the sleep/wake sync leak by stopping stale async work in the fallback pull path from continuing after abort. The investigation showed that welshman cleanup was already completing correctly, so the bug was not in request teardown. The real problem was in the async flow inside sync.ts: pullAndListen() was starting pullWithFallback() without awaiting it, and pullOneWithFallback() could continue past controller.abort() after loadRelay() resolved. That allowed the chain to reach Difference creation and fallback request() calls even though teardown had already won. Fixes #111 ## What changed: -Added an early signal.aborted guard at the top of pullOneWithFallback(). -Kept the abort check before Difference creation so no new listener work starts after teardown. -Kept the abort check before the fallback request() call so the relay request is not re-issued after abort. -Added an early guard in pullAndListen() so a stale call does not start the async chain at all. -Kept the visibility lifecycle support already in place via state.ts. ## Validation: pnpm run check passed with 0 errors and 0 warnings.
hodlbod reviewed 2026-04-07 19:43:46 +00:00
hodlbod left a comment
Owner

Can you add .next to gitignore and remove it from the commit?

Otherwise this looks decent, but it occurs to me we probably actually want to keep syncing in the background, because that's how notifications get triggered. So I think the document visibility stuff is actually not something we should add. Sorry I didn't realize that earlier.

Can you add .next to gitignore and remove it from the commit? Otherwise this looks decent, but it occurs to me we probably actually want to keep syncing in the background, because that's how notifications get triggered. So I think the document visibility stuff is actually not something we should add. Sorry I didn't realize that earlier.
@@ -159,0 +160,4 @@
export const documentVisibility = readable<DocumentVisibilityState>(
typeof document !== "undefined" ? document.visibilityState : "visible",
set => {
if (typeof document === "undefined") return
Owner

No need to check for document, this project doesn't do SSR

No need to check for document, this project doesn't do SSR
@@ -6,6 +6,7 @@ import {PollResponse} from "nostr-tools/kinds"
import {
getListTags,
getRelayTagValues,
type List,
Owner

Use import type rather than mixing type/variable imports

Use `import type` rather than mixing type/variable imports
hodlbod marked this conversation as resolved
@@ -240,3 +241,1 @@
loadFeedsForPubkey($userRelayList.event.pubkey)
}
})
const syncRelayList = ($userRelayList: List | undefined) => {
Owner

You can use PublishedList here for a slightly more narrow type

You can use `PublishedList` here for a slightly more narrow type
hodlbod marked this conversation as resolved
@@ -245,3 +257,2 @@
for (const pubkeys of chunk(10, get(bootstrapPubkeys))) {
// This isn't urgent, avoid clogging other stuff up
await sleep(1000)
void (async () => {
Owner

This actually changes the behavior to flood relays with requests. This needs to be a trickle in the background to prevent interrupting other requests. If we need to cancel it, then we should add that explicitly.

This actually changes the behavior to flood relays with requests. This needs to be a trickle in the background to prevent interrupting other requests. If we need to cancel it, then we should add that explicitly.
hodlbod marked this conversation as resolved
@@ -260,0 +293,4 @@
},
)
const unsubscribeFollows = derived([userFollowList, documentVisibility], identity).subscribe(
Owner

Instead of derived/identity you can use merged from @welshman/store

Instead of derived/identity you can use `merged` from `@welshman/store`
hodlbod marked this conversation as resolved
Owner

.next is still in the diff

.next is still in the diff
Author
Contributor

Actually that was due to earlier commit, and despite adding it to .gitignore it was still there. Though, I've now untracked those files.

Actually that was due to earlier commit, and despite adding it to .gitignore it was still there. Though, I've now untracked those files.
hodlbod added 6 commits 2026-04-08 16:42:56 +00:00
When browser tab is hidden (sleep), tear down all relay socket connections
completely. This forces fresh socket creation and clean auth handshake on wake,
preventing stuck AuthStatus.PendingResponse states.

- Add documentVisibility store tracking document.visibilityState
- Call Pool.get().remove(url) for relay sockets during hidden teardown
- Integrate socket reset in syncUserData, syncSpaces, syncDMs
- Add explanatory comments for visibility guards and race conditions

Fixes: App stuck in "Authenticating" after Mac sleep/wake cycle
Tested with: wss://news.utxo.one/
hodlbod force-pushed fix/pull-fallback-abort-race from d0e23b605c to 3dd5a0915f 2026-04-08 16:42:56 +00:00 Compare
hodlbod merged commit 7f1e98dcb2 into dev 2026-04-08 16:43:05 +00:00
hodlbod deleted branch fix/pull-fallback-abort-race 2026-04-08 16:43:05 +00:00
Owner

Perfect, thanks!

Perfect, thanks!
Collaborator

Hey @hodlbod, I wasn’t experiencing any logging-related hangs before, but after pulling this merge I’m now running into an issue where it gets stuck.

Could you please take a look? I’ve attached a video for reference.

Hey @hodlbod, I wasn’t experiencing any logging-related hangs before, but after pulling this merge I’m now running into an issue where it gets stuck. Could you please take a look? I’ve attached a video for reference. <video src="attachments/6235341c-50bd-4a13-90ce-b0697831d167" title="Screen Recording 2026-04-08 at 11.17.58 PM.mov" controls></video>
Collaborator

update: It worked, but it took quite a while.

**update:** It worked, but it took quite a while.
Author
Contributor

hello @userAdityaa thanks for letting us know. Let me check if this could anyway affect logging delays.

hello @userAdityaa thanks for letting us know. Let me check if this could anyway affect logging delays.
Owner

The login issue is unrelated to this PR, it looks like one of the signers is timing out. I'll look into it.

The login issue is unrelated to this PR, it looks like one of the signers is timing out. I'll look into it.
Sign in to join this conversation.