Flotilla gets stuck authenticating #111

Closed
opened 2026-04-01 17:48:05 +00:00 by mplorentz · 12 comments
Collaborator

I've noticed that when I leave app.flotilla.social running in my web browser all day it tends to get stuck Authenticating and no new events are loaded until I hit refresh. I have seen this at least 4 times in the past week or two. I'm seeing this in Firefox maybe especially after waking from sleep.

Here are some log excerpts that may be a starting point to diagnose:
dozens of these:

MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 1001 receive listeners added. Use emitter.setMaxListeners() to increase limit
    Pe events.js:211
    addListener events.js:227
    Ds Tools.js:1200
    rG adapter.js:24
    YB adapter.js:122
    Tg request.js:14
    gn request.js:112
    gn request.js:112
    Ar sync.ts:121
    ot sync.ts:126
    i sync.ts:278
    Ts Tools.js:901
    onEvent sync.ts:309
    qb sync.ts:81
    Qb sync.ts:114
events.js:46:40

hundreds of these:

Firefox can’t establish a connection to the server at wss://no.str.cr/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://nostr.land/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://indexer.coracle.social/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://nostr-pub.wellorder.net/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://nostr.bitcoiner.social/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://purplepag.es/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://relay.nostr.info/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://auth.nostr1.com/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://offchain.pub/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://nostr.mom/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://relay.nos.social/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
Firefox can’t establish a connection to the server at wss://relay.damus.io/.

one dozen of these (maybe after waking from sleep? no timestamps in the console 😞)

The connection to wss://relay.scuttle.works/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://builders-republic.spaces.coracle.social/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://meta.spaces.coracle.social/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://relay.nos.social/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://offchain.pub/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://relay.damus.io/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://nostr-pub.wellorder.net/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://auth.nostr1.com/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://no.str.cr/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://pyramid.fiatjaf.com/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js)
The connection to wss://nostr.wine/ was interrupted while the page was loading.
I've noticed that when I leave app.flotilla.social running in my web browser all day it tends to get stuck Authenticating and no new events are loaded until I hit refresh. I have seen this at least 4 times in the past week or two. I'm seeing this in Firefox maybe especially after waking from sleep. Here are some log excerpts that may be a starting point to diagnose: dozens of these: ``` MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 1001 receive listeners added. Use emitter.setMaxListeners() to increase limit Pe events.js:211 addListener events.js:227 Ds Tools.js:1200 rG adapter.js:24 YB adapter.js:122 Tg request.js:14 gn request.js:112 gn request.js:112 Ar sync.ts:121 ot sync.ts:126 i sync.ts:278 Ts Tools.js:901 onEvent sync.ts:309 qb sync.ts:81 Qb sync.ts:114 events.js:46:40 ``` hundreds of these: ``` Firefox can’t establish a connection to the server at wss://no.str.cr/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://nostr.land/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://indexer.coracle.social/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://nostr-pub.wellorder.net/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://nostr.bitcoiner.social/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://purplepag.es/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://relay.nostr.info/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://auth.nostr1.com/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://offchain.pub/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://nostr.mom/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://relay.nos.social/. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) Firefox can’t establish a connection to the server at wss://relay.damus.io/. ``` one dozen of these (maybe after waking from sleep? no timestamps in the console 😞) ``` The connection to wss://relay.scuttle.works/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://builders-republic.spaces.coracle.social/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://meta.spaces.coracle.social/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://relay.nos.social/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://offchain.pub/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://relay.damus.io/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://nostr-pub.wellorder.net/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://auth.nostr1.com/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://no.str.cr/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://pyramid.fiatjaf.com/ was interrupted while the page was loading. [CFtcGODy.js:62:18205](https://app.flotilla.social/_app/immutable/chunks/CFtcGODy.js) The connection to wss://nostr.wine/ was interrupted while the page was loading. ```
hodlbod added the priority label 2026-04-02 16:06:14 +00:00
hodlbod added this to the Current milestone 2026-04-02 16:06:17 +00:00
hodlbod removed this from the Current milestone 2026-04-02 18:57:08 +00:00
hodlbod added this to the Current milestone 2026-04-02 19:03:48 +00:00
hodlbod added the dev label 2026-04-02 19:12:40 +00:00
Contributor

Hi @mplorentz and @hodlbod — I've traced this to two related problems:

In syncSpaces(), derived([userGroupList, page], identity) fires on every page navigation. Each time roomsKey changes or the guard doesn't fire, pullAndListen() creates new request() subscriptions without cleaning up old ones, eventually accumulating 1000+ receive listeners on the same socket (the MaxListenersExceededWarning).
After sleep-wake, all WebSocket connections drop simultaneously. The auth policy tries to re-authenticate on every reconnected socket, but the accumulated stale listeners from (1) interfere with the auth handshake, leaving sockets stuck in "Authenticating".

My proposed fix:
(a) decouple page navigation from space sync restarts in sync.ts, and (
b) add a visibilitychange handler that cleanly tears down and restarts all sync on sleep-wake.
A secondary fix in policies.ts adds a per-socket auth timeout to prevent permanent stuck state.

This is more complex than a typical bug fix since it touches sync lifecycle. Would appreciate your thoughts on the approach before I open a PR — particularly on whether tearing down/restarting sync on visibilitychange is the right granularity or if you'd prefer a lighter-touch reconnect.

Hi @mplorentz and @hodlbod — I've traced this to two related problems: In syncSpaces(), derived([userGroupList, page], identity) fires on every page navigation. Each time roomsKey changes or the guard doesn't fire, pullAndListen() creates new request() subscriptions without cleaning up old ones, eventually accumulating 1000+ receive listeners on the same socket (the MaxListenersExceededWarning). After sleep-wake, all WebSocket connections drop simultaneously. The auth policy tries to re-authenticate on every reconnected socket, but the accumulated stale listeners from (1) interfere with the auth handshake, leaving sockets stuck in "Authenticating". My proposed fix: (a) decouple page navigation from space sync restarts in sync.ts, and ( b) add a visibilitychange handler that cleanly tears down and restarts all sync on sleep-wake. A secondary fix in policies.ts adds a per-socket auth timeout to prevent permanent stuck state. This is more complex than a typical bug fix since it touches sync lifecycle. Would appreciate your thoughts on the approach before I open a PR — particularly on whether tearing down/restarting sync on visibilitychange is the right granularity or if you'd prefer a lighter-touch reconnect.
nayan9617 self-assigned this 2026-04-04 13:27:59 +00:00
Owner

I think those are good suggestions, but instead of decoupling sync restarts from page navigation I would suggestion turning the visibilitychange handler into a store which gets subscribed to in sync to tear down/set up everything using the same logic. This means we can keep the current idempotent approach, which keeps things easier to understand.

I'm more interested in the memory leak you've found; I see that syncSpace never gets called without the corresponding unsubscriber being called. So I'm not sure how you would end up with duplicate subscriptions on a url based on the logic in sync. Can you clarify where the leak actually is?

I think those are good suggestions, but instead of decoupling sync restarts from page navigation I would suggestion turning the visibilitychange handler into a store which gets subscribed to in `sync` to tear down/set up everything using the same logic. This means we can keep the current idempotent approach, which keeps things easier to understand. I'm more interested in the memory leak you've found; I see that syncSpace never gets called without the corresponding unsubscriber being called. So I'm not sure how you would end up with duplicate subscriptions on a url based on the logic in sync. Can you clarify where the leak actually is?
Contributor

Hi @hodlbod, first of all apologies for the late response - was unaware of your comment.

Nonetheless, that makes sense — using a visibilitychange store and subscribing to it in sync sounds much cleaner and keeps the idempotent flow intact. I’ll switch to that approach.

Regarding the leak — I initially thought along similar lines (that syncSpace unsubscribers are being called), but from the stack traces (adapter.js / Tools.js accumulating receive listeners), it seems the issue is happening at a lower level.
My current hypothesis is that even though the high-level cleanup runs, the underlying request() / adapter layer isn’t fully removing the EventEmitter listeners from the shared socket. This could be due to:

-async timing (listener attached after cleanup fires), or
-incomplete teardown in the adapter (removeListener not being called consistently)

So effectively, each re-run adds new listeners on the same socket, leading to the accumulation.
I’ll trace this more precisely in sync.ts and the adapter layer to pinpoint where the cleanup is being missed and include that in the fix. Does this direction make sense?

Hi @hodlbod, first of all apologies for the late response - was unaware of your comment. Nonetheless, that makes sense — using a visibilitychange store and subscribing to it in sync sounds much cleaner and keeps the idempotent flow intact. I’ll switch to that approach. Regarding the leak — I initially thought along similar lines (that syncSpace unsubscribers are being called), but from the stack traces (adapter.js / Tools.js accumulating receive listeners), it seems the issue is happening at a lower level. My current hypothesis is that even though the high-level cleanup runs, the underlying request() / adapter layer isn’t fully removing the EventEmitter listeners from the shared socket. This could be due to: -async timing (listener attached after cleanup fires), or -incomplete teardown in the adapter (removeListener not being called consistently) So effectively, each re-run adds new listeners on the same socket, leading to the accumulation. I’ll trace this more precisely in sync.ts and the adapter layer to pinpoint where the cleanup is being missed and include that in the fix. Does this direction make sense?
Owner

Yes, I think you're headed in the right direction, thanks for looking into it!

Yes, I think you're headed in the right direction, thanks for looking into it!
Contributor

Hi @hodlbod,

Actually you were right!
I traced the listener accumulation down to the @welshman/net layer. It looks like if options.onClose throws, the cleanup path (including removing listeners) never executes, which leaves listeners attached to the shared socket.

I’m planning a two-part fix:

ensure our onClose callbacks don’t throw (i can implement it in order to get the immediate fix in Flotilla)
and open a PR upstream to guard the cleanup with a try/finally

Let me know if this approach sounds good to you!

Hi @hodlbod, Actually you were right! I traced the listener accumulation down to the @welshman/net layer. It looks like if options.onClose throws, the cleanup path (including removing listeners) never executes, which leaves listeners attached to the shared socket. I’m planning a two-part fix: ensure our onClose callbacks don’t throw (i can implement it in order to get the immediate fix in Flotilla) and open a PR upstream to guard the cleanup with a try/finally Let me know if this approach sounds good to you!
Owner

That all sounds fine, have you validated that an onClose callback is throwing?

That all sounds fine, have you validated that an onClose callback is throwing?
Contributor

I haven’t conclusively seen it throw yet. The hypothesis comes from tracing the control flow where a thrown onClose could skip the adapter cleanup.

I’ll add some instrumentation around that path locally (especially during sleep/wake) to confirm if it’s actually happening, and will share logs once I can reproduce it.

I haven’t conclusively seen it throw yet. The hypothesis comes from tracing the control flow where a thrown onClose could skip the adapter cleanup. I’ll add some instrumentation around that path locally (especially during sleep/wake) to confirm if it’s actually happening, and will share logs once I can reproduce it.
Contributor

@hodlbod I just ran it locally with some trace logs and onClose isn't throwing at all. So that theory is busted 😅. I'm digging into the adapter now to see if the abort signal is dropping early or if removeListener is just missing the reference.Lastly, I'll let you know what I find!

@hodlbod I just ran it locally with some trace logs and onClose isn't throwing at all. So that theory is busted 😅. I'm digging into the adapter now to see if the abort signal is dropping early or if removeListener is just missing the reference.Lastly, I'll let you know what I find!
Contributor

@hodlbod I instrumented the onClose path and confirmed it's not throwing; welshman cleanup completes correctly every time.
The leak is in pullWithFallback in sync.ts. It's called without await inside pullAndListen, so the async chain — including the Difference creation and request() calls in pullOneWithFallback — continues running after controller.abort() has already fired. The signal.aborted check only runs after await loadRelay() resolves, by which point new listeners are already being attached to the socket.

My Plan: add signal.aborted guards before Difference construction and before request() in pullOneWithFallback, alongside the documentVisibility store. Shall I open a PR with this?

@hodlbod I instrumented the onClose path and confirmed it's not throwing; welshman cleanup completes correctly every time. The leak is in pullWithFallback in sync.ts. It's called without await inside pullAndListen, so the async chain — including the Difference creation and request() calls in pullOneWithFallback — continues running after controller.abort() has already fired. The signal.aborted check only runs after await loadRelay() resolves, by which point new listeners are already being attached to the socket. My Plan: add signal.aborted guards before Difference construction and before request() in pullOneWithFallback, alongside the documentVisibility store. Shall I open a PR with this?
Owner

Yes, thanks!

Yes, thanks!
Contributor

@hodlbod Thanks for the clarification on visibility - removed the documentVisibility integration entirely.
The PR now only contains the abort guards in the fallback pull path. Also addressed all other review comments. Please take another look.

@hodlbod Thanks for the clarification on visibility - removed the documentVisibility integration entirely. The PR now only contains the abort guards in the fallback pull path. Also addressed all other review comments. Please take another look.
Owner

I've closed this issue, but I'm not entirely satisfied it's fixed. Feel free to re-open if you're still experiencing it.

I've closed this issue, but I'm not entirely satisfied it's fixed. Feel free to re-open if you're still experiencing it.
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: coracle/flotilla#111