feature/23-voice-room/poc #93
Reference in New Issue
Block a user
Delete Branch "feature/23-voice-room/poc"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
#23 This adds the ability to create and join voice rooms from desktop web, following the spec outlined in this NIP-29 PR.
This is not fully finished, but it's already fairly large and feels like a shippable chunk. Areas that still need improvement are:
This is deployed at https://shipwreck.scuttle.works if you want to play with it there.
@hodlbod I'm ready for a review of the code/direction here and for some input on how you want to ship this. It's not done but I think enough is done for it to be useful so we could merge and ship this PR to users and keep iterating on things, but what do you think? I listed areas for improvement in the PR description that I would like to address in future PRs. But if you would rather merge this into a feature branch instead of dev and ship it later I'm happy to do that.
Also this is my first major PR and I don't know svelte so please nitpick all the things.
Could you rebase on dev? I'll take a look now
I'm happy to merge into dev, but probably feature flag it until it's done so we don't forget to finish it (this is what I naturally do).
WRT design, the way discord handles it is it shows voice rooms in a separate area. I think we can do that safely, which means we can also show the user's selected room icon (with a speaker icon instead of a hashtag as a fallback).
As far as the review goes, mostly small things, but the
joinVoiceRoomfunction is terrifying. Stateful logic like this is always nasty, but the cyclomatic complexity of that function is through the roof. A good pattern I've found is to have a synchronous init function which returns an unsubscribe function or takes a signal (in this case, probably the latter). That way you can encapsulate all the race conditions surrounding aborts in the unsubscribe function by doing something likeabort = () => promise.then(cleanup)@@ -18,0 +25,4 @@const hasNoText = tags.some(t => t[0] === "no-text")if (hasLivekit && hasNoText) return "voice"if (hasLivekit) return "both"return "text"[nit] A useful util is
getTagValue("livekit", tags)just for clarity. Also, beef up the type signature, either pass maybe or pass a TrustedEventOk cool. Can you explain why you prefer Maybe over typescripts built-in optional type? Is that a pretty universal preference for you?
The optional type means scattering around a bunch of ad-hoc types that would need to be updated if the model changes (not that it will here, but it's still redundant).
(event?: TrustedEvent)is stronger and allows more assumptions about what's actually happening. If you just want to accept tags so that (for example) callers can call this without having an event, you could do(tags: string[][] = []). But this function should always be able to assume that an event existsSorry, I'm asking why you suggested
(event: Maybe<TrustedEvent>)over(event?: TrustedEvent)?Oh, I see, no optional parameters are better, I was just being vague
@@ -18,0 +32,4 @@const filtered = existingTags.filter(t => t[0] !== "livekit" && t[0] !== "no-text")if (roomMode === "both") return [...filtered, ["livekit"]]if (roomMode === "voice") return [...filtered, ["livekit"], ["no-text"]]return filtered[nit]
Stylistically I normally do something like:
@@ -30,0 +58,4 @@return () => {cancelled = true}})Better would be to add a util to app/state called
deriveHasLiveKit(url)that can cache the request. This complex effect behavior is something I've noticed a lot recently, but we can assume url will not change here.@@ -34,0 +68,4 @@theme: "error",message: "This relay does not support voice rooms.",})}This should never run if we make the form change below.
@@ -44,1 +86,4 @@const existingTags = room.event?.tags ?? []const tags = buildTagsWithRoomMode(existingTags, roomMode)room.event = room.event ? {...room.event, tags} : ({tags} as RoomMeta["event"])createRoomdoesn't use the event property (because it shouldn't exist yet). I think it's probably better to include this tag on the room update event anyhow.Ah, ok. I think you mentioned including the tag on the update event in a comment on the proposed NIP too. I was very confused but now I see that you always try to publish a kind 9007 room create and then immediately a 9002 room metadata edit. Why? Can't you just put all the tags in the 9007? Do all the NIP-29 clients behave this way?
Related question: do I remember correctly that you use VSCode? Do you have some trick for being able to command-click into Welshman functions from the flotilla repo? I have tried and failed to get Cursor to index those functions. Alt-tabbing and ctrl-fing to read welshman function implementations is pretty annoying.
Just the way the spec was written, I agree it's stupid. Most relays will reject meta kinds on 9007, so both events are necessary.
No haha I use kakoune and do everything manually. Intellisense is for the weak. If you're able to fix it go right ahead, there is some squirreliness somewhere in there, sourcemaps don't load either.
This line is still a hack, since it circumvents the welshman data model. I went ahead and released welshman 0.8.8 which adds support for room.livekit and room.noText booleans.
@@ -164,0 +218,4 @@Text and voice{relayHasLivekit === false ? " (not setup)" : ""}</option><option value="voice" disabled={relayHasLivekit === false}>Voice only{relayHasLivekit === false ? " (not setup)" : ""}Instead of showing (not setup) let's only show this field if the relay has livekit support
I will do that, but imo it makes some pretty bad UX if the relay loses livekit support. Either we silently convert voice or voice + text rooms to text only on edit, which may not be what the user wants. Or we don't silently convert and we leave them no way to do the conversion themselves.
Interesting point. The general approach I take for feature-detected stuff is to hide it if it's not supported to keep the UI from being unnecessarily complex. For this, maybe we could swap out the select input for a warning that says "This room requires voice support which isn't currently provided by myrelay.example.com. Convert to a text-only room?" But that seems like an edge case better handled by the administrator. "Why don't my voice rooms work anymore?" is a much more likely question than "Why did my room change to text-only on edit?"
@@ -30,4 +23,1 @@<RoomNameWithImage {url} {h} />{#if showDifferenceIcon}<Icon icon={$shouldNotifyForRoom ? VolumeLoud : VolumeCross} size={4} class="opacity-50" />{/if}I can see why this would be confusing, but we should figure out what to do about muted rooms still
Oh yes that's my mistake. How about we use a bell icon for notifications and speaker icon for voice rooms?
sure, that makes sense
@@ -0,0 +48,4 @@} catch (e) {if (e instanceof Error && e.message === "Join cancelled") returnif (e instanceof DOMException && e.name === "AbortError") returnconst message = e instanceof Error ? e.message : String(e)Why do the bots do this? I have seen this cause bugs more than once. Just do
e.message || String(e). Or, better yet, actually design the error types and don't put anything that gets thrown in a message to the user. Just tell them "Failed to join voice room" and log it if it's not an error we know what to do with.I am cleaning this up but I want to fully understand you. Can you explain how this causes bugs? As a non-javascript guy this doesn't look pretty but makes sense to me because
ecould literally be any type and there is no way to write a catch block for a specific type, so it seems reasonably defensive to check the type before just grabbing themessageproperty off of whatever might have been thrown. Or in the case where some library throws a String it seems reasonable to surface that.I understand the current code risks showing raw error messages to the user (which doesn't bother me actually as long as it has a nice prefix like "Failed to join voice room"). Is there something worse that I am missing?
Idk maybe it is stupid to think that some library would throw some random object with a
messageproperty, but anything seems possible whennode_modulesare involved 😓Oh interesting,
e.message || String(e)actually gives a typescript error:'e' is of type 'unknown'.. But AI found theerrorMessage()function from@lib/utilso I'll use that.The problem is when a pojo with a message gets thrown or rejected, which means the object gets cast to a string, which fails to show the message. This is the case when you probably actually want to show the message to the user, whereas showing an error's message is almost always confusing. Errors being completely untyped are a huge flaw in typescript, so I like to save throws for truly exceptional cases (https://effect.website/ does this more formally, and I think it's a good idea), in which case you want to log and recover if you can, and resolve/return everything else (even errors that are deferred to the caller). But I'm certainly not consistent about this.
errorMessagewas introduced by an LLM too (via Tyson), I don't really believe in it either. I think we should just avoid propagating thrown errors to the user (I don't like the "Failed to do x: [technical jargon]" pattern. It helps to debug some, but not really in that many cases.@@ -0,0 +18,4 @@const spaceName = $derived($currentVoiceSession ? displayRelayUrl($currentVoiceSession.url) : "")const handleDisconnect = () => leaveVoiceRoom()const handleToggleMute = () => toggleMute()Don't do indirection like this, just call the actual function
A stupid way to do this that I like would be
ifLet($currentVoiceSession, ({url, h}) => pushModal(RoomDetail, {url, h}))but you don't have to do thatIt is harder to read at first glance but
ifLetbrings warm feelings from better programming languages so it's done.@@ -0,0 +21,4 @@const handleToggleMute = () => toggleMute()const showRoomDetail = () => {const session = get(currentVoiceSession)if (session) pushModal(RoomDetail, {url: session.url, h: session.h})$currentVoiceSession is better than
get@@ -635,0 +636,4 @@derived(roomsById, $roomsById => {const set = new Set<string>()for (const room of $roomsById.values()) {if (room.url === url && room.event?.tags?.some(t => t[0] === "livekit")) {roomwill always have an event, andgetTagValue('livekit', room.event.tags)would be more strightforward@@ -0,0 +46,4 @@export const currentVoiceSession = writable<VoiceSession | undefined>(undefined)export const speakingPubkeys = writable<Set<string>>(new Set())[nit] simpler would be
writable(new Set<string>())@@ -0,0 +60,4 @@if (signal?.aborted) throw new Error("Join cancelled")const authHeader = await getToken(Use welshman's version instead (makeHttpAuth/makeHttpAuthHeader)
@@ -0,0 +80,4 @@signal,})} catch (e) {if (e instanceof DOMException && e.name === "AbortError") throw new Error("Join cancelled")If we really need to propagate these errors (probably not, they're either triggered by a user action and shouldn't notify the user, or technical in nature), we should return an error instead of throwing one. As it is, we should just assert signer and let errors fly
I tried this several different ways. I think it turned out ok but I added some util functions. Let me know what you think.
@@ -0,0 +195,4 @@if (signal.aborted) {room.disconnect()throw new Error("Join cancelled")}This will never happen because there are no awaits between this and the previous check.
The if (signal) is redundant too, we should just require the caller to provide a signal.
@@ -0,0 +215,4 @@room.disconnect()if (signal?.aborted) {throw new Error("Join cancelled")}too much of this noise
@@ -0,0 +220,4 @@} finally {signal?.removeEventListener("abort", onAbort)}if (signal?.aborted) throw new Error("Join cancelled")aaaaa
haha, yes. Sorry I did do a self-review of this code but clearly I missed this function or my eyes glazed over while reading it. Cleaning it up.
f02b8d8fe7to8d38432598@hodlbod I think I have addressed all PR comments, this is ready for another look.
I started adding a feature flag but realized that with the changes to the room form (hiding room type unless livekit is detected on the server) users won't see any voice room UI unless their relay starts serving the livekit, and currently my relay is the only one that does, so maybe that's enough of a feature flag? But if not I can add a VITE env var or something.
For design, I think we could maybe expand the active voice room and include participants inside the card. The icon variant with sound might look better too. I went ahead and implemented this, see attached.
Got a few more comments for you, I went ahead and added some stuff to welshman that will help with this.
Also, you mentioned your bespoke relay has livekit support, is it using welshman? If so, is that something that we could PR to upstream?
@@ -279,0 +291,4 @@{/if}{#if $roomsWithLivekit.has(h)}<VoiceRoomItem {url} {h} />{/if}I think we talked about putting voice rooms below, separate from text rooms, is that right? I can't remember, but I think it would make sense for any room with livekit support to go there instead of in the text room area regardless of whether the room has text support.
I don't love the labels I landed on but see what you think of the current take:
"Your Rooms"
{ favorited voice and text}
"Other Rooms"
{ text rooms}
Voice Rooms
{ voice rooms}
Or when the user has no favorited rooms it looks like:
"Rooms"
{ text rooms}
Voice Rooms
{ voice rooms}
Idk if you want to say "Text Rooms" or "Chat Rooms" or something. Those feel quite wordy for menu headings to me though.
This surfaces another weird consequence of rooms that are "livekit" but not "no-text": currently favoriting either the text or vioce channel pulls both into the "Your Rooms" section.
That's probably ok, I like how the labels look.
@@ -0,0 +32,4 @@if (isMobileViewport()) {pushToast({theme: "error", message: "Voice rooms are not yet supported on mobile."})return}Instead of this, we should just not render voice rooms on mobile
I tried that, but realized that you could then join a call on desktop web, resize your browser to be small and then lose the UI to leave the call. As I type it out I realize that's too edgy of a case 😂 I will just hide it.
@@ -0,0 +42,4 @@return}joinAbortController = new AbortController()isJoining = true[nit] in situations like this, I like to overload the controller as the isJoining boolean and just check whether joinAbortController is undefined. One fewer variable to keep track of
@@ -0,0 +43,4 @@</Button><Button class="btn btn-sm btn-square btn-error" onclick={leaveVoiceRoom}><Icon icon={PhoneRounded} size={4} /></Button>If we have room here, it might be good to label the buttons with mute/leave
There is not a ton of room. How about a hover state?
Yeah, that's a fine compromise.
@@ -635,0 +638,4 @@derived(roomsById, $roomsById => {const set = new Set<string>()for (const room of $roomsById.values()) {if (room.url === url && getTag("livekit", room.event?.tags ?? [])) {Is the null coalescing necessary here? roomsById should use PublishedRoomMeta, so event should always be defined.
@@ -635,0 +654,4 @@}}return set})This and the previous function can take advantage of the new room.livekit and room.noText properties
@@ -635,0 +664,4 @@export const roomIsNoText = (url: string, h: string) => {const room = getRoom(makeRoomId(url, h))return !!getTag("no-text", room?.event?.tags ?? [])}It looks like these utilities aren't used
I don't have a bespoke relay, I just modified zooid. PR open now.
Ready for another review (the re-request review button doesn't seem to be working 🤷)
Other Rooms is showing up even if there are no text rooms. Also, audio rooms aren't showing the user-selected icon in the sidebar (the speaker should be a fallback, or maybe show the speaker icon on the right, similar to how we do the muted room icon).
The room should mute the mic if permission is denied, and ask for permission again when the user un-mutes.
dummy comment which I can't delete, but gitea lost my review comment? oh well
@hodlbod fixed, sort of. Right now "text and voice" rooms have to share the same icon which is an issue especially if you favorite a "text and voice" room because then they show up together in the "Your Rooms" section. But I'm having trouble coming up with a good way to communicate the room type while supporting custom icons in the side menu. Putting the speaker icon on the right might be the best but then we need some other default icon in place of the hashtag for the left I think, but no matter what we decide it gets muddy. Another idea would be to keep the speaker on the left but display the custom icon after it (maybe with a slash like "🔈/ 🍎"). We could do the same for text channels and hashtags "# / 🍎". I'm questioning how important it is to have voice-only channels but I think they are worth the effort. I wish we could get rid of text and voice channels combined but I understand why fiatjaf needs them for "normal" NIP-29 groups.
Another idea would be to not support "text and voice" type rooms in the UI at all unless your relay has exactly one room. In that case we can probably toss the custom icon (you already get a relay icon) and just show one voice room and one text room with the same name. Flotilla isn't really a stranger to adapting to these different relay configurations (i.e. the General chat room).
Still need to add some sounds on join/leave mute/unmute. Where did you get the sounds for notifications? Maybe I could find some there with a similar vibe.
Ok, I think we should drop the
no-texttag. As it is, anyone can send any event kind to a 39000 room, and will continue to do so regardless of whether no-text is set. Relays can restrict it, but that's a tarpit IMO. Livekit can be added to any room, which means we can also send text to any livekit room. This fixes the room pairs problem and allows us to very simply split rooms into rooms-with-voice and rooms-without-voice with unique icons. This isn't quite how discord does it, but I do think having a chat in a voice channel is usually going to be a nice to have. We might render it differently though to emphasize that it's a different medium. For now, keep the UI as it is, but also navigate to the chat view for that room when the user clicks the voice card. They should still be able to navigate to other parts of the space (or other spaces?) without leaving the room.Also, I remember you doing something hacky to detect support, it looks like fiatjaf updated the nip to say "In order to inform clients about relay support for AV (so they can display that option for users when creating or editing groups) relays should serve a status code
204athttps://relay.tld/.well-known/nip29/livekit", which is better. Can you add support to zooid and flotilla for that?Remmel added the sound, so I don't know where it came from.
@hodlbod that's fine but you didn't weigh in on my icon dilemma 😅 Even if all voice rooms have text chat we still need to clearly communicate to the user that clicking on a voice room in the sidebar is going to join a voice room.
I'm going to try the slash between icons on for size.
I tried out a hybrid solution that I think works visually. Text room icons remain unchanged but voice rooms get a slash and a custom icon if one is set. (haven't pushed yet)
Clicking around doesn't feel quite right yet though. It's weird to be left in the text room if you hang up the voice call, and also weird that you can't get back to the text room to view past history without rejoining the voice channel. These feel solvable but I haven't solved them yet.
I'm having some doubts now about whether I'm overcooking things by trying to shoehorn Discord voice channels into NIP-29 😓 In casual conversations with potential users I'm realizing that while they are excited about audio/video calls in community they tend to use them more like Zoom than Discord voice "third spaces". The data model for voice-only channels has been janky from the start and the UI patterns for starting a "call" in a "channel" (a la Slack huddles) feels easier to integrate into Flotilla's existing UI (maybe the grass is just always greener. There are unsolved UI challenges there too the more I think about it). My instinct is to schedule some in depth user interviews with users to really flesh the needs out, but I'm not sure we have any active flotilla groups that are excited about this? And non-flotilla users feel too generic since needs are different for every community.
I think I want a temperature check from you Jon. Or maybe a conversation (in a voice room 😉). I think I want to go ahead and continue down this voice room path until we can ship it and see what happens. Without
no-textrooms in the data model I think we have full freedom to transform the UI into a more call-like thing later if desired. But I'm feeling more influenceable than I was a couple weeks ago.I tried merging from dev and got a ton of weird merge conflicts. Did you rewrite history on dev? I resolved them all but then couldn't commit the merge because of missing functions in welshman:
I pulled the latest welshman but those functions don't exist. Maybe you have them locally?
I'm working on the kind 39004 presence system as probably the last change before this is ready to deploy. Probably won't have it done until this afternoon though. In the meantime feel free to check out the join and leave sounds. I spent a long time trying to find some decent ones. I'm unsure whether I succeeded.
Sorry about that, I added some stuff to welshman but forgot to release a new version. I've just pushed to dev with the new dependencies in there.
For the conflicts, you should always try to rebase rather than merge. I had released a hotfix on master a while ago, and rebased on master, creating ~70 new commits, which is why the merge was so painful. My advice would be to
git rebase $(git merge-base head dev -i)and squash your work into a single commit, thengit rebase devand fix conflicts again to get back synced up.b9413d9b4atodc2d245f77@hodlbod ok that took longer than I planned but we now use kind 39004 to list participants in a room before joining a call and the livekit participants API while in a call. Zooid has been updated to handle the webhook from livekit and publish the 39004.
Give the tires a good kick and let me know what you think.
79a90f2319tob513982dc3I was doing some testing and realized that at some point my code to hide the voice rooms on mobile broke. I tried just showing the voice widget below the chat input on mobile and it seems ok so I pushed it. Maybe this is too big of an 11th hour change, happy to back it out if so but it seems like a halfway decent mobile solution to me.
This is looking really good with the nav changes I made yesterday. One flow that isn't quite there is when you join a room you auto-join the chat (which is good), but the indicator only shows up once you're connected, and once you hang up there's no way to get it back without leaving and re-joining the room. I think the indicator should always be there, maybe faded or something if you're not connected. Also, what do you think about joining muted? We could show a dialog that orients the user to who is in the room, how to un mute etc. But don't worry about doing that right now.
Ah yes that's a good idea. We can show the widget but say "Voice Not Connected" instead of "Voice Connected" or something.
I go back and forth about how easy it should be to join the room. I like that clicking just joins you and you can start talking without having to click something else. If you have never joined before you should get a permissions prompt anyway, warning you that your microphone is going to be used. On the other hand I never join these types of things without double checking my audio inputs and outputs, and if it's a big meeting I always join muted. So a confirmation screen (like Zoom has) makes sense too. Kind of an entryway before you step into the living room.
One thing I don't like about joining muted is that it might encourage browsing rooms without intention to speak. That to me seems is not the type of space we are trying to cultivate with these, but maybe I'm overthinking it and that isn't such a big deal.
Those are very good points, I'm satisfied. Once we have the widget showing I say we ship it.
I made a follow-up issue btw to keep track of stuff we should do after this is merged: #98
Feel free to edit with more items
@@ -17,1 +19,3 @@const profile = deriveProfile(pubkey, removeUndefined([url]))const readableProfile = ifLet(pubkey, pk => deriveProfile(pk, removeUndefined([url])))const emptyProfile = readable(undefined)const profile: Readable<{picture?: string} | undefined> = readableProfile ?? emptyProfilederiveProfileis designed to acceptundefinedto avoid this kind of mess@@ -16,0 +20,4 @@const isVoiceRoom = $derived($room.livekit)const isVoiceRoomActive = $derived($currentVoiceSession?.url === url && $currentVoiceSession?.h === h,)Now that we're showing the widget in the room, I think we can probably get rid of this change, what do you think?
I'm confused - did you want to show the VoiceWidget only on the room page? Right now I have it showing in the SpaceMenu on desktop and both in SpaceMenu and on the room page on mobile. It seems like a better use of space on desktop to have it down in the corner than to stretch it across the whole width of the chat room.
I guess that's orthogonal to the question of whether we want a different icon for voice rooms once you have joined them. I think it's a nice affordance if you are looking a the sidebar to be able to tell which room you are in without having to read the room name out of the voice widget.
Yeah, I just mean we should just show the room's actual icon at the top of the page regardless of whether the room is active (the widget at the bottom will be indication enough)
@@ -0,0 +76,4 @@"inline-flex shrink-0 items-center justify-center rounded-full transition-shadow",isActive && $isParticipantSpeaking(p) && "ring-2 ring-success",)}><ProfileCircle pubkey={p.pubkey} size={5} class="h-5 w-5" />There's a typescript error here because pubkey may be undefined. Is there a reason for this or can we strengthen the type to guarantee it?
Yes, in the case where there is a voice participant in the room but we can't derive a
pubkeyfrom their LiveKit identity string it seems important to still show that there is someone in the room. I just wanted to reuse the default profile icon for that and this was my way of doing it.@hodlbod it looks like you reverted my changes to the build.sh. I think we need some changes because right now the README.md says to make a
.envfile but thebuild.shdoesn't actually source it (although Vite does) causing logo stuff in there to not work. So I think for consistency and clarity we need to either.envand not.env.localin the build.shREADME.mdto tell people to use.env.localspecifically.env,.env.local,.env.*,.env.*.local.This probably should be in a separate PR though. I'll just open a ticket for now.
Just want to call out in case it was unintentional - I think one of your recent commits removed the room image from the top of the room page, which is a change from
master.5105a9772etob5dd7dd590b5dd7dd590to9364b46e5d