Commit graph

518 commits

Author SHA1 Message Date
Till Faelligen 9e2afb5586
Merge branch 'main' of github.com:matrix-org/dendrite into s7evink/latest-event-updater 2024-01-22 07:59:56 +01:00
Till 8e4dc6b4ae
Optimize PrevEventIDs when getting thousands of backwards extremeties (#3308)
Changes how many `PrevEventIDs` we send to other servers when
backfilling, capped to 100 events.

Unsure about how representative this benchmark is..
```
goos: linux
goarch: amd64
pkg: github.com/matrix-org/dendrite/roomserver/api
cpu: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
                            │    old.txt     │               new.txt               │
                            │     sec/op     │   sec/op     vs base                │
PrevEventIDs/Original1-8         264.9n ± 5%   237.4n ± 7%  -10.36% (p=0.000 n=10)
PrevEventIDs/Original10-8        3.101µ ± 4%   1.590µ ± 2%  -48.72% (p=0.000 n=10)
PrevEventIDs/Original100-8       44.32µ ± 2%   12.80µ ± 4%  -71.11% (p=0.000 n=10)
PrevEventIDs/Original500-8     263.835µ ± 4%   7.907µ ± 4%  -97.00% (p=0.000 n=10)
PrevEventIDs/Original1000-8    578.798µ ± 2%   7.620µ ± 2%  -98.68% (p=0.000 n=10)
PrevEventIDs/Original2000-8   1272.039µ ± 2%   8.241µ ± 9%  -99.35% (p=0.000 n=10)
geomean                          43.81µ        3.659µ       -91.65%

                            │    old.txt     │               new.txt                │
                            │      B/op      │     B/op      vs base                │
PrevEventIDs/Original1-8          72.00 ± 0%     48.00 ± 0%  -33.33% (p=0.000 n=10)
PrevEventIDs/Original10-8        1512.0 ± 0%     500.0 ± 0%  -66.93% (p=0.000 n=10)
PrevEventIDs/Original100-8     11.977Ki ± 0%   7.023Ki ± 0%  -41.36% (p=0.000 n=10)
PrevEventIDs/Original500-8     67.227Ki ± 0%   7.023Ki ± 0%  -89.55% (p=0.000 n=10)
PrevEventIDs/Original1000-8   163.227Ki ± 0%   7.023Ki ± 0%  -95.70% (p=0.000 n=10)
PrevEventIDs/Original2000-8   347.227Ki ± 0%   7.023Ki ± 0%  -97.98% (p=0.000 n=10)
geomean                         12.96Ki        1.954Ki       -84.92%

                            │   old.txt   │              new.txt               │
                            │  allocs/op  │ allocs/op   vs base                │
PrevEventIDs/Original1-8       2.000 ± 0%   1.000 ± 0%  -50.00% (p=0.000 n=10)
PrevEventIDs/Original10-8      6.000 ± 0%   2.000 ± 0%  -66.67% (p=0.000 n=10)
PrevEventIDs/Original100-8     9.000 ± 0%   3.000 ± 0%  -66.67% (p=0.000 n=10)
PrevEventIDs/Original500-8    12.000 ± 0%   3.000 ± 0%  -75.00% (p=0.000 n=10)
PrevEventIDs/Original1000-8   14.000 ± 0%   3.000 ± 0%  -78.57% (p=0.000 n=10)
PrevEventIDs/Original2000-8   16.000 ± 0%   3.000 ± 0%  -81.25% (p=0.000 n=10)
geomean                        8.137        2.335       -71.31%
```
2024-01-20 22:26:57 +01:00
Till edd02ec468
Fix panic if unable to assign a state key NID (#3294) 2023-12-30 18:34:36 +01:00
Till Faelligen 329d15ef44
Revert "Try to fix state reset sentry messages, maybe?"
This reverts commit 925843d05b.
2023-12-29 23:03:53 +01:00
Till Faelligen 925843d05b
Try to fix state reset sentry messages, maybe? 2023-12-29 22:19:45 +01:00
Till Faelligen 2c81b060d6
Merge branch 'main' of github.com:matrix-org/dendrite into s7evink/latest-event-updater 2023-12-29 19:20:49 +01:00
Till f93d1c4790
Use AckExplicitPolicy instead of AckAllPolicy (#3288)
Fixes https://github.com/matrix-org/dendrite/issues/3240 and potentially
a root cause for state resets.

While testing, I've had added some more debug logging:
```
time="2023-12-16T18:13:11.319458084Z" level=warning msg="already processed event" event_id="$qFYMl_F2vb1N0yxmvlFAMhqhGhLKq4kA-o_YCQKH7tQ" kind=KindNew times=2
time="2023-12-16T18:13:14.537389126Z" level=warning msg="already processed event" event_id="$EU-LTsKErT6Mt1k12-p_3xOHfiLaK6gtwVDlZ35lSuo" kind=KindNew times=5
time="2023-12-16T18:13:16.789551206Z" level=warning msg="already processed event" event_id="$dIPuAfTL5x0VyG873LKPslQeljCSxFT1WKxUtjIMUGE" kind=KindNew times=5
time="2023-12-16T18:13:17.383838767Z" level=warning msg="already processed event" event_id="$7noSZiCkzerpkz_UBO3iatpRnaOiPx-3IXc0GPDQVGE" kind=KindNew times=2
time="2023-12-16T18:13:22.091946597Z" level=warning msg="already processed event" event_id="$3Lvo3Wbi2ol9-nNbQ93N-E2MuGQCJZo5397KkFH-W6E" kind=KindNew times=1
time="2023-12-16T18:13:23.026417446Z" level=warning msg="already processed event" event_id="$lj1xS46zsLBCChhKOLJEG-bu7z-_pq9i_Y2DUIjzGy4" kind=KindNew times=4
```

So we did receive the same event over and over again. Given they are
`KindNew`, we don't short circuit if we already processed them, which
potentially caused the state to be calculated with a now wrong state
snapshot.

Also fixes the back pressure metric. We now correctly increment the
counter once we sent the message to NATS and decrement it once we
actually processed an event.
2023-12-19 08:25:47 +01:00
Till Faelligen 4ae7335b01
Be more verbose when logging state resets 2023-11-29 09:15:04 +01:00
Till Faelligen 2f99680c38
Move checking if the event has been sent as well 2023-11-24 10:05:30 +01:00
Till Faelligen 94960c507a
Move SetLatestEvents 2023-11-24 08:11:50 +01:00
Till Faelligen 7db3e9f689
Move context 2023-11-23 20:34:48 +01:00
Till Faelligen 0f74cbfb27
Move historyVisibility 2023-11-23 19:48:53 +01:00
Till Faelligen 7999ef09d5
Move lastEventIDSent 2023-11-23 19:46:28 +01:00
Till Faelligen 2047cca580
Move roomInfo 2023-11-23 19:44:48 +01:00
Till Faelligen 416dbcee76
No naked returns 2023-11-23 19:38:55 +01:00
Till Faelligen 06064e0b48
Move sendAsServer 2023-11-23 19:37:23 +01:00
Till Faelligen 8bbd9406f8
Remove TransactionID from struct 2023-11-23 19:34:57 +01:00
Till Faelligen 53c474f93e
Move makeOutputNewRoomEvent 2023-11-23 19:33:08 +01:00
Till Faelligen a4e111cd40
Let doUpdateLatestEvents return the updates to send to NATS 2023-11-23 19:28:59 +01:00
Till b8f91485b4
Update ACLs when received as outliers (#3008)
This should fix #3004 by making sure we also update our in-memory ACLs
after joining a new room.
Also makes use of more caching in `GetStateEvent`

Bonus: Adds some tests, as I was about to use `GetBulkStateContent`, but
turns out that `GetStateEvent` is basically doing the same, just that it
only gets the `eventTypeNID`/`eventStateKeyNID` once and not for every
call.
2023-11-22 15:38:04 +01:00
Till 7863a405a5
Use IsBlacklistedOrBackingOff to determine if we should try to fetch devices (#3254)
Use `IsBlacklistedOrBackingOff` from the federation API to check if we
should fetch devices.

To reduce back pressure, we now only queue retrying servers if there's
space in the channel.
2023-11-09 08:43:27 +01:00
Till 699f5ca8c1
More rows.Close() and rows.Err() (#3262)
Looks like we missed some `rows.Close()`

Even though `rows.Err()` is mostly not necessary, we should be more
consistent in the DB layer.

[skip ci]
2023-11-09 08:42:33 +01:00
Till 5f872f4a82
Fix panic in QueryNextRoomHierarchyPage (#3253)
Sentry reported the following panic:
```
time="2023-11-01T01:33:56.220583478Z" level=error msg="Request panicked!
goroutine 43763845 [running]:
runtime/debug.Stack()
	runtime/debug/stack.go:24 +0x5e
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.MakeJSONAPI.Protect.func3.1()
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:98 +0x13e
panic({0x15b5540?, 0x2453560?})
	runtime/panic.go:914 +0x21f
github.com/matrix-org/dendrite/internal/httputil.MakeAuthAPI.func1.1()
	github.com/matrix-org/dendrite/internal/httputil/httpapi.go:91 +0x4a
panic({0x15b5540?, 0x2453560?})
	runtime/panic.go:914 +0x21f
github.com/matrix-org/dendrite/roomserver/internal/query.(*Queryer).QueryNextRoomHierarchyPage(0x413185?, {0x1a576e0, 0xc0436705a0}, {{{0xc01e5fd260, 0x1f}, {0xc01e5fd261, 0x12}, {0xc01e5fd274, 0xb}}, {0xc145cb5200, ...}, ...}, ...)
	github.com/matrix-org/dendrite/roomserver/internal/query/query_room_hierarchy.go:116 +0xbfe
github.com/matrix-org/dendrite/clientapi/routing.QueryRoomHierarchy(0xc0be13b200, 0xc144e65dd0, {0xc01e5fd260?, 0x6?}, {0x7faf140639c8, 0xc00059af20}, 0xc08adca000?)
	github.com/matrix-org/dendrite/clientapi/routing/room_hierarchy.go:141 +0x68b
github.com/matrix-org/dendrite/clientapi/routing.Setup.func35(0xc03e7d5c20?, 0x17c3a57?)
	github.com/matrix-org/dendrite/clientapi/routing/routing.go:534 +0xbe
github.com/matrix-org/dendrite/internal/httputil.MakeAuthAPI.func1(0xc0bd097300)
	github.com/matrix-org/dendrite/internal/httputil/httpapi.go:108 +0x5ed
github.com/matrix-org/util.(*jsonRequestHandlerWrapper).OnIncomingRequest(0xc0bd097200?, 0xc13b7d6fc0?)
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:79 +0x19
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.MakeJSONAPI.func2({0x1a54880, 0xc138f28b60}, 0xc0bd097200?)
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:141 +0xaa
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.MakeJSONAPI.Protect.func3({0x1a54880?, 0xc138f28b60?}, 0x17c01d9?)
	github.com/matrix-org/util@v0.0.0-20221111132719-399730281e66/json.go:103 +0x63
net/http.HandlerFunc.ServeHTTP(...)
	net/http/server.go:2136
github.com/matrix-org/dendrite/internal/httputil.MakeExternalAPI.func1({0x1a54880?, 0xc138f28b60?}, 0xc0bd097100)
	github.com/matrix-org/dendrite/internal/httputil/httpapi.go:191 +0x411
net/http.HandlerFunc.ServeHTTP(0xc0bd097000?, {0x1a54880?, 0xc138f28b60?}, 0xbe1348905308878e?)
	net/http/server.go:2136 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000000000, {0x1a54880, 0xc138f28b60}, 0xc0bd096f00)
	github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1c5
github.com/matrix-org/dendrite/setup/base.SetupAndServeHTTP.(*Handler).Handle.(*Handler).handle.func5({0x1a54880, 0xc138f28b60}, 0xc0bd096e00)
	github.com/getsentry/sentry-go@v0.14.0/http/sentryhttp.go:103 +0x298
net/http.HandlerFunc.ServeHTTP(0xc0bd096a00?, {0x1a54880?, 0xc138f28b60?}, 0x7fae6812f5d0?)
	net/http/server.go:2136 +0x29
github.com/gorilla/mux.(*Router).ServeHTTP(0xc000000a80, {0x1a54880, 0xc138f28b60}, 0xc0bd096900)
	github.com/gorilla/mux@v1.8.0/mux.go:210 +0x1c5
net/http.serverHandler.ServeHTTP({0xc02884c4e0?}, {0x1a54880?, 0xc138f28b60?}, 0x6?)
	net/http/server.go:2938 +0x8e
net/http.(*conn).serve(0xc1926922d0, {0x1a576e0, 0xc024a6ec90})
	net/http/server.go:2009 +0x5f4
created by net/http.(*Server).Serve in goroutine 16979
	net/http/server.go:3086 +0x5cb
" context=missing panic="runtime error: invalid memory address or nil pointer dereference"
```

[skip ci]
2023-11-08 14:22:02 +01:00
Till da7bca0224
Some tweaks for the device list updater (#3251)
This makes the following changes:
- Adds two new metrics observing the usage of the `DeviceListUpdater`
workers
- Makes the number of workers configurable
- Adds a 30s timeout for DB requests when receiving a device list update
over federation
2023-10-31 16:39:45 +01:00
Till 4fa8512d57
Check event is not rejected (#3243)
Companion PR to https://github.com/matrix-org/gomatrixserverlib/pull/421
2023-10-25 09:47:21 +02:00
Till 8b3adaf244
Fix state resets (#3231)
Needs https://github.com/matrix-org/gomatrixserverlib/pull/419

May fix: https://github.com/matrix-org/dendrite/issues/2508,
https://github.com/matrix-org/dendrite/issues/1760
2023-10-23 15:17:21 +02:00
Till f02d998253
Remove the creator field when upgrading to v11 (#3210)
Minor oversight
2023-09-28 07:36:57 +02:00
Till 05a8f1ede3
Support for room version v11 (#3204)
Fixes #3203
2023-09-27 08:27:08 +02:00
devonh 16d922de70
Complement fixes for pseudoIDs (#3206) 2023-09-26 17:44:49 +00:00
devonh 8245b24100
Update gmsl to use new validated RoomID on PDUs (#3200)
GMSL returns a `spec.RoomID` when calling `PDU.RoomID()`
2023-09-15 14:39:06 +00:00
Sam Wedgwood 9b5be6b9c5
[pseudoIDs] More pseudo ID fixes - Part 2 (#3181)
Fixes include:
- Translating state keys that contain user IDs to their respective room
keys for both querying and sending state events
- **NOTE**: there may be design discussion needed on what should happen
when sender keys cannot be found for users
- A simple fix for kicking guests from rooms properly
- Logic for boundary history visibilities was slightly off (I'm
surprised this only manifested in pseudo ID room versions)

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-24 16:43:51 +01:00
Sam Wedgwood 9a12420428
[pseudoID] More pseudo ID fixes (#3167)
Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-15 12:37:04 +01:00
Sam Wedgwood 35804f8493
Add config key for default room version (#3171)
This PR adds a config key `room_server.default_config_key` to set the
default room version for the room server.

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-08 14:20:05 +01:00
Sam Wedgwood c7193e24d0
Use *spec.SenderID for QuerySenderIDForUser (#3164)
There are cases where a dendrite instance is unaware of a pseudo ID for
a user, the user is not a member of that room. To represent this case,
we currently use the 'zero' value, which is often not checked and so
causes errors later down the line. To make this case more explict, and
to be consistent with `QueryUserIDForSender`, this PR changes this to
use a pointer (and `nil` to mean no sender ID).

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-02 11:12:14 +01:00
Sam Wedgwood af13fa1c75
[pseudoIDs] Fixes for room alias tests (#3159)
Some (deceptively) simple fixes for some bugs that caused room alias
tests to fail (sytext `tests/30rooms/05aliases.pl`). Each commit has
details about what it fixes.

Sytest results:

- Sytest before (79d4a0e):
https://gist.github.com/swedgwood/972ac4ef93edd130d3db0930703d6c82
- Sytest after (4b09bed):
https://gist.github.com/swedgwood/504b00ac4ee892acb757b7fac55fa28a

Room aliases go from `8/15` to `15/15`, but looks like these fixes also
managed to fix about `4` other tests, which is a nice bonus :)

Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-07-31 14:39:41 +01:00
Till Faelligen 79d4a0e399
Restore old behaviour of PurgeRoom 2023-07-26 09:09:04 +02:00
devonh c809e95335
Fix event federation with pseudoID rooms (#3156) 2023-07-21 16:08:40 +00:00
Sam Wedgwood 9582827493
de-MSC-ifying space summaries (MSC2946) (#3134)
- This PR moves and refactors the
[code](https://github.com/matrix-org/dendrite/blob/main/setup/mscs/msc2946/msc2946.go)
for
[MSC2946](https://github.com/matrix-org/matrix-spec-proposals/pull/2946)
('Space Summaries') to integrate it into the rest of the codebase.
- Means space summaries are no longer hidden behind an MSC flag
- Solves #3096

Signed-off-by: Sam Wedgwood <sam@wedgwood.dev>
2023-07-20 15:06:05 +01:00
Till 297479ea49
Use pointer when passing the connection manager around (#3152)
As otherwise existing connections aren't reused.
2023-07-19 13:37:04 +02:00
Till 5267cc0f54
Optimise getting local members and membership counts (#3150)
The previous version was getting **ALL** membership events (as
`ClientEvents`, so going through `NewEventFromTrustedJSONWithID`) for a
given room.
Now we are querying only locally joined users as `ClientEvents`, which
should **significantly** reduce allocations.

Take for example a large room with 2k membership events, but only 1
local user - avoiding 1999 `NewEventFromTrustedJSONWithID` calls just to
calculate the `roomSize` which we can also query by other means.

This is also getting called for every `OutputRoomEvent` in the userAPI.

Benchmark with 1 local user and 100 remote users.
```
pkg: github.com/matrix-org/dendrite/userapi/consumers
cpu: 12th Gen Intel(R) Core(TM) i5-12500H
                    │   old.txt   │               new.txt               │
                    │   sec/op    │   sec/op     vs base                │
LocalRoomMembers-16   375.9µ ± 7%   327.6µ ± 6%  -12.85% (p=0.000 n=10)

                    │    old.txt    │               new.txt                │
                    │     B/op      │     B/op      vs base                │
LocalRoomMembers-16   79.426Ki ± 0%   8.507Ki ± 0%  -89.29% (p=0.000 n=10)

                    │   old.txt   │              new.txt               │
                    │  allocs/op  │ allocs/op   vs base                │
LocalRoomMembers-16   1015.0 ± 0%   277.0 ± 0%  -72.71% (p=0.000 n=10)
```
2023-07-13 14:19:08 +02:00
Till eb9e90379d
Add event size checks similar to Synapse (#3140)
Companion to https://github.com/matrix-org/gomatrixserverlib/pull/400
This tries to mimic the logic found in Synapse, as dropping events can
break rooms (and we may end up in endless loops..)
2023-07-07 20:37:23 +02:00
devonh d507c5fc95
Add pseudoID compatibility to Invites (#3126) 2023-07-06 15:15:24 +00:00
Till Faelligen 5a87c703fa
Fix metrics.. 2023-07-05 12:34:53 +02:00
Till 4c3a526e1b
Fix adding state events to the database (#3133)
When we're adding state to the database, we check which eventNIDs are
already in a block, if we already have that eventNID, we remove it from
the list. In its current form we would skip over eventNIDs in the case
we already found a match (we're decrementing `i` twice)
My theory is, that when we later get the state blocks, we are receiving
"too many" eventNIDs (well, yea, we stored too many), which may or may
not can result in state resets when comparing different state snapshots.
(e.g. when adding state we stored a eventNID by accident because we
skipped it, later we add more state and are not adding it because we
don't skip it)
2023-07-04 17:15:44 +02:00
Till Faelligen 939ee325f8
Actually use the parameter 2023-06-29 18:02:11 +02:00
Till 23cd7877a1
Add MXIDMapping for pseudoID rooms (#3112)
Add `MXIDMapping` on membership events when
creating/joining rooms.
2023-06-28 20:29:49 +02:00
Till a734b112c6
Fix backfilling (#3117)
This should fix two issues with backfilling:
1. right after creating and joining a room over federation, we are doing
a `/backfill` request, which would return redacted events, because the
`authEvents` are empty. Even though the spec states that, in the absence
of a history visibility event, it should be handled as `shared`.
2. `gomatrixserverlib: unsupported room version ''` - because, well, we
were never setting the `roomInfo` field..
2023-06-20 16:52:29 +02:00
Devon Hudson 8cf6c381e2
Fix senderID/key conversion unit tests 2023-06-14 17:11:27 +01:00
Devon Hudson 3f4df25b31
Add missing dep 2023-06-14 17:04:19 +01:00
Devon Hudson 5aaa539e3e
Fix senderID/key conversions 2023-06-14 16:42:09 +01:00