Commit graph

202 commits

Author SHA1 Message Date
Daniel Aloni c550c2e8cb Merge remote-tracking branch 'origin' into release/upstream-v0.12.0 2023-03-15 12:30:11 +02:00
Till 70322699ab
Unset RoomServerEvent, since we can't be sure that Set actually updates the cached entry (#3002)
This should deflake UTs and be more correct in terms of getting
`Events`.
`Events` tries to fetch the event from the cache first and may get an
unredacted event from it, while it should already be redacted.
2023-03-09 09:52:13 +01:00
Till Faelligen 56b28b01db
Update the cache with the redacted event 2023-03-03 14:49:41 +01:00
Till 9bcd0a2105
Make redaction check easier to read (#2995)
We need to check the redaction PL in Dendrite, if we do it in GMSL, we
end up not sending the event to the output stream because it will be
rejected.

---------

Co-authored-by: kegsay <kegan@matrix.org>
2023-03-03 14:03:17 +01:00
Till 6c20f8f742
Refactor StoreEvent, add MaybeRedactEvent, create an EventDatabase (#2989)
This PR changes the following:
- `StoreEvent` now only stores an event (and possibly prev event),
instead of also doing redactions
- Adds a `MaybeRedactEvent` (pulled out from `StoreEvent`), which should
be called after storing events
- a few other things
2023-03-01 17:06:47 +01:00
Till Faelligen 3d31b131fc
Cache all the things 2023-02-24 11:45:01 +01:00
Till ad07b169b8
Refactor StoreEvent and create a new RoomDatabase interface (#2985)
This PR changes a few things:
- It pulls out the creation of several NIDs from the `StoreEvent`
function to make the functions more reusable
- Uses more caching when using those NIDs to avoid DB round trips
2023-02-24 09:40:20 +01:00
Till eb29a31550
Optimize /sync and history visibility (#2961)
Should fix the following issues or make a lot less worse when using
Postgres:

The main issue behind #2911: The client gives up after a certain time,
causing a cascade of context errors, because the response couldn't be
built up fast enough. This mostly happens on accounts with many rooms,
due to the inefficient way we're getting recent events and current state

For #2777: The queries for getting the membership events for history
visibility were being executed for each room (I think 185?), resulting
in a whooping 2k queries for membership events. (Getting the
statesnapshot -> block nids -> actual wanted membership event)

Both should now be better by:
- Using a LATERAL join to get all recent events for all joined rooms in
one go (TODO: maybe do the same for room summary and current state etc)
- If we're lazy loading on initial syncs, we're now not getting the
whole current state, just to drop the majority of it because we're lazy
loading members - we add a filter to exclude membership events on the
first call to `CurrentState`.
- Using an optimized query to get the membership events needed to
calculate history visibility

---------

Co-authored-by: kegsay <kegan@matrix.org>
2023-02-07 14:31:23 +01:00
devonh 4738fe656f
Roomserver published pkey migration (#2960)
Adds a missed migration to update the primary key on the
roomserver_published table in postgres.
Primary key was changed in #2836.
2023-02-01 16:32:31 +00:00
Neil 738686ae68
Add /_dendrite/admin/purgeRoom/{roomID} (#2662)
This adds a new admin endpoint `/_dendrite/admin/purgeRoom/{roomID}`. It
completely erases all database entries for a given room ID.

The roomserver will start by clearing all data for that room and then
will generate an output event to notify downstream components (i.e. the
sync API and federation API) to do the same.

It does not currently clear media and it is currently not implemented
for SQLite since it relies on SQL array operations right now.

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: Till Faelligen <2353100+S7evinK@users.noreply.github.com>
2023-01-19 21:02:32 +01:00
danielaloni ac514b406c 🐛 Migration to have the correct composite primary key in roomserver_published. 2023-01-04 13:19:01 +02:00
danielaloni c1b2f2514d Merge remote-tracking branch 'origin' into release/upstream_v0.10.8 2023-01-04 10:21:08 +02:00
Till 7d2344049d
Cleanup stale device lists for users we don't share a room with anymore (#2857)
The stale device lists table might contain entries for users we don't
share a room with anymore. This now asks the roomserver about left users
and removes those entries from the table.

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-12-12 08:20:59 +01:00
Till 2a77a910eb
Handle remote room upgrades (#2866)
Makes the following tests pass
```
/upgrade moves remote aliases to the new room
Local and remote users' homeservers remove a room from their public directory on upgrade
```
2022-11-14 12:07:13 +00:00
Till 1e79b0557e
Use a writer to assign state key NIDs (#2877) 2022-11-14 12:06:27 +00:00
Till Faelligen e177e0ae73
Fix oops, add simple UT 2022-11-11 16:44:59 +01:00
Till c648c671a3
Fix issue with missing user NIDs (#2874)
This should fix #2696 and possibly other related issues regarding
missing user NIDs.
(https://github.com/matrix-org/dendrite/issues/2094?)
2022-11-11 10:52:43 +01:00
danielaloni 843f180cc9 Merge remote-tracking branch 'origin' into release/upstream-0.10.6 2022-11-03 13:25:17 +02:00
Neil Alexander 6663728eb1
Fix SQLite roomserver_published migration 2022-11-01 16:08:13 +00:00
Till 2acc1d65fb
Optimize history visibility checks (#2848)
This optimizes history visibility checks by (mostly) avoiding database
hits.
Possibly solves https://github.com/matrix-org/dendrite/issues/2777

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-11-01 15:07:17 +00:00
Till Faelligen a785532463
Fix upgrade appservices 2022-10-27 16:01:51 +02:00
Till 444b4bbdb8
Add AS specific public room list endpoints (#2836)
Adds `PUT
/_matrix/client/v3/directory/list/appservice/{networkId}/{roomId}` and
`DELTE
/_matrix/client/v3/directory/list/appservice/{networkId}/{roomId}`
support, as well as the ability to filter `/publicRooms` on networkID
and including all networks.
2022-10-27 14:40:35 +02:00
Piotr Kozimor b2fcf0e4d9 Merge branch 'main' into release/upstream-0.10.3 2022-10-18 13:20:47 +02:00
Till 3c1474f68f
Fix /get_missing_events for rooms with joined/invited history_visibility (#2787)
Sytest was using a wrong `history_visibility` for `invited`
(https://github.com/matrix-org/sytest/pull/1303), so `invited` was
passing for the wrong reason (-> defaulted to `shared`, as `invite`
wasn't understood).
This change now handles missing events like Synapse, if a server isn't
allowed to see the event, it gets a redacted version of it, making the
`get_missing_events` tests pass.
2022-10-11 16:04:02 +02:00
Till 1ca3f3efb5
Fix issue with DMs shown as normal rooms (#2776)
Fixes #2121, test added in
https://github.com/matrix-org/complement/pull/494
2022-10-07 16:00:12 +02:00
Neil Alexander 8e231130e9
Revert "tDatabase transaction tweaks in roomserver"
This reverts commit 8d8f4689a0.
2022-10-07 14:05:06 +01:00
Neil Alexander 8d8f4689a0
tDatabase transaction tweaks in roomserver 2022-10-07 12:21:55 +01:00
danielaloni 1a5c48b9d0 Merge branch 'main' into release/upstream-0.10.1 2022-10-06 16:35:13 +03:00
Neil Alexander c85bc3434f
Optimise QuerySharedUsers so that we can only work on local users (#2766)
Otherwise the sync API key change consumer wastes a lot of time trying
to wake up the notifiers for non-local users.
2022-10-05 12:47:53 +01:00
Neil Alexander f022fc1397
Remove origin field from PDUs (#2737)
This nukes the `origin` field from PDUs as per
matrix-org/matrix-spec#998, matrix-org/gomatrixserverlib#341.
2022-09-26 17:35:35 +01:00
Till 100fa9b235
Check unique constraint errors when manually inserting migrations (#2712)
This should avoid unnecessary logging on startup if the migration (were
we need `InsertMigration`) was already executed.
This now checks for "unique constraint errors" for SQLite and Postgres
and fails the startup process if the migration couldn't be manually
inserted for some other reason.
2022-09-13 08:07:43 +02:00
Neil Alexander c0e17bbe1b
Fix transactions around assigning NIDs 2022-09-09 13:30:09 +01:00
Till 8196b29657
Change detection of already executed migrations (#2665)
This changes the detection of already executed migrations for the
roomserver state block and keychange refactor. It now uses schema tables
provided by the database engine to check if the column was already
removed. We now also store the migration in the migrations table.

This should stop e.g. Postgres from logging errors like `ERROR: column
"event_nid" does not exist at character 8`.
2022-09-09 13:14:52 +01:00
PiotrKozimor 387868e65d
Upstream release v0.9.5 2022-08-26 17:56:12 +02:00
Neil Alexander 522bd2999f
Allow un-rejecting events on reprocessing 2022-08-24 14:03:06 +01:00
Neil Alexander 14fea600bb
Detect types.MissingStateError in CheckServerAllowedToSeeEvent (#2667)
This will hopefully stop some 500 errors on `/event` where there is no state-before known.
2022-08-23 13:57:11 +01:00
PiotrKozimor d15a4e4a61
Upstream release v0.9.4 2022-08-22 18:03:50 +02:00
Piotr Kozimor 4aaa80a56e Merge branch 'main' into release/upstream 2022-08-22 14:45:25 +02:00
Neil Alexander 6b48ce0d75
State handling tweaks (#2652)
This tweaks how rejected events are handled in room state and also to not apply checks we can't complete to outliers.
2022-08-18 17:06:13 +01:00
Neil Alexander 59bc0a6f4e
Reprocess rejected input events (#2647)
* Reprocess outliers that were previously rejected

* Might as well do all events this way

* More useful errors

* Fix queries

* Tweak condition

* Don't wrap errors

* Report more useful error

* Flatten error on `r.Queryer.QueryStateAfterEvents`

* Some more debug logging

* Flatten error in `QueryRestrictedJoinAllowed`

* Revert "Flatten error in `QueryRestrictedJoinAllowed`"

This reverts commit 1238b4184c.

* Tweak `QueryStateAfterEvents`

* Handle MissingStateError too

* Scope to room

* Clean up

* Fix the error

* Only apply rejection check to outliers
2022-08-18 10:37:47 +01:00
Till 03ddd98f5e
Fix issues with migrations not getting executed (#2628)
* Fix issues with migrations not getting executed

* Check actual postgres error

* Return error if it's not "column does not exist"
2022-08-08 10:18:57 +02:00
Till 1b7f84250a
Fix linter issues (#2624)
* Try that again

* All hail the mighty linter?

* And once again

* goimport all the things
2022-08-05 11:12:41 +02:00
Piotr Kozimor 9aceb04b98 Run gofmt over the code 2022-08-05 10:42:14 +02:00
Piotr Kozimor d98c33e733 Merge branch 'main' into release/upstream 2022-08-05 10:25:30 +02:00
Neil Alexander 2250768be1
Remove roominfo cache (#2615)
* Remove roominfo cache

It's the source of a number of race conditions which are seemingly causing bugs and CI failures.

* Make the linter less sad
2022-08-03 17:14:21 +01:00
PiotrKozimor 15cfeb16aa
Upstream release v0.9.0 (#18)
* Correctly redact events over federation (#2526)

* Ensure we check powerlevel/origin before redacting an event

* Add passing test

* Use pl.UserLevel

* Make check more readable, also check for the sender

* Add new next steps page to the documentation

* Highlighting in docs

* Rename the page to "Optimise your installation"

* Attempt to raise the file descriptor limit at startup (#2527)

* Add `--difference` to `resolve-state` tool

* Make the linter happy again

* generic CaddyFile in front of Dendrite (monolith) (#2531)

for Caddy 2.5.x

Co-authored-by: emanuele.aliberti <emanuele.aliberti@mtka.eu>

* Handle state before, send history visibility in output (#2532)

* Check state before event

* Tweaks

* Refactor a bit, include in output events

* Don't waste time if soft failed either

* Tweak control flow, comments, use GMSL history visibility type

* Fix rare panic when returning user devices over federation (#2534)

* Add `InputDeviceListUpdate` to the keyserver, remove old input API (#2536)

* Add `InputDeviceListUpdate` to the keyserver, remove old input API

* Fix copyright

* Log more information when a device list update fails

* Fix nats.go commit (#2540)

Signed-off-by: Jean Lucas <jean@4ray.co>

* Don't return `end` if there are not more messages (#2542)

* Be more spec compliant

* Move lazyLoadMembers to own method

* Return an error if trying to invite a malformed user ID (#2543)

* Add `evacuateUser` endpoint, use it when deactivating accounts (#2545)

* Add `evacuateUser` endpoint, use it when deactivating accounts

* Populate the API

* Clean up user devices when deactivating

* Include invites, delete pushers

* Silence presence logs (#2547)

* Blacklist `Guest users can join guest_access rooms` test until it can be investigated

* Disable WebAssembly builds for now

* Try to fix backfilling (#2548)

* Try to fix backfilling

* Return start/end to not confuse clients

* Update GMSL

* Update GMSL

* Roomserver producers package (#2546)

* Give the roomserver a producers package

* Change init point

* Populate ACLs API

* Fix build issues

* `RoomEventProducer` naming

* Version 0.8.9 (#2549)

* Version 0.8.9

* Update changelog

* feat+fix: Ignore unknown keys and verify required fields are present in appservice registration files (#2550)

* fix: ignore unknown keys in appservice configs

fixes matrix-org/dendrite#1567

* feat: verify required fields in appservice configs

* Use new testrig for key changes tests (#2552)

* Use new testrig for tests

* Log the error message

* Fix QuerySharedUsers for the SyncAPI keychange consumer (#2554)

* Make more use of base.BaseDendrite

* Fix QuerySharedUsers if no UserIDs are supplied

* Return clearer error when no state NID exists for an event (#2555)

* Wrap error from `SnapshotNIDFromEventID`

* Hopefully fix read receipts timestamps (#2557)

This should avoid coercions between signed and unsigned ints which might fix problems like `sql: converting argument $5 type: uint64 values with high bit set are not supported`.

* Fix nil pointer access when redacting events (#2560)

* Fix issue `uint64 values with high bit are not supported` in presence (#2562)

* Fix issue #2528

* Use gomatrixserverlib.Timestamp

* Use ParseUint instead of ParseInt

* Update Pinecone to matrix-org/pinecone@1ce778f

* Ristretto cache (#2563)

* Try Ristretto cache

* Tweak

* It's beautiful

* Update GMSL

* More strict keyable interface

* Fix that some more

* Make less panicky

* Don't enforce mutability checks for now

* Determine mutability using deep equality

* Tweaks

* Namespace keys

* Make federation caches mutable

* Update cost estimation, add metric

* Update GMSL

* Estimate cost for metrics better

* Reduce counters a bit

* Try caching events

* Some guards

* Try again

* Try this

* Use separate caches for hopefully better hash distribution

* Fix bug with admitting events into cache

* Try to fix bugs

* Check nil

* Try that again

* Preserve order jeezo this is messy

* thanks VS Code for doing exactly the wrong thing

* Try this again

* Be more specific

* aaaaargh

* One more time

* That might be better

* Stronger sorting

* Cache expiries, async publishing of EDUs

* Put it back

* Use a shared cache again

* Cost estimation fixes

* Update ristretto

* Reduce counters a bit

* Clean up a bit

* Update GMSL

* 1GB

* Configurable cache sizees

* Tweaks

* Add `config.DataUnit` for specifying friendly cache sizes

* Various tweaks

* Update GMSL

* Add back some lazy loading caching

* Include key in cost

* Include key in cost

* Tweak max age handling, config key name

* Only register prometheus metrics if requested

* Review comments @S7evinK

* Don't return errors when creating caches (it is better just to crash since otherwise we'll `nil`-pointer exception everywhere)

* Review comments

* Update sample configs

* Update GHA Workflow

* Update Complement images to Go 1.18

* Remove the cache test from the federation API as we no longer guarantee immediate cache admission

* Don't check the caches in the renewal test

* Possibly fix the upgrade tests

* Update to matrix-org/gomatrixserverlib#322

* Update documentation to refer to Go 1.18

* Minor SendToDevice fix (#2565)

* Avoid unnecessary marshalling if sending to the local server

* Fix ordering of ToDevice messages

* Revive SendToDevice test

* Use `/v3` to request media from remote servers (update to matrix-org/gomatrixserverlib#324)

* Pointerise `types.RoomInfo` in the cache so we can update it in-place in the latest events updater

* Add a Troubleshooting page

* Update `sytest-whitelist`

* Use sync API database in `filterSharedUsers` (#2572)

* Add function to the sync API storage package for filtering shared users

* Use the database instead of asking the RS API

* Fix unit tests

* Fix map handling in `filterSharedUsers`

* Update 1_createusers.md (#2571)

* Update 1_createusers.md

Added description on how to create user accounts when running in docker.

* Update 1_createusers.md

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>

* Fix connection_string format in dendrite-sample.polylith.yaml (#2574)

* History visibility database changes (#2533)

* Add new history_visibility column

* Update SQL queries to include history_visibility

* Store the history visibilty calculated by the roomserver

* Update GMSL

* Update migrations

* Fix migration

* Update GMSL

* Fix `go.sum`

* Update GMSL to use sql.Scanner & sql.Valuer

* Re-order migration/table creation

* Update gomatrixserverlib

* Add history_visibility column to current_room_state

* Fix migrations

* Return error instead of Fatal log

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>

* Tweak cache counters (#2575)

* Tweak cache counters

This makes the number of counters relative to the
maximum cache size. Since the counters
effectively manage the size of the bloom filter,
larger caches need more counters and smaller
caches need less.

10 counters per 1KB data means that the default
cache size of 1GB should result in a bloom filter
and TinyLRU admission set of about 16MB
estimated.

* Remove line left by accident

* Set historyVisibility in rowsToStreamEvents

* Update FAQ

* Add event state key cache (#2576)

* Explain how SRV works in Matrix and discourage using it (#2577)

* Explain how SRV works in Matrix and discourage using it

* Minor tweaks to formatting

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>

* Fix issue with membership event_nid being 0 (#2580)

* docs: Add build page; correct proxy info; fix Caddy example (#2579)

* Add build page; correct proxy info; fix Caddy example

* Improve Caddyfile example

* Apply review comments; add polylith Caddyfile

* Bump tzinfo from 1.2.9 to 1.2.10 in /docs (#2584)

Bumps [tzinfo](https://github.com/tzinfo/tzinfo) from 1.2.9 to 1.2.10.
- [Release notes](https://github.com/tzinfo/tzinfo/releases)
- [Changelog](https://github.com/tzinfo/tzinfo/blob/master/CHANGES.md)
- [Commits](https://github.com/tzinfo/tzinfo/compare/v1.2.9...v1.2.10)

---
updated-dependencies:
- dependency-name: tzinfo
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Membership updater refactoring (#2541)

* Membership updater refactoring

* Pass in membership state

* Use membership check rather than referring to state directly

* Delete irrelevant membership states

* We don't need the leave event after all

* Tweaks

* Put a log entry in that I might stand a chance of finding

* Be less panicky

* Tweak invite handling

* Don't freak if we can't find the event NID

* Use event NID from `types.Event`

* Clean up

* Better invite handling

* Placate the almighty linter

* Blacklist a Sytest which is otherwise fine under Complement for reasons I don't understand

* Fix the sytest after all (thanks @S7evinK for the spot)

* Try to fix HTTP 500s on `/members` (#2581)

* Update database migrations, remove goose (#2264)

* Add new db migration

* Update migrations
Remove goose

* Add possibility to test direct upgrades

* Try to fix WASM test

* Add checks for specific migrations

* Remove AddMigration
Use WithTransaction
Add Dendrite version to table

* Fix linter issues

* Update tests

* Update comments, outdent if

* Namespace migrations

* Add direct upgrade tests, skipping over one version

* Split migrations

* Update go version in CI

* Fix copy&paste mistake

* Use contexts in migrations

Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>

* Add .well-known/matrix/client to clientapi (#2551)

Signed-off-by: Jonathan Bartlett <jonathan@jonnobrow.co.uk>

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>

* Remove `room_id` field from MSC2946 stripped events (closes #2588)

* Remove `goose` from Dockerfiles

* Make the User API responsible for sending account data output events (#2592)

* Make the User API responsible for sending account data output events

* Clean up producer

* Review comments

* Update NATS Server and nats.go to use upstream

* Set CORS headers for HTTP 404 and 405 errors (#2599)

* Set CORS headers for the 404s

* Use custom handlers, plus one for HTTP 405 too

* Tweak setup

* Add to muxes too

* Tidy up some more

* Use built-in HTTP 404 handler

* Don't bother setting it for federation-facing

* Optimise checking other servers allowed to see events (#2596)

* Try optimising checking if server is allowed to see event

* Fix error

* Handle case where snapshot NID is 0

* Fix query

* Update SQL

* Clean up `CheckServerAllowedToSeeEvent`

* Not supported on SQLite

* Maybe placate the unit tests

* Review comments

* De-race `types.RoomInfo` (#2600)

* De-race `CompleteSync` (#2601)

The `err` was coming from outside of the goroutine and being written to by concurrent goroutines.

* Version 0.9.0 (#2602)

Co-authored-by: Till <2353100+S7evinK@users.noreply.github.com>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: Till Faelligen <davidf@element.io>
Co-authored-by: Emanuele Aliberti <dev@mtka.eu>
Co-authored-by: emanuele.aliberti <emanuele.aliberti@mtka.eu>
Co-authored-by: Jean Lucas <jean@4ray.co>
Co-authored-by: Kabir Kwatra <kabir@kwatra.me>
Co-authored-by: andreever <52261463+andreever@users.noreply.github.com>
Co-authored-by: Maximilian Gaedig <38767445+MaximilianGaedig@users.noreply.github.com>
Co-authored-by: Tulir Asokan <tulir@maunium.net>
Co-authored-by: Matt Holt <mholt@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: Jonathan Bartlett <34320158+Jonnobrow@users.noreply.github.com>
2022-08-03 13:35:29 +02:00
PiotrKozimor 8903184fe8
Return correct membership in GetMembership when user is invited (#19)
* Return correct membership in GetMembership when user is invited

* Update whitelist

* Restore logging to file

* Fix linter issues

* Attempt to fix presence
2022-08-03 12:51:00 +02:00
Neil Alexander ca3fa58388
Various roominfo tweaks (#2607) 2022-08-02 12:27:15 +01:00
Neil Alexander 119cde3766
De-race types.RoomInfo (#2600) 2022-08-01 15:29:19 +01:00
Neil Alexander 05c83923e3
Optimise checking other servers allowed to see events (#2596)
* Try optimising checking if server is allowed to see event

* Fix error

* Handle case where snapshot NID is 0

* Fix query

* Update SQL

* Clean up `CheckServerAllowedToSeeEvent`

* Not supported on SQLite

* Maybe placate the unit tests

* Review comments
2022-08-01 14:11:00 +01:00