Commit graph

251 commits

Author SHA1 Message Date
Neil Alexander 10e1d347e2
Use synchronous contexts, limit time to fetch missing events 2022-01-07 11:57:11 +00:00
Neil Alexander af34b4abe3
Reject instead of soft-fail, don't copy roominfo so much 2022-01-07 10:50:19 +00:00
Neil Alexander eff348bb69
Check missing state if not an outlier before storing the event 2022-01-07 09:47:53 +00:00
Neil Alexander 110ab7b8f3
Don't mark state events with zero snapshot NID as not existing 2022-01-06 17:22:46 +00:00
Neil Alexander ad19c2b81a
Merge branch 'master' into neilalexander/federationinput 2022-01-06 13:14:48 +00:00
S7evinK 161f145176
Add NATS JetStream support (#1866)
* Add NATS JetStream support
Update shopify/sarama

* Fix addresses

* Don't change Addresses in Defaults

* Update saramajetstream

* Add missing error check

Keep typing events for at least one minute

* Use all configured NATS addresses

* Update saramajetstream

* Try setting up with NATS

* Make sure NATS uses own persistent directory (TODO: make this configurable)

* Update go.mod/go.sum

* Jetstream package

* Various other refactoring

* Build fixes

* Config tweaks, make random jetstream storage path for CI

* Disable interest policies

* Try to sane default on jetstream base path

* Try to use in-memory for CI

* Restore storage/retention

* Update nats.go dependency

* Adapt changes to config

* Remove unneeded TopicFor

* Dep update

* Revert "Remove unneeded TopicFor"

This reverts commit f5a4e4a339.

* Revert changes made to streams

* Fix build problems

* Update nats-server

* Update go.mod/go.sum

* Roomserver input API queuing using NATS

* Fix topic naming

* Prometheus metrics

* More refactoring to remove saramajetstream

* Add missing topic

* Don't try to populate map that doesn't exist

* Roomserver output topic

* Update go.mod/go.sum

* Message acknowledgements

* Ack tweaks

* Try to resume transaction re-sends

* Try to resume transaction re-sends

* Update to matrix-org/gomatrixserverlib@91dadfb

* Remove internal.PartitionStorer from components that don't consume keychanges

* Try to reduce re-allocations a bit in resolveConflictsV2

* Tweak delivery options on RS input

* Publish send-to-device messages into correct JetStream subject

* Async and sync roomserver input

* Update dendrite-config.yaml

* Remove roomserver tests for now (they need rewriting)

* Remove roomserver test again (was merged back in)

* Update documentation

* Docker updates

* More Docker updates

* Update Docker readme again

* Fix lint issues

* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

* Go 1.16 instead of Go 1.13 for upgrade tests and Complement

* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

This reverts commit 368675283f.

* Don't report any errors on `/send` to see what fun that creates

* Fix panics on closed channel sends

* Enforce state key matches sender

* Do the same for leave

* Various tweaks to make tests happier

Squashed commit of the following:

commit 13f9028e7a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:47:14 2022 +0000

    Do the same for leave

commit e6be7f05c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:33:42 2022 +0000

    Enforce state key matches sender

commit 85ede6d64b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 14:07:04 2022 +0000

    Fix panics on closed channel sends

commit 9755494a98
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:38:22 2022 +0000

    Don't report any errors on `/send` to see what fun that creates

commit 3bb4f87b5d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:00:26 2022 +0000

    Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

    This reverts commit 368675283f.

commit fe2673ed7b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 12:09:34 2022 +0000

    Go 1.16 instead of Go 1.13 for upgrade tests and Complement

commit 368675283f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 11:51:45 2022 +0000

    Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

commit b028dfc085
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 10:29:08 2022 +0000

    Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork

* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers

* Fix consumer component name in  federation API

* Add comment explaining where streams are defined

* Tweaks to roomserver input with comments

* Finish that sentence that I apparently forgot to finish in INSTALL.md

* Bump version number of config to 2

* Add comments around asynchronous sends to roomserver in processEventWithMissingState

* More useful error message when the config version does not match

* Set version in generate-config

* Fix version in config.Defaults

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-01-05 17:44:49 +00:00
Neil Alexander 8faf5bb041
Add verifier 2021-12-16 10:21:38 +00:00
Neil Alexander df6a60f35e
Don't get stuck on mutexes: 2021-12-15 15:45:53 +00:00
Neil Alexander 2a46752d0b
Use event origin failing all else 2021-12-15 14:05:35 +00:00
Neil Alexander 2203dd9d8a
Sorta transplanted the code over 2021-12-15 13:59:58 +00:00
Neil Alexander 69111355d0
Logging 2021-12-14 12:38:47 +00:00
Neil Alexander 18cef55d04
Look for missing auth events in RS input 2021-12-14 11:38:08 +00:00
Neil Alexander 3113210f17
Fix keyring regressions in previous P2P demo 2021-12-13 13:24:49 +00:00
Neil Alexander c3dda0779d
Return event NID from StoreEvent, match PSQL vs SQLite behaviour, tweak backfill persistence (#2071) 2021-12-09 15:03:26 +00:00
Neil Alexander c9419e51af
Don't populate config defaults where it doesn't make sense (#2058)
* Don't populate config defaults where it doesn't make sense

* Fix dendritejs builds
2021-11-24 11:57:39 +00:00
Neil Alexander ec716793eb
Merge federationapi, federationsender, signingkeyserver components (#2055)
* Initial federation sender -> federation API refactoring

* Move base into own package, avoids import cycle

* Fix build errors

* Fix tests

* Add signing key server tables

* Try to fold signing key server into federation API

* Fix dendritejs builds

* Update embedded interfaces

* Fix panic, fix lint error

* Update configs, docker

* Rename some things

* Reuse same keyring on the implementing side

* Fix federation tests, `NewBaseDendrite` can accept freeform options

* Fix build

* Update create_db, configs

* Name tables back

* Don't rename federationsender consumer for now
2021-11-24 10:45:23 +00:00
Neil Alexander 6e93531e94
Don't persist transaction IDs in the roomserver (#2048) 2021-11-22 09:13:12 +00:00
Neil Alexander 403498a85b
Only return non-stub rooms from GetKnownRooms (#2049)
* Only return non-stub rooms from `GetKnownRooms`

This should stop a bunch of errors at startup with invalid server ACLs.

* Fix query
2021-11-18 11:34:19 +00:00
Neil Alexander b9a575919a
Try to reduce re-allocations a bit in resolveConflictsV2 2021-11-04 10:55:07 +00:00
PiotrKozimor dec05c3347
Run gofmt on dendrite - apply go 1.17 preferred build tags (#2021) 2021-11-02 16:48:48 +00:00
Neil Alexander b99f594a93
Fix #2028 (#2036) 2021-11-02 16:47:39 +00:00
Ryan W a624eab309
- Removed double imports (#1989)
- Lower cased error messages

Signed-off-by: Ryan Whittington <twentybitdev@gmail.com>

Co-authored-by: kegsay <kegan@matrix.org>
2021-09-08 17:31:03 +01:00
kegsay 7dc8fb1fe7
Add more logs (#2005)
* Add more logs

To help debug the migration issue in #1924 along with manual data-loss-inducing fixes.
Also log the origin server on processed txns to help debug buggy server origins.

* Fix query
2021-09-07 15:07:14 +01:00
Neil Alexander 51b119107c
Don't return nonsense canonical room aliases in the public rooms responses (#1992) 2021-08-27 16:50:30 +01:00
kegsay 4cc8b28b7f
Ensure all create events have a snapshot NID of 0 (#1961)
Fixes #1924 for postgres users, though the underlying cause of why
they aren't 0 in the first place is unresolved.
2021-08-04 17:48:23 +01:00
kegsay ed04eed441
Fix sqlite migration issues (#1960)
* Do not store 'null' in the database for empty JSON arrays

This can cause issues, though it should be noted that the majority
of the time this will marshal/unmarshal just fine, see
https://play.golang.org/p/Doe2NZUgv7Q

* bugfix: sqlite migration should handle create events as having no 'before' snapshot

The state snapshot for any given event in the roomserver represents the state _before_
the event. For the create event, this is nothing, so the state snapshot nid should be 0.

In some cases this wasn't happening, resulting in a nice mix of possible options including:
 - A state snapshot without any state blocks `[]` or `null`.
 - A state snapshot with a single state block with a single event, the create event, causing
   a circular loop. This is incorrect as it represents the state before the event, not after.

* Add state key check
2021-08-04 17:08:17 +01:00
Kegan Dougal ed4097825b Factor out StatementList to sqlutil and use it in userapi
It helps with the boilerplate.
2021-07-28 18:30:04 +01:00
kegsay 16bf94f239
Not finding the snapshot is not fatal (#1940) 2021-07-26 12:30:44 +01:00
Neil Alexander 39e8d1cc6f
Track knocking in membership updater (#1935)
* Topologically sort outliers in SendEventWithState

* Knock in membership updater

* Update gomatrixserverlib

* Update gomatrixserverlib

* Get the NID of the knock event properly for the membership updater
2021-07-22 12:26:58 +01:00
Neil Alexander c1447a58e5
Various alias fixes (#1934)
* Generate m.room.canonical_alias instead of legacy m.room.aliases

* Add omitempty tags

* Add aliases endpoint to client API

* Check power levels when setting aliases

* Don't return null on /aliases

* Don't return error if the state event fails

* Update sytest-whitelist

* Don't send updated m.room.canonical_alias events

* Don't check PLs after all because for local aliases they are apparently irrelevant

* Fix some bugs

* Allow deleting a local alias with enough PL

* Fix some more bugs

* Update sytest-whitelist

* Fix copyright notices

* Review comments
2021-07-21 16:53:50 +01:00
Neil Alexander f0f8c7f055
Optimise QueryServerJoinedToRoom (#1933)
* Optimise checking if a server is in a room

* Fix queries

* Fix queries
2021-07-21 13:06:32 +01:00
Neil Alexander f63068df3b
Only include go-sqlite3 on the relevant binaries (#1900)
* Only include go-sqlite3 on the relevant binaries

* The driver name is always sqlite3 now

* Update to matrix-org/go-sqlite3-js@e537baa
2021-07-20 11:18:14 +01:00
David Spenler 8d8fe485b4
Fix failing ban tests (#1884)
* Add room membership and powerlevel checks for func SendBan

* Added non-error return to func GetStateEvent when no state events with the specified state key are found

* Add passing tests to whitelist

* Fixed formatting

* Update roomserver/storage/shared/storage.go

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: kegsay <kegsay@gmail.com>
2021-07-19 18:33:05 +01:00
Neil Alexander 09d3bab838
Metric fixes
Squashed commit of the following:

commit c6eb4d8bbf80320ec2b6d416c77659b0343e5e47
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Mon Jul 19 16:52:57 2021 +0100

    Fix bug

commit d420966d9ac44936728960a8d38602662b58f1c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Mon Jul 19 16:46:12 2021 +0100

    Update metric

commit 0ad6e37846e2ebbbd0e33a38274094bd15b8f11b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Mon Jul 19 16:30:14 2021 +0100

    Fix observe for calculateStateDurations
2021-07-19 17:20:55 +01:00
Neil Alexander eb2a8e4c0b
Set buckets for dendrite_roomserver_calculate_state_duration_microseconds 2021-07-19 16:07:06 +01:00
Neil Alexander b20d402f39
dendrite_roomserver_calculate_state_duration_microseconds as histogram rather than summary 2021-07-19 15:34:12 +01:00
kegsay e80098e186
bugfix: retire invites even when we cannot talk to the remote server to make/send_leave (#1918)
* bugfix: retire invites even when we cannot talk to the remote server to make/send_leave

Also modify the leave response in /sync to include a fake event as this is ultimately
what clients (and sytest) will use to determine leave-ness.

* hash the event ID

* Base64 not hex
2021-07-14 10:39:17 +01:00
kegsay f8ae391a5b
Expose more data when outputting output room events (#1916)
* Add more logging for content fields

* Fix fields
2021-07-13 11:19:21 +01:00
Neil Alexander acec6fa979
Move a couple of callers to helpers.IsServerCurrentlyInRoom over to the query API (#1912) 2021-07-09 17:49:59 +01:00
Neil Alexander c8408a6387
Add more optimised code path for checking if we're in a room (#1909)
* Add more optimised code path for checking if we're in a room

* Fix database queries

* Fix federation API test

* Fix logging

* Review comments

* Make separate API call for room membership
2021-07-09 16:36:45 +01:00
kegsay 3e50bac944
bugfix: order the state blocks so recreating state snapshots works correctly (#1908)
* Logging

* Revert "Logging"

This reverts commit 23ce334182.

* bugfix: order the state blocks so recreating state snapshots works correctly
2021-07-09 10:49:49 +01:00
Neil Alexander 816e1a402b
Fix bug when rejecting invites (#1907)
* Fix rejecting invites maybe

* Remove comment that is no longer correct

* Review comment on performFederatedRejectInvite
2021-07-08 14:54:03 +01:00
kegsay 5a09290c32
db migration: handle create events with no state blocks from v0.1.0 (#1904) 2021-07-07 17:07:33 +01:00
Neil Alexander 192a7a7923
Roomserver input backpressure metric
Squashed commit of the following:

commit 56e934ac0aeedcfb2c072010959ba49734d4e0cb
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:39:30 2021 +0100

    Fix metric

commit 3911f3a0c17b164b012e881c085ceca30f5de408
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:36:29 2021 +0100

    Register correct metric

commit a9ddbfaed421538a701151801e9451198a8be4f3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Fri Jul 2 09:33:33 2021 +0100

    Try to capture RS input backpressure metric
2021-07-02 09:48:55 +01:00
kegsay c849e74dfc
db migration: fix #1844 and add additional assertions (#1889)
* db migration: fix #1844 and add additional assertions

- Migration scripts will now check to see if there are any unconverted
  snapshot IDs and fail the migration if there are any. This should
  prevent people from getting a corrupt database in the event the root
  cause is still unknown.
- Add an ORDER BY clause when doing batch queries in the postgres
  migration. LIMIT and OFFSET without ORDER BY are undefined and must
  not be relied upon to produce a deterministic ordering (e.g row order).
  See https://www.postgresql.org/docs/current/queries-limit.html

* Linting

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-06-29 11:25:17 +01:00
Neil Alexander 7c3991ee2f
Use a custom FIFO queue for the RS input API (#1888)
* Use a FIFO queue instead of a channel to reduce backpressure

* Make sure someone wakes up

* Tweaks

* Add comments
2021-06-28 15:11:36 +01:00
Neil Alexander 5357df36c9
Fix panic in roomserver 2021-06-21 09:41:12 +01:00
Neil Alexander c67d8da3eb
Fix bug in SQLite migration 2021-04-26 13:45:47 +01:00
Neil Alexander 5ce1fe80de
State storage refactor (#1839)
* Hash-deduplicated state storage (and migrations) for PostgreSQL and SQLite

* Refactor droomserver database setup for migrations

* Fix conflict statements

* Update migration names

* Set a boundary for old to new block/snapshot IDs so we don't rewrite them more than once accidentally

* Create sequence if not exists

* Fix boundary queries

* Fix boundary queries

* Use Query

* Break out queries a bit

* More sequence tweaks

* Query parameters are not playing the game

* Injection escaping may not work for CREATE SEQUENCE after all

* Fix snapshot sequence name

* Use boundaried IDs in SQLite too

* Use IFNULL for SQLite

* Use COALESCE in PostgreSQL

* Review comments @Kegsay
2021-04-26 13:25:57 +01:00
Kegsay af41f6d454
Add Sentry support (#1803)
* Add Sentry support

* Use HTTP Sentry properly maybe

* Capture panics

* Log fed Sentry stuff correctly

* British english linter
2021-03-24 10:25:24 +00:00