Commit graph

39 commits

Author SHA1 Message Date
Till 5c0ceec2a6
Don't attempt to send transactions if Dendrite is shutting down (#3356)
This should avoid confusions with logs like:

```
time="2024-04-08T08:38:45.104235081Z" level=error msg="Failed to set \"scs.ems.host\" as assumed offline" func="github.com/matrix-org/dendrite/federationapi/statistics.(*ServerStatistics).Failure" file="github.com/matrix-org/dendrite/federationapi/statistics/statistics.go:204" error="sqlutil.WithTransaction.Begin: sql: database is closed"
time="2024-04-08T08:38:45.104239201Z" level=error msg="Failed to set \"obermui.de\" as assumed offline" func="github.com/matrix-org/dendrite/federationapi/statistics.(*ServerStatistics).Failure" file="github.com/matrix-org/dendrite/federationapi/statistics/statistics.go:204" error="sqlutil.WithTransaction.Begin: sql: database is closed"
```

or 

```
time="2024-04-08T08:38:45.105235411Z" level=error msg="Failed to get pending EDUs for \"retro76.net\"" func="github.com/matrix-org/dendrite/federationapi/queue.(*destinationQueue).getPendingFromDatabase" file="github.com/matrix-org/dendritefederationapi/queue/destinationqueue.go:258" error="sqlutil.WithTransaction.Begin: sql: database is closed"
```

[skip ci]
2024-04-09 07:49:56 +02:00
Till 865fff5f03
Make usage of relays optional, avoid DB roundtrips (#3337)
This should avoid 2 additional DB roundtrips if we don't want to use
relays.

So instead of possibly doing roughly 20k trips to the DB, we are now
"only" doing ~6600.

---------

Co-authored-by: devonh <devon.dmytro@gmail.com>
2024-02-28 20:59:34 +01:00
Till b8f91485b4
Update ACLs when received as outliers (#3008)
This should fix #3004 by making sure we also update our in-memory ACLs
after joining a new room.
Also makes use of more caching in `GetStateEvent`

Bonus: Adds some tests, as I was about to use `GetBulkStateContent`, but
turns out that `GetStateEvent` is basically doing the same, just that it
only gets the `eventTypeNID`/`eventStateKeyNID` once and not for every
call.
2023-11-22 15:38:04 +01:00
devonh 8245b24100
Update gmsl to use new validated RoomID on PDUs (#3200)
GMSL returns a `spec.RoomID` when calling `PDU.RoomID()`
2023-09-15 14:39:06 +00:00
kegsay f5b3144dc3
Use PDU not *Event in HeaderedEvent (#3073)
Requires https://github.com/matrix-org/gomatrixserverlib/pull/376

This has numerous upsides:
 - Less type casting to `*Event` is required.
- Making Dendrite work with `PDU` interfaces means we can swap out Event
impls more easily.
 - Tests which represent weird event shapes are easier to write.

Part of a series of refactors on GMSL.
2023-05-02 15:03:16 +01:00
kegsay b189edf4f4
Remove gmsl.HeaderedEvent (#3068)
Replaced with types.HeaderedEvent _for now_. In reality we want to move
them all to gmsl.Event and only use HeaderedEvent when we _need_ to
bundle the version/event ID with the event (seriailsation boundaries,
and even then only when we don't have the room version).

Requires https://github.com/matrix-org/gomatrixserverlib/pull/373
2023-04-27 12:54:20 +01:00
devonh ed19efc5d7
Move fedclient interface over to gmsl (#3061)
Companion PR: https://github.com/matrix-org/gomatrixserverlib/pull/366
2023-04-24 16:23:25 +00:00
kegsay 1647213fac
Implement new RoomVersionImpl API (#3062)
As outlined in https://github.com/matrix-org/gomatrixserverlib/pull/368

The main change Dendrite side is that `RoomVersion` no longer has any
methods on it. Instead, you need to bounce via `gmsl.GetRoomVersion`.

It's very interesting to see where exactly Dendrite cares about this.
For some places it's creating events (fine) but others are way more
specific. Those areas will need to migrate to GMSL at some point.
2023-04-21 17:06:29 +01:00
kegsay 71eeccf34a
refactor: funnel event creation through room versions (#3060)
In preparation of interfacing up the room version value.
2023-04-20 19:07:31 +01:00
kegsay 72285b2659
refactor: update GMSL (#3058)
Sister PR to https://github.com/matrix-org/gomatrixserverlib/pull/364

Read this commit by commit to avoid going insane.
2023-04-19 15:50:33 +01:00
kegsay 0db43f13a6
refactor: use latest GMSL which splits fed client from matrix room logic (#3051)
Part of a series of refactors on GMSL.
2023-04-06 09:55:01 +01:00
Till 5e85a00cb3
Remove BaseDendrite (#3023)
Removes `BaseDendrite` to, hopefully, make testing and composing of
components easier in the future.
2023-03-22 09:21:32 +01:00
Till 5579121c6f
Preparations for removing BaseDendrite (#3016)
Preparations to actually remove/replace `BaseDendrite`.
Quite a few changes:
- SyncAPI accepts an `fulltext.Indexer` interface (fulltext is removed
from `BaseDendrite`)
- Caches are removed from `BaseDendrite`
- Introduces a `Router` struct (likely to change)
  - also fixes #2903
- Introduces a `sqlutil.ConnectionManager`, which should remove
`base.DatabaseConnection` later on
- probably more
2023-03-17 11:09:45 +00:00
devonh 63df85db6d
Relay integration to pinecone demos (#2955)
This extends the dendrite monolith for pinecone to integrate the s&f
features into the mobile apps.
Also makes a few tweaks to federation queueing/statistics to make some
edge cases more robust.
2023-01-28 23:27:53 +00:00
devonh 5b73592f5a
Initial Store & Forward Implementation (#2917)
This adds store & forward relays into dendrite for p2p.
A few things have changed:
- new relay api serves new http endpoints for s&f federation
- updated outbound federation queueing which will attempt to forward
using s&f if appropriate
- database entries to track s&f relays for other nodes
2023-01-23 17:55:12 +00:00
Till d3db542fbf
Add federation peeking table tests (#2920)
As the title says, adds tests for inbound/outbound peeking federation
table tests.

Also removes some unused code
2022-12-22 10:56:20 +01:00
devonh a8e7ffc7ab
Add p2p wakeup broadcast handling to pinecone demos (#2841)
Adds wakeup broadcast handling to the pinecone demos.
This will reset their blacklist status and interrupt any ongoing
federation queue backoffs currently in progress for this peer.
The end result is that any queued events will quickly be sent to the
peer if they had disconnected while attempting to send events to them.
2022-11-18 00:29:23 +00:00
Neil Alexander 6650712a1c
Federation fixes for virtual hosting 2022-11-15 15:05:23 +00:00
Neil Alexander 529df30b56
Virtual hosting schema and logic changes (#2876)
Note that virtual users cannot federate correctly yet.
2022-11-11 16:41:37 +00:00
devonh 97491a174b
Associate events in db before queueing them to send (#2833)
Fixes a race condition between sending federation events and having them
fully associated in the database.
2022-10-26 17:35:01 +01:00
Neil Alexander f6dea712d2
Initial support for multiple server names (#2829)
This PR is the first step towards virtual hosting by laying the
groundwork for multiple server names being configured.
2022-10-26 12:59:19 +01:00
Till 9e4c3171da
Optimize inserting pending PDUs/EDUs (#2821)
This optimizes the association of PDUs/EDUs to their destination by
inserting all destinations in one transaction.
2022-10-21 12:50:51 +02:00
devonh b58c9bb094
Fix flakey queue test (#2818)
Ensure both events are added to the database, even if the destination is
already blacklisted.
2022-10-20 15:37:35 +00:00
Till Faelligen 6a93858125
Fix race condition 2022-10-20 10:45:59 +02:00
devonh 241d5c47df
Refactor Federation Destination Queues (#2807)
This is a refactor of the federation destination queues.
It fixes a few things, namely:
- actually retry outgoing events with backoff behaviour
- obtain enough events from the database to fill messages as much as
possible
- minimize the amount of running goroutines
  - use pure timers for backoff
  - don't restart queue unless necessary
  - close the background task when backing off
- increase max edus in a transaction to match the spec
- cleanup timers more aggresively to reduce memory usage
- add jitter to backoff timers to reduce resource spikes
- add a bunch of tests (with real and fake databases) to ensure
everything is working
2022-10-19 11:03:16 +01:00
Neil Alexander f3be4b3185
Revert "Federation backoff fixes and tests (#2792)"
This reverts commit dcedd1b6bf.
2022-10-13 16:06:50 +01:00
devonh dcedd1b6bf
Federation backoff fixes and tests (#2792)
This fixes some edge cases where federation queue backoffs and
blacklisting weren't behaving as expected.
It also adds new tests for the federation queues to ensure their
behaviour continues to work correctly.
2022-10-13 14:38:13 +00:00
Till b000db81ca
Send E2EE related errors to sentry (#2784)
Only sends errors if we're not retrying them in NATS.
Not sure if those should be scoped/tagged with something like "E2EE".
2022-10-10 17:36:26 +02:00
Neil Alexander bd39748b5c
Update dependencies (#2729)
This updates Dendrite dependencies.
2022-09-20 15:01:19 +01:00
Till 240ae257de
Add housekeeping function to delete old/expired EDUs (#2399)
* Add housekeeping function to delete old/expired EDUs

* Add migrations

* Evict EDUs from cache

* Fix queries

* Fix upgrade

* Use map[string]time.Duration to specify different expiry times

* Fix copy & paste mistake

* Set expires_at to tomorrow

* Don't allow NULL

* Add comment

* Add tests

* Use new testrig package

* Fix migrations

* Never expire m.direct_to_device

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
2022-08-09 11:15:58 +02:00
Till 9a655cb5e7
Only create a new destinationQueue if we don't have one (#2620) 2022-08-05 07:20:34 +02:00
kegsay 6de29c1cd2
bugfix: E2EE device keys could sometimes not be sent to remote servers (#2466)
* Fix flakey sytest 'Local device key changes get to remote servers'

* Debug logs

* Remove internal/test and use /test only

Remove a lot of ancient code too.

* Use FederationRoomserverAPI in more places

* Use more interfaces in federationapi; begin adding regression test

* Linting

* Add regression test

* Unbreak tests

* ALL THE LOGS

* Fix a race condition which could cause events to not be sent to servers

If a new room event which rewrites state arrives, we remove all joined hosts
then re-calculate them. This wasn't done in a transaction so for a brief period
we would have no joined hosts. During this interim, key change events which arrive
would not be sent to destination servers. This would sporadically fail on sytest.

* Unbreak new tests

* Linting
2022-05-17 13:23:35 +01:00
Neil Alexander 923f789ca3
Fix graceful shutdown 2022-04-27 15:29:49 +01:00
Neil Alexander aad81b7b4d
Only call key update process functions if there are updates, don't send things to ourselves over federation 2022-04-25 14:22:46 +01:00
Till 446819e4ac
Store the EDU type in the database (#2370) 2022-04-25 11:56:50 +02:00
Neil Alexander 9b316ac64c
Slower federation warm-up (#2320)
* Wake destination queues gradually, rather than all at once

* Delay device list updates too

* Maximum two minute warmup period
2022-04-04 15:14:10 +01:00
S7evinK a4e7d471af
Remove FederationDisabled error type (#2174) 2022-02-11 18:15:44 +01:00
Neil Alexander a84f50f4fb
Demote logging entry for backoff 2022-02-08 16:49:49 +00:00
Neil Alexander ec716793eb
Merge federationapi, federationsender, signingkeyserver components (#2055)
* Initial federation sender -> federation API refactoring

* Move base into own package, avoids import cycle

* Fix build errors

* Fix tests

* Add signing key server tables

* Try to fold signing key server into federation API

* Fix dendritejs builds

* Update embedded interfaces

* Fix panic, fix lint error

* Update configs, docker

* Rename some things

* Reuse same keyring on the implementing side

* Fix federation tests, `NewBaseDendrite` can accept freeform options

* Fix build

* Update create_db, configs

* Name tables back

* Don't rename federationsender consumer for now
2021-11-24 10:45:23 +00:00