Commit graph

82 commits

Author SHA1 Message Date
S7evinK 89b7519089
Raise waitTime for network related issues (#2192) 2022-02-17 13:15:49 +00:00
S7evinK e9b672a34e
Make "Device list doesn't change if remote server is down" pass (#2190) 2022-02-16 16:56:45 +00:00
S7evinK ac25065a54
Fix sytest uploading signed devices gets propagated over federation (#2162)
* Remove unneeded logging

* Add MasterKey & SelfSigningKey to update
Avoid panic if signatures are not present

* Add passing test

* Revert "Add MasterKey & SelfSigningKey to update"

This reverts commit 2c81b34884.

* Send MasterKey & SelfSigningKey with update

* Debugging

* Remove delete() so we also query signingkeys
2022-02-09 13:11:43 +01:00
S7evinK 2771d93748
Remove OutputKeyChangeEvent consumer on keyserver (#2160)
* Remove keyserver consumer

* Remove keyserver from eduserver

* Directly upload device keys without eduserver

* Add passing tests
2022-02-08 18:13:38 +01:00
Neil Alexander 00cbe75150
Fix CPU spin from key change consumer when an invalid message is supplied (#2146) 2022-02-04 16:16:50 +00:00
S7evinK 9de7efa0b0
Remove sarama/saramajetstream dependencies (#2138)
* Remove dependency on saramajetstream & sarama

Signed-off-by: Till Faelligen <tfaelligen@gmail.com>

* Remove internal.ContinualConsumer from federationapi

* Remove internal.ContinualConsumer from syncapi

* Remove internal.ContinualConsumer from keyserver

* Move to new Prepare function

* Remove saramajetstream & sarama dependency

* Delete unneeded file

* Remove duplicate import

* Log error instead of silently irgnoring it

* Move `OffsetNewest` and `OffsetOldest` into keyserver types, change them to be more sane values

* Fix comments

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-02-04 13:08:13 +00:00
Neil Alexander ba1a9b98b7
Tweak some logging (#2130)
* Modify some log levels

* Update gomatrixserverlib to matrix-org/gomatrixserverlib@336334f

* Update gomatrixserverlib to matrix-org/gomatrixserverlib@cde7ac8

* Demote warning about key change producer

* Add more useful roomserver logging

* Further tweaking
2022-01-31 10:48:28 +00:00
kegsay 2c581377a5
Remodel how device list change IDs are created (#2098)
* Remodel how device list change IDs are created

Previously we made them using the offset Kafka supplied.
We don't run Kafka anymore, so now we make the SQL table assign
the change ID via an AUTOINCREMENTing ID. Redesign the
`keyserver_key_changes` table to have `UNIQUE(user_id)` so we
don't accumulate key changes forevermore, we now have at most 1
row per user which contains the highest change ID.

This needs a SQL migration.

* Ensure we bump the change ID on sqlite

* Actually read the DeviceChangeID not the Offset in synapi

* Add SQL migrations

* Prepare after migration; fixup dendrite-upgrade-test logging

* Use higher version numbers; fix sqlite query to increment better

* Default 0 on postgres

* fixup postgres migration on fresh dendrite instances
2022-01-21 09:56:06 +00:00
kegsay db7d9cba8a
BREAKING: Remove Partitioned Stream Positions (#2096)
* go mod tidy

* Break complement to check it fails CI

* Remove partitioned stream positions

This was used by the device list stream position. The device list position
now corresponds to the `Offset`, and the partition is always 0, in prep
for removing reliance on Kafka topics for device list changes.

* Linting

* Migrate old style tokens to new style because element-web doesn't soft-logoout on 4xx errors on /sync
2022-01-20 15:26:45 +00:00
S7evinK 161f145176
Add NATS JetStream support (#1866)
* Add NATS JetStream support
Update shopify/sarama

* Fix addresses

* Don't change Addresses in Defaults

* Update saramajetstream

* Add missing error check

Keep typing events for at least one minute

* Use all configured NATS addresses

* Update saramajetstream

* Try setting up with NATS

* Make sure NATS uses own persistent directory (TODO: make this configurable)

* Update go.mod/go.sum

* Jetstream package

* Various other refactoring

* Build fixes

* Config tweaks, make random jetstream storage path for CI

* Disable interest policies

* Try to sane default on jetstream base path

* Try to use in-memory for CI

* Restore storage/retention

* Update nats.go dependency

* Adapt changes to config

* Remove unneeded TopicFor

* Dep update

* Revert "Remove unneeded TopicFor"

This reverts commit f5a4e4a339.

* Revert changes made to streams

* Fix build problems

* Update nats-server

* Update go.mod/go.sum

* Roomserver input API queuing using NATS

* Fix topic naming

* Prometheus metrics

* More refactoring to remove saramajetstream

* Add missing topic

* Don't try to populate map that doesn't exist

* Roomserver output topic

* Update go.mod/go.sum

* Message acknowledgements

* Ack tweaks

* Try to resume transaction re-sends

* Try to resume transaction re-sends

* Update to matrix-org/gomatrixserverlib@91dadfb

* Remove internal.PartitionStorer from components that don't consume keychanges

* Try to reduce re-allocations a bit in resolveConflictsV2

* Tweak delivery options on RS input

* Publish send-to-device messages into correct JetStream subject

* Async and sync roomserver input

* Update dendrite-config.yaml

* Remove roomserver tests for now (they need rewriting)

* Remove roomserver test again (was merged back in)

* Update documentation

* Docker updates

* More Docker updates

* Update Docker readme again

* Fix lint issues

* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

* Go 1.16 instead of Go 1.13 for upgrade tests and Complement

* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

This reverts commit 368675283f.

* Don't report any errors on `/send` to see what fun that creates

* Fix panics on closed channel sends

* Enforce state key matches sender

* Do the same for leave

* Various tweaks to make tests happier

Squashed commit of the following:

commit 13f9028e7a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:47:14 2022 +0000

    Do the same for leave

commit e6be7f05c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:33:42 2022 +0000

    Enforce state key matches sender

commit 85ede6d64b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 14:07:04 2022 +0000

    Fix panics on closed channel sends

commit 9755494a98
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:38:22 2022 +0000

    Don't report any errors on `/send` to see what fun that creates

commit 3bb4f87b5d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:00:26 2022 +0000

    Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

    This reverts commit 368675283f.

commit fe2673ed7b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 12:09:34 2022 +0000

    Go 1.16 instead of Go 1.13 for upgrade tests and Complement

commit 368675283f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 11:51:45 2022 +0000

    Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

commit b028dfc085
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 10:29:08 2022 +0000

    Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork

* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers

* Fix consumer component name in  federation API

* Add comment explaining where streams are defined

* Tweaks to roomserver input with comments

* Finish that sentence that I apparently forgot to finish in INSTALL.md

* Bump version number of config to 2

* Add comments around asynchronous sends to roomserver in processEventWithMissingState

* More useful error message when the config version does not match

* Set version in generate-config

* Fix version in config.Defaults

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-01-05 17:44:49 +00:00
Neil Alexander ec716793eb
Merge federationapi, federationsender, signingkeyserver components (#2055)
* Initial federation sender -> federation API refactoring

* Move base into own package, avoids import cycle

* Fix build errors

* Fix tests

* Add signing key server tables

* Try to fold signing key server into federation API

* Fix dendritejs builds

* Update embedded interfaces

* Fix panic, fix lint error

* Update configs, docker

* Rename some things

* Reuse same keyring on the implementing side

* Fix federation tests, `NewBaseDendrite` can accept freeform options

* Fix build

* Update create_db, configs

* Name tables back

* Don't rename federationsender consumer for now
2021-11-24 10:45:23 +00:00
Neil Alexander a9e715b5c5
Guard in all key consumers 2021-11-16 09:27:49 +00:00
PiotrKozimor dec05c3347
Run gofmt on dendrite - apply go 1.17 preferred build tags (#2021) 2021-11-02 16:48:48 +00:00
Neil Alexander 614e67280d
Delete device keys/signatures from key server when deleting devices (#1979)
* Delete device keys/signatures from key server when deleting device from user API

* Move loop to within database transaction

* Don't fall over deleting no rows
2021-08-18 12:07:09 +01:00
Neil Alexander ff21675c5b
Cross-signing fixes, notifications via sync, federation (#1974)
* Initial work on signing key update EDUs

* Fix build

* Produce/consume EDUs

* Producer logging

* Only produce key change notifications for local users

* Better naming

* Try to notify sync

* Enable feature

* Use key change topic

* Don't bother verifying signatures, validate key lengths if we can, notifier fixes

* Copyright notices

* Remove tests from whitelist until matrix-org/sytest#1117

* Some review comment fixes

* Update to matrix-org/gomatrixserverlib@f9416ac

* Remove unneeded parameter
2021-08-17 13:44:30 +01:00
Neil Alexander 125ea75b24
Add type field to DeviceMessage, allow fields to be nullable (#1969) 2021-08-11 09:44:14 +01:00
Neil Alexander b1377d991a
Cross-signing signature handling (#1965)
* Handle other signatures

* Decorate key ID properly

* Match by key IDs

* Tweaks

* Fixes

* Fix /user/keys/query bug, review comments, update sytest-whitelist

* Various wtweaks

* Fix wiring for keyserver in API mode

* Additional fixes
2021-08-09 14:35:24 +01:00
Neil Alexander e95b1fd238
Cross-signing validation for self-sigs, expose signatures over /user/keys/query and /user/devices/{userId} (#1962)
* Enable unstable feature again

* Try to verify when a device signs a key

* Try to verify when a key signs a device

* It's the self-signing key, not the master key

* Fix error

* Try to verify master key uploads

* Actually we can't guarantee we can do that so nevermind

* Add signatures into /devices/list request

* Fix nil pointer

* Reprioritise map creation

* Don't skip devices that don't have signatures

* Add some debug logging

* Fix logic error in QuerySignatures

* Fix bugs

* Expose master and self-signing keys on /devices/list hopefully

* maps are tedious

* Expose signatures via /keys/query

* Upload signatures when uploading keys

* Fixes

* Disable the feature again
2021-08-06 10:13:35 +01:00
Neil Alexander eb0efa4636
Cross-signing groundwork (#1953)
* Cross-signing groundwork

* Update to matrix-org/gomatrixserverlib#274

* Fix gobind builds, which stops unit tests in CI from yelling

* Some changes from review comments

* Fix build by passing in UIA

* Update to matrix-org/gomatrixserverlib@bec8d22

* Process master/self-signing keys from devices call

* nolint

* Enum-ify the key type in the database

* Process self-signing key too

* Fix sanity check in device list updater

* Fix check

* Fix sytest, hopefully

* Fix build
2021-08-04 17:56:29 +01:00
Neil Alexander 7a9a2547b3
Cross-signing storage code (#1959) 2021-08-04 17:31:18 +01:00
S7evinK 89a6787fdb
Try to optimize SelectOneTimeKeys (#1851)
* Try to optimize SelectOneTimeKeys

Signed-off-by: Till Faelligen <tfaelligen@gmail.com>

* Use pg.Array when using ANY...

Co-authored-by: Kegsay <kegan@matrix.org>
2021-06-07 09:17:46 +01:00
Kegsay b769d5a25e
Optimise memory usage when calling /g_m_e (#1819)
* Optimise memory usage when calling /g_m_e

* cache more events

* refactor handling of device list update pokes

* Sigh
2021-04-08 13:50:39 +01:00
Kegsay 802f1c96f8
Add more metrics (#1802)
* Add more metrics

* Linting
2021-03-23 15:22:00 +00:00
Kegsay a1b7e4ef3f
log less for failed key querys, add counters for incoming pdus/edus (#1801)
* log less for failed key querys, add counters for incoming pdus/edus

* use labels

* Blacklist flakey test

* Fix metrics
2021-03-23 11:33:36 +00:00
Kegsay 77fb981da5
device lists: backoff for longer if the wrong error type is returned (#1796) 2021-03-08 17:45:20 +00:00
Neil Alexander 81312b8a78
Return the current OTK count on an empty upload request (#1774)
* Always return OTK counts

* Fix parameter ordering

* Send IDs over to keyserver internal API

* Review comments

* Fix syntax error

* Fix panic, hopefully

* Require user ID to be set

* Fix user API call
2021-03-02 11:40:20 +00:00
Neil Alexander 6757b67a32
NewClient and NewFederationClient updates (#1730)
* Use matrix-org/gomatrixserverlib#252

* Add missing WithSkipVerify to test

* Functions instead

* Update gomatrixserverlib to matrix-org/gomatrixserverlib#252

* Fix disabling TLS validation
2021-01-22 16:09:05 +00:00
Loïck Bonniot 940577cd3c
Fix integer overflow in device_list_update.go (#1717)
Fix #1511
On 32-bits systems, int(hash.Sum32()) can be negative.
This makes the computation of array indices using modulo invalid, crashing dendrite.
Signed-off-by: Loïck Bonniot <git@lesterpig.com>
2021-01-18 12:43:15 +00:00
Neil Alexander fa65c40bae
Reduce device list GetUserDevices timeout (#1704) 2021-01-12 16:13:21 +00:00
Neil Alexander 50963b724b
More sane next batch handling, typing notification tweaks, give invites their own stream position, device list fix (#1641)
* Update sync responses

* Fix positions, add ApplyUpdates

* Fix MarshalText as non-pointer, PrevBatch is optional

* Increment by number of read receipts

* Merge branch 'master' into neilalexander/devicelist

* Tweak typing

* Include keyserver position tweak

* Fix typing next position in all cases

* Tweaks

* Fix typo

* Tweaks, restore StreamingToken.MarshalText which somehow went missing?

* Rely on positions from notifier rather than manually advancing them

* Revert "Rely on positions from notifier rather than manually advancing them"

This reverts commit 53112a62cc.

* Give invites their own position, fix other things

* Fix test

* Fix invites maybe

* Un-whitelist tests that look to be genuinely wrong

* Use real receipt positions

* Ensure send-to-device uses real positions too
2020-12-18 11:11:21 +00:00
Neil Alexander b5aa7ca3ab
Top-level setup package (#1605)
* Move config, setup, mscs into "setup" top-level folder

* oops, forgot the EDU server

* Add setup

* goimports
2020-12-02 17:41:00 +00:00
Neil Alexander 49abe359e6
Start Kafka connections for each component that needs them (#1527)
* Start Kafka connection for each component that needs one

* Fix roomserver unit tests

* Rename to naffkaInstance (@Kegsay review comment)

* Fix import cycle
2020-10-15 13:27:13 +01:00
Neil Alexander fe5d1400bf
Update federation timeouts (#1504)
* Update to matrix-org/gomatrixserverlib#234

* Update gomatrixserverlib

* Update federation timeouts

* Fix dendritejs

* Increase /send context time in destination queue
2020-10-09 17:08:32 +01:00
Sam a6700331ce
Update all usages of tx.Stmt to sqlutil.TxStmt (#1423)
* Replace all usages of txn.Stmt with sqlutil.TxStmt

Signed-off-by: Sam Day <me@samcday.com>

* Fix sign off link in PR template.

Signed-off-by: Sam Day <me@samcday.com>

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-09-24 11:10:14 +01:00
Matthew Hodgson 39507bacc3
Peeking via MSC2753 (#1370)
Initial implementation of MSC2753, as tested by https://github.com/matrix-org/sytest/pull/944.
Doesn't yet handle unpeeks, peeked EDUs, or history viz changing during a peek - these will follow.
https://github.com/matrix-org/dendrite/pull/1370 has full details.
2020-09-10 14:39:18 +01:00
Neil Alexander a0f2a4510f
Exclude deleted keys from selectBatchDeviceKeysSQL (#1412) 2020-09-08 17:47:54 +01:00
Neil Alexander 04bc09f591
Defer keyserver and federationsender wakeups to give HTTP listeners time to start (#1389) 2020-09-03 21:17:55 +01:00
Kegsay 29d6481842
Wait for 8h between device list updates for blacklisted servers (#1344) 2020-08-26 15:38:21 +01:00
Kegsay abd16ff4a0
Modify DeviceListUpdater to retry requests according to RetryAfter (#1342)
* Modify DeviceListUpdater to retry requests according to RetryAfter

* Reduce wait time for sytest test pollution
2020-08-26 12:03:09 +01:00
Neil Alexander 720ddce0a8
Use Writer in shared package (#1296) 2020-08-25 10:29:45 +01:00
Neil Alexander 9d53351dc2
Component-wide TransactionWriters (#1290)
* Offset updates take place using TransactionWriter

* Refactor TransactionWriter in current state server

* Refactor TransactionWriter in federation sender

* Refactor TransactionWriter in key server

* Refactor TransactionWriter in media API

* Refactor TransactionWriter in server key API

* Refactor TransactionWriter in sync API

* Refactor TransactionWriter in user API

* Fix deadlocking Sync API tests

* Un-deadlock device database

* Fix appservice API

* Rename TransactionWriters to Writers

* Move writers up a layer in sync API

* Document sqlutil.Writer interface

* Add note to Writer documentation
2020-08-21 10:42:08 +01:00
Kegsay 6d6bb75137
Add FederationClient interface to federationsender (#1284)
* Add FederationClient interface to federationsender

- Use a shim struct in HTTP mode to keep the same API as `FederationClient`.
- Use `federationsender` instead of `FederationClient` in `keyserver`.

* Pointers not values

* Review comments

* Fix unit tests

* Rejig backoff

* Unbreak test

* Remove debug logs

* Review comments and linting
2020-08-20 17:03:07 +01:00
Neil Alexander b24747b305
Transaction writer changes, move roomserver writers (#1285)
* Updated TransactionWriters, moved locks in roomserver, various other tweaks

* Fix redaction deadlocks

* Fix lint issue

* Rename SQLiteTransactionWriter to ExclusiveTransactionWriter

* Fix us not sending transactions through in latest events updater
2020-08-19 15:38:27 +01:00
Kegsay e571e196ce
Summarise key change logs (#1278) 2020-08-18 11:14:37 +01:00
Kegsay 02a8515e99
Only emit key changes which are different from what we had before (#1279)
We did this already for local `/keys/upload` but didn't for
remote `/users/devices`. This meant any resyncs would spam produce
events, hammering disk i/o and spamming the logs.
2020-08-18 11:14:20 +01:00
Kegsay 20c8f252a7
Make 'Device list doesn't change if remote server is down' pass (#1268)
- As a last resort, query the DB when exhausting all possible remote query
  endpoints, but keep the field in `failures` so clients can detect that this
  is stale data.
- Unblock `DeviceListUpdater.Update` on failures rather than timing out.
- Use a mutex when writing directly to `res`, not just for failures.
2020-08-13 16:43:27 +01:00
Kegsay 820c56c165
Fix more E2E sytests (#1265)
* WIP: Eagerly sync device lists on /user/keys/query requests

Also notify servers when a user's device display name changes. Few
caveats:
 - sytest `Device deletion propagates over federation` fails
 - `populateResponseWithDeviceKeysFromDatabase` is called from multiple
   goroutines and hence is unsafe.

* Handle deleted devices correctly over federation
2020-08-12 22:43:02 +01:00
Kegsay d98ec12422
Add sync mechanism to block when updating device lists (#1264)
* Add sync mechanism to block when updating device lists

With a timeout, mainly for sytest to fix the test
"Server correctly handles incoming m.device_list_update"
which is flakey because it assumes that when `/send` 200 OKs
that the server has updated the device lists in prep for
`/keys/query` which is not always true when using workers.

* Fix UT

* Add new working test
2020-08-12 13:50:54 +01:00
Kegsay b8b854d642
Bugfixes for 'If remote user leaves room we no longer receive device updates' (#1262)
* Bugfixes for 'If remote user leaves room we no longer receive device updates'

* Update whitelist and README
2020-08-12 10:50:52 +01:00
Kegsay befccd7d51
Reduce cooldown to make sure sytest doesn't give up (#1257)
* Reduce cooldown to make sure sytest doesn't give up

* More sytests pass weeeeeee
2020-08-11 10:44:59 +01:00