Commit graph

70 commits

Author SHA1 Message Date
Neil Alexander 31799a3b2a
Device list display name fixes (#2405)
* Get device names from `unsigned` in `/user/devices`

* Fix display name updates

* Fix bug

* Fix another bug
2022-04-29 16:02:55 +01:00
Neil Alexander 2ff75b7c80
Ensure signature map exists (fixes #2393) (#2397) 2022-04-28 11:34:19 +01:00
Neil Alexander 5306c73b00
Fix bug when uploading device signatures (#2377)
* Find the complete key ID when uploading signatures

* Try that again

* Try splitting the right thing

* Don't do it for device keys

* Refactor `QuerySignatures`

* Revert "Refactor `QuerySignatures`"

This reverts commit c02832a3e9.

* Both requested key IDs and master/self/user keys

* Fix uniqueness

* Try tweaking GMSL

* Update GMSL again

* Revert "Update GMSL again"

This reverts commit bd6916cc37.

* Revert "Try tweaking GMSL"

This reverts commit 2a054524da.

* Database migrations
2022-04-26 13:08:54 +01:00
Neil Alexander aad81b7b4d
Only call key update process functions if there are updates, don't send things to ourselves over federation 2022-04-25 14:22:46 +01:00
Neil Alexander 6d78c4d67d
Fix retrieving cross-signing signatures in /user/devices/{userId} (#2368)
* Fix retrieving cross-signing signatures in `/user/devices/{userId}`

We need to know the target device IDs in order to get the signatures and we weren't populating those.

* Fix up signature retrieval

* Fix SQLite

* Always include the target's own signatures as well as the requesting user
2022-04-22 14:58:24 +01:00
Neil Alexander 9b316ac64c
Slower federation warm-up (#2320)
* Wake destination queues gradually, rather than all at once

* Delay device list updates too

* Maximum two minute warmup period
2022-04-04 15:14:10 +01:00
S7evinK 49dc49b232
Remove eduserver (#2306)
* Move receipt sending to own JetStream producer

* Move SendToDevice to producer

* Remove most parts of the EDU server

* Fix SendToDevice & copyrights

* Move structs, cleanup EDU Server traces

* Use HeadersOnly subscription

* Missing file

* Fix linter issues

* Move consumers to own files

* Rename durable consumer; Consumer cleanup

* Docs/config cleanup
2022-03-29 14:14:35 +02:00
Neil Alexander d983d17355
Fix lint errors 2022-03-24 10:03:22 +00:00
Neil Alexander e485f9c2bd
64-bit stream IDs for device list updates (#2267) 2022-03-10 13:17:28 +00:00
Kegan Dougal e46a61c49e Skip flakey test for now 2022-03-02 11:38:13 +00:00
Kegan Dougal a4c918ee17 Fix data race in unit tests 2022-03-02 10:49:36 +00:00
kegsay 23f028cf6e
Add unit test for device list update debouncing (#2220)
* Add unit test for device list update debouncing

* bugfix: actually return stale device lists in the test...

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-03-01 17:18:06 +00:00
Neil Alexander 58bf91a585
Check for changes in PerformUploadDeviceKeys (#2233)
* Don't generate key change notifs if nothing changed on cross-signing upload

* Check both directions of changes
2022-03-01 11:00:54 +00:00
S7evinK 41dc651b25
Send device update to local users if remote display name changes (#2215)
* Send device_list update to satisfy sytest

* Fix build issue from merged in change

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-02-22 16:34:53 +00:00
Neil Alexander c7811e9d71
Add DeviceKeysEqual (#2219)
* Add `DeviceKeysEqual`

* Update check order

* Fix check

* Tweak conditions again

* One more time

* Single return value
2022-02-22 15:43:17 +00:00
Neil Alexander 600fbae31f
Only emit key change notifications from federation when changes are made (#2217)
* Only emit key changes when poked over federation

* Remove logging

* Fix unit test possibly
2022-02-22 13:35:06 +00:00
Neil Alexander 9bd5e414c9
Missing commit from #2186 2022-02-18 11:32:45 +00:00
Neil Alexander 153bfbbea5
Merge both user API databases into one (#2186)
* Merge user API databases into one

* Remove DeviceDatabase from config

* Fix tests

* Try that again

* Clean up keyserver device keys when the devices no longer exist in the user API

* Tweak ordering

* Fix UserExists flag, device check

* Allow including empty entries so we can clean them up

* Remove logging
2022-02-18 11:31:05 +00:00
Neil Alexander 140077265e
Make GetUserDevices logging entry more useful 2022-02-17 15:02:06 +00:00
S7evinK 89b7519089
Raise waitTime for network related issues (#2192) 2022-02-17 13:15:49 +00:00
S7evinK e9b672a34e
Make "Device list doesn't change if remote server is down" pass (#2190) 2022-02-16 16:56:45 +00:00
S7evinK ac25065a54
Fix sytest uploading signed devices gets propagated over federation (#2162)
* Remove unneeded logging

* Add MasterKey & SelfSigningKey to update
Avoid panic if signatures are not present

* Add passing test

* Revert "Add MasterKey & SelfSigningKey to update"

This reverts commit 2c81b34884.

* Send MasterKey & SelfSigningKey with update

* Debugging

* Remove delete() so we also query signingkeys
2022-02-09 13:11:43 +01:00
S7evinK 2771d93748
Remove OutputKeyChangeEvent consumer on keyserver (#2160)
* Remove keyserver consumer

* Remove keyserver from eduserver

* Directly upload device keys without eduserver

* Add passing tests
2022-02-08 18:13:38 +01:00
kegsay 2c581377a5
Remodel how device list change IDs are created (#2098)
* Remodel how device list change IDs are created

Previously we made them using the offset Kafka supplied.
We don't run Kafka anymore, so now we make the SQL table assign
the change ID via an AUTOINCREMENTing ID. Redesign the
`keyserver_key_changes` table to have `UNIQUE(user_id)` so we
don't accumulate key changes forevermore, we now have at most 1
row per user which contains the highest change ID.

This needs a SQL migration.

* Ensure we bump the change ID on sqlite

* Actually read the DeviceChangeID not the Offset in synapi

* Add SQL migrations

* Prepare after migration; fixup dendrite-upgrade-test logging

* Use higher version numbers; fix sqlite query to increment better

* Default 0 on postgres

* fixup postgres migration on fresh dendrite instances
2022-01-21 09:56:06 +00:00
kegsay db7d9cba8a
BREAKING: Remove Partitioned Stream Positions (#2096)
* go mod tidy

* Break complement to check it fails CI

* Remove partitioned stream positions

This was used by the device list stream position. The device list position
now corresponds to the `Offset`, and the partition is always 0, in prep
for removing reliance on Kafka topics for device list changes.

* Linting

* Migrate old style tokens to new style because element-web doesn't soft-logoout on 4xx errors on /sync
2022-01-20 15:26:45 +00:00
Neil Alexander ec716793eb
Merge federationapi, federationsender, signingkeyserver components (#2055)
* Initial federation sender -> federation API refactoring

* Move base into own package, avoids import cycle

* Fix build errors

* Fix tests

* Add signing key server tables

* Try to fold signing key server into federation API

* Fix dendritejs builds

* Update embedded interfaces

* Fix panic, fix lint error

* Update configs, docker

* Rename some things

* Reuse same keyring on the implementing side

* Fix federation tests, `NewBaseDendrite` can accept freeform options

* Fix build

* Update create_db, configs

* Name tables back

* Don't rename federationsender consumer for now
2021-11-24 10:45:23 +00:00
Neil Alexander 614e67280d
Delete device keys/signatures from key server when deleting devices (#1979)
* Delete device keys/signatures from key server when deleting device from user API

* Move loop to within database transaction

* Don't fall over deleting no rows
2021-08-18 12:07:09 +01:00
Neil Alexander ff21675c5b
Cross-signing fixes, notifications via sync, federation (#1974)
* Initial work on signing key update EDUs

* Fix build

* Produce/consume EDUs

* Producer logging

* Only produce key change notifications for local users

* Better naming

* Try to notify sync

* Enable feature

* Use key change topic

* Don't bother verifying signatures, validate key lengths if we can, notifier fixes

* Copyright notices

* Remove tests from whitelist until matrix-org/sytest#1117

* Some review comment fixes

* Update to matrix-org/gomatrixserverlib@f9416ac

* Remove unneeded parameter
2021-08-17 13:44:30 +01:00
Neil Alexander 125ea75b24
Add type field to DeviceMessage, allow fields to be nullable (#1969) 2021-08-11 09:44:14 +01:00
Neil Alexander b1377d991a
Cross-signing signature handling (#1965)
* Handle other signatures

* Decorate key ID properly

* Match by key IDs

* Tweaks

* Fixes

* Fix /user/keys/query bug, review comments, update sytest-whitelist

* Various wtweaks

* Fix wiring for keyserver in API mode

* Additional fixes
2021-08-09 14:35:24 +01:00
Neil Alexander e95b1fd238
Cross-signing validation for self-sigs, expose signatures over /user/keys/query and /user/devices/{userId} (#1962)
* Enable unstable feature again

* Try to verify when a device signs a key

* Try to verify when a key signs a device

* It's the self-signing key, not the master key

* Fix error

* Try to verify master key uploads

* Actually we can't guarantee we can do that so nevermind

* Add signatures into /devices/list request

* Fix nil pointer

* Reprioritise map creation

* Don't skip devices that don't have signatures

* Add some debug logging

* Fix logic error in QuerySignatures

* Fix bugs

* Expose master and self-signing keys on /devices/list hopefully

* maps are tedious

* Expose signatures via /keys/query

* Upload signatures when uploading keys

* Fixes

* Disable the feature again
2021-08-06 10:13:35 +01:00
Neil Alexander eb0efa4636
Cross-signing groundwork (#1953)
* Cross-signing groundwork

* Update to matrix-org/gomatrixserverlib#274

* Fix gobind builds, which stops unit tests in CI from yelling

* Some changes from review comments

* Fix build by passing in UIA

* Update to matrix-org/gomatrixserverlib@bec8d22

* Process master/self-signing keys from devices call

* nolint

* Enum-ify the key type in the database

* Process self-signing key too

* Fix sanity check in device list updater

* Fix check

* Fix sytest, hopefully

* Fix build
2021-08-04 17:56:29 +01:00
Kegsay b769d5a25e
Optimise memory usage when calling /g_m_e (#1819)
* Optimise memory usage when calling /g_m_e

* cache more events

* refactor handling of device list update pokes

* Sigh
2021-04-08 13:50:39 +01:00
Kegsay 802f1c96f8
Add more metrics (#1802)
* Add more metrics

* Linting
2021-03-23 15:22:00 +00:00
Kegsay a1b7e4ef3f
log less for failed key querys, add counters for incoming pdus/edus (#1801)
* log less for failed key querys, add counters for incoming pdus/edus

* use labels

* Blacklist flakey test

* Fix metrics
2021-03-23 11:33:36 +00:00
Kegsay 77fb981da5
device lists: backoff for longer if the wrong error type is returned (#1796) 2021-03-08 17:45:20 +00:00
Neil Alexander 81312b8a78
Return the current OTK count on an empty upload request (#1774)
* Always return OTK counts

* Fix parameter ordering

* Send IDs over to keyserver internal API

* Review comments

* Fix syntax error

* Fix panic, hopefully

* Require user ID to be set

* Fix user API call
2021-03-02 11:40:20 +00:00
Neil Alexander 6757b67a32
NewClient and NewFederationClient updates (#1730)
* Use matrix-org/gomatrixserverlib#252

* Add missing WithSkipVerify to test

* Functions instead

* Update gomatrixserverlib to matrix-org/gomatrixserverlib#252

* Fix disabling TLS validation
2021-01-22 16:09:05 +00:00
Loïck Bonniot 940577cd3c
Fix integer overflow in device_list_update.go (#1717)
Fix #1511
On 32-bits systems, int(hash.Sum32()) can be negative.
This makes the computation of array indices using modulo invalid, crashing dendrite.
Signed-off-by: Loïck Bonniot <git@lesterpig.com>
2021-01-18 12:43:15 +00:00
Neil Alexander fa65c40bae
Reduce device list GetUserDevices timeout (#1704) 2021-01-12 16:13:21 +00:00
Neil Alexander fe5d1400bf
Update federation timeouts (#1504)
* Update to matrix-org/gomatrixserverlib#234

* Update gomatrixserverlib

* Update federation timeouts

* Fix dendritejs

* Increase /send context time in destination queue
2020-10-09 17:08:32 +01:00
Kegsay 29d6481842
Wait for 8h between device list updates for blacklisted servers (#1344) 2020-08-26 15:38:21 +01:00
Kegsay abd16ff4a0
Modify DeviceListUpdater to retry requests according to RetryAfter (#1342)
* Modify DeviceListUpdater to retry requests according to RetryAfter

* Reduce wait time for sytest test pollution
2020-08-26 12:03:09 +01:00
Kegsay 6d6bb75137
Add FederationClient interface to federationsender (#1284)
* Add FederationClient interface to federationsender

- Use a shim struct in HTTP mode to keep the same API as `FederationClient`.
- Use `federationsender` instead of `FederationClient` in `keyserver`.

* Pointers not values

* Review comments

* Fix unit tests

* Rejig backoff

* Unbreak test

* Remove debug logs

* Review comments and linting
2020-08-20 17:03:07 +01:00
Kegsay 02a8515e99
Only emit key changes which are different from what we had before (#1279)
We did this already for local `/keys/upload` but didn't for
remote `/users/devices`. This meant any resyncs would spam produce
events, hammering disk i/o and spamming the logs.
2020-08-18 11:14:20 +01:00
Kegsay 20c8f252a7
Make 'Device list doesn't change if remote server is down' pass (#1268)
- As a last resort, query the DB when exhausting all possible remote query
  endpoints, but keep the field in `failures` so clients can detect that this
  is stale data.
- Unblock `DeviceListUpdater.Update` on failures rather than timing out.
- Use a mutex when writing directly to `res`, not just for failures.
2020-08-13 16:43:27 +01:00
Kegsay 820c56c165
Fix more E2E sytests (#1265)
* WIP: Eagerly sync device lists on /user/keys/query requests

Also notify servers when a user's device display name changes. Few
caveats:
 - sytest `Device deletion propagates over federation` fails
 - `populateResponseWithDeviceKeysFromDatabase` is called from multiple
   goroutines and hence is unsafe.

* Handle deleted devices correctly over federation
2020-08-12 22:43:02 +01:00
Kegsay d98ec12422
Add sync mechanism to block when updating device lists (#1264)
* Add sync mechanism to block when updating device lists

With a timeout, mainly for sytest to fix the test
"Server correctly handles incoming m.device_list_update"
which is flakey because it assumes that when `/send` 200 OKs
that the server has updated the device lists in prep for
`/keys/query` which is not always true when using workers.

* Fix UT

* Add new working test
2020-08-12 13:50:54 +01:00
Kegsay b8b854d642
Bugfixes for 'If remote user leaves room we no longer receive device updates' (#1262)
* Bugfixes for 'If remote user leaves room we no longer receive device updates'

* Update whitelist and README
2020-08-12 10:50:52 +01:00
Kegsay befccd7d51
Reduce cooldown to make sure sytest doesn't give up (#1257)
* Reduce cooldown to make sure sytest doesn't give up

* More sytests pass weeeeeee
2020-08-11 10:44:59 +01:00