Commit graph

428 commits

Author SHA1 Message Date
Neil Alexander ba2ffb7da9
Repeatable reads for /sync (#2783)
This puts repeatable reads into all sync streams.

Co-authored-by: kegsay <kegan@matrix.org>
2022-12-06 18:16:17 +00:00
Till f8d1dc521d
Fix m.receipts causing notifications (#2893)
Fixes https://github.com/matrix-org/dendrite/issues/2353
2022-11-29 15:46:28 +01:00
Erik Johnston 31f56ac3f4
Never filter out a user's own membership when using LL (#2887) 2022-11-22 21:38:27 +00:00
Neil Alexander 6650712a1c
Federation fixes for virtual hosting 2022-11-15 15:05:23 +00:00
Till d35a5642e8
Deny guest access on several endpoints (#2873)
Second part for guest access, this adds a `WithAllowGuests()` option to
`MakeAuthAPI`, allowing guests to access the specified endpoints.
Endpoints taken from the
[spec](https://spec.matrix.org/v1.4/client-server-api/#client-behaviour-14)
and by checking Synapse endpoints for `allow_guest=true`.
2022-11-11 10:52:08 +01:00
Till efe28db631
Update latestPosition when getting reversed room delta (#2860)
Regression test added in
https://github.com/matrix-org/complement/pull/551
Should fix https://github.com/matrix-org/dendrite/issues/2514?
2022-11-04 15:39:09 +01:00
Neil Alexander 98d3f88bfb
Move prev_batch calculation (#2856)
This might help #2847.
2022-11-03 16:56:21 +00:00
Neil Alexander 8704e84898
Tweak removeDuplicates calls to use events instead of recentEvents (#2853)
... since `events` is *after* history visibility filtering, not before
it.
2022-11-03 10:19:37 +00:00
Tak Wai Wong 75a508cc27
Fix issue where a member is forced to leave a room when the invite is marked deleted (#2839)
Proposed fix for issue:
https://github.com/matrix-org/dendrite/issues/2838

Suppose bob received invites to spaceA and spaceB.
When Bob joins spaceA, we add an OutputEvent event to retire the invite.
This sets the invite to "deleted" in the database. This makes sense.

The bug is in stream_invites.go. Triggered when bob received a new
invite for spaceB, and does a client sync.

In the block (line 76)
`for roomID := range retiredInvites

   if _, ok := req.Response.Rooms.Invite[roomID]; ok {
	continue
   }

   if _, ok := req.Response.Rooms.Join[roomID]; ok {
	continue
  }
...
` 

Bob is not in either maps even though he had just accepted the invite
for spaceA. Consequently, the spaceA invite is treated as a retired
invite, and a membership Leave event is generated. What bob sees is that
after accepting the invite to spaceB, he lose access to spaceA.


### Pull Request Checklist

<!-- Please read
https://matrix-org.github.io/dendrite/development/contributing before
submitting your pull request -->

* [ ] I have added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x ] Pull request includes a [sign off below using a legally
identifiable
name](https://matrix-org.github.io/dendrite/development/contributing#sign-off)
_or_ I have already signed off privately

Signed-off-by: `Tak Wai Wong <tak@hntlabs.com>`

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-11-02 10:02:23 +00:00
Neil Alexander 3db9e98456
Don't limit "state" (#2849)
This is apparently some incorrect behaviour that we built as a result of
a spec bug (matrix-org/matrix-spec#1314) where we were applying a filter
to the `"state"` section of the `/sync` response incorrectly. The client
then has no way to know that the state was limited.

This PR removes the state limiting, which probably also helps #2842.
2022-11-02 09:34:19 +00:00
ash lea 5aaa60227a
return required room_id field in /members (#2846)
### Pull Request Checklist

<!-- Please read
https://matrix-org.github.io/dendrite/development/contributing before
submitting your pull request -->

* [ ] I have added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x] Pull request includes a [sign off below using a legally
identifiable
name](https://matrix-org.github.io/dendrite/development/contributing#sign-off)
_or_ I have already signed off privately

Signed-off-by: `ash lea <example@thisismyactual.email>`
2022-11-01 16:42:07 +00:00
Neil Alexander 0b21cb78aa
Try to fix a panic in the sync API PDU stream 2022-11-01 14:45:15 +00:00
Neil Alexander 7307701a24
Tweak "state" and "timeline" filtering (#2844)
This should stop state events disappearing down a gap where we'd try to
separate out the sections *before* applying history visibility instead
of after.

This may be a better approach than #2843 but I hope @tak-hntlabs will
shout if it isn't.
2022-10-31 15:14:08 +00:00
Till 69aff372f3
Limit recent events when going backwards (#2840)
If we're going backwards, we were selecting potentially thousands of
events, which in turn were fed to history visibility checks, resulting
in bad sync performance.
2022-10-28 13:40:51 +02:00
Till Faelligen f6035822e7
Simplify error checking and check the correct error 2022-10-28 08:17:40 +02:00
Till a169a9121a
Fix /members (#2837)
Fixes a bug introduced in #2827, where the SyncAPI might not have all
requested eventIDs, resulting in too few members returned.
2022-10-27 14:18:22 +02:00
Till c62ac3d6ad
Fix Current state appears in timeline in private history with many messages after (#2830)
The problem was that we weren't getting enough recent events, as most of
them were removed by the history visibility filter. Now we're getting
all events between the given input range and re-slice the returned
values after applying history visibility.
2022-10-25 15:15:24 +02:00
Till Faelligen 8b7bf5e7d7
Return forbidden if not a member anymore (fix #2802) 2022-10-25 15:00:52 +02:00
Till 313cb3fd19
Filter /members, return members at given point (#2827)
Makes the tests
```
Can get rooms/{roomId}/members at a given point
Can filter rooms/{roomId}/members
```
pass, by moving `/members` and `/joined_members` to the SyncAPI.
2022-10-25 12:39:10 +02:00
Till 7506e3303e
Get messages from before user left the room (#2824)
This is going to make `Can get rooms/{roomId}/messages for a departed
room (SPEC-216)` pass, since we now only grep events from before the
user left the room.
2022-10-24 17:03:04 +02:00
Till 3cf42a1d64
Add syncapi_memberships table tests (#2805) 2022-10-21 12:53:04 +02:00
Till 40cfb9a4ea
Fix invite -> leave -> join dance when accepting invites (#2817)
As mentioned in
https://github.com/matrix-org/dendrite/issues/2361#issuecomment-1139394565
and observed by ourselves, this should fix the odd `invite -> leave ->
join` dance when accepting invites.
2022-10-21 09:26:22 +01:00
Till e79bfd8fd5
Get state deltas without filters (#2810)
This makes the following changes:
- get state deltas without the user supplied filter, so we can actually
"calculate" state transitions
- closes `stmt` when using SQLite
- Adds presence for users who newly joined a room, even if the syncing
user already knows about the presence status (should fix
https://github.com/matrix-org/complement/pull/516)
2022-10-19 14:05:39 +02:00
Till a8bc558a60
Always add UnreadNotifications to joined room reponses (#2793)
Fixes a minor bug, where we failed to add `UnreadNotifications` to the
join response, if it wasn't in
`GetUserUnreadNotificationCountsForRooms`.
2022-10-14 10:38:12 +02:00
Neil Alexander 23a3e04579
Event relations (#2790)
This adds support for tracking `m.relates_to`, as well as adding support
for the various `/room/{roomID}/relations/...` endpoints to the CS API.
2022-10-13 14:50:52 +01:00
Neil Alexander 0a9aebdf01
Private read receipts (#2789)
Implement behaviours for `m.read.private` receipts.
2022-10-11 12:27:21 +01:00
Neil Alexander 3920b9f9b6
Tweak GetStateDeltas behaviour (#2788)
Improves the control flow of `GetStateDeltas` for clarity and possibly
also fixes a bug where duplicate state delta entries could be inserted
with different memberships instead of being correctly overridden by
`join`.
2022-10-11 10:58:34 +01:00
Till 0f09e9d196
Move /event to the SyncAPI (#2782)
This allows us to apply history visibility without having to recalculate
it in the roomserver.
Unblocks https://github.com/matrix-org/complement/pull/495, fix missing
part of https://github.com/matrix-org/dendrite/issues/617
2022-10-10 12:19:16 +02:00
Neil Alexander 1b5460a920
Ensure we only wake up a given user once (#2775)
This ensures that the sync API notifier only wakes up a given user once
for a given stream position.
2022-10-07 13:42:35 +01:00
Till 8c5b166784
Use the stream positions of the notifier (#2768)
Use the stream positions of the notifier, which might have advanced
since setting it at the beginning of the loop. This possibly helps in
reducing roundtrips to the SyncAPI, just because we didn't fetch the
latest data.
Also fixes a minor oversight in the receipts stream.
2022-10-06 11:57:13 +01:00
Till 0f777d421c
Remove empty fields from /sync response (#2755)
First attempt at removing empty fields from `/sync` responses. Needs
https://github.com/matrix-org/sytest/pull/1298 to keep Sytest happy.

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-10-05 13:47:13 +01:00
Neil Alexander c85bc3434f
Optimise QuerySharedUsers so that we can only work on local users (#2766)
Otherwise the sync API key change consumer wastes a lot of time trying
to wake up the notifiers for non-local users.
2022-10-05 12:47:53 +01:00
Neil Alexander 21f8881985
Add indexes that optimise selectStateInRangeSQL (#2764)
This gets rid of some expensive scans on `add_state_ids` and
`remove_state_ids`, turning them into much cheaper and faster index
scans instead.
2022-10-04 16:43:10 +01:00
Ashley Nelson c1e16fd41e
Fix fragility of selectEventsWithEventIDsSQL queries (#2757)
This fixes a temporary workaround with the `selectEventsWithEventIDsSQL`
queries where fields need to be artificially added to the queries so the
row results match the format of the `syncapi_output_room_events` table.
I made similar functions that accept row results from the
`syncapi_current_room_state` table and convert them into StreamEvents
without the fields that are specific to output room events.

There is also a unit test in the first commit to ensure the resulting
behavior doesn't change from the modified queries and functions.

Fixes #601.

### Pull Request Checklist

<!-- Please read docs/CONTRIBUTING.md before submitting your pull
request -->

* [x] I have added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x] Pull request includes a [sign
off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)

Signed-off-by: `Ashley Nelson <fant@shley.email>`

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-10-03 11:57:21 +01:00
Neil Alexander d32f60249d
Modify sync transaction behaviour (#2758)
This now uses a transaction per stream, so that errors in one stream
don't propagate to another, and we therefore no longer need to do hacks
to reopen a new transaction after aborting a failed one.
2022-10-03 11:38:20 +01:00
Neil Alexander 0116db79c6
Reset transaction after a failure 2022-09-30 17:07:37 +01:00
Neil Alexander 16048be236
Handle case when applying history visibility failed 2022-09-30 16:55:10 +01:00
Neil Alexander 7d9545ceea
Another /sync fix 2022-09-30 16:34:06 +01:00
Neil Alexander ee40a29e55
Fix broken /sync due to transaction error 2022-09-30 16:07:18 +01:00
Neil Alexander 6348486a13
Transactional isolation for /sync (#2745)
This should transactional snapshot isolation for `/sync` etc requests.

For now we don't use repeatable read due to some odd test failures with
invites.
2022-09-30 12:48:10 +01:00
Neil Alexander 3f9e38e80a
Consistent *sql.Tx usage across sync API (#2744)
This tidies up the `storage` package so that everything takes a
transaction parameter instead of something things that do and some that
don't.
2022-09-28 10:18:03 +01:00
texuf a574ed5369
Fix for `sql: converting argument $1 type: unsupported type []interfa… (#2743)
…ce {}, a slice of interface` in new notifications select

The sqlite3 version was just not working, original pr here:
https://github.com/matrix-org/dendrite/pull/2688

signed off by: austin ellis <austin@hntlabs.com>

This doesn't fix the notification counts, they still only work about 1
out of every 5 times in my tests. I will stick with my other fix locally
for reliable notification delivery:
https://github.com/matrix-org/dendrite/pull/2701
2022-09-28 06:19:34 +02:00
Neil Alexander 083ae01520
Promote reindexing log level 2022-09-27 17:30:40 +01:00
Till 87be32ca26
Fulltext implementation using Bleve (#2675)
Based on #2480

This actually indexes events based on their event type. They are removed
from the index if we receive a `m.room.redaction` event on the
`OutputRoomEvent` stream.
An admin endpoint is added to reindex all existing events.


Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-09-27 18:06:49 +02:00
Till 249b32c4f3
Refactor notifications (#2688)
This PR changes the handling of notifications
- removes the `StreamEvent` and `ReadUpdate` stream
- listens on the `OutputRoomEvent` stream in the UserAPI to inform the
SyncAPI about unread notifications
- listens on the `OutputReceiptEvent` stream in the UserAPI to set
receipts/update notifications
- sets the `read_markers` directly from within the internal UserAPI

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-09-27 15:01:34 +02:00
PiotrKozimor 12649ccedd
Improve selectRoomIDsWithAnyMembershipSQL performance (#2738)
Recently I have observed that dendrite spends a lot of time (~390s) in
`selectRoomIDsWithAnyMembershipSQL` query

```
dendrite_syncapi=# select total_exec_time, left(query,100) from pg_stat_statements order by total_exec_time desc limit 5 ;
  total_exec_time   |                                                 left
--------------------+------------------------------------------------------------------------------------------------------
  747826.5800519128 | SELECT event_id, id, headered_event_json, session_id, exclude_from_sync, transaction_id, history_vis
  389130.5490339942 | SELECT DISTINCT room_id, membership FROM syncapi_current_room_state WHERE type = $2 AND state_key =
 376104.17514700035 | SELECT psd.datname, xact_commit, xact_rollback, blks_read, blks_hit, tup_returned, tup_fetched, tup_
   363644.164092031 | SELECT event_type_nid, event_state_key_nid, event_nid FROM roomserver_events WHERE event_nid = ANY($
  58570.48104699995 | SELECT event_id, headered_event_json FROM syncapi_current_room_state WHERE room_id = $1 AND ( $2::te
(5 rows)
```

Explain analyze showed correct usage of `syncapi_room_state_unique`
index:

```
dendrite_syncapi=#
explain analyze SELECT distinct room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
                                                                               QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=2749.38..2749.56 rows=24 width=52) (actual time=2.933..2.956 rows=65 loops=1)
   ->  Sort  (cost=2749.38..2749.44 rows=24 width=52) (actual time=2.932..2.937 rows=65 loops=1)
         Sort Key: room_id, membership
         Sort Method: quicksort  Memory: 34kB
         ->  Index Scan using syncapi_room_state_unique on syncapi_current_room_state  (cost=0.41..2748.83 rows=24 width=52) (actual time=0.030..2.890 rows=65 loops=1)
               Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
 Planning Time: 0.140 ms
 Execution Time: 2.990 ms
(8 rows)
```

Multi-column indexes in Postgres shall perform well for leftmost
columns, but I gave it a try and created
`syncapi_current_room_state_type_state_key_idx` index. I could observe
significant performance improvement. Execution time dropped from 2.9 ms
to 0.24 ms:

```
explain analyze SELECT distinct room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
                                                                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=96.46..96.64 rows=24 width=52) (actual time=0.199..0.218 rows=65 loops=1)
   ->  Sort  (cost=96.46..96.52 rows=24 width=52) (actual time=0.199..0.202 rows=65 loops=1)
         Sort Key: room_id, membership
         Sort Method: quicksort  Memory: 34kB
         ->  Bitmap Heap Scan on syncapi_current_room_state  (cost=4.53..95.91 rows=24 width=52) (actual time=0.048..0.139 rows=65 loops=1)
               Recheck Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
               Heap Blocks: exact=59
               ->  Bitmap Index Scan on syncapi_current_room_state_type_state_key_idx  (cost=0.00..4.53 rows=24 width=0) (actual time=0.037..0.037 rows=65 loops=1)
                     Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
 Planning Time: 0.236 ms
 Execution Time: 0.242 ms
(11 rows)
```

Next improvement is skipping DISTINCT and rely on map assignment in
`SelectRoomIDsWithAnyMembership`. Execution time drops by almost half:

```
explain analyze SELECT room_id, membership FROM syncapi_current_room_state WHERE type = 'm.room.member' AND state_key = '@qjfl:dendrite.stg.globekeeper.com';
                                                                       QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on syncapi_current_room_state  (cost=4.53..95.91 rows=24 width=52) (actual time=0.032..0.113 rows=65 loops=1)
   Recheck Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
   Heap Blocks: exact=59
   ->  Bitmap Index Scan on syncapi_current_room_state_type_state_key_idx  (cost=0.00..4.53 rows=24 width=0) (actual time=0.021..0.021 rows=65 loops=1)
         Index Cond: ((type = 'm.room.member'::text) AND (state_key = '@qjfl:dendrite.stg.globekeeper.com'::text))
 Planning Time: 0.087 ms
 Execution Time: 0.136 ms
(7 rows)
```

In our env we spend only 1s on inserting to table, so the write penalty
of creating an index should be small.
```
dendrite_syncapi=# select total_exec_time, left(query,100) from pg_stat_statements where query like '%INSERT%syncapi_current_room_state%' order by total_exec_time desc;
  total_exec_time   |                                                 left
--------------------+------------------------------------------------------------------------------------------------------
 1139.9057619999971 | INSERT INTO syncapi_current_room_state (room_id, event_id, type, sender, contains_url, state_key, he
(1 row)
``` 

This PR does not require test modifications.

### Pull Request Checklist

<!-- Please read docs/CONTRIBUTING.md before submitting your pull
request -->

* [x] I have added added tests for PR _or_ I have justified why this PR
doesn't need tests.
* [x] Pull request includes a [sign
off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)

Signed-off-by: `Piotr Kozimor <p1996k@gmail.com>`
2022-09-27 09:41:36 +01:00
Till c53f284fdb
Get the DeviceListPosition before anything else in complete syncs (#2733)
This should hopefully unflake `Can query remote device keys using POST`
in Complement.
2022-09-22 17:49:35 +02:00
Neil Alexander 97d7cf2232
Remove deleted state logging lines from sync API (they are pointless) 2022-09-20 11:25:18 +01:00
Till e007b8038f
Mark device list as stale, if we don't have the requesting device (#2728)
This hopefully makes E2EE chats a little bit more reliable by re-syncing
devices if we don't have the `requesting_device_id` in our database. (As
seen in
[Synapse](c52abc1cfd/synapse/handlers/devicemessage.py (L157-L201)))
2022-09-20 11:32:03 +02:00
Neil Alexander b05e028f7d
Tweak LoadMembershipAtEvent behaviour when state not known (#2716)
Previously `LoadMembershipAtEvent` would fail if the state before one of
the events was not known, i.e. because it was an outlier. This modifies
it so that it gracefully handles not knowing the state and returns no
memberships instead, so that history visibility doesn't freak out and
kill `/sync` requests dead.
2022-09-13 12:52:09 +01:00
Till c366ccdfca
Send-to-device consumer/producer tweaks (#2713)
Some tweaks for the send-to-device consumers/producers:
- use `json.RawMessage` without marshalling it first
- try further devices (if available) if we failed to `PublishMsg` in the
producers
- some logging changes (to better debug E2EE issues)
2022-09-13 09:35:45 +02:00
Neil Alexander 955e69a3b7
Optimise SharedUsers again by using complete composite index 2022-09-09 14:18:45 +01:00
Neil Alexander 6ee758df63
Optimise shared users query in Synx API slightly by removing a potential sort 2022-09-09 13:50:50 +01:00
Neil Alexander 646de03d60
More writer fixes in the Sync API 2022-09-09 13:06:42 +01:00
Neil Alexander 175f65407a
Allow batching in JetStreamConsumer (#2686)
This allows us to receive more than one message from NATS at a time if we want.
2022-08-31 12:21:56 +01:00
PiotrKozimor 2be43560ca
Index on syncapi_send_to_device table (#2684)
Introduced index improves select query performance. Example execution time of `selectSendToDeviceMessagesSQL` query dropped from 80 ms to 15 ms. No sytest modifications are required.

### Pull Request Checklist

* [x] I have added added tests for PR _or_ I have justified why this PR doesn't need tests.
* [x] Pull request includes a [sign off](https://github.com/matrix-org/dendrite/blob/main/docs/CONTRIBUTING.md#sign-off)

Signed-off-by: `Piotr Kozimor <p1996k@gmail.com>`
2022-08-30 14:47:54 +01:00
Till 7313f56f44
Use existing limit instead of default limit when lazy loading members (#2682)
This should fix an issue where we return less than the expected membership events, when doing an initial sync.
When doing an initial sync, the state limit is set to `math.MaxInt32`, while the default filter is set to 20.
2022-08-30 14:18:47 +02:00
Till 07dd9bd995
SyncAPI tweaks/fixes (#2671)
- Reverts 9dc57122d9 as it was causing issues https://github.com/matrix-org/dendrite/issues/2660
- Updates the GMSL `DefaultStateFilter` to use a limit of 20 events
- Uses the timeline events to determine the new position instead of the state events
2022-08-25 13:42:47 +01:00
Neil Alexander 522bd2999f
Allow un-rejecting events on reprocessing 2022-08-24 14:03:06 +01:00
Till 9dc57122d9
Fetch more data for newly joined rooms in an incremental sync (#2657)
If we've joined a new room in an incremental sync, try fetching more data.
This deflakes the complement server notices test (at least in my tests).
2022-08-19 15:32:24 +02:00
Till 365da70a23
Set historyVisibility for backfilled events over federation (#2656)
This should hopefully deflake Backfill works correctly with history visibility set to joined as we were using the default shared visibility, even if the events are set to joined (or something else)
2022-08-19 11:04:26 +02:00
Till 5cacca92d2
Make SyncAPI unit tests more reliable (#2655)
This should hopefully make some SyncAPI tests more reliable
2022-08-19 11:03:55 +02:00
Till Faelligen 8d9c8f11c5
Add a delay after sending events to the roomserver 2022-08-18 08:56:57 +02:00
Neil Alexander ad4ac2c016
Stop spamming the logs with StateBetween: ignoring deleted state event IDs 2022-08-16 14:42:35 +01:00
Neil Alexander ec16c944eb
Lazy-loading fixes (#2646)
* Use existing current room state if we have it

* Don't dedupe before applying the history vis filter

* Revert "Don't dedupe before applying the history vis filter"

This reverts commit d27c4a0874.

* Revert "Use existing current room state if we have it"

This reverts commit 5819b4a7ce.

* Tweaks
2022-08-16 14:42:00 +01:00
Till 0642ffc0f6
Only return non-retired invites (#2643)
* Only return non-retired invites

* Revert "Only return non-retired invites"

This reverts commit 1150aa7f38.

* Check if we're doing an initial sync in the stream
2022-08-16 10:29:36 +02:00
Till 05cafbd197
Implement history visibility on /messages, /context, /sync (#2511)
* Add possibility to set history_visibility and user AccountType

* Add new DB queries

* Add actual history_visibility changes for /messages

* Add passing tests

* Extract check function

* Cleanup

* Cleanup

* Fix build on 386

* Move ApplyHistoryVisibilityFilter to internal

* Move queries to topology table

* Add filtering to /sync and /context
Some cleanup

* Add passing tests; Remove failing tests :(

* Re-add passing tests

* Move filtering to own function to avoid duplication

* Re-add passing test

* Use newly added GMSL HistoryVisibility

* Update gomatrixserverlib

* Set the visibility when creating events

* Default to shared history visibility

* Remove unused query

* Update history visibility checks to use gmsl
Update tests

* Remove unused statement

* Update migrations to set "correct" history visibility

* Add method to fetch the membership at a given event

* Tweaks and logging

* Use actual internal rsAPI, default to shared visibility in tests

* Revert "Move queries to topology table"

This reverts commit 4f0d41be9c.

* Remove noise/unneeded code

* More cleanup

* Try to optimize database requests

* Fix imports

* PR peview fixes/changes

* Move setting history visibility to own migration, be more restrictive

* Fix unit tests

* Lint

* Fix missing entries

* Tweaks for incremental syncs

* Adapt generic changes

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
2022-08-11 18:23:35 +02:00
Neil Alexander c45d0936b5
Generic-based internal HTTP API (#2626)
* Generic-based internal HTTP API (tested out on a few endpoints in the federation API)

* Add `PerformInvite`

* More tweaks

* Fix metric name

* Fix LookupStateIDs

* Lots of changes to clients

* Some serverside stuff

* Some error handling

* Use paths as metric names

* Revert "Use paths as metric names"

This reverts commit a9323a6a34.

* Namespace metric names

* Remove duplicate entry

* Remove another duplicate entry

* Tweak error handling

* Some more tweaks

* Update error behaviour

* Some more error tweaking

* Fix API path for `PerformDeleteKeys`

* Fix another path

* Tweak federation client proxying

* Fix another path

* Don't return typed nils

* Some more tweaks, not that it makes any difference

* Tweak federation client proxying

* Maybe fix the key backup test
2022-08-11 15:29:33 +01:00
Till e930959e49
Send-to-device/sync tweaks (#2630)
* Always delete send to device messages

* Omit empty to_device

* Tweak /sync response to omit empty values
2022-08-09 10:40:46 +02:00
Till Faelligen 10a151cb55
Don't panic if we fail to upsert account data 2022-08-05 15:37:13 +02:00
Till 3a156a434a
Invalidate lazyLoadCache if we're doing an initial sync (#2623)
* Bypass lazyLoadCache if we're doing an initial sync

* Make the linter happy again?

* Revert "Make the linter happy again?"

This reverts commit 52a5691ba3.

* Try that again

* Invalidate LazyLoadCache on initial syncs

* Remove unneeded check

* Add TODO

* Rename Invalite -> InvalidateLazyLoadedUser

* Thanks IDE
2022-08-05 14:27:27 +02:00
Till cecd11be9a
Partly fix notification counts (#2621)
* Fix notification query

* Also for SQLite

* Move tests to whitelist

* Revert "Move tests to whitelist"

This reverts commit a7d0120019.
2022-08-05 13:44:20 +02:00
Neil Alexander c8935fb53f
Do not use ioutil as it is deprecated (#2625) 2022-08-05 10:26:59 +01:00
Till 1b7f84250a
Fix linter issues (#2624)
* Try that again

* All hail the mighty linter?

* And once again

* goimport all the things
2022-08-05 11:12:41 +02:00
Brian Meek de78eab63a
Add race testing to tests, and fix a few small race conditions in the tests (#2587)
* Add race testing to tests, and fix a few small race conditions in the tests

* Enable run-sytest on MacOS

* Remove deadlock detecting mutex, per code review feedback

* Remove autoformatting related changes and a closure that is not needed

* Adjust to importing nats client as 'natsclient'

Signed-off-by: Brian Meek <brian@hntlabs.com>

* Clarify the use of gooseMutex to proect goose internal state

Signed-off-by: Brian Meek <brian@hntlabs.com>

* Remove no longer needed mutex for guarding goose

Signed-off-by: Brian Meek <brian@hntlabs.com>
2022-08-05 09:19:33 +01:00
Till 9fe509b18d
Fix syncapi shared users query & device lists (#2614)
* Fix query issue, only add "changed" users if we actually share a room

* Avoid log spam if context is done

* Undo changes to filterSharedUsers

* Add logging again..

* Fix SQLite shared users query

* Change query to include invited users
2022-08-03 18:35:17 +02:00
Till df5d4dc7a3
Delete correct Send-to-Device messages (#2608)
* Add send-to-device tests

* Update tests, fix message deletion

* PR comments
2022-08-02 17:00:16 +02:00
sergekh2 6b6b420b9f
Fix issue with sync API not advancing. (#2603)
Issue: During conversation, under some conditions, sync cookie is not advanced, and, as a result, client loops on the same sync API call creating high traffic and CPU load.
Fix: pdu component of cookie was updated incorrectly.
2022-08-02 09:43:48 +01:00
Neil Alexander e94ef84aab
De-race CompleteSync (#2601)
The `err` was coming from outside of the goroutine and being written to by concurrent goroutines.
2022-08-01 15:55:56 +01:00
Till 081f5e7226
Update database migrations, remove goose (#2264)
* Add new db migration

* Update migrations
Remove goose

* Add possibility to test direct upgrades

* Try to fix WASM test

* Add checks for specific migrations

* Remove AddMigration
Use WithTransaction
Add Dendrite version to table

* Fix linter issues

* Update tests

* Update comments, outdent if

* Namespace migrations

* Add direct upgrade tests, skipping over one version

* Split migrations

* Update go version in CI

* Fix copy&paste mistake

* Use contexts in migrations

Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-07-25 10:39:22 +01:00
Neil Alexander f0c8a03649
Membership updater refactoring (#2541)
* Membership updater refactoring

* Pass in membership state

* Use membership check rather than referring to state directly

* Delete irrelevant membership states

* We don't need the leave event after all

* Tweaks

* Put a log entry in that I might stand a chance of finding

* Be less panicky

* Tweak invite handling

* Don't freak if we can't find the event NID

* Use event NID from `types.Event`

* Clean up

* Better invite handling

* Placate the almighty linter

* Blacklist a Sytest which is otherwise fine under Complement for reasons I don't understand

* Fix the sytest after all (thanks @S7evinK for the spot)
2022-07-22 14:44:04 +01:00
Till Faelligen bcff14adea Set historyVisibility in rowsToStreamEvents 2022-07-18 18:19:44 +02:00
Till a7e92f8cb9
History visibility database changes (#2533)
* Add new history_visibility column

* Update SQL queries to include history_visibility

* Store the history visibilty calculated by the roomserver

* Update GMSL

* Update migrations

* Fix migration

* Update GMSL

* Fix `go.sum`

* Update GMSL to use sql.Scanner & sql.Valuer

* Re-order migration/table creation

* Update gomatrixserverlib

* Add history_visibility column to current_room_state

* Fix migrations

* Return error instead of Fatal log

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-07-18 14:46:15 +02:00
Neil Alexander 90bf01d8b1
Use sync API database in filterSharedUsers (#2572)
* Add function to the sync API storage package for filtering shared users

* Use the database instead of asking the RS API

* Fix unit tests

* Fix map handling in `filterSharedUsers`
2022-07-15 16:25:26 +01:00
Till 09f0ff14c8
Minor SendToDevice fix (#2565)
* Avoid unnecessary marshalling if sending to the local server

* Fix ordering of ToDevice messages

* Revive SendToDevice test
2022-07-12 08:23:58 +02:00
Neil Alexander 3ea21273bc
Ristretto cache (#2563)
* Try Ristretto cache

* Tweak

* It's beautiful

* Update GMSL

* More strict keyable interface

* Fix that some more

* Make less panicky

* Don't enforce mutability checks for now

* Determine mutability using deep equality

* Tweaks

* Namespace keys

* Make federation caches mutable

* Update cost estimation, add metric

* Update GMSL

* Estimate cost for metrics better

* Reduce counters a bit

* Try caching events

* Some guards

* Try again

* Try this

* Use separate caches for hopefully better hash distribution

* Fix bug with admitting events into cache

* Try to fix bugs

* Check nil

* Try that again

* Preserve order jeezo this is messy

* thanks VS Code for doing exactly the wrong thing

* Try this again

* Be more specific

* aaaaargh

* One more time

* That might be better

* Stronger sorting

* Cache expiries, async publishing of EDUs

* Put it back

* Use a shared cache again

* Cost estimation fixes

* Update ristretto

* Reduce counters a bit

* Clean up a bit

* Update GMSL

* 1GB

* Configurable cache sizees

* Tweaks

* Add `config.DataUnit` for specifying friendly cache sizes

* Various tweaks

* Update GMSL

* Add back some lazy loading caching

* Include key in cost

* Include key in cost

* Tweak max age handling, config key name

* Only register prometheus metrics if requested

* Review comments @S7evinK

* Don't return errors when creating caches (it is better just to crash since otherwise we'll `nil`-pointer exception everywhere)

* Review comments

* Update sample configs

* Update GHA Workflow

* Update Complement images to Go 1.18

* Remove the cache test from the federation API as we no longer guarantee immediate cache admission

* Don't check the caches in the renewal test

* Possibly fix the upgrade tests

* Update to matrix-org/gomatrixserverlib#322

* Update documentation to refer to Go 1.18
2022-07-11 14:31:31 +01:00
Till f76f28e6db
Fix issue uint64 values with high bit are not supported in presence (#2562)
* Fix issue #2528

* Use gomatrixserverlib.Timestamp

* Use ParseUint instead of ParseInt
2022-07-07 16:29:25 +02:00
Neil Alexander 460dccf93d
Hopefully fix read receipts timestamps (#2557)
This should avoid coercions between signed and unsigned ints which might fix problems like `sql: converting argument $5 type: uint64 values with high bit set are not supported`.
2022-07-05 17:13:26 +01:00
Till 89cd0e8fc1
Try to fix backfilling (#2548)
* Try to fix backfilling

* Return start/end to not confuse clients

* Update GMSL

* Update GMSL
2022-07-01 11:49:26 +02:00
Till 561c159ad7
Silence presence logs (#2547) 2022-06-30 12:34:37 +02:00
Till 2086992caf
Don't return end if there are not more messages (#2542)
* Be more spec compliant

* Move lazyLoadMembers to own method
2022-06-29 10:49:12 +02:00
Neil Alexander 7120eb6bc9
Add InputDeviceListUpdate to the keyserver, remove old input API (#2536)
* Add `InputDeviceListUpdate` to the keyserver, remove old input API

* Fix copyright

* Log more information when a device list update fails
2022-06-15 14:27:07 +01:00
Neil Alexander 4c2a10f1a6
Handle state before, send history visibility in output (#2532)
* Check state before event

* Tweaks

* Refactor a bit, include in output events

* Don't waste time if soft failed either

* Tweak control flow, comments, use GMSL history visibility type
2022-06-13 15:11:10 +01:00
kegsay 21dd5a7176
syncapi: don't return early for no-op incremental syncs (#2473)
* syncapi: don't return early for no-op incremental syncs

Comments explain why, but basically it's an inefficient use
of bandwidth and some sytests rely on /sync to block.

* Honour timeouts

* Actually return a response with timeout=0
2022-05-19 09:00:56 +01:00
kegsay b3162755a9
bugfix: fix race condition when updating presence via /sync (#2470)
* bugfix: fix race condition when updating presence via /sync

Previously when presence is updated via /sync, we would send the presence update
asyncly via NATS. This created a race condition:
 - If the presence update is processed quickly, the /sync which triggered the presence
   update would see an online presence.
 - If the presence update was processed slowly, the /sync which triggered the presence
   update would see an offline presence.

This is the root cause behind the flakey sytest: 'User sees their own presence in a sync'.

The fix is to ensure we update the database/advance the stream position synchronously
for local users.

* Bugfix for test
2022-05-17 15:53:08 +01:00
kegsay 6de29c1cd2
bugfix: E2EE device keys could sometimes not be sent to remote servers (#2466)
* Fix flakey sytest 'Local device key changes get to remote servers'

* Debug logs

* Remove internal/test and use /test only

Remove a lot of ancient code too.

* Use FederationRoomserverAPI in more places

* Use more interfaces in federationapi; begin adding regression test

* Linting

* Add regression test

* Unbreak tests

* ALL THE LOGS

* Fix a race condition which could cause events to not be sent to servers

If a new room event which rewrites state arrives, we remove all joined hosts
then re-calculate them. This wasn't done in a transaction so for a brief period
we would have no joined hosts. During this interim, key change events which arrive
would not be sent to destination servers. This would sporadically fail on sytest.

* Unbreak new tests

* Linting
2022-05-17 13:23:35 +01:00
Till Faelligen b57fdcc82d Only try to get OTKs if the context isn't done yet 2022-05-13 10:28:00 +02:00
Kegan Dougal 3437adf597 Wait 100ms for events to be processed by syncapi 2022-05-12 10:11:46 +01:00
Till 58af7f61b6
Fix OTK upload spam (#2448)
* Fix OTK spam

* Update comment

* Optimize selectKeysCountSQL to only return max 100 keys

* Return CurrentPosition if the request timed out

* Revert "Return CurrentPosition if the request timed out"

This reverts commit 7dbdda9641.

Co-authored-by: kegsay <kegan@matrix.org>
2022-05-11 17:15:18 +01:00
kegsay 9599b3686e
More syncapi tests (#2451)
* WIP tests for flakey create event

* Uncomment all database test
2022-05-11 13:44:32 +01:00