Commit graph

154 commits

Author SHA1 Message Date
Neil Alexander 134ec18614
Ack tweaks 2021-11-03 15:57:36 +00:00
Neil Alexander eb07c2d5d7
Message acknowledgements 2021-11-03 15:45:51 +00:00
Neil Alexander 1110f830c6
More refactoring to remove saramajetstream 2021-11-03 14:28:40 +00:00
Neil Alexander 6b835b83bf
Roomserver input API queuing using NATS 2021-11-03 11:41:51 +00:00
Neil Alexander 73d6964fb4
Merge branch 'master' into add-nats-support 2021-11-02 17:36:22 +00:00
PiotrKozimor dec05c3347
Run gofmt on dendrite - apply go 1.17 preferred build tags (#2021) 2021-11-02 16:48:48 +00:00
Neil Alexander fbd1a0ab13
Update to matrix-org/gomatrixserverlib@5e02b64 2021-11-02 10:13:38 +00:00
Ryan W a624eab309
- Removed double imports (#1989)
- Lower cased error messages

Signed-off-by: Ryan Whittington <twentybitdev@gmail.com>

Co-authored-by: kegsay <kegan@matrix.org>
2021-09-08 17:31:03 +01:00
Neil Alexander ff21675c5b
Cross-signing fixes, notifications via sync, federation (#1974)
* Initial work on signing key update EDUs

* Fix build

* Produce/consume EDUs

* Producer logging

* Only produce key change notifications for local users

* Better naming

* Try to notify sync

* Enable feature

* Use key change topic

* Don't bother verifying signatures, validate key lengths if we can, notifier fixes

* Copyright notices

* Remove tests from whitelist until matrix-org/sytest#1117

* Some review comment fixes

* Update to matrix-org/gomatrixserverlib@f9416ac

* Remove unneeded parameter
2021-08-17 13:44:30 +01:00
Till Faelligen 5f9996b1c5 Revert "Remove unneeded TopicFor"
This reverts commit f5a4e4a339.
2021-07-26 12:06:09 +02:00
Till Faelligen f5a4e4a339 Remove unneeded TopicFor 2021-07-24 12:17:42 +02:00
Till Faelligen a833f5764a Merge branch 'master' of https://github.com/matrix-org/dendrite into add-nats-support 2021-07-24 11:27:24 +02:00
Neil Alexander f63068df3b
Only include go-sqlite3 on the relevant binaries (#1900)
* Only include go-sqlite3 on the relevant binaries

* The driver name is always sqlite3 now

* Update to matrix-org/go-sqlite3-js@e537baa
2021-07-20 11:18:14 +01:00
Neil Alexander e2e1a966e1
Merge branch 'master' into add-nats-support 2021-07-20 10:36:13 +01:00
kegsay 728061db03
fedsender: try to satisfy all notary key requests from the cache first (#1925)
* fedsender: try to satisfy all notary key requests from the cache first

* Linting
2021-07-16 11:35:42 +01:00
kegsay c102adaf43
fedsender: add cache tables for notary keys (#1923)
* Add notary server tables for postgres

* Add sqlite tables

* fedsender: GetServerKeys -> QueryServerKeys

As it now checks a cache and can return multiple responses
2021-07-15 17:45:37 +01:00
Neil Alexander 4c45de81b3
Jetstream package 2021-07-14 13:34:42 +01:00
Neil Alexander 79c5485c8d
Allow clearing federation blacklist at startup for P2P demos 2021-05-24 11:43:24 +01:00
Kegsay 656d11ec90
fedsender: tolerate dupe membership events (#1824)
* fedsender: tolerate dupe membership events

Previously if the fedsender got a duplicate membership event it would cause
the entire process to crash. Now it doesn't. This masks an issue with the
roomserver where it can emit duplicate membership events.

* Update joined_hosts_table.go
2021-04-14 11:11:54 +01:00
Kegsay a1b7e4ef3f
log less for failed key querys, add counters for incoming pdus/edus (#1801)
* log less for failed key querys, add counters for incoming pdus/edus

* use labels

* Blacklist flakey test

* Fix metrics
2021-03-23 11:33:36 +00:00
Neil Alexander d15836e260
Increase gocyclo complexity to 25 (and remove all but 2 golint directives related to it) (#1783) 2021-03-03 14:35:57 +00:00
Neil Alexander db637515a5
Update libp2p dependencies 2021-02-18 10:14:24 +00:00
Neil Alexander 4c0103a2d5
Don't close channels when clearing queue (we might race and panic, when the GC will still clean it up for us anyway) 2021-02-18 09:26:40 +00:00
Neil Alexander 8b5cd256cb
Don't hold destination queues in memory forever (#1769)
* Don't hold destination queues in memory forever

* Close channels

* Fix ordering

* Clear more aggressively

* clearQueue only called by defer so should be safe to delete queue in any case

* Wake queue when created, otherwise cleanup doesn't get called in all cases

* Clean up periodically, we hit a race condition otherwise

* Tweaks

* Don't create queues for blacklisted hosts

* Check blacklist properly
2021-02-17 15:16:35 +00:00
Neil Alexander bd72ed50d4
Reduce log level of 'Failed to send transaction' log line, since quite often it is flooding logs for dead servers 2021-02-04 12:25:31 +00:00
Neil Alexander 6099379ea4
Remove rooms table from federation sender (#1751)
* Remove last sent event ID column from federation sender

* Remove EventIDMismatchError

* Remove the federationsender rooms table altogether, it's useless

* Add migration

* Fix migrations

* Fix migrations
2021-02-04 11:52:49 +00:00
Neil Alexander 9f443317bc
Graceful shutdowns (#1734)
* Initial graceful stop

* Fix dendritejs

* Use process context for outbound federation requests in destination queues

* Reduce logging

* Fix log level
2021-01-26 12:56:20 +00:00
Kegsay ef9d5ad4fe
Check peek state response and refactor checking send_join response (#1732) 2021-01-22 17:16:35 +00:00
Matthew Hodgson 0571d395b5
Peeking over federation via MSC2444 (#1391)
* a very very WIP first cut of peeking via MSC2753.

doesn't yet compile or work.
needs to actually add the peeking block into the sync response.
checking in now before it gets any bigger, and to gather any initial feedback on the vague shape of it.

* make PeekingDeviceSet private

* add server_name param

* blind stab at adding a `peek` section to /sync

* make it build

* make it launch

* add peeking to getResponseWithPDUsForCompleteSync

* cancel any peeks when we join a room

* spell out how to runoutside of docker if you want speed

* fix SQL

* remove unnecessary txn for SelectPeeks

* fix s/join/peek/ cargocult fail

* HACK: Track goroutine IDs to determine when we write by the wrong thread

To use: set `DENDRITE_TRACE_SQL=1` then grep for `unsafe`

* Track partition offsets and only log unsafe for non-selects

* Put redactions in the writer goroutine

* Update filters on writer goroutine

* wrap peek storage in goid hack

* use exclusive writer, and MarkPeeksAsOld more efficiently

* don't log ascii in binary at sql trace...

* strip out empty roomd deltas

* re-add txn to SelectPeeks

* re-add accidentally deleted field

* reject peeks for non-worldreadable rooms

* move perform_peek

* fix package

* correctly refactor perform_peek

* WIP of implementing MSC2444

* typo

* Revert "Merge branch 'kegan/HACK-goid-sqlite-db-is-locked' into matthew/peeking"

This reverts commit 3cebd8dbfb, reversing
changes made to ed4b3a58a7.

* (almost) make it build

* clean up bad merge

* support SendEventWithState with optional event

* fix build & lint

* fix build & lint

* reinstate federated peeks in the roomserver (doh)

* fix sql thinko

* todo for authenticating state returned by /peek

* support returning current state from QueryStateAndAuthChain

* handle SS /peek

* reimplement SS /peek to prod the RS to tell the FS about the peek

* rename RemotePeeks as OutboundPeeks

* rename remote_peeks_table as outbound_peeks_table

* add perform_handle_remote_peek.go

* flesh out federation doc

* add inbound peeks table and hook it up

* rename ambiguous RemotePeek as InboundPeek

* rename FSAPI's PerformPeek as PerformOutboundPeek

* setup inbound peeks db correctly

* fix api.SendEventWithState with no event

* track latestevent on /peek

* go fmt

* document the peek send stream race better

* fix SendEventWithRewrite not to bail if handed a non-state event

* add fixme

* switch SS /peek to use SendEventWithRewrite

* fix comment

* use reverse topo ordering to find latest extrem

* support postgres for federated peeking

* go fmt

* back out bogus go.mod change

* Fix performOutboundPeekUsingServer

* Fix getAuthChain -> GetAuthChain

* Fix build issues

* Fix build again

* Fix getAuthChain -> GetAuthChain

* Don't repeat outbound peeks for the same room ID to the same servers

* Fix lint

* Don't omitempty to appease sytest

Co-authored-by: Kegan Dougal <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2021-01-22 14:55:08 +00:00
Kegsay 80aa9aa8b0
Implement MSC2946 over federation (#1722)
* Add fedsender dep on msc2946

* Add MSC2946Spaces to fsAPI

* Add exclude_rooms impl

* Implement fed spaces handler

* Use stripped state not room version

* Call federated spaces at the right time
2021-01-19 17:14:25 +00:00
Neil Alexander 534c29ab02
Log event ID on consumer errors (fixes #1714) 2021-01-18 12:58:48 +00:00
Neil Alexander 56b5847c74
Add prometheus metrics for destination queues, sync requests
Squashed commit of the following:

commit 7ed1c6cfe67429dbe378a763d832c150eb0f781d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Dec 16 14:53:27 2020 +0000

    Updates

commit 8442099d08760b8d086e6d58f9f30284e378a2cd
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Dec 16 14:43:18 2020 +0000

    Add some sync statistics

commit ffe2a11644ed3d5297d1775a680886c574143fdb
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Dec 16 14:37:00 2020 +0000

    Fix backing off display

commit 27443a93855aa60a49806ecabbf9b09f818301bd
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Wed Dec 16 14:28:43 2020 +0000

    Add some destination queue metrics
2020-12-16 15:02:39 +00:00
Neil Alexander f64c8822bc
Federation sender refactor (#1621)
* Refactor federation sender, again

* Clean up better

* Missing operators

* Try to get overflowed events from database

* Fix queries

* Log less

* Comments

* nil PDUs/EDUs shouldn't happen but guard against them for safety

* Tweak logging

* Fix transaction coalescing

* Update comments

* Check nils more

* Remove channels as they add extra complexity and possibly will deadlock

* Don't hold lock while sending transaction

* Less spam about sleeping queues

* Comments

* Bug-fixing

* Don't try to rehydrate twice

* Don't queue in memory for blacklisted destinations

* Don't queue in memory for blacklisted destinations

* Fix a couple of bugs

* Check for duplicates when pulling things out of the database

* Durable transactions, some more refactoring

* Revert "Durable transactions, some more refactoring"

This reverts commit 5daf924eaa.

* Fix deadlock
2020-12-09 10:03:22 +00:00
Neil Alexander 5d65a879a5
Federation sender event cache (#1614)
* Cache federation sender events

* Store in the correct cache

* Update federation event cache

* Fix Unset

* Give EDUs same caching treatment as PDUs

* Make federationsender_cache_size configurable

* Default caches configuration

* Fix unit tests

* Revert "Fix unit tests"

This reverts commit 24eb5d2252.

* Revert "Default caches configuration"

This reverts commit 464ecd1e64.

* Revert "Make federationsender_cache_size configurable"

This reverts commit 4631f53241.
2020-12-04 14:52:10 +00:00
Kegsay b507312d4c
MSC2836 threading: part 2 (#1596)
* Update GMSL

* Add MSC2836EventRelationships to fedsender

* Call MSC2836EventRelationships in reqCtx

* auth remote servers

* Extract room ID and servers from previous events; refactor a bit

* initial cut of federated threading

* Use the right client/fed struct in the response

* Add QueryAuthChain for use with MSC2836

* Add auth chain to federated response

* Fix pointers

* under CI: more logging and enable mscs, nil fix

* Handle direction: up

* Actually send message events to the roomserver..

* Add children and children_hash to unsigned, with tests

* Add logic for exploring threads and tracking children; missing storage functions

* Implement storage functions for children

* Add fetchUnknownEvent

* Do federated hits for include_children if we have unexplored children

* Use /ev_rel rather than /event as the former includes child metadata

* Remove cross-room threading impl

* Enable MSC2836 in the p2p demo

* Namespace mscs db

* Enable msc2836 for ygg

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-12-04 14:11:01 +00:00
Ronnie Ebrin a677a288bd
federationsender/roomserver: don't panic while federation is disabled (#1615) 2020-12-04 14:08:17 +00:00
Neil Alexander b5aa7ca3ab
Top-level setup package (#1605)
* Move config, setup, mscs into "setup" top-level folder

* oops, forgot the EDU server

* Add setup

* goimports
2020-12-02 17:41:00 +00:00
Neil Alexander bdf6490375
Add ability to disable federation (#1604)
* Allow disabling federation

* Don't start federation queues if disabled

* Fix for Go 1.13
2020-12-02 15:10:03 +00:00
Kegsay 6353b0b7e4
MSC2836: Threading - part one (#1589)
* Add mscs/hooks package, begin work for msc2836

* Flesh out hooks and add SQL schema

* Begin implementing core msc2836 logic

* Add test harness

* Linting

* Implement visibility checks; stub out APIs for tests

* Flesh out testing

* Flesh out walkThread a bit

* Persist the origin_server_ts as well

* Edges table instead of relationships

* Add nodes table for event metadata

* LEFT JOIN to extract origin_server_ts for children

* Add graph walking structs

* Implement walking algorithm

* Add more graph walking tests

* Add auto_join for local rooms

* Fix create table syntax on postgres

* Add relationship_room_id|servers to the unsigned section of events

* Persist the parent room_id/servers in edge metadata

Other events cannot assert the true room_id/servers for the
parent event, only make claims to them, hence why this is
edge metadata.

* guts to pass through room_id/servers

* Refactor msc2836 to allow handling from federation

* Add JoinedVia to PerformJoin responses

* Fix tests; review comments
2020-11-19 11:34:59 +00:00
Neil Alexander 20a01bceb2
Pass pointers to events — reloaded (#1583)
* Pass events as pointers

* Fix lint errors

* Update gomatrixserverlib

* Update gomatrixserverlib

* Update to matrix-org/gomatrixserverlib#240
2020-11-16 15:44:53 +00:00
S7evinK bcb89ada5e
Implement read receipts (#1528)
* fix conversion from int to string yields a string of one rune, not a string of digits

* Add receipts table to syncapi

* Use StreamingToken as the since value

* Add required method to testEDUProducer

* Make receipt json creation "easier" to read

* Add receipts api to the eduserver

* Add receipts endpoint

* Add eduserver kafka consumer

* Add missing kafka config

* Add passing tests to whitelist

Signed-off-by: Till Faelligen <tfaelligen@gmail.com>

* Fix copy & paste error

* Fix column count error

* Make outbound federation receipts pass

* Make "Inbound federation rejects receipts from wrong remote" pass

* Don't use errors package

* - Add TODO for batching requests
- Rename variable

* Return a better error message

* - Use OutputReceiptEvent instead of InputReceiptEvent as result
- Don't use the errors package for errors
- Defer CloseAndLogIfError to close rows
- Fix Copyright

* Better creation/usage of JoinResponse

* Query all joined rooms instead of just one

* Update gomatrixserverlib

* Add sqlite3 migration

* Add postgres migration

* Ensure required sequence exists before running migrations

* Clarification on comment

* - Fix a bug when creating client receipts
- Use concrete types instead of interface{}

* Remove dead code
Use key for timestamp

* Fix postgres query...

* Remove single purpose struct

* Use key/value directly

* Only apply receipts on initial sync or if edu positions differ,
otherwise we'll be sending the same receipts over and over again.

* Actually update the id, so it is correctly send in syncs

* Set receipt on request to /read_markers

* Fix issue with receipts getting overwritten

* Use fmt.Errorf instead of pkg/errors

* Revert "Add postgres migration"

This reverts commit 722fe5a046.

* Revert "Add sqlite3 migration"

This reverts commit d113b03f64.

* Fix selectRoomReceipts query

* Make golangci-lint happy

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2020-11-09 18:46:11 +00:00
Neil Alexander 3afc623098
Fix RewritesState bug (#1557)
* Set RewritesState once

* Check if any new state provided

* Obey rewritesState

* Don't nuke everything the sync API knows when purging state

* Fix panic from duplicate insert

* Consistency

* Use HasState

* Remove nolint

* Clean up joined rooms on state rewrite
2020-10-22 10:39:16 +01:00
Neil Alexander 6e63df1d9a
KindOld (#1531)
* Add KindOld

* Don't process latest events/memberships for old events

* Allow federationsender to ignore duplicate key entries when LatestEventIDs is duplicated by RS output events

* Signal to downstream components if an event has become a forward extremity

* Don't exclude from sync

* Soft-fail checks on KindNew

* Don't run the latest events updater at all for KindOld

* Don't make federation sender change after all

* Kind in federation sender join

* Don't send isForwardExtremity

* Fix syncapi

* Update comments

* Fix SendEventWithState

* Update sytest-whitelist

* Generate old output events

* Sync API consumes old room events

* Update comments
2020-10-19 14:59:13 +01:00
Neil Alexander 49abe359e6
Start Kafka connections for each component that needs them (#1527)
* Start Kafka connection for each component that needs one

* Fix roomserver unit tests

* Rename to naffkaInstance (@Kegsay review comment)

* Fix import cycle
2020-10-15 13:27:13 +01:00
Neil Alexander 9d6b77c58a
Try to retrieve missing auth events from multiple servers (#1516)
* Recursively fetch auth events if needed

* Fix processEvent call

* Ask more servers in lookupEvent

* Don't panic!

* Panic at the Disco

* Find servers more aggressively

* Add getServers

* Fix number of servers to 5, don't bail making RespState if auth events missing

* Fix panic

* Ignore missing state events too

* Report number of servers correctly

* Don't reuse request context for /send_join

* Update federation API tests

* Don't recurse processEvents

* Implement getEvents differently
2020-10-13 11:53:20 +01:00
Kegsay 9096bfcee8
Validate m.room.create events in send_join responses (#1505)
* Validate m.room.create events in send_join responses

For sytest compliance, refs #1315 and #1317

Fixes #1317

* Linting
2020-10-10 00:21:15 +01:00
Neil Alexander fe5d1400bf
Update federation timeouts (#1504)
* Update to matrix-org/gomatrixserverlib#234

* Update gomatrixserverlib

* Update federation timeouts

* Fix dendritejs

* Increase /send context time in destination queue
2020-10-09 17:08:32 +01:00
Neil Alexander bf90db5b60
Remove KindRewrite (#1481)
* Don't send rewrite events

* Remove final traces of rewrite events

* Remove test that is no longer needed

* Revert "Remove test that is no longer needed"

This reverts commit 9a45babff6.

* Update test to use KindOutlier
2020-10-06 11:05:00 +01:00
Neil Alexander d63d7c5640
Tweak log level of a fairly common log line 2020-09-29 17:08:47 +01:00
Neil Alexander a854e3aa18
Fix backoff bug 2020-09-22 14:53:36 +01:00