* Look up servers less often, don't hit API for missing auth events unless there are actually missing auth events
* Remove ResolveConflictsAdhoc (since it is already in GMSL), other tweaks
* Update gomatrixserverlib to matrix-org/gomatrixserverlib#254
* Fix resolve-state
* Initialise t.servers on first use
* Update GMSL
* Add MSC2836EventRelationships to fedsender
* Call MSC2836EventRelationships in reqCtx
* auth remote servers
* Extract room ID and servers from previous events; refactor a bit
* initial cut of federated threading
* Use the right client/fed struct in the response
* Add QueryAuthChain for use with MSC2836
* Add auth chain to federated response
* Fix pointers
* under CI: more logging and enable mscs, nil fix
* Handle direction: up
* Actually send message events to the roomserver..
* Add children and children_hash to unsigned, with tests
* Add logic for exploring threads and tracking children; missing storage functions
* Implement storage functions for children
* Add fetchUnknownEvent
* Do federated hits for include_children if we have unexplored children
* Use /ev_rel rather than /event as the former includes child metadata
* Remove cross-room threading impl
* Enable MSC2836 in the p2p demo
* Namespace mscs db
* Enable msc2836 for ygg
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add mscs/hooks package, begin work for msc2836
* Flesh out hooks and add SQL schema
* Begin implementing core msc2836 logic
* Add test harness
* Linting
* Implement visibility checks; stub out APIs for tests
* Flesh out testing
* Flesh out walkThread a bit
* Persist the origin_server_ts as well
* Edges table instead of relationships
* Add nodes table for event metadata
* LEFT JOIN to extract origin_server_ts for children
* Add graph walking structs
* Implement walking algorithm
* Add more graph walking tests
* Add auto_join for local rooms
* Fix create table syntax on postgres
* Add relationship_room_id|servers to the unsigned section of events
* Persist the parent room_id/servers in edge metadata
Other events cannot assert the true room_id/servers for the
parent event, only make claims to them, hence why this is
edge metadata.
* guts to pass through room_id/servers
* Refactor msc2836 to allow handling from federation
* Add JoinedVia to PerformJoin responses
* Fix tests; review comments
* Add resolve-state helper
* Tweaks
* Refactor forward extremities, again
* Tweaks
* Minor optimisation
* Make path a bit clearer
* Only process state/membership if forward extremities have changed
* Usage comments in resolve-state
* Initial work oon multipersonality binary
* Remove old binaries
* Monolith and polylith binaries
* Better logging
* dendrite-poly-multi
* Fix path
* Copyright notices etc
* Tweaks
* Update Docker, INSTALL.md
* Take first argument if flags package doesn't find any args
* Postgres 9.6 or later, fix some more Docker stuff
* Don't create unnecessary e2ekey DB
* Run go mod tidy
* Support auto-upgrading accounts DB
* Auto-upgrade device DB deltas
* Support up/downgrading from cmd/goose
* Linting
* Create tables then do migrations then prepare statements
To avoid failing due to some things not existing
* Linting
* Start Kafka connection for each component that needs one
* Fix roomserver unit tests
* Rename to naffkaInstance (@Kegsay review comment)
* Fix import cycle
* Rename serverkeyapi to signingkeyserver
We use "api" for public facing stuff and "server" for internal stuff.
As the server key API is internal only, we call it 'signing key server',
which also clarifies the type of key (as opposed to TLS keys, E2E keys, etc)
* Convert docker/scripts to use signing-key-server
* Rename missed bits
* WIP Event rejection
* Still send back errors for rejected events
Instead, discard them at the federationapi /send layer rather than
re-implementing checks at the clientapi/PerformJoin layer.
* Implement rejected events
Critically, rejected events CAN cause state resolution to happen
as it can merge forks in the DAG. This is fine, _provided_ we
do not add the rejected event when performing state resolution,
which is what this PR does. It also fixes the error handling
when NotAllowed happens, as we were checking too early and needlessly
handling NotAllowed in more than one place.
* Update test to match reality
* Modify InputRoomEvents to no longer return an error
Errors do not serialise across HTTP boundaries in polylith mode,
so instead set fields on the InputRoomEventsResponse. Add `Err()`
function to make the API shape basically the same.
* Remove redundant returns; linting
* Update blacklist
* Add support for database migrations
Closes#1246
This PR does NOT add any migrations as an example. I have
manually tested that the library works with SQL and Go based
upgrades correctly. Documentation should be sufficient for
devs to add migrations.
* Clarifications
* Linting
* Remove QueryBulkStateContent from current state server
Expected fail due to db impl not existing
* Implement query bulk state content
* Fix up rejecting invites over federation
* Fix bulk content marshalling
* Use federation sender for backfill and getting missing events
* Fix internal URL paths
* Update go.mod/go.sum for matrix-org/gomatrixserverlib#218
* Add missing server implementations in HTTP interface
* Add FederationClient interface to federationsender
- Use a shim struct in HTTP mode to keep the same API as `FederationClient`.
- Use `federationsender` instead of `FederationClient` in `keyserver`.
* Pointers not values
* Review comments
* Fix unit tests
* Rejig backoff
* Unbreak test
* Remove debug logs
* Review comments and linting
* Initial pass at refactoring config (not finished)
* Don't forget current state and EDU servers
* More shifting around
* Update server key API tests
* Fix roomserver test
* Fix more tests
* Further tweaks
* Fix current state server test (sort of)
* Maybe fix appservices
* Fix client API test
* Include database connection string in database options
* Fix sync API build
* Update config test
* Fix unit tests
* Fix federation sender build
* Fix gobind build
* Set Listen address for all services in HTTP monolith mode
* Validate config, reinstate appservice derived in directory, tweaks
* Tweak federation API test
* Set MaxOpenConnections/MaxIdleConnections to previous values
* Update generate-config
* Add QueryDeviceMessages to serve up device keys and stream IDs
* Consume key change events in fedsender
Don't yet send them to destinations as we haven't worked them out yet
* Send device list updates to all required servers
* Glue it all together
* Recheck device lists when join/leave events come in
* Add PerformDeviceDeletion
* Notify clients when devices are deleted
* Unbreak things
* Remove debug logging
* Produce kafka events when keys are added
* Consume key changes in syncapi with TODO markers for handling them and catching up
* unbreak tests
* Linting
* Initial persistence of blacklists
* Move statistics folder
* Make MaxFederationRetries configurable
* Set lower failure thresholds for Yggdrasil demos
* Still write events into database for blacklisted hosts (they can be tidied up later)
* Review comments
* Perform outbound federation hits for querying/claiming E2E keys
Untested currently because we need the receiving end to work
before sytest will be happy.
* Linting
Squashed commit of the following:
commit 86c2388e13ffdbabdd50cea205652dccc40e1860
Merge: b0a3ee6c f5e7e751
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Thu Jul 16 13:47:10 2020 +0100
Merge branch 'master' into neilalexander/yggbarequic
commit b0a3ee6c5c063962384bb91c59ec753ddc8cfe5f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Thu Jul 16 13:42:22 2020 +0100
Add support for broadcasting wake-up EDUs to known hosts
commit 8a5c2020b3a4b705b5d5686a9e71990a49e6d471
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Thu Jul 16 13:42:10 2020 +0100
Bare QUIC demo working
commit d3939b3d6568cf4262c0391486a5203873b68bfc
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Wed Jul 15 11:42:43 2020 +0100
Support bare Yggdrasil sessions with encrypted QUIC
* Add storage layer for postgres/sqlite
* Return OTK counts when inserting new keys
* Hook up the key DB and make a test pass
* Convert postgres queries to be sqlite queries
* Blacklist test due to requiring rejected events
* Unbreak tests
* Update blacklist
* Don't return null to public directory request
* Initial support for finding public rooms in Yggdrasil demo (incomplete)
* Increase QUIC idle time to 15 minutes
* Initial QUIC work
* Update Yggdrasil demo
* Make sure that the federation sender knows how many pending events are in the database when the worker starts
* QUIC tunables
* pprof
* Don't spin
* Set build info for Yggdrasil
* Use content_value instead of membership
* Fix build
* Replace publicroomsapi with a combination of clientapi/roomserver/currentstateserver
- All public rooms paths are now handled by clientapi
- Requests to (un)publish rooms are sent to the roomserver via `PerformPublish`
which are stored in a new `published_table.go`
- Requests for public rooms are handled in clientapi by:
* Fetch all room IDs which are published using `QueryPublishedRooms` on the roomserver.
* Apply pagination parameters to the slice.
* Do a `QueryBulkStateContent` request to the currentstateserver to pull out
required state event *content* (not entire events).
* Aggregate and return the chunk.
Mostly but not fully implemented (DB queries on currentstateserver are missing)
* Fix pq query
* Make postgres work
* Make sqlite work
* Fix tests
* Unbreak pagination tests
* Linting
* Initial work on persistent queues
* Update index for event ID and server name
* Put things into database (postgres for now)
* Duplicate postgres code into sqlite for now just to stop build errors, will fix SQLite soon
* Fix table name
* Fix index
* Fix table name
* Use RETURNING because LastInsertID is not supported by postgres
* Use functions
* Marshal headered event
* Don't error on now rows
* Don't block if there are PDUs waiting
* Try to tidy up JSON
* Debug logging
* Fix query, use transactions in postgres
* Clean up
* Rehydrate more opportunistically
* Fix SQLite
* remove unused types
* Review comments
* Shuffle things around a bit
* Clean up transaction properly
* Don't send empty transactions
* Reduce unnecessary retries
* Count PDUs to make more resilient
* Don't stop when there is work to be done
* Try to limit wakeups
* well this is tedious
* Fix race in incomplete transactions
* Thread safety on transaction ID/count
* Remove membership table from account DB
And make code which needs that data use the currentstate server
* Unbreak tests; use a membership enum for space
* BREAKING: Make eduserver/appservice use userapi
This is a breaking change because this PR restructures how the AS API
tracks its position in Kafka streams. Previously, it used the account DB
to store partition offsets. However, this is also being used by `clientapi`
for the same purpose, which is bad (each component needs to store offsets
independently or else you might lose messages across restarts). This PR
changes this behaviour to now store partition offsets in the `appservice`
database.
This means that:
- Upon restart, the `appservice` component will attempt to replay all
room events from the beginning of time.
- An additional table will be created in the appservice database, which
in and of itself is backwards compatible.
* Return ErrorConflict
* Make userapi responsible for checking access tokens
There's still plenty of dependencies on account/device DBs, but this
is a start. This is a breaking change as it adds a required config
value `listen.user_api`.
* Cleanup
* Review comments and test fix
* More key tweaks
* Start testing stuff
* Move responsibility for generating local keys into server key API, don't register prom in caches unless needed, start tests
* Don't store our own keys in the database
* Don't store our own keys in the database
* Don't run tests for now
* Tweak caching behaviour, update tests
* Update comments, add fixes from forward-merge
* Debug logging
* Debug logging
* Perform final comparison against original set of requests
* oops
* Fetcher timeouts
* Fetcher timeouts
* missing func
* Tweaks
* Update gomatrixserverlib
* Fix Federation API test
* Break up FetchKeys
* Add comments to caching
* Add URL check in test
* Partially revert "Move responsibility for generating local keys into server key API, don't register prom in caches unless needed, start tests"
This reverts commit d7eb54c5b3.
* Fix federation API test
* Fix internal cache stuff again
* Fix server key API test
* Update comments
* Update comments from review
* Fix lint
* Fix rooms v3 url paths for good - with tests
- Add a test rig around `federationapi` to test routing.
- Use `JSONVerifier` over `KeyRing` so we can stub things out more easily.
- Add `test.NopJSONVerifier` which verifies nothing.
- Add `base.BaseMux` which is the original `mux.Router` used to spawn public/internal routers.
- Listen on `base.BaseMux` and not the default serve mux as it cleans paths which we don't want.
- Factor out `ListenAndServe` to `test.ListenAndServe` and add flag for listening on TLS.
* Fix comments
* Linting
This is a wrapper around whatever impl we have which then logs
the function name/request/response/error.
Also tweak when we log on kafka streams: only log on the producer
side not the consumer side: we've never had issues with comms and
having 1 message rather than N would be nice.
* Remove clientapi producers which aren't actually producers
They are actually just convenience wrappers around the internal APIs
for roomserver/eduserver. Move their logic to their respective `api`
packages and call them directly.
* Remove TODO
* unbreak ygg
Previously we had 3 monoliths:
- dendrite-monolith-server
- dendrite-demo-libp2p
- dendritejs
which all had their own of setting up public routes. Factor this
out into a new `setup.Monolith` struct which gets all dependencies
set as fields. This is different to `basecomponent.Base` which
doesn't provide any way to set configured deps (e.g public rooms db)
Part of a larger process to clean up how we initialise Dendrite.
* Split out adding HTTP routes from making internal APIs for clarity
* Split out more components
* Split out more things
* Finish converting
* internal mux for internal routes
* Return newly created error if user already exists (#1002)
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Rename variable
* Remove check for account and use returned error
* Return ErrUserExists
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* State that CreateAccount will return err ErrUserExists if the user exists
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Also check sqlite for constraint error
* Revert "Also check sqlite for constraint error"
This reverts commit 7d310514
* Check for sqlite3 constraint error
* Add documentation to CreateAccount
* Move ErrUserExists to accounts package
* Revert "Move ErrUserExists to accounts package"
Import Cycle..
This reverts commit be3d4cda
Co-authored-by: Kegsay <kegan@matrix.org>
* Add a way to force federationsender to retry sending transactions
And use it in P2P mode when we pick up new nodes.
* Linting
* Use atomic bool to stop us blocking on the channel
* Groundwork for send-to-device messaging
* Update sample config
* Add unstable routing for now
* Send to device consumer in sync API
* Start the send-to-device consumer
* fix indentation in dendrite-config.yaml
* Create send-to-device database tables, other tweaks
* Add some logic for send-to-device messages, add them into sync stream
* Handle incoming send-to-device messages, count them with EDU stream pos
* Undo changes to test
* pq.Array
* Fix sync
* Logging
* Fix a couple of transaction things, fix client API
* Add send-to-device test, hopefully fix bugs
* Comments
* Refactor a bit
* Fix schema
* Fix queries
* Debug logging
* Fix storing and retrieving of send-to-device messages
* Try to avoid database locks
* Update sync position
* Use latest sync position
* Jiggle about sync a bit
* Fix tests
* Break out the retrieval from the update/delete behaviour
* Comments
* nolint on getResponseWithPDUsForCompleteSync
* Try to line up sync tokens again
* Implement wildcard
* Add all send-to-device tests to whitelist, what could possibly go wrong?
* Only care about wildcard when targeted locally
* Deduplicate transactions
* Handle tokens properly, return immediately if waiting send-to-device messages
* Fix sync
* Update sytest-whitelist
* Fix copyright notice (need to do more of this)
* Comments, copyrights
* Return errors from Do, fix dendritejs
* Review comments
* Comments
* Constructor for TransactionWriter
* defletions
* Update gomatrixserverlib, sytest-blacklist
- We don't want a serverKeyAPI as fetching keys doesn't need a DB.
- De-dupe rooms so we don't see them multiple times, but shuffle the
alias we join via so we don't all flood a single server.
* Server key API (works for monolith but not for polylith yet)
* Re-enable caching on server key API component
* Groundwork for HTTP APIs for server key API
* Hopefully implement HTTP for server key API
* Simplify public key request marshalling from map keys
* Update gomatrixserverlib
* go mod tidy
* Common -> internal
* remove keyring.go
* Update Docker Hub for server key API
* YAML is funny about indentation
* Wire in new server key API into hybrid monolith mode
* Create maps
* Route server key API endpoints on internal API mux
* Fix server key API URLs
* Add fetcher behaviour into server key API implementation
* Return error if we failed to fetch some keys
* Return results anyway
* Move things about a bit
* Remove unused code
* Fix comments, don't use federation sender URL in polylith mode
* Add server_key_api to sample config
* Review comments
* HTTP API to cache keys that have been requested
* Overwrite server_key_api listen in monolith hybrid mode
* Separate muxes for public and internal APIs
* Update client-api-proxy and federation-api-proxy so they don't add /api to the path
* Tidy up
* Consistent HTTP setup
* Set up prefixes properly
* Limit database connections (#564)
- Add new options to the config file database:
max_open_conns: 100
max_idle_conns: 2
conn_max_lifetime: -1
- Implement connection parameter setup on the *DB (database/sql) in internal/sqlutil/trace.go:Open()
- Propagate the values in the form of DbProperties interface via all the
Open() and NewDatabase() functions
Signed-off-by: Tomas Jirka <tomas.jirka@email.cz>
* Fix wasm builds
* Remove file accidentally added from working tree
Co-authored-by: Tomas Jirka <tomas.jirka@email.cz>
* Consolidation of roomserver APIs
* Comment out alias tests for now, they are broken
* Wire AS API into roomserver again
* Roomserver didn't take asAPI param before so return to that
* Prevent roomserver asking AS API for alias info
* Rename some files
* Remove alias_test, incoherent tests and unwanted appservice integration
* Remove FS API inject on syncapi component
* Define an input API for the federationsender
* Wiring for rooomserver input API and federation sender input API
* Whoops, commit common too
* Merge input API into query API
* Rename FederationSenderQueryAPI to FederationSenderInternalAPI
* Fix dendritejs
* Rename Input to Perform
* Fix a couple of inputs -> performs
* Remove needless storage interface, add comments
* Initial cut for backfilling
The syncserver now asks the roomserver via QueryBackfill (which already
existed to *handle* backfill requests) which then makes federation requests
via gomatrixserverlib.RequestBackfill.
Currently, tests fail on subsequent /messages requests because we don't know
which servers are in the room, because we are unable to get state snapshots
from a backfilled event because that code doesn't exist yet.
* WIP backfill, doesn't work
* Make initial backfill pass checks
* Persist backfilled events with state snapshots
* Remove debug lines
* Linting
* Review comments
* Experimental LRU cache for room versions
* Don't accidentally try to type-assert nil
* Also reduce hits on query API
* Use hashicorp implementation which mutexes for us
* Define const for max cache entries
* Rename to be specifically immutable, panic if we try to mutate a cache entry
* Review comments
* Remove nil guards, give roomserver integration test a cache
* go mod tidy
* Update gomatrixserverlib
* Test matrix.org as perspective key server
* Base64 decode better
* Optional strict validity checking in gmsl
* Update gomatrixserverlib
* Attempt to find missing auth events over federation (this shouldn't happen but I am guessing there is a synapse bug involved where we don't get all of the auth events)
* Update gomatrixserverlib, debug logging
* Remove debugging output
* More verbose debugging
* Print outliers
* Increase timeouts for testing, observe contexts before trying to join over more servers
* Don't block on roomserver (experimental)
* Don't block on roomserver
* Update gomatrixserverlib
* Update gomatrixserverlib
* Configurable perspective key fetchers
* Output number of configured keys for perspective
* Example perspective config included
* Undo debug stack trace
* Undo debug stack trace
* Restore original HTTP listener in monolith
* Fix lint
* Review comments
* Set default HTTP server timeout to 5 minutes now, block again when joining
* Don't use HTTP address for HTTPS whoops
* Update gomatrixserverlib
* Update gomatrixserverlib
* Update gomatrixserverlib
* Actually add perspectives
* Actually add perspectives
* Update gomatrixserverlib
* Add libp2p-go
* Some tweaks, tidying up
(cherry picked from commit 1a5bb121f8121c4f68a27abbf25a9a35a1b7c63e)
* Move p2p dockerfile
(cherry picked from commit 8d3bf44ea1bf37f950034e73bcdc315afdabe79a)
* Remove containsBackwardsExtremity
* Fix some linter errors, update some libp2p packages/calls, other tidying up
* Add -port for dendrite-p2p-demo
* Use instance name as key ID
* Remove P2P demo docker stuff, no longer needed now that we have SQLite
* Remove Dockerfile-p2p too
* Remove p2p logic from dendrite-monolith-server
* Inject publicRoomsDB in publicroomsapi
Inject publicRoomsDB instead of switching on base.libP2P.
See: https://github.com/matrix-org/dendrite/pull/956/files?file-filters%5B%5D=.go#r406276914
* Fix lint warning
* Extract mDNSListener from base.go
* Extract CreateFederationClient into demo
* Create P2PDendrite from BaseDendrite
Extract logic specific to P2PDendrite from base.go
* Set base.go to upstream/master
* Move pubsub to demo cmd
* Move PostgreswithDHT to cmd
* Remove unstable features
* Add copyrights
* Move libp2pvalidator into p2pdendrite
* Rename dendrite-p2p-demo -> dendrite-demo-libp2p
* Update copyrights
* go mod tidy
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add setting to enable/disable metrics (#461)
Add basic auth to /metric handlers
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Add warning message if metrics are exposed without protection
* Remove redundant type conversion
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* SetBasicAuth per test case
* Update warning message and change loglevel to warn
* Update common/config/config.go
* Update dendrite-config.yaml
Co-authored-by: Till Faelligen <tfaelligen@gmail.com>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
This commit replaces the default client from the http lib with a custom one.
The previously used default client doesn't come with a timeout. This could cause
unwanted locks.
That solution chosen here creates a http client in the base component dendrite
with a constant timeout of 30 seconds. If it should be necessary to overwrite
this, we could include the timeout in the dendrite configuration.
Here it would be a good idea to extend the type "Address" by a timeout and
create an http client for each service.
Closes#820
Signed-off-by: Benedikt Bongartz <benne@klimlive.de>
Co-authored-by: Kegsay <kegan@matrix.org>