Commit graph

221 commits

Author SHA1 Message Date
Robert Swain 01c565ddfb mediaapi/writers/fileutils: Store files based on hash, not media ID
This avoids having to sanitize the origin and media ID for files from
remote servers. It also allows us to deduplicate files across all files
uploaded to this homeserver or downloaded from remote homeservers.
2017-05-22 10:34:56 +02:00
Robert Swain 979ce964ab Merge branch 'master' into rob/media-upload 2017-05-22 10:28:37 +02:00
Robert Swain 8f7ce9adc0 mediaapi/writers/upload: Add note about Content-Disposition override 2017-05-22 10:27:48 +02:00
Robert Swain 3cea54db0b mediaapi/writers/download: Simplify user error message
They already known the origin and media ID so it is redundant.
2017-05-22 10:27:02 +02:00
Robert Swain 5f604cc41f mediaapi/writers/upload: Infof -> Info as no formatting in string 2017-05-22 10:26:30 +02:00
Robert Swain 5536fec902 mediaapi/writers: Add base64hash to media_repository table
A SHA-256 hash sum in golang base64 URLEncoding format (contains only
URL-safe characters) is now calculated and stored for every file
transferred to this server.

Uploads to the server use this hash as the MediaID. Downloads from
remote servers retain their MediaID from the remote server, but can use
the hash for local deduplication and integrity checking purposes.
2017-05-22 10:24:03 +02:00
Robert Swain 370cb74d2d mediaapi/writers: Reuse same writer code for upload and download
This now calculates a hash for downloads from remote servers as well as
uploads to this server.
2017-05-22 10:19:52 +02:00
Robert Swain 9af66a1963 mediaapi/writers: Reuse removeDir throughout the package 2017-05-22 10:13:37 +02:00
Mark Haines 6605333f6f Start implementing the federation server keys API. (#112)
* Start implementing the federation server keys API.

* Fix copyright

* Fix comments

* Comment on the key format

* Better explain what the ValidityPeriod is

* Return a 200 status code
2017-05-19 16:06:41 +01:00
Robert Swain 318531d011 mediaapi/writers/upload: Make assign in-line in if 2017-05-19 12:27:55 +02:00
Robert Swain 86cb8e32f7 mediaapi/writers/upload: Clarify order of moving file and storing metadata 2017-05-19 12:26:27 +02:00
Robert Swain 1242fdba22 mediaapi: Improve logging throughout, leveraging logrus features 2017-05-19 12:21:10 +02:00
Mark Haines aa179d451c Update version of gomatrixserverlib (#111) 2017-05-19 10:46:17 +01:00
Robert Swain 5d5f156500 mediaapi/writers/fileutils: Rework file path layout
From experience with synapse, splitting the files into subdirectories
based on the beginnings of the filenames helps with browsability. As we
are using MediaIDs that are base64-encoded, each character has 64
possibilities, which is a nice upper limit on the number of
subdirectories in a directory in terms of browsing. We have two levels
of single character directories for added convenience, creating up to
4096 buckets.
2017-05-19 11:34:40 +02:00
Kegsay 9d4d18ae7f Add AccountDatabase for storing user accounts (#110)
Including the ability to add new accounts with a user/password and
select accounts using a user/password. Uses bcrypt to hash passwords.
2017-05-19 10:27:03 +01:00
Robert Swain 12b0cdde06 mediaapi/writers/upload: Explain the use of TeeReader 2017-05-19 11:01:44 +02:00
Robert Swain f7d11f87c1 mediaapi/writers/upload: Add comment about why we hash the file data 2017-05-19 10:59:12 +02:00
Robert Swain 3e5ac85ce1 mediaapi/writers/upload: Clarify TODO comment 2017-05-19 10:53:47 +02:00
Robert Swain cdd4222e45 mediaapi/writers/fileutils: Return errors to log using request context 2017-05-19 10:46:50 +02:00
Robert Swain 5dd90fbff3 mediaapi/writers/fileutils: Make note of further file path validation todo 2017-05-18 18:00:56 +02:00
Robert Swain 7af45e4664 mediaapi/writers/upload: Refactor Upload() into three new functions 2017-05-18 17:56:19 +02:00
Robert Swain 00e8fed3a7 mediaapi/writers: Add validation and error handling to getPathFromMediaMetadata 2017-05-18 17:39:30 +02:00
Robert Swain 10a2b2f8e6 mediaapi: Also rename all basePath variables to absBasePath for clarity 2017-05-18 17:37:32 +02:00
Robert Swain 995e1f2c99 cmd/dendrite-media-api-server: Make base path absolute 2017-05-18 17:25:12 +02:00
Robert Swain f5422787a1 mediaapi/writers: Move single-value error return assignment into if 2017-05-18 16:02:43 +02:00
Robert Swain 9fc5abdb3f mediaapi/writers: Rename utils.go to fileutils.go
Better reflects the content of the file.
2017-05-18 16:01:06 +02:00
Robert Swain 2e795ed8aa mediaapi/storage: Improve GetMediaMetadata description 2017-05-18 15:57:07 +02:00
Robert Swain 1f2ac60bee mediaapi/routing: Sync make() to makeAPI() as in clientapi 2017-05-18 15:53:48 +02:00
Robert Swain 04c4a2d05a cmd/dendrite-media-api-server: Move os.Getenv() for consistency 2017-05-18 15:50:09 +02:00
Mark Haines 426a0365cf Rename "make" to "makeAPI" and factor out some more common code into it (#109)
* Rename "make" to "makeAPI" and factor out some more common code into it

Naming a function the same as a go builtin function seems like a bad
idea. Also move the call to `NewJSONRequestHander` inside the function
rather than calling it everywhere.

* Fix typo
2017-05-18 13:47:23 +01:00
Kegan Dougal cf736d746d hook: Make go vet run all tests and fix warnings 2017-05-18 12:27:11 +01:00
Robert Swain 1057e2e117 Merge branch 'master' into rob/media-upload 2017-05-18 12:47:41 +02:00
Robert Swain e4a97d13b3 Merge pull request #108 from matrix-org/rob/golang-1.8
.travis.yml: Bump golang to 1.8
2017-05-18 12:47:08 +02:00
Robert Swain 4df470eab5 .travis.yml: Bump golang to 1.8 2017-05-18 12:38:09 +02:00
Robert Swain ec0d584fe7 cmd/dendrite-media-api-server: Log format string with Infof not Info 2017-05-18 12:23:17 +02:00
Robert Swain 8085c1f863 mediaapi/types: Clarify what is ActiveRemoteRequests.Set's key 2017-05-18 12:19:03 +02:00
Robert Swain 28ef35d36a mediaapi/storage: Rework GetMediaMetadata API to return new MediaMetadata 2017-05-18 12:09:33 +02:00
Robert Swain 2fca4bbd65 mediaapi/config: Fix max_file_size_bytes YAML tag 2017-05-18 11:58:41 +02:00
Robert Swain c5cd5a93b9 mediaapi: Use ServerName type from gomatrixserverlib 2017-05-18 11:57:44 +02:00
Robert Swain bd9db7557a mediaapi/README: Add link to spec section 2017-05-18 11:50:24 +02:00
Robert Swain 7727a8c61e cmd/dendrite-media-api-server: Add MAX_FILE_SIZE_BYTES configuration 2017-05-18 11:44:48 +02:00
Robert Swain 846aece163 mediaapi: MaxFileSize -> MaxFileSizeBytes 2017-05-18 11:36:26 +02:00
Robert Swain 35a0b5d2e9 cmd/dendrite-media-api-server: Add BASE_PATH configuration 2017-05-18 11:34:01 +02:00
Robert Swain ff3009ffdd cmd/dendrite-media-api-server: Add SERVER_NAME configuration 2017-05-18 11:32:30 +02:00
Robert Swain deee6f84c7 mediaapi/writers/upload: Move file first as db is source of truth
The database is the source of truth. If we add the metadata to the
database and it succeeds, and then the file fails to be moved, we think
we have a file when we actually don't.
2017-05-18 11:10:41 +02:00
Robert Swain 3f904e1cdb mediaapi/writer/upload: Remove unnecessary logic 2017-05-18 11:09:09 +02:00
Robert Swain f28235c05d mediaapi/writers/upload: Factor out removeDir
Reduces complexity of Upload. Note that we never care about the error
from os.RemoveAll() beyond logging as we are already in an error case.
2017-05-18 11:07:03 +02:00
Robert Swain 5348b64edc mediaapi/writers/download: Reduce complexity of copyToActiveAndPassive 2017-05-18 10:17:11 +02:00
Robert Swain b80d5ab919 cmd/mediaapi-integration-tests: Test downloading same file 100 times
Spawns a GET request for the same file in 100 parallel go routines and
prints the body (which is some error JSON) in case of not 200 OK. Also
prints the number of successful requests.

This of course should take command line arguments for the URL and number
of requests but that can be done as soon as needed.
2017-05-18 09:12:01 +02:00
Robert Swain 8cf507f85f mediaapi/writers: Never return server errors to user but log them 2017-05-18 09:04:36 +02:00