Deployment

I want a blog, I have a running server, let's make it happen.

Architecture

Currently, I plan to store my content, including this first entry, in a BitBucket private repository, so I can include non-public entries.

I am using Ansible to manage the deployment, as it's a system I've used before.

I plan on running this elsewhere later, though I'm familiar enough with the idea that "temporary plans are never temporary" to use Ansible and git, as opposed to setting it all up manually.

This means it's easier to move when I get around to it, and what I did is described in an executable format.

Anyways, this is my justification for using caddy to generate things on-the-fly, as opposed to hugo to make a nice, static website I can push. hugo would be a better idea.

Notes

I'm starting from my deploy template and stripping away most of it, and instead using the caddy role from another project.

I'm roughing up the setup role, as I'd still like to use the behaviour of testing logins, but I don't want to be using root as the administrative account.


Actually, just realized I don't want caddy to have sudo privileges. The admin user will administer the system, including setting up the caddy account, and all the systemd stuff that's sure to be needed.

After some cottage cheese and chikening out of knocking on someone's door to distract them from their domestic dispute during a walk, I feel refreshed.

I'm thinking I can have nginx proxy the one virtual site to caddy's port.

Currently, my issue is that I want caddy's web root to be a child of the global web root. I want to make the file permissions easier, by having the caddy user be the owner of everything, but I want the web_user to still have access to the folders as it does the backups.

This sounds like a job for setfacl(1)!

Okie dokie.

Well, actually, the web_user is configured to serve up the entire directory by default, and we'll be fighting the existing configuration system to do out thing, so it'd actually be better if it didn't have the ability to read those files, as then, if our configuartion failed, the files would still be there, and would be made publicly available as long as a visitor could guess the URL. I plan on having caddy fetch the git repository, which would include the private entries, so that's a no-go.

As to my original concern about backing up the directory: All the files would be coming from this repo, or would be pulled from the blog repo, so backing them up would be redundant.

I think I'm going to have caddy manage the files out of its own home directory. That'd make things a bit easier.

Note to self: Definitely need to start using ansible defaults, instead of the {{ var | default(...) }} pattern


Always make a backup of configuration files. Always. Especially when testing out regex with \n-matching . enabled :(

By far the best thing to come out of all this is learning of the AnsiballZ architecture. Pointed to in the stacktrace for a regexp not working.

Going to try setting interpreter_python in ansible.cfg before reading too much into the stack trace.

Nope, didn't do the trick. Instead, replaced my use of before and after in the replace module with manual (?P<after>...)(?P<before>...) and \g<after>...\g<before>.

This entry won't see the light of day just yet, but caddy is running, and serving up 404 pages! :D


My PGP keys expired the night I wanted to commit this, and I decided to go to sleep instead of looking up the instructions for how to unexpire them.

Anyways, as I suspected, the way everything comes setup on this server means that the nginx configuration is regenerated often, so instead of hooking into this regeneration process, I can instead just fight it, overwriting its overwrites, in an endless battle.

Basically, I'm looking for a way to modify the configuration file on the remote machine everytime the configuration file is changed. I already have a working Ansible task, so I'll probably just put a tiny playbook on the remote machine, and set it up to run every time the configuration file is touched, so that we know our line about proxying requests to caddy is in the modified file.

Google tells me systemd.path(5) is what I'm looking for!

Additionally, I could make a shell script with sed instead of writing a tiny Ansible play, but I do like the picture of an Ansible playbook generating an Ansible playbook.

There's a lot to systemd, so these will be very basic unit files for now, but should get the job done.

I actually do want this file to run after boot, as I'm not 100% sure systemd will consider the path changed if somehow the nginx file is regenerated after booting but before the modify_nginx.path unit is started.

So this will be an interesting set of unit files, with a .service, .timer, and .path.

Ansible creates a lot of temporary files during the course of a play, and needs writeable access to a handful of directories. How and where to specify these are spread around Ansible's documentation:

I set both just to be safe. I was worried I might need allow_worl_readable_tmpfiles, but systemd.exec(5) has PrivateTmp which is able to hide Ansible's /tmp from anyone wanting to see what it's up to.

A private temporary directory is important, as Ansible may pass around passwords and other secret values in files. This play won't, but it doesn't hurt to put this in early, before more features that may do that are added.


Success! 🎉 It works! I have successfully pitted Ansible against whichever system is generating the nginx configuration, in an endless battle, with Ansible successfully getting the last word in each bout.

Alright, onto grabbing the bitbucket repository.

It doesn't look like caddy comes with the http.git plugin that lets it run git in response to webhooks.

That means I need to modify the playbook to:

Note: The first requirement above reminds me that the remote machine is not currently setup to automatically update caddy

Because I'll now be storing BitBucket OAuth token values in my files, I can either keep them in the untracked vars.yaml file, or I can take this opportunity to use Ansible Vault for the first time.

It still doesn't seem like a good idea to store secrets in source control, so I'll do both.

This does mean ansible-playbook must be run with --ask-vault-pass, since this behaviour can't be specified in a configuration somewhere, unlike --ask-become-pass/-K and become_ask_pass.

The advantage of putting all the secrets in vault is that they're not stored plain no disk, and you only have to type in one thing for all of them, as opposed to with vars_prompt and having to type in each secret individually.


I'm having some trouble with the http.git plugin for caddy. I have a private repository, and it claims to support those, and lists in its documentation that I should be able to give something like the following:

git {
    repo {{ blog_repo }}
    path ./blog_repo
    key {{ bitbucket_access_key.filename }}
    clone_args --depth=1 --single-branch
    pull_args --all -s recursive -X theirs
    hook {{ webhook_path }}
    hook_type bitbucket
}

But this does not work. From caddy, we get:

parse ssh://git@bitbucket.org:mawillcockson/blog: invalid port ":mawillcockson" after host

Alright, not quite what I expected, but I guess we can fix that by using the ssh:// url format ourselves, from the beginning:

git {
    repo ssh://git@bitbucket.org/{{ bitbucket_user }}/{{ bitbucket_repo }}
    path ./blog_repo
    key {{ bitbucket_access_key.filename }}
    clone_args --depth=1 --single-branch
    pull_args --all -s recursive -X theirs
    hook {{ webhook_path }}
    hook_type bitbucket
}

No dice:

fatal: repository 'git@bitbucket.org/mawillcockson/blog' does not exist

Wait, wut... But you just used the ssh:// URL!

Alright, that's fine, maybe we can sneak in the port! Let's try:

git {
    repo ssh://git@bitbucket.org:22/{{ bitbucket_user }}/{{ bitbucket_repo }}
    path ./blog_repo
    key {{ bitbucket_access_key.filename }}
    clone_args --depth=1 --single-branch
    pull_args --all -s recursive -X theirs
    hook {{ webhook_path }}
    hook_type bitbucket
}

Haha! Take that! Let's see you fail now!

Cloning into '/var/blog/blog_repo'...
Resource not found
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.

Oh.

Well, at least we got some progress. It did create a blog_repo directory, at least:

$ sudo -u caddy test -d {{ caddy_home }}/blog_repo && echo Created!
Created!

Hmm. Well, that's unfortunate. Why can't it? What happens if we try it? The command would be something like, uh, let me just check man git-clone to see which flags I'd need to pass to specify the ssh key...nothing. Okay, Google.

Oh, right! I've done this before! I'll give it an ~/.ssh/config, and then just use an unqualified domain name:

# ~/.ssh/config
Host bitbucket
    HostName bitbucket.org
    User git
    IdentityFile {{ bitbucket_access_key.filename }}
# From https://stackoverflow.com/a/11251797
    IdentitiesOnly yes

I like to remind myself where I got some information.

Alright, now we can do:

git {
    repo ssh://bitbucket:22/{{ bitbucket_user }}/{{ bitbucket_repo }}
    path ./blog_repo
    key {{ bitbucket_access_key.filename }}
    clone_args --depth=1 --single-branch
    pull_args --all -s recursive -X theirs
    hook {{ webhook_path }}
    hook_type bitbucket
}

Cool. So now, we get:

Cloning into '/var/blog/blog_repo'...
Resource not found
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.

Huh. No change. Well, maybe we don't have access:

$ sudo -u caddy ssh -T ssh://bitbucket:22
ssh: Could not resolve hostname ssh://bitbucket:22: Name or service not known

What? Oh, right: http.git strips off the ssh://:

$ sudo -u caddy ssh -T bitbucket
authenticated via a deploy key.

You can use git or hg to connect to Bitbucket. Shell access is disabled.

This deploy key has read access to the following repositories:
{{ bitbucket_user }}/{{ bitbucket_repo }}

Okay, so our .ssh/config is working. What gives?

Well, with an extraordinarily hacky command, we can see:

sudo systemctl restart caddy ; watch -g -n 0.1 -p pgrep -a git | less -R

I know, different ways you could do it, but it does give an answer, and only needs a reset after:

[?1049h^OEvery 0.1s: pgrep -a gitmatthew.willcockson.family: Thu Jan  9 02:59:53 2020^M18645 /usr/bin/git clone -b master --depth=1 --single-branch bitbucket:22/{{ bitbucket_user }}/{{ bitbucket_repo }} {{ caddy_home }}/blog_repo^M

Oh, yeah, that would cause a problem. Looks like http.git formats the URL as ssh://, requires a port to be specified, and then does not remove the port, which makes git think we're indicating the user is 22, and the git repo is {{ bitbucket_user }}/{{ bitbucket_repo }}.

The really hacky way to fix this would be to try to register a bitbucket.org account, hoping that a username between 1 and 65535 hasn't been registered, and drop {{ bitbucket_user }} from the URL, making ssh://git@bitbucket.org:65535/blog a parseable URL for http.git, and git@bitbucket.org:65535/blog a valid git URL.

The alternative would be to implement building caddy and http.git from scratch, and modify the bash script http.git generates and runs as part of that process.

Ugh, if only I'd used a public repo, then it would've been an easy https://github.com/....

But nooooo! I wanted to have my private and public entries in the same repo, so I don't have to switch between them, as any additional friction is a barrier to me not writing. And it'd be a bit annoying to have to setup git remotes for both the public and private content, especially since I plan on being able to type out things on different machines.


Easy it is. Just need to switch around some OAuth tokens.

Welp, got it working for now! I'll figure out how to get Ansible to do remote cloning, patching, and building of a go webserver later.

On a side note, it's a little weird BitBucket doesn't let you set up 2-factor authentication for a new account without creating an SSH key.


Alright, so I can finally include an ultra simple browse directive for caddy, with a markdown for the files, and that should be good.


Well, I may have said I had it working, but it took a bit more rewrite statements in the Caddyfile, as well as setting the root of the blog inside a directory in the cloned repository, ensuring-

Wait, double checking if matthew.willcockson.family/../private escapes the root.

-that the only things that can be served are the files in the public folder.

There's still some unwanted behaviour:


The default template of the caddy browse directive appears to generate urls relative to the url it's accessed from.

That makes sense, as if the a subdirectory is browsed, you'd want the links to be for items in that directory.

So, one redir /blog /blog/ later, and tada! 🥳

This page is up, /private is not, and everything is in source-controlled configuration management!


Forgot to setup webhooks with the new repository, so I'm adding this line as a check to make sure the webhook is working properly.


Well, cool. Now I have bitbucket's webhook request logging turned on, because the website isn't updating on pushes.


I'm fairly certain that http.git's IP address filtering is returning a 403 for all webhook requests, since caddy is being proxied.

I can test this by setting the webhook to make requests on an unused port, which also has to be opened in ufw.


Wow. To be honest, I was going to give up when I realized I couldn't get the webhook request to work behind the proxy. Then, when trying to figure out how I could prove it's caddy returning the 403, and that the nginx in bitbucket's webhook response viewer is from nginx capturing the 403 and throwing its own, I tried changing errors stderr to errors visible in Caddyfile, but was still getting nothing.

I don't know why trying to figure out "How can I send a request directly to caddy to prove it's the one returning a 403?" worked to help me solve the problem, instead of the original "Oh, we're behind a proxy. How do we get past the IP filtering when all requests come from 127.0.0.1?". Probably because the latter leads to "How to change request IPs?", rather than "How to request caddy?".

Anywho, yes, setting the webhook to a custom port, setting a seperate virtual host for the http.git plugin to watch for requests on an unused port, and opening the port in the firewall does do the trick.


Now that that's been formalized in the blog_deploy, let's test it out 🤞


Works beautifully.

Well, aside from the kludge, but that makes it beautiful in its own w-- No, hacks aren't beautiful. They can be elegant, or clever, but not beautiful. And certainly not these hacks ☹