i am not a devops engineer. i appreciate any critique or correction.
Deploying Nextcloud on AWS ECS with Pulumi
This Pulumi programme deploys a highly-available, cost-effective Nextcloud service on AWS Fargate with a serverless Aurora PostgreSQL database.
Deployment Option 1 (GitOps)
The first few items are high-level instructions only. You can follow the instructions from the hyperlinked web pages. They include the best practices as recommended by the authors.
- A Pulumi account. This is for creating a Personal Access Token that is required when provisioning the AWS resources.
- Create a non-root AWS IAM User called
pulumi-user
. - Create an IAM User Group called
pulumi-group
- Add the
pulumi-user
to thepulumi-group
User Group. - Attach the
IAMFullAccess
policy topulumi-group
. TheIAMFullAccess
allows your IAM User to add the remaining required IAM policies to the IAM User Group using the automation script later. - Create an access key for your non-root IAM User.
- On your Pulumi account, go to Personal access tokens and create a token.
- Also create a password for the Aurora Database. You can use a password generator.
- Clone this repository either to your GitLab or GitHub.
- This works either on GitLab CI/CD or GitHub Actions. On GitLab, go to the cloned repository settings > Settings > Variables. On GitHub, go to the cloned repository settings > Secrets and variables > Actions > Secrets.
- Store the credentials from steps 6-8 as
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
,PULUMI_ACCESS_TOKEN
, andPOSTGRES_PASSWORD
. These will be used as environment variables by the deployment script. - On AWS Console, go to EC2 > Load Balancers. The
DNS name
is where you access the Nextcloud Web Interface to establish your administrative credentials.
[!NOTE] The automatic deployment will be triggered if there are changes made on the
main.go
,.gitlab-ci.yml
, or theci.yml
file upon doing agit push
. Onmain.go
, you can adjust the specifications of the resources to be manifested. Notable ones are in lines 327, 328, 571, 572, 602, 603, 640.
Deployment Option 2 (Manual)
- Install Go, AWS CLI, and Pulumi.
- Follow steps 1-8 above.
- Add the required IAM policies to the IAM User Group to allow Pulumi to interact with AWS resources:
printf '%s\n' "arn:aws:iam::aws:policy/AmazonS3FullAccess" "arn:aws:iam::aws:policy/AmazonECS_FullAccess" "arn:aws:iam::aws:policy/ElasticLoadBalancingFullAccess" "arn:aws:iam::aws:policy/CloudWatchEventsFullAccess" "arn:aws:iam::aws:policy/AmazonEC2FullAccess" "arn:aws:iam::aws:policy/AmazonVPCFullAccess" "arn:aws:iam::aws:policy/SecretsManagerReadWrite" "arn:aws:iam::aws:policy/AmazonElasticFileSystemFullAccess" "arn:aws:iam::aws:policy/AmazonRDSFullAccess" | xargs -I {} aws iam attach-group-policy --group-name pulumi-group --policy-arn {}
- Add the environment variables.
export PULUMI_ACCESS_TOKEN="value" && export AWS_ACCESS_KEY_ID="value" && export AWS_SECRET_ACCESS_KEY="value" && export POSTGRES_PASSWORD="value"
- Clone the repository locally and deploy.
mkdir pulumi-aws && \
cd pulumi-aws && \
pulumi new aws-go && \
rm * && \
git clone https://gitlab.com/joevizcara/pulumi-aws.git . && \
pulumi up
Deprovisioning
pulumi destroy --yes
Local Testing
The Pulumi.aws-go-dev.yaml
file contains a code block to use with Localstack for local testing.
Features
- Subscription-free application - Nextcloud is a free and open-source cloud storage and file-sharing platform.
- Serverless management - using Fargate and Aurora Serverless reduces infrastructure management.
- Reduced cost - can be scaled and as highly available as an AWS EKS cluster, but with cost lower per-hour.
- Go coding language - a popular language for cloud-native applications, eliminating syntax barriers for engineers.
Diagramme
A few suggestions:
-
Some of those components may end up costing a lot to operate. You said you’re doing it as a portfolio piece. May want to create a spreadsheet with all the services, then run a cost simulation. You can use the AWS Cost calculator, but it won’t be as flexible doing ‘what if’ scenarios. Any prospective employer will appreciate that you’ve given some thought to runtime pricing.
-
You may want to bifurcate static media out and put them in S3 buckets, plus put a CloudFront CDN in front for regional scaling (and cost). Static media coming out of local server uses up processing power, bandwidth, storage, and memory. S3/CloudFront is designed for just this and is a lot cheaper. All fonts, js scripts, images, CSS stylesheets, videos, etc. can be moved out.
-
Definitely expire your CloudWatch log records (maybe no more than a week), otherwise they’ll pile up and end up costing a lot.
-
Consider where backups and logs may go. Backups should also account for Disaster Recovery (DR). Is the purpose of multiple AZs for scaling or DR? If for DR, you should think about different recovery strategies and how much down-time is acceptable.
-
Using Pulumi is good if the goal is to go multi-cloud. But if you’ve hardcoded Aurora or ALBs into the stack, you’re stuck with AWS. If that’s the case, maybe consider going with AWS CDK in a language you like. It would get you farther and let you do more native DevOps.
-
Consider how updates and revisions might work, especially once rolled out. What scripts will you need to run to upgrade the NextCloud stack. What are the implications if only one AZ is updated, but not the other. Etc.
-
If this is meant for business or multiple users, consider where user accounts would go? What about OAuth or 2FA? If it’s a business, they may already have an Identity Provider (IDP) and now you need to tie into it.
-
If tire-kicking, may want to also script switching to plain old RDS/Postgres so you can stay under the free tier.
-
To make this all reusable, you want to take whatever is generated (i.e. Aurora endpoints, and save everything to a JSON or .env file. This way, the whole thing can be zapped and re-created and should work without having to manually do much in the console or CLI.
-
Any step that uses the console or CLI adds friction and risk. Either automate them, or document the crap out of them as a favor to your future self.
-
All secrets could go in .env files (which should be in .gitignore). Aurora/RDS Database passwords could also be auto-generated and kept in SecretsManager and periodically rotated. Hardcoded DB passwords are a risk.
-
Think about putting WAF in front of everything with web access to prevent DDOS attacks.
This is a great, learning exercise. Hope you don’t find these suggestions overwhelming. They only apply if you want to show it off for future employers. If it’s just for personal use, ignore all the rest I said and just think about operating costs. See if you can find an AWS sales or support person and get some freebie credits.
Best of luck!
thank you
This is an awesome helpful comment
-
Pretty cool as a learning exercise. As a follow up scenario maybe try moving this infrastructure to another cloud provider because AWS deleted your account without warning or try a multi-cloud deployment.
Everyone is free to pick their poison, but I have to ask…why? What is the target audience here? This is a massively overkill architecture IMHO. Not to talk about the fact you now need 3 managed services (fargate, s3 and aurora at least) for a single self hosted tool, and that is being generous (not counting cloudwatch, ALBs, etc.).
- Why do you need security groups to allow egress anywhere (or, at all)?
- I would pin the image to a digest, rather than using latest.
- what is the average monthly cost for this infra for you?
Folks in IT. This is one of those “deploy something enterprise grade because you can” type of scenarios. It’s like asking why somebody would play a dry milsim game like Arma when Call of Duty exists. This will cost you more than a simple VPS on a platform but it wouldn’t exactly break the bank either.
Not that cheap. Both Aurora and Fargate can be pricy, so using this for personal cloud, not as business solution, is not only a overkill, but also expensive tool, that you will not fully reuse for other services. I think, in personal selfhosted area, we agree to not use that overkilled architecture (but for typical cloud deployment it is fairly simple) to cut costs massively.
Oh yeah, I am aware. Mostly here I would question the idea to have multi-AZ redundancy and using a manage service for DB (which indeed is expensive). All of this when a 5$ VPS could host the same (maybe still using s3 for storage) and accept the few hours downtime in the rare event your VPS explodes and you need to restore it from a backup.
So from my PoV this is absolutely overkill but I concede that it depends a lot on the requirements. I can’t ever imagine having requirements so tight that need such infra to run (in fact, I think not even most businesses have these requirements, I have written on the topic at https://loudwhisper.me/blog/hating-clouds/) for my personal stuff…
Yes, just like I said, when running it for personal use, going with SLA 99,(9) is too expensive. As far as long we say about serverless solutions, they can be great and helpful (I can say that from both SysOps and DevOps perspective that work on many projects), but I don’t think they should be used in homelab form, as they do not provide that much customisations, and homelabs are the place where we want to experiment and have some fun, not just deploy something in a way that will “just work”.
Plus, at this point why not using directly managed Nextcloud (or alternatives)… If anyway you use a managed storage, runtime and database, in a vendor lock…
I would not go with managed NC, because you can’t control nothing and provider raise prices over and over. Even with serverless Nextcloud deployment, the architecture is still like LEGO, and if something will go bad or price will be too high, then you can exchange those LEGO bricks, ie. migrate from Fargate to EC2 w/ ECS, migrate from Aurora to RDS Postgres or Postgres installed on EC2 and so one.
No sane selfhoster should do this. This is far beyond being overkill.
I’m pretty sure that’s the point.
All comments about overkill are amusing. You do you. Did you learn stuff?
Maybe you can replace some of those tools with less expensive analogs, how’s the cost anyway?
it’s just for my portfolio. it’s like self-hosting for enterprise
If you want it to stand out don’t automate the compute and networking that’s so standardized these days that anyone can do it, Automated those IAM permissions.
I know that when hiring nothing gets me more excited about a candidate than them understanding how to securely bootstrap an environment.
In that case, Pulumi permissions are too broad IMHO for what it has to do, an enterprise should adhere to least privilege. Likewise, as I wrote in another comment, the egress security groups are unclear to me (why any traffic at all is needed?) and the image consumed should be pinned to a digest. Or better yet, should be coming from a private enterprise registry, ideally with an attestation that can be verified at runtime.
I am not sure ECS Fargate makes sense vs an ec2 instance to run the workload. This setup alone will cost about $30/month assuming half a vCPU per replica with Fargate, plus about $12 for the memory (1GB/task). 2xt2.micro could be run for ~$20 without even considering reservation discounts etc. Obviously the gap will become even larger at scale, which I suppose might be very interesting for an enterprise.
If that’s true, then great, and learning with cloud-native technology is perfectly fine. The critical comments were probably made because the post does not indicate that this is just an idea for production architecture or a form of learning, but rather the actual deployment that should be carried out (at least that is how I see it), which in this subreddit could be perceived as a proposal for self-hosting for private individuals (as self-hosting is associated with private individuals).
I started counting, and only with Fargate two ECS tasks (with not much CPU power) and with first Aurora DB it is almost 200 USD per month (in Frankfurt). If we will add another services, the cost will be higher and higher.
Did you learn stuff?
Yeah, learning is great and if you will deploy it and kill it in the same day, the cost will be quite low. But if you want to really use it, it is too much, it is better to use hostable alternatives, ie Load Balancer == Haproxy, Fargate Task == Docker on EC2/VPS (even with ECS), Aurora == burstable tier RDS or DB hosted on VPS/EC2. I know, in business area, you should not host DB on EC2 or use clean Docker on EC2 (without ECS) (and that production Nextcloud deployment could be more extended, because availability and scalability is more important that saving some dollars), but in private zone, where every penny is important, it is overkill for everyday use.
One availability zone is enough. I am not convinced that Aurora is good value.
This seems like an ad for pulumi, whatever that is.
In my experience, Pulumi can best be described as a waste of time.
This seems like overkill compared to just running it on a VPS and having a second VPS as a hot spare.