Hosting Lemmy experience

nachitima@lemmy.ml · 2 days ago

Hosting Lemmy experience

nachitima@lemmy.ml · 5 hours ago

Hey, super helpful comment.

A few of the details you mentioned are exactly the kind of practical stuff I’m trying to collect, so I wanted to ask a bit more:

When you say you pushed federation workers up to 128, which exact setting are you referring to?
Roughly how big is your instance in practice — users, subscriptions, remote communities, storage size, daily activity?
What were the first signs that federation was falling behind, besides the Waiting for X workers log message?
Did increasing workers fully solve it, or did it just move the bottleneck somewhere else?
What kind of Postgres tuning ended up mattering most for you?
For backups, are you only doing weekly pg_dump + VPS backups, or also separately backing up pictrs, configs, secrets, and proxy setup?
Have you tested full restore end-to-end on another machine?
For pictrs growth, have you found any good way to keep storage under control, or is it mostly just “plan for it to grow”?
For monitoring/logging, if you were starting over, what would you set up from day one?

I’m mostly interested in the boring operational side of running Lemmy long-term: backup/restore, federation lag, storage growth, and early warning signs before things get messy.

Sorry if some of these questions are a bit basic or oddly specific — I’m using AI to help gather as much real-world Lemmy hosting experience as possible, and it generated most of these follow-up questions for me.