System Hackers meeting - Lyon edition

Hackers in their natural working environment. For the picture we took off the black ski masks and gloves.

For the 4th time, and less than 5 months after the last meeting, the FSFE System Hackers met in person to coordinate their activities, work on complex issues, and exchange know-how. This time, we chose yet another town familiar to one of our team members as venue – Lyon in France. What follows is a report of this gathering that happened shortly before #stayhome became the order of the day.

For those who do not know this less visible but important team: The System Hackers are responsible for the maintenance and development of a large number of services. From the fsfe.org website’s deployment to the mail servers and blogs, from Git to internal services like DNS and monitoring, all these services, virtual machines and physical servers are handled by this friendly group that is always looking forward to welcoming new members.

Interestingly, we have gathered in the same constellation as in the hackathon before, so Albert, Florian, Francesco, Thomas, Vincent and me tackled large and small challenges in the FSFE’s systems. But we have also used the time to exchange knowledge about complex tasks and some interconnected systems. The official part was conducted in the fascinating Astech Fablab, but word has it that Ninkasi, an excellent pub in Lyon, was the actual epicentre of this year’s meeting.

Sharing is caring

Saturday morning after reviewing open tasks and setting our priorities, we started to share more knowledge about our services to reduce bottlenecks. For this, I drew a few diagrams to explain how we deploy our Docker containers, how our community database interacts with the mail and lists server, and how DNS works at the FSFE.

To also help the non-present system hackers and “future generations”, I’ve added this information to a public wiki page. This could also be the starting point to transfer more internal knowledge to public pages to make maintenance and onboarding easier.

Todo? Done!

Afterwards, we focused on closing tasks that have been open for a longer time:

  • The DNS has been a big issue for a long time. Over the past months we’ve migrated the source for our nameserver entries from SVN to Git, rewrote our deployment scripts, and eventually upgraded the two very sensitive systems to Debian 10. During the meeting, we came closer to perfection: all Bind configuration cleaned from old entries, uniformly formatted, and now featuring SPF, DMARC and CAA records.
  • For a better security monitoring of the 100+ mailing lists the FSFE hosts, we’ve finalised the weekly automatic checks for sane and safe settings, and a tool that helps to easily update the internal documentation.
  • Speaking of monitoring: we did lack proper monitoring of our 20+ hosts for availability, disk usage, TLS certificates, service status and more. While we tried for a longer time to get Prometheus and Grafana doing what we need, we performed a 180° turn: now, there is a Icinga2 installation running that already monitors a few hosts and their services – deployed with Ansible. In the following weeks we will add more hosts and services to the watched targets.
  • We plan to migrate our user-unfriendly way to share files between groups to Nextcloud, including using some more of the software’s capabilities. During the weekend, we’ve tested the instance thoroughly, and created some more LDAP groups that are automatically transposed to groups in Nextcloud. In the same run, Albert shared some more knowledge about LDAP with Vincent and me, so we get rid of more bottlenecks.

Then, it was time to deal with other urgent issues:

  • Some of us worked on making our systems more resilient against DDoS attacks. Over the Christmas season, we became a target of an attack. The idea is to come up with solutions that are easy to deploy on all our web services while keeping complexity low. We’ve tested some approaches and will further work on coming up with solutions.
  • Regarding webservers, we’ve updated the TLS configurations on various services to the recommended settings, and also improved some other settings while touching the configuration files.
  • We intend to ease people encrypting their emails with GnuPG. That is why we experimented with WKD/WKS and will work on setting up this service. As it requires some interconnection with others services, this will take us some more time unfortunately.
  • On the maintenance side of things, we have upgraded all servers except one to the latest Debian version, and also updated many of our Docker images and containers to make use of the latest security and stability improvements.
  • The FSFE hosts a few third party services, and unfortunately they have been running on unmaintained systems. That is why we set up a brand new host for our sister organisation in Latin America so they can eventually migrate, and moved the fossmarks.org website to our automatic CI/CD setup via Drone/Docker.

The next steps and developments

As you can see, we completed and started to tackle a lot of issues again, so it won’t become boring in our team any time soon. However, although we should know better, we intend to “change a running system”!

While the in-person meetings have been highly important and also fun, we are in a state where knowledge and mutual trust are further distributed between the members, the tasks separated more clearly and the systems mostly well documented. So part of our feedback session was the question whether these meetings in the 6-12 month rhythm are still necessary.

Yes, they are, but not more often than once a year. Instead, we would like to try virtual meetings and sprints. Before a sprint session, we would discuss all tasks (basically go through our internal Kan board), plan the challenges, ask for input if necessary, and resolve blockers as early as possible. Then, we would be prepared for a sprint day or afternoon during which everyone can work on their tasks while being able to directly contact other members. All that should happen over a video conference to have a more personal atmosphere.

For the analogue meetings, it was requested to also plan tasks and priorities beforehand together, and focus on tasks that require more people from the group. Also, we want to have more trainings and system introductions like we’ve just had to reduce dependencies on single persons.

All in all, this gathering has been another successful meeting and will set a corner stone for exciting new improvements for both the systems and the team. Thanks to everyone who participated, and a big applause to Vincent who organised the venue and the social activities!



Comments