How to design such a project?
Python with django was chosen as a main technology simply due to its simplicity and ability to easily “hack” things together. It turned out nice, as virtually every developer in our company knows some python and as a result, anyone can do a fix or add a minor feature to the project. We decided to build a monolithic django application which leverages django’s admin heavily as it is the fastest way (known to us) to have the project running. It’s not really a good choice for a big service, but hey - we were not planning on writing an enterprise CRM.
How to develop such a project?
Thanks to our knowledge of the tools we were working with, we had two project instances running in a matter of days. We started with a simple stack: Django + PostgreSQL + Nginx running in Docker on a VPS. Of course, we had a CI/CD pipeline running since day one, and we utilized it heavily. Thanks to the pipeline, two environments (staging and production) and an agile approach to the development, we had a quick feedback loop between developers and the users. The features were discussed, implemented, deployed, and approved in a few hours. After about two weeks of development, we started to retire previously used tools that had equivalents in this project - that includes complex spreadsheets, Jira plugins, and a couple of external services.
How to keep developing such a project?
To ensure the project reliability and code quality, we introduced some strict rules. From the beginning, all tasks needed to be selected for development before the work on it started and approved after it was deployed. This approach allowed us to keep our kanban board up to date and as close to reality as possible.
On the development part, all pull requests, apart from needing to be code reviewed (duh), needed to pass a couple of checks that were enforced by the CI pipeline. First, a static analysis was run using flake8 - it caught many of the simple issues. Next, project formatting was checked by a tool called black - this step ensured the whole project is always formatted in the same convention. Further,Unit tests were run which verified that the code works as intended. Last, a minimal viable environment was spun up, and integration tests were run on it to prevent unexpected failure. After a pull request was merged, the code was automatically deployed to a staging environment. A deployment to production could be requested by anyone and was automatically performed when a code owner approved it.
How to maintain such a project?
With quick deployment comes lots of bugs and unexpected failures. First thing that we can do is to introduce the prevention tools described in a previous chapter. Unfortunately, they can’t detect 100% of the bugs because the world would be too simple. Therefore, the next thing that we can do is detection and alerting. For that, we used sentry - a quite popular tool that gathers logs and tracebacks from your application. It has integrations with virtually all languages and frameworks. Sentry allows you to detect failures in your application, alert you of new issues, estimate their scope, and simplify debugging and fixing them. A simple tool like a website uptime monitor will help you detect the most serious failures, i.e., server reboot, deployment crash, compute provider downtime.
Another thing that can be done to raise project reliability is to use automated failure recovery mechanisms. We started using Kubernetes to take advantage of its ability to monitor running service. In the first step, we added health checks to all services in the project. It allowed the Kubernetes to detect crashed or unhealthy containers and restart them. The second thing that we did was to leverage Kubernetes’s load balancing capabilities to implement zero-downtime deployment. That allowed decoupling developer deployments from users that previously needed to be asked to save their work and pause working for a couple of minutes. With the help of some simple tools (Argo cd & workflows), Kubernetes can be made to automatically rollback your service to a previous version in case of a faulty deployment. Mind that we are still running a small application on a cheap VPS, cloud-based enterprise solution.
How to increase quality of life?
We implemented a couple of things that hugely increased our quality of life with this project. One of those things was a Django application that manages our backups. It exposed a user interface via classic Django admin to list, create, and delete backups of our media files and persistent databases. It also allows configuring when backups are made and where they are stored. At the moment, we are making a database backup every 4h and store them locally for a month, and we do a full system backup every week and store it indefinitely on AWS S3. Such a simple application that allows setting up, reviewing, and manually creating backups, greatly simplifies the backup flow and access control (it uses the Django permissions framework). Side note: please, make and audit your backups. If you are waiting for a signal from the world, this is it.
Another thing that simplifies your and your users’ life is enabling oauth2 login. Allow your application users to login with your organisation accounts - no extra login and password make everything more secure and simpler to use. Don’t actually write it yourself - it might be difficult to ensure security and there are a ton of existing and proven solutions that can be used. Just search for oauth django packages.
To allow any developer to start implementing features in the project without bothering someone for hours about how to set up a local environment, have a comprehensive readme written. Describe how to do that on your organisation's most popular operating systems. List prerequisites (e.g. git client, python interpreter - nothing is obvious), preferably with their versions that are known to work. Describe how to populate the database[s] or how to connect to a populated one. Briefly describe the conventions and what is (and should be) placed wherein the repository. You should also mention what big parts of the system are and what they do. Consider pointing out some tricky concepts or hacks that might be misunderstood. A good readme saves a lot of time and headache but doesn’t mistake it for proper documentation.
What it actually does?
The biggest module in our project is used to manage employees. It stores all information about our current and past employees and presents it in many different ways. This includes generating CVs for our clients, graphing rates against a skill level, making sure the profile is up to date, counting used and available days off, and many more. This application also ensures that most paperwork is done on time.
Another big module is one that helps our sales team with their everyday tasks. It synchronizes information about potential clients and projects between a couple of external systems and displays it in a nice way. It generates a table with each developers’ availability and status of the nearest projects for them. This module also aggregates leads from many different sources (contact form on our website and contact email inbox) and alerts correct people about new ones via Jira and Slack.
Maybe not so big, but an incredibly useful module is a cash flow one. It gets all our invoices from the external system and compares them against financial statements from the accounting team. It also creates a revenue and cost prediction for the upcoming months based on previous data and projects statuses from the sales module. It also alerts us about overdue invoices.
We have a couple of small modules that accomplish simple tasks. For example, we have three integrations with slack. The one that is used the most is a /thanks slack command that is used to thank someone for something. It assigns points to both thanker and thankee and creates monthly employee leaderboards. You’d be surprised how nice people can be when you make it a competitive activity.
One more worth mentioning module is an inventory management one. It is straightforward as it only stores “items” that can be assigned [or not] to particular employees. It allows us to keep track of how many of a given asset we have available and who “has” what. We store information mostly about the big stuff like assigned laptops and displays.
What is the future?
At the moment, we are focusing on extending the sales module, and we have planned a couple of extra features. This project started as and still is our internal-only platform, but more and more of our partners are interested in using it or at least parts of it. We are considering cleaning it up a bit and making it less tailor-made for us to release it as a PaaS.
What do you think about it? Would you consider this platform?
Feel free to contact us via the contact form if you are interested or if you have any questions.