More Secure Groupware: Self-Hosted Exchange With a Walled Garden
Rarely a month passes when it’s not an issue, but two recent high-profile compromises of large, professionally managed Hosted Exchange Platforms have brought Microsoft Exchange mailserver security back to the forefront of information security news cycles. Rackspace is still in the process of responding to a serious attack against its Hosted Exchange business and appears to have quickly decommissioned its product as they work to find solutions for their customers. More recently, Australia’s TPG Telecom has engaged Mandiant to respond to unauthorized access of its own Exchange platform, home to thousands of customer email accounts. Operating on-premises Exchange clusters can be onerous on its own at the best of times, but securing them is an even more intensive process.
One we’ve experienced previously.
Groupware, But More Secure
Back before Farsight Security was acquired by DomainTools, we were presented with an all-too-familiar challenge: provide for email, calendaring, and all the services that groupware typically provides, and do it securely. We made a list of the requirements and preferences, and we tried to meet as many as possible. Open source solutions were preferred, but ultimately the needs and challenges faced by the users made clear the available systems would not be sufficient. Two choices emerged: cloud-hosted email, or self-hosted Microsoft Exchange. As you might be able to tell from the title of this article, we opted for the greater control and privacy of a self-hosted solution.
There were a number of concerns shared and considered, from security to telemetry to cost, and the Technical Operations and Information Technology teams worked closely to do our best to address each of those concerns. As a result, we ended up with something that worked extremely well for us under the given parameters, and performed admirably on the security front, even preventing the recent Hafnium breach that affected so many other self-hosted Exchange users. Since the acquisition, we no longer have the same specific needs, so we pivoted again and found we no longer needed our Exchange “Walled Garden”. This presented an opportunity to share what we learned without potentially opening ourselves up to additional risk.
Initial Steps To Set Up A Self-Hosted Exchange
Specific constraints (including cost) meant we could not follow all of the external security recommendations. As always, the reader is encouraged to be aware of the manufacturer and security experts’ recommendations on the configuration of the software, and simply consider the information within this article as a supplement or educational material. No guarantee is offered that following this article will in fact secure your environment even as far as it did ours.
Our requirements were as follows:
- Provide dozens of users with redundant service from data centers on both American coasts
- Enable maintenance without impacting users
- Keep our servers up to date
- Keep our data secure
- Allow auto-configuration of desktop and mobile mail clients over HTTPS and IMAPS/SMTPS
- Keep the costs as low as reasonable
Since this was not the primary task for either of the project members, the design, architecture, testing, and implementation ended up taking over a year of calendar time, during which our company continued to use the self-hosted open source solution we were replacing. It cannot be stressed enough that planning something like this takes time, effort, and testing if you want to do it right.
Architecting A More Secure Solution
Scoping the task was the first major step. Early on in the proof of concept and development phase, we decided to minimize the number of Windows licenses we would need as a primary concern. Part of this decision was cost, part of it was minimizing the workload to secure and maintain an unfamiliar (to us) operating system, and part of it was a preference for open source software. Originally, that meant we planned for just two Microsoft Windows Server 2016 licenses, though it ended up growing to three by the end. Despite Windows Server 2019 being newly available, we selected Server 2016 as we felt utilizing a more mature version of an unfamiliar operating system seemed likely to provide stability and satisfy the security requirements we had.
We determined that the roughly 50 users had over a terabyte of email, mostly in small files. Since Exchange stores the emails in a database, that means that sufficient IOPS would be critical. This required us to use solid-state drives (SSDs) with a minimum aggregate size of 4 TB for the Exchange servers. We wanted the data to be “multiply redundant”, meaning local Redundant Array of Independent Disks (RAID), remote replication of all data, and backups. The hardware we were using did not have hardware RAID controllers, as we typically use Linux software RAID. Due to the lack of reliable Windows drivers, as well as the absence of a TPM for this specific hardware, the additional work that would have been required to get Windows software RAID working properly exceeded the time we had available. The solution was to install Linux and have it manage the RAID, set up a KVM hypervisor with LVM and provide near-block-level access to the Windows installation as a VM. This gave us familiar Linux tools for troubleshooting hardware, networking connectivity, routing, and even firewall control outside of the Windows environment.
We had to establish secure communications between the two data centers while preventing the Windows servers from talking to the outside world. We set up a Linux virtual machine as a proxy and gateway in each datacenter. This provided access to the public and private virtual local area networks (VLANs), and also to a secured VLAN which would only be accessible to the systems inside the secured Windows network. The only way in or out of the secured environment would be through the proxies. We established an IPsec tunnel between the two networks and enabled communications between the two Windows systems. Counter to the guidance Microsoft provides, we configured both as domain controllers because we were relying on external security to protect Active Directory and Exchange. We installed Microsoft Exchange 2016 and configured replication between the two sites.
Now we had two Exchange VMs talking to each other, but we weren’t out of the woods yet. Any time the two VMs lost communications with each other, they both thought they were authoritative, which stopped replication, and therefore a manual failover was required for clean resumption. This failed the “redundant service” requirement, so we installed a third Windows Server and set it up as an SMB host, to act as the third party for a database availability group (DAG). It also provided us with the ability to set up a Windows Server Update Services (WSUS) server to enable automated updates. This third server was also denied access to the Internet, except for manually enabled time windows where we synchronized the WSUS catalog, which itself was pared down to only the software we used. With that, the Exchange cluster could now automatically fail over between sites when one of the systems was rebooted for maintenance or updates.
Having an Exchange environment is only useful if it provides services to the outside world, and these servers now completely lacked the ability to interact with the open Internet. The secured environment would not stay secured long with client systems directly accessing it. So we determined the exact stack of protocols and services that would reach the Windows systems and then set about limiting the potential for compromise.
HAProxy provided almost all of the access functionality that we required. We first looked at TLS. We could handle the TLS handshake externally using a specific set of protocols and ciphers to maximize security with the equipment we approved. We blocked TLS below version 1.2, and excluded all ciphers with known weaknesses, preferring elliptical curves. But TLS offloading isn’t sufficient to secure everything. For HTTPS, we limited access to specific subpaths and validated input where possible, allowing for OWA, OMA and RPC over HTTPS. For other protocols, we used other software to handle the connections.
Second, we needed Lightweight Directory Access Protocol (LDAP) access for authentication with a slew of third-party services. We configured 389-ds, an open source LDAP server for Linux, using onsite redundant Centos VMs, and used LDAP replication and password sync software to replicate to and from Active Directory. HAProxy pointed all authentication connections to 389-ds from the outside, and only Exchange and IIS used Active Directory directly.
Third, we needed email to get in and out of the environment, so we installed Postfix, with amavisd, clamav, and spamassassin to filter the email inbound and outbound. We configured the MX Records to point external email to the Postfix hosts, and configured all outbound email to relay through them from Exchange without DNS lookups. SPF and DKIM records were used to validate the email we sent.
While it protected us from multiple widespread threats occurring during the time the servers were deployed, this setup was not without issues. The first complication constituted essential schema differences between 389-ds and Active Directory / Exchange, which had to be slowly mapped out to ensure all attributes and values were preserved. The second complication presented with some unreliable replication of Distribution Lists and Security Groups from Exchange to 389-ds: it was eventually deemed important to create those lists on the Exchange server first and let it replicate back to 389-ds, verifying it had done so as standard procedure.
Now email could get in and out, desktop and mobile clients were able to access the system, services could authenticate through LDAP, and all that was left was IMAPS. We allowed access from clients connected to the corporate VPN, and handled TLS at HAProxy to enable stronger ciphers than the Windows IMAP Server supported. For the limited audience utilizing this protocol, these protections were determined to be sufficient. We disabled IMAP access for all accounts that had not specifically requested it.
Then came logging. Having everything pass through the proxy enabled us to control a log that Exchange and Windows had no access to, which gave us an additional layer of security. HAProxy logged all URLs visited, all authentication attempts, all IPAMS connections. Postfix logged every SMTPS connection, every message to and from every user, and all scanning performed. All of these logs were replicated to an additional syslog server. This became critical later when attacks like Hafnium / ProxyShell spread across the Internet like wildfire. We were able to verify that such attacks were attempted, but ultimately failed due to our Walled Garden approach. The Exchange servers were not able to download exploit payloads, nor would they have been able to upload / exfiltrate data. Even though we still looked vulnerable to scans, log analysis and deep investigation for other indicators of compromise such as specific file creation showed no evidence of compromise.
The rest of security came down to creating and following documented procedures. We installed the Windows Updates as quickly as we could, unblocking WSUS internet access long enough to pull the updates, then restoring the block before update installation. We would pull Exchange Cumulative Updates and place them inside an ISO, which we would mount to the Exchange VMs as DVDs for installation. We monitored access logs and event logs. At no point from creation to decommission were the Windows systems that ran Exchange ever allowed to talk directly to the Internet. We were able to use LVM snapshots and qemu-img to get point in time backups of the Windows VMs, including the Exchange databases.
Conclusion
The threat model required for hosting Exchange Server has changed over the last few years as attacks on Exchange have increased in velocity and severity. We at Farsight Security took a unique approach to designing and building our self-hosted Exchange solution which met our needs for a secure, reliable groupware solution. This solution required the use of Linux hypervisors, VPNs, proxy servers, and LDAP and mail servers to create our “Walled Garden” for Exchange. Even though it was complex to operate, this solution worked extremely well for us under the given parameters, and performed admirably on the security front. Even though this approach differs from other best practices around the use of Exchange, we hope this article can highlight different ways that such critical systems could be secured and still useful to all employees.
It’s important to remember that each of these elements used in our solution takes time to design, create and deploy. Without the buy-in from management to spend the time to appropriately monitor, update, and manage the systems involved, there is no security system that will remain secure.
This article was co-authored by Ian Campbell and edited by Sean McNee, Ph.D