Technology

Welcome to the technology section, here you’ll find information about the infrastructure and code which runs Security Industry Cloud.

Resilience and security are at the heart of our technology and we cater for failure at every layer of our application.

Code

We have a small team of developers with a solutions architect led by a highly experienced professional architect who has designed and delivered several services similar in nature to Security Industry Cloud.

A mixture of programming languages have been used to produce the service including PHP, HTML, Javascript, jQuery, Python and NodeJS.

The majority of our service is written with PHP with a custom lightweight framework and hosted on Amazon Linux.

Infrastructure

Imagicloud (the company that owns SI Cloud) are specialists in Cloud Computing and are Amazon Web Services Consulting Partners. Our infrastructure has been developed to top spec as a showcase infrastructure which is used to demonstrate security and operational excellence to clients who are considering choosing Imagicloud to help them on their cloud journeys.

We use Amazon Web Services to run the majority of our services in a global spot architecture, we host some monitoring and supporting services using Google Cloud Platform.

Technology Overview

Archictecture

Architecture is in the hands of our in-house architects who are certified Amazon Web Services experts, and led by Geraint who has been delivering high traffic, complex solutions for high profile clients using AWS for more than 10 years.

We have opted for a fault tolerant, cross-region, spot architecture which offers a perfect balance of cost and reliability… we may experience very short outages from time to time, but they will generally be limited to a small number of users and be brief – any services which are absolutely required to respond due to risk to life are served from a separate highly resilient servers.

Amazon Web Services

Amazon Web Services is the primary hosting provider for Security Industry Cloud, the platform is supported by our in house team of cloud engineers who are on call 24/7/365. We run a global spot architecture utilising a minimum of 2 regions, with all zones within each region utilised at any given time. Our service defaults to running in eu-west-1 (Dublin, Ireland), eu-west-2 (London, UK) and us-east-2 (Ohio, US), when required, the service can automatically move itself to other areas of the globe in response to demand and failure.

We use the following AWS Services:

  • CloudFront – Caching for busy portals and static assets
  • CloudWatch – Monitoring AWS managed services such as load balancers
  • EC2 – Caching, Kafka, MariaDB, Memcache, Redis, Web Servers, Processing Servers
  • DynamoDB – Caching during high traffic periods
  • IAM – Security
  • Lambda – High traffic volume endpoints
  • S3 – Static assets & document storage
  • SES – Sending Email
  • SNS – Broadcasting to infrastructure components & Sending SMS Messages
  • SQS – Queuing when Kafka is too busy
  • Route 53 – DNS
  • VPC – Security
We have several environments including Development, Testing, Staging and Production.

Google Cloud Platform

Google Cloud Platform (GCP) is used minimally within our solution. A monitoring service is deployed here to provide an external view from Amazon Web Services and our status dashboard is also hosted here.

Services in use:

  • Cloud Storage
  • Compute
  • Cloud Functions

Monitoring & Security

We have in depth monitoring of every layer of our infrastructure using a combination of tools including:

  • CloudWatch – To monitor AWS managed services.
  • Grafana – To visualise metrics
  • Prometheus – To gather information from various components.
  • Promtail – Capturing statistics from log files.
  • Pushover – To notify us of any outages or unusual patterns.
  • Google Analytics – General user behaviours.
A Web Application Firewall is in use attached to our edge ingress Load Balancers, this detects unusual behaviour and blocks common attack types.
These additional security measures have been implemented:
  • No direct outbound Internet connectivity from any running component, an egress proxy server allows whitelisted endpoints.
  • Remote Access Disabled to running instances.
  • Daily security patching, including immediate patching in response to 0-day vulnerabilities.
  • Individual database passwords restricting what can be queried.
  • XSS & SQL Injection Protection at both WAF and Application layer.
  • Internal only services, all components are accessed via secure VPN.
  • Multi Factor Authentication where possible.

Caching

There are several layers of caching implemented across the infrastructure, here are details of just some of these caching methods we have implemented.

  • CloudFront – Provides caching for the busiest customer portals.
  • NGINX – This is used to provide caching for our websites and SI Cloud requests, there are two layers of NGINX caching in place.
  • DynamoDB – This is used to provide additional capacity and availability in the event of unexpected high volumes of traffic.
  • Memcache – We use memcache on all web servers, there is also a common cluster which all web and processing servers share.

Databases & Queues

We use a variety of databases to store different types of data.

  • DynamoDB – This is used during high traffic scenarios to provide additional capacity at short notice.
  • MariaDB (Galera) – Our primary database service, we have a global Galera cluster in an eventual consistency write-write configuration with read replicas in some regions.
  • Memcache – In memory database for caching.
  • Kafka – We use Kafka to queue actions for processing by our processing servers.
  • SNS – Amazon’s Simple Notification Service is used to send control messages to running components globally.
  • SQS – Amazon’s Simple Queue Service is used when Kafka is too busy to handle requests.
  • Prometheus – Prometheus is a time series database which we use to monitor service usage and performance over time.

Processing Servers

Our processing servers perform larger or long running tasks such as sending email, SMS, cleaning the database, creating PDF files, checking for anomalies and so forth.

Tasks arrive at this server through a variety of methods including:

  • Database Tables
  • Kafka
  • SQS
  • HTTP Calls from the Web Application
Services are written in a variety of languages including PHP, NodeJS and Python.

Web Application Layer

The web application layer is situated upstream of the caching layer and traffic is distributed with load balancers. These servers run Amazon Linux 2 and do not have Internet access, they are entirely isolated for enhanced security.

The bulk of the application code is PHP, however some endpoints are serviced by Lambda functions written in Python or NodeJS.

In the event of a local region database failure, traffic automatically redirects to the closest online region while the local region recovers.

The components which form the web application layer include:

  • AWS EC2 – Running the PHP application
  • AWS Lambda – High traffic endpoints
  • AWS S3 – Document & static asset storage
  • NGINX – Passing requests to php-fpm
  • PHP-FPM – processing PHP
  • Grafana – Visual elements are rendered with grafana to be displayed within the portals

Automation

The following tools are used to automate the management of our platform:

  • Ansible – We use Ansible to define our Linux instances as code.
  • Terraform – Terraform is used to manage our cloud infrastructures.
  • Jenkins – We use jenkins to build application packages, update running instances and perform a variety of backup management.

Backup & Resilience

Our service is globally deployed and has been designed with resilience and security as priorities, should a region fail, traffic will automatically redirect to the closest online working region – during the failover time some service interruption may be experienced but this typically lasts no more than 2-3 minutes.

We take daily database dump backups, and snapshots of running database servers once every 4 hours. We retain backups for up to 30 days.

Our monitoring solution is linked to our automation services and can perform automated fixes in some scenarios, when this is not possible a notification is sent to the Imagicloud 24/7 infrastructure support team.