updated: 2022-09-08
(under construction…)
Introduction
Description: I wanted to review the best practices to inpsect AWS traffic and how this would be architected. I ended up going through a few iterations of different designs and the pros and cons of each. This is a space on some of those designs and online resources I used.
I hope this becomes a sort of one stop shop in evaluating different methodoligies that can be leveraged to secure an AWS Cloud solution. Focused around Palo Alto, but what the alternative native recommended solutions would leverage.
See also new
Background
I was going back and forth on different design concepts within AWS. Primarly all documenation that I cam across always defaulted to native AWS tools in order to secure a typcial 3-Tier design. But I had a few issues with this.
- What if we don’t have enough developers to be able to create the proper security groups needed to secure an AWS infrastructure leveraging only AWS Security?
- What if you have different requirements that need deep level packet inspection and full visibility into traffic leaving your AWS environment to help protect against exfiltration or nodes accessing resources that they shouldn’t?
- Is there a better way to be able to monitor my compute instances and be able to dynmaically create rules/policies similar to how we already have deployed in our more traditional Datacenter?
This is where I begain looking at how to deploy Palo Alto NGFW into the AWS infrasture. At first it was a simple Outbound inspection requirement. But as things progressed different designs started to evovle from the same basic designs.
Resources
Great video breakdown by Ralph Carter:
I think he does an amazing breakdown of the Palo Alto single-arm methodology that I prefer over a more traditional dual-arm method. It slightly differs then the recommended Palo Alto version (which I will go over later). It is very powerful and from this you pretty much solve any problem. I do prefer seperating traffic using subinterfaces and find it more secure when you redirect traffic to a subinterface vs using the main interface. Seperating the traffic across subinterfaces does offer some additional security. It allows you the ability to lock down the GWLB Healthchecks to the main interface keeping the management interface locked down (see VPC sub interface association document).
Design Concepts and Overview
Basic conceptual design overview:
Lets break this down for a moment since it may be a lot.
- What we do here is we have a few simple route tables that are responsible for directing traffic flow patterns.
- Each VPC is considered a “spoke vpc” and therefore should only contain a single default route to the “Security VPC”.
- The “spoke or application vpc” should have no clue how to get to anything but the transit gateway also known as the “Appliance VPC” or “Shared VPC” whatever term you want to use.
- The “Security VPC” is a specialized VPC that is set up in appliance mode. See AWS Modify Transit Gateway VPC Attachment
documenation to see how to create this.
As of right now you can only modify this via aws cli. A sample code would look something like this (now if you have no-ssl or if you are like me with multiple profiles in your ~/.aws/credentials file you would also specify –profile in your command, but you should be able to figure this out):
# aws ec2 modify-transit-gateway-vpc-attachment --options "ApplianceModeSupport=enable" --transit-gateway-attachment-id <YOUR TGW ATTACHMENT HERE> --region <YOUR REGION HERE>
- The “Security VPC” is also set up to allow cross availability zone traffic in order to offer redundancy between the different frirewall setup
- Once the traffic hits the “Security VPC” all traffic is redirected via the Transit Gateway Attachement Interface (ENI) to the corresponding Gateway Load Balancer Endpoint (GWLBe). You accomplish this by using a sperate subnet for your TGW Attachement (which should be standard practice for all your VPC)
- The GWLBe is now the endpoint that is added to the connecting Transit Gateway Route table where you direct all traffic to the GWLBe.
- In subdesigns you can split out the traffc by using multiple GWLBe and send specific subnets to each GWLBe.
- This is still the recommended version that Palo Alto recommends based off their most recent centralized security VPC
documenations. In this document they advise splitting up east/west and outbound traffic via two seperate GWLBe reference: Deep Dive
.
- There is debate on this design and some may consider it incorrect, but I have gone back and forth.
- The design addes additional complications, which makes it more difficult to manage.
- You now need to ensure that you are assosciating your VPC endpoint to the correct subinterface on the Palo Alto which will now include new zone configurations and rule-base based off that subinterface.
- If you chose not to do this and just direct all traffic to a single Ethernet Interface on the Palo Alto; you no longer need to create subinterfaces or manage them.
- Yet for the GWLB to operate you need to enable a management profile on the main Ethernet interface, which is used to monitor the health of the interface. And therefore can be used to manage the firewall if someone gains access to the fireall.
- So, my argumnent for creating the subinterfaces and linking the VPC endpoints to are twofold:
- You control the interface management profile and no actual traffic will not be goign through this interface therefore it secures it from being able to be accessed by any other traffic.
- You now can split create two different zones an “outbound” zone and an “east_west” zone and control your policies as such.
How it works and the problems
The design is widely used and works out very well especially in regards to insepecting outbound and east/west traffic. But lets for a momen focus on one of the most critical pieces. The inbound VPC.
Inbound Inspection
In this design you are going through a centralized point for inspection.But here are the caviots:
- You need to ensure your certificate management system is up to date and all certificates are
- loaded into your panorama device
- pushed to all firewalls
- all rules are referencing the new certficiate each time it changes
- there is an issue with Palo Alto if you use the same name on a certificate; I’ve noticed it may cause Panorama to bug out.
See Cloud NGFW Certificate Management
- Now that you got all that down, you need a Load Balancer between your IGW and your GWLBe (Gateway Load Balancer Endpoint) to successfully achieve redudancy within your AZ’s inside the region that application is deployed in. Yet, what AWS doesn’t tell you is that it’s impossible to build a NLB in front without stripping out the source IP.
I’ve been in communications and have yet to set this up in a lab, but appearently there may be a way around this by using an Application Load Balancer and preserve the X-Forwarder in the header. This should preserve the IP And let the application know where the traffic is coming from; but I have yet to fully iron out this design.
It is for the above reasons I sit back and still ponder what Palo Alto and AWS are thinking in offering this as your end all solution for Inbound traffic.
This leads to a new design leveraging an external CDN that does the deep level decryption on Palo Alto’s behalf allowing you to both decrypt inspect and load balance the traffic between availability zones for full regional redundancy.
Deep Dive
Considerations:
- This design I did decide to split out the VPCE that terminate to subinterfaces. There is a debate to be had here. You really don’t need to and it adds a layer of complexity that may not be what you want, but in my case I like it because:
- I’m able to open up a management profile only on the main interface which is necessary to permit the healthchecks to connect to the NGFW instance. This allows me to only maintain a management profile open to port 80 to the GENEVE protocol sends those healthchecks to the main interface.
- I’m able to create different zones to better try to manage my rules. Since we are dealing with a single arm it can be hard to tell where the traffic is heading. I can now specify a sub-interface for only East/West and another for Outbound.
All in all you can always decide to just send everything to the main interface build out the security management profile on it and keep everthing in the same zone. But that ultimatly is what works best for you.
Automation
How do we dynamically deploy and create rules around this design to simplify and keep up with cloud development?
Limitation
Panorama can retrieve a total of 32 tags for each virtual machine: 11 predefined tags and up to 21 user-defined tags. The number of tags used impacts the total number of IP addresses you can monitor. For example, Panorama can retrieve 10,000 IP addresses with 13 tags for each, or it can retrieve 5000 IP addresses with 25 tags for each.
WARNING: Depending on the version of the AWS plugin (2.0.0 is the last to support this) that you use depends on the ability of tagging you are capable of handling. The standard plugin is centralized around just EC2 instance monitoring, but once you migrate to version 3.0.0+ you lose the ability to monitor things like EKS, which is critical for EKS migration. This gets handled in the EKS plugin, but there are more complications around this that we would need another blog post to go over.
EKS Plugin
Caution if you are looking to monitor EKS; Palo Alto removed support for EKS Tag monitoring in version 3+. Requiring you to purchase the Palo CN-Series firewall
which integrates with Panorama plugin, but is a whole new product line that handles East/West traffic and securing traffic within the workload itself. This would require an additional process to run inside the workload which takes away resources from your workloads depending on the way it gets deployed.
An alternative way of protecting your EKS workloads would be to leverage The Complete Guide to Kubernetes Security with Prisma Cloud
. Although it doesn’t allow the ability to migrate EKS Clusters between environments. I’ve been in conversations with
AWS Plugin
AWS Plugin major release version 2 is their original plugin that handles EC2 and EKS monitoring. Although it is old and Palo has since released major releases 3 and 4 that handle other enhancements as well as updated source code. But see above under the EKS Plugin section for what Palo Alto had decided to do when they upgraded to version 3+.
Palo Alto documentation on how to install and configure AWS Plugin 2.x is located Set Up the AWS Plugin for VM Monitoring on Panorama
.
Special note if your instance sits behind a firewall you need to ensure you create a custom URL Category Object and allow (they don’t inform you of this):
- iam.amazonaws.com/
- ec2.^.amaozonaws.com/
- sts.amazonaws.com/
- elasticloadbalancing.^.amazonaws.com/
Note: I use ^ instead of * to permit only a single value between the subdomains. But if you are sure of the regions you are monitoring you can replace that with the specific regions you are setting up your plugin to monitor.