0

I have a 5 node Greenplum Cluster in Amazon Web Services with Ambari. I have de following ports open in my secutity group (all with TCP protocol): 80 50030 28080 8080 5432 50075 8441 50020 50010 443 9000 50070 22 50475 8021 8440 50470 50470 8020 50060 50090 51111 I can't to start services in Ambari, only I can start services when I add "all traffic" rule in my AWS Security group. What ports I have to open and what ports of my security group i can to close? Thanks for the help!

1 Answers1

0

When you deploy in AWS, you'll need to create several resources.

  1. Availability Zone: You will want to use just one AZ when deploying any big data product in Amazon.
  2. VPC: Just like the AZ, create just one VPC. You can also configure the VPC to enable dns hostnames which makes things easier.
  3. Subnet: A subnet specifies the IP address range for the hosts you deploy. You'll also want the subnet to automatically assign the ip address on launch.
  4. Security Group: When you create a VPC, a security group will automatically get created. You can use this or create another one. This is likely where you are having problems.

Enable TCP ports 0-65535, UDP ports 0-65535, and ping (icmp port -1) to the security group only. Do not open all of this to the world (0.0.0.0/0).

For example:

aws ec2 authorize-security-group-ingress --group-id $security_group_id --protocol tcp --port 0-65535 --source-group $security_group_id
aws ec2 authorize-security-group-ingress --group-id $security_group_id --protocol udp --port 0-65535 --source-group $security_group_id
aws ec2 authorize-security-group-ingress --group-id $security_group_id --protocol icmp --port -1 --source-group $security_group_id

Next, get your IP address. Here is a neat way to do it:

security_cidr=$(wget http://ipecho.net/plain -O - -q ; echo "/32")

Now, allow access to ssh and connect to the database to only your IP address.

aws ec2 authorize-security-group-ingress --group-id $security_group_id --protocol tcp --port 22 --cidr $security_cidr
aws ec2 authorize-security-group-ingress --group-id $security_group_id --protocol tcp --port 5432 --cidr $security_cidr
  1. Gateway: Create a gateway and attach it to your VPC.
  2. Route: The VPC will have a route table created automatically. You will need to route using the Gateway and your route table which allows the destination of 0.0.0.0/0. This allows the cluster to communicate over the Internet but you security group is still only allowing your ip address to connect.
  3. Placement Group: Creating this puts all your nodes together for better performance.
  4. Deploy instances: Use ebs optimized option and dedicated tenancy. If using EBS storage, use the st1 disk type. If using ephemeral, use RAID 0 with as many volumes as possible. Don't create just a single RAID0 mount.

There are still many optimizations you'll need to do such as enable the 10GB network, installation of the Intel network drivers, configuring the operating system, and formatting and mounting disks.

The much easier solution is to use the Pivotal Greenplum Amazon Marketplace offering which has all of this already configured for you. It deploys the cluster in minutes too.

Jon Roberts
  • 2,068
  • 1
  • 9
  • 11
  • If I enable TCP ports 0-65535, UDP ports 0-65535, and ping (icmp port -1) to my security group only, I can't access to port 8080 from browser (entering ***.***.***.***:8080). I can do this only when I allow all traffic :( – Joan Sánchez Escudero May 29 '17 at 10:28
  • I don't know why you are using Ambari with Greenplum but if you want to access port 8080 from your computer, add a entry to the security group for TCP port 8080 and allow access to your IP address only. I would also change the default Ambari password. – Jon Roberts May 30 '17 at 14:21