Genvid Forum

Timeout when doing terraform deploy


#1

Is there a built in one hour timeout waiting for an AMI image to be installed during a terraform apply?

I ask as after today’s third attempt to apply the terraform infrastructure, it waits for about an hour for the game instance to start up in AWS, then gives up and I have to start over (hitting apply again destroys and starts from the beginning). After the timeout the log reads:

Error applying plan:

5 error(s) occurred:

* module.cluster.module.services.aws_instance.internal_worker: 1 error(s) occurred:

* dial tcp <ipaddressremoved>:22: i/o timeout
* module.cluster.module.services.aws_instance.server: 1 error(s) occurred:

* dial tcp <ipaddressremoved>:22: i/o timeout
* module.cluster.module.services.aws_instance.public_worker: 1 error(s) occurred:

* dial tcp <ipaddressremoved>:22: i/o timeout
* module.cluster.module.game.aws_instance.game: 1 error(s) occurred:

* timeout
* module.cluster.module.services.aws_instance.encoding_worker: 1 error(s) occurred:

* dial tcp <ipaddressremoved>:22: i/o timeout

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

Not sure why it is taking so long, as exactly the same image previously installed in about 35 minutes - obviously that needs to be fixed but it would be useful if this timeout can be changed somewhere just in case it was actually going to complete in 65 minutes.

The game instance on EC2 says is is running, but I’m unable to connect to it via VNC and the EC2 dashboard will not let me take a screenshot.

Adrian


#2

I believe this is occuring because the security group it creates isn’t adding an inbound rule for my public IP address. I believe it did that automatically previously, where can this setting be updated?

Thanks,

Adrian


#3

Hi Adrian,

I’ve emailed you so we can schedule a call.

Thanks,
Sophie


#4

Hi Adrian,

Yes, you’re right. The inbound connection are setup based on two terraform: trusted_cidr and trusted_cidrs.

The first one is a string and is usually set to the right value when you run genvid-bastion update-global-tfvars or simply by adding the -u option to genvid-bastion install (like in genvid-bastion install -u).

The second one is a list of strings representing a set of CIDR IP ranges like the first one. It is not touch by anyone but allow to maintain a list of trusted ips that can access your cluster.

I will add a quick fix on our system to detect when the external ip of the host is not included in the trusted_cidr range during genvid-bastion install and add it. It could help prevent it.

I hope this will give you some help, and we will still have our call tomorrow morning for your other problem.

Sorry for all those complications,
Fabien