terraform google providerの2.0へのUpgradeでBigtableをふっ飛ばした話

TL;DR

bigtable_instanceとcontainer_node_poolの記法が大きく変わったから、同じResource名を利用して、variableで環境分けている場合は注意しよう
plan結果をしっかり見て、v2へ上げたときにresourceの再作成が行われないようにしよう

概要

terraform-google-providerのv2が正式にリリースされました https://www.terraform.io/docs/providers/google/version_2_upgrade https://github.com/terraform-providers/terraform-provider-google/blob/master/CHANGELOG.md

これにともなって、いくつかのResource定義の記法が変更されており、特に注意しなければならない変更があります。それは、google_container_node_poolとgoogle_bigtable_instanceです

google_container_node_poolについて

こちらで大きく変わったのは、name_prefixを利用できなくなったことです。

従来、以下のように記述していたものが

resource "google_container_node_pool" "example" {
  name_prefix               = "example-np-"
  zone                           = "us-central1-a"
  cluster                        = "${google_container_cluster.example.name}"
  node_count                = 1

  node_config {
    machine_type = "${var.machine_type}"
  }

  lifecycle {
    create_before_destroy = true
  }
}

prefixを利用する場合はrandom_idというリソースから名前を生成する形がGuideに記述されています

variable "machine_type" {}

resource "google_container_cluster" "example" {
  name               = "example-cluster"
  zone               = "us-central1-a"
  initial_node_count = 1

  remove_default_node_pool = true
}

resource "random_id" "np" {
  byte_length = 11
  prefix      = "example-np-"
  keepers = {
    machine_type = "${var.machine_type}"
  }
}

resource "google_container_node_pool" "example" {
  name               = "${random_id.np.dec}"
  zone               = "us-central1-a"
  cluster            = "${google_container_cluster.example.name}"
  node_count         = 1

  node_config {
    machine_type = "${var.machine_type}"
  }

  lifecycle {
    create_before_destroy = true
  }
}

これによって、従来の記法で定義されていたNode_poolと、v2の記法で書かれたNode_poolが同一のものではなくなり、再作成となります。回避方法は、Variable等を利用して、すでに作成されているnode_poolの名前をhard codingしてnameを定義する必要があります

google_bigtable_instance

これが非常に厄介で、Guideを引用すると

resource "google_bigtable_instance" "instance" {
  name         = "tf-instance"
  cluster_id   = "tf-instance-cluster"
  zone         = "us-central1-b"
  num_nodes    = 3
  storage_type = "HDD"
}

のような記法が

resource "google_bigtable_instance" "instance" {
  name = "tf-instance"
  cluster {
    cluster_id   = "tf-instance-cluster"
    zone         = "us-central1-b"
    num_nodes    = 3
    storage_type = "HDD"
  }
}

このような変更となり、clusterのfield内に定義するようになりました。

これだけならまぁ、対応可能な範疇です。

しかし問題なのは、instance_typeを定義していた場合です。

例えば、v1のときに、以下のような記法で開発環境と本番環境を作成していたとします

# dev.tfvars
bigtable = {
  instance_type = "DEVELOPMENT"
  num_nodes = "0"
}

# prod.tfvars
bigtable = {
  instance_type = "PRODUCTION"
  num_nodes = "3"
}

### この2つを環境ごとに切り分けて読み込む
variable bigtable {
  type = "map"
  default = {
    instance_type = "OVERWRITE"
    num_nodes = "OVERWRITE"
  }
}

resource "google_bigtable_instance" "instance" {
  name         = "tf-instance"
  instance_type = "${var.bigtable["instance_type"]}"
  cluster_id   = "tf-instance-cluster"
  zone         = "us-central1-b"
  num_nodes    = "${var.bigtable["num_nodes"]}"
  storage_type = "HDD"
}

ところが、v2系ではDEVELOPMENTの場合、そもそもnum_nodesを定義してはいけません。以下、公式抜粋

num_nodes - (Optional) The number of nodes in your Cloud Bigtable cluster. Required, with a minimum of 3 for a PRODUCTION instance. Must be left unset for a DEVELOPMENT instance.

つまり、このように定義していた場合、devとprdでresourceを2つ定義しなければなりません。

# dev.tfvars
bigtable = {
  instance_type = "DEVELOPMENT"
}

# prod.tfvars
bigtable = {
  instance_type = "PRODUCTION"
  num_nodes = "3"
}

### この2つを環境ごとに切り分けて読み込む
### countを利用して、条件分岐させる
variable bigtable {
  type = "map"
  default = {
    instance_type = "OVERWRITE"
    num_nodes = "OVERWRITE"
  }
}

resource "google_bigtable_instance" "instance" {
  count = "${var.bigtable["instance_type"] == "PRODUCTION" ? 1 : 0}"

  name          = "tf-instance"
  instance_type = "${var.bigtable["instance_type"]}"

  cluster {
    cluster_id   = "tf-instance-cluster"
    zone         = "us-central1-b"
    num_nodes    = "${var.bigtable["num_nodes"]}"
    storage_type = "HDD"
  }
}

# devはnum_nodesを定義しない
resource "google_bigtable_instance" "instance_dev" {
  count = "${var.bigtable["instance_type"] == "DEVELOPMENT" ? 1 : 0}"

  name          = "tf-instance"
  instance_type = "${var.bigtable["instance_type"]}"

  cluster {
    cluster_id   = "tf-instance-cluster"
    zone         = "us-central1-b"
    storage_type = "HDD"
  }
}

同一resourceは定義できないので、devとprodで2つのresourceを定義します。そうなると、resource名が変更として扱われ、同一のnameやcluster_idだったとしても再作成となります。(これを見逃していてふっ飛ばしました。しっかりplan見ようねって話なのですが...) 上記の例の場合、prodはresource名を変更していないので、changed扱い、devは新規作成及び旧instanceの削除となります。