MySQL MHA-Go Deployment Guide

This article explains the support boundaries and basic usage of mha_go.yml. mha_go.yml deploys the Go rewrite of MHA (mha-go) without the Perl MHA toolchain. It runs as one static binary plus one YAML config on the manager node.

The current role of mha_go.yml is: deploy the mha-go manager on top of an existing GTID primary-replica topology created by dbbot master_slave.yml. If the MySQL topology does not exist yet, run master_slave.yml first.

1. How it works

mha-go provides failure detection, online switchover, and failover for GTID single-primary replication:

GTID-only: binlog file/position replication is not supported.
Topology discovery: SQL probes discover roles, GTID, read-only state, replication threads, lag, and semi-sync state.
Candidate selection: the order of slave_ips maps to candidate_priority; the first replica has the highest priority.
Writer endpoint abstraction: default writer_endpoint.kind: none; VIP mode calls /usr/local/bin/mha_ip_failover.sh.
Runtime shape: the manager runs as mysql under systemd and writes JSON logs to the journal.

2. Comparison with legacy MHA

Dimension	Legacy MHA (`mha.yml`)	MHA-Go (`mha_go.yml`)
Language	Perl	Go, single static binary
MySQL support	Mainly legacy `5.7` topologies	`8.4.x`, `9.7.0` ER/EA
Replication	Binlog position or GTID	GTID only
Config format	INI (`app1.cnf`)	YAML (`cluster.yaml`)
Runtime	`masterha_manager` foreground or nohup	`mha-manager.service`
Runtime user	Usually root	`mysql` user
Writer endpoint	Traditional VIP hook script	`writer_endpoint` abstraction, disabled by default
Dry-run	Weak	Native `--dry-run` for `switch` / `failover-execute`

For new MySQL 8.4/9.7 GTID primary-replica clusters, prefer mha_go.yml. Keep legacy mha.yml only for MySQL 5.7 historical topologies.

3. Support boundaries

Target architecture: one primary, one or more replicas, and one MHA-Go manager.
MySQL versions: 8.4.x or 9.7.0 ER/EA.
Replication: gtid_mode=ON and enforce_gtid_consistency=ON are mandatory.
Default deployment: pure async replication, semi_sync.policy: disabled.
Semi-sync: use preferred or required only after MySQL semi-sync plugins are installed and enabled.
Unsupported: MySQL 5.7, 8.0, 9.6, and non-GTID replication.

MySQL 9.7.x uses glibc2.28 packages and is not supported on the CentOS/RHEL 7 family. Use MySQL 8.4.x as the prerequisite primary-replica cluster on those systems.

4. Topology conventions

mha_go.yml reuses the [dbbot_mysql] host group and uses master_ip, slave_ips, and manager_ip for roles:

master_ip: current primary, rendered as db1 in cluster.yaml.
slave_ips: replica list, rendered as db2, db3, and so on.
manager_ip: node that runs mha-manager. It must be either master_ip or one of slave_ips. The default is master_ip, matching the current manual guide and dbbot default test environment.

dbbot default test environment:

192.168.161.11  primary / manager
192.168.161.12  replica
192.168.161.13  replica

5. Key variables

Edit mysql_ansible/playbooks/vars/var_mha_go.yml:

master_ip: 192.168.161.11
slave_ips:
  - 192.168.161.12
  - 192.168.161.13
sub_nets: "192.168.161.%"

manager_ip: "{{ master_ip }}"
mha_go_cluster_name: app1

mha_go_semi_sync_policy: disabled
mha_go_semi_sync_wait_for_replica_count: 0
mha_go_semi_sync_timeout: 5s
mha_go_salvage_policy: salvage-if-possible
mha_go_salvage_timeout: 30s

Common variables:

Variable	Default	Purpose
`manager_ip`	`{{ master_ip }}`	MySQL node that runs `mha-manager`
`mha_go_binary_dest`	`/usr/local/bin/mha`	Binary path on the manager node
`mha_go_config_dir`	`/etc/mha`	Directory for `cluster.yaml`
`mha_go_log_dir`	`/var/log/mha`	Log directory
`mha_go_service_enabled`	`true`	Whether to enable the systemd service
`mha_go_semi_sync_policy`	`disabled`	`disabled`, `preferred`, or `required`
`mha_go_writer_endpoint_enabled`	`false`	Whether to enable VIP writer endpoint switching

6. Prerequisites

make_mha_go checks these conditions before running:

Every node has master_slave_finish.flag under datadir, meaning master_slave.yml completed.
Every node has gtid_mode=ON and enforce_gtid_consistency=ON.
master_ip, slave_ips, and manager_ip can all be found in the inventory.

Current mha_go.yml role chain:

pre_check_and_set -> make_mha_go

It does not reinstall MySQL or rebuild replication.

7. Entry point

cd /usr/local/dbbot/mysql_ansible/playbooks
python3 /usr/local/dbbot/portable-ansible/ansible-playbook \
  -i ../inventory/hosts.ini \
  mha_go.yml \
  -e dbbot_confirmation_input=confirm

When testing with dbbot public default passwords, explicitly allow them:

-e '{"fcs_allow_dbbot_default_passwd": true}'

8. Artifacts

On the manager node:

/usr/local/bin/mha: the static mha-go binary bundled with dbbot.
/etc/mha/cluster.yaml: cluster config with topology.kind: mysql-replication-single-primary and replication.mode: gtid.
/etc/systemd/system/mha-manager.service: runs mha manager --log-format json as mysql:mysql.
/var/log/mha/: log directory.

The manager node gets mha_go_finish.flag under datadir.

9. Common commands

Run on the manager node:

mha version
mha check-repl --config /etc/mha/cluster.yaml
mha switch --config /etc/mha/cluster.yaml --new-primary db2 --dry-run
mha failover-plan --config /etc/mha/cluster.yaml
systemctl status mha-manager
journalctl -u mha-manager -f

failover-plan and failover-execute are expected to block while the primary is still alive.

10. Uninstall

mha_go_unsafe_uninstall.yml stops and disables mha-manager.service, removes mha-go manager files and the configured VIP, and then removes the MySQL instance data, log, and runtime directories for the current mysql_port:

cd /usr/local/dbbot/mysql_ansible/playbooks
python3 /usr/local/dbbot/portable-ansible/ansible-playbook \
  -i ../inventory/hosts.ini \
  mha_go_unsafe_uninstall.yml \
  -e dbbot_confirmation_input=confirm

Do not run this entry if you want to keep the MySQL primary-replica topology; it also removes the MySQL instance for the current mysql_port.

11. Enabling the VIP writer endpoint

VIP is disabled by default. To expose a stable writer endpoint:

mha_go_writer_endpoint_enabled: true
vip: 192.168.161.10
vip_netmask: "32"
net_work_interface: enp1s0

The playbook then deploys the VIP script and sudo rules for the mysql user. vip, vip_netmask, and net_work_interface must match the real network. Sample environments using the dbbot 192.168.161.* test inventory usually use enp1s0; in other environments, confirm the interface with ip route or ip addr first.

12. Notes

mha_go.yml is an incremental deployment entry point, not a MySQL primary-replica initialization entry point.
manager_ip can be the primary or a replica, but it must be a node in [dbbot_mysql].
MySQL 9.7.0 renders version_series: "9.7" in cluster.yaml; MySQL 8.4.x renders "8.4".
dbbot defaults semi_sync.policy to disabled so pure async topologies do not produce misleading semi-sync warnings.
cluster.yaml is mode 0640 and owned by mysql:mysql because the systemd service reads it as the mysql user.