Bonjour,

Nous avons un cluster de 3 machines debian 10 / mariadb 10.3 / galera3

J'ai voulu upgrader un des noeuds du cluster en debian 11, et par conséquent mariadb 10.5 et alera 4.

Les docs mariadb (https://mariadb.com/kb/en/upgrading-galera-cluster/) ne parlent pas d'upgrade vers 10.5, mais uniquement 10.3 -> 10.4. ma principale inquiétude était le passage de galera3 a galera4, mais la donc est plutot rassurante là dessus :
"Galera 3 and Galera 4 should be compatible for the purposes of a rolling upgrade, as long as you are using Galera 26.4.2 or later"
ce qui est le cas en debian 11 ( galera 26.4.44)

debian ne fournissant pas de paquets mariadb 10.4, je me suis dit que ça se pourrait marcher, de 10.3 a 10.5

Sauf que non.
les 2 nodes restants voient bien le wsrep_cluster_size à 3, par contre le node upgradé est dans les choux. j'ai des "lock wait timeout" dans tous les sens, par exemple
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE '%wsrep%';
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

les journaux systemd disent : 

Feb 24 11:02:03 c7000-pa2-mysqlcluster2 systemd[1]: Starting MariaDB 10.5.15 database server...
Feb 24 11:02:04 c7000-pa2-mysqlcluster2 sh[52249]: WSREP: Recovered position 8a9a12bd-6240-11eb-803b-a210a81100fb:8208686
Feb 24 11:02:04 c7000-pa2-mysqlcluster2 mariadbd[52379]: 2022-02-24 11:02:04 0 [Note] /usr/sbin/mariadbd (mysqld 10.5.15-MariaDB-1:10.5.15+maria~bullseye-log) starting as process 52379 ...
Feb 24 11:02:05 c7000-pa2-mysqlcluster2 rsyncd[52504]: rsyncd version 3.2.3 starting, listening on port 4444
Feb 24 11:02:05 c7000-pa2-mysqlcluster2 rsyncd[52516]: connect from c7000-pa2-mysqlcluster1 (192.168.155.35)
Feb 24 11:02:05 c7000-pa2-mysqlcluster2 rsyncd[52516]: rsync allowed access on module rsync_sst from c7000-pa2-mysqlcluster1 (192.168.155.35)
Feb 24 11:02:05 c7000-pa2-mysqlcluster2 rsyncd[52516]: rsync to rsync_sst/ from c7000-pa2-mysqlcluster1 (192.168.155.35)
Feb 24 11:02:05 c7000-pa2-mysqlcluster2 rsyncd[52516]: receiving file list
Feb 24 11:02:05 c7000-pa2-mysqlcluster2 rsyncd[52516]: sent 48 bytes  received 199 bytes  total size 47
Feb 24 11:02:07 c7000-pa2-mysqlcluster2 rsyncd[52504]: sent 0 bytes  received 0 bytes  total size 0
Feb 24 11:02:07 c7000-pa2-mysqlcluster2 systemd[1]: Started MariaDB 10.5.15 database server.
Feb 24 11:02:07 c7000-pa2-mysqlcluster2 debian-start[52656]: ERROR 1047 (08S01) at line 1: WSREP has not yet prepared node for application use


et le error.log mysql, se remplit de messages du genre (j'ai activé le debug wsrep) :

2022-02-24 11:11:33 1417 [Note] WSREP: wsrep_commit_empty(1417)
2022-02-24 11:11:33 1417 [Note] WSREP: wsrep_after_statement for 1417 client_state exec  client_mode local trans_state aborted
2022-02-24 11:11:33 1417 [Note] WSREP: wsrep_after_statement for 1417 client_state exec  client_mode local trans_state aborted
2022-02-24 11:11:33 1417 [Note] WSREP: dispatch_command leave
    thd: 1417 thd_ptr: 0x7f4188000c58 client_mode: local client_state: result trx_state: aborted
    next_trx_id: 3289 trx_id: -1 seqno: -1
    is_streaming: 0 fragments: 0
    sql_errno: 1205 message: Lock wait timeout exceeded; try restarting transaction
    command: 0 query: SELECT (SELECT VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHER
2022-02-24 11:11:33 1417 [Note] WSREP: assigned new next trx id: 3480


Si quelqu'un a une piste, ou a déjà expérimenté ce genre de problèmes... je prends !

Merci
Cédric