mysqldump | Infraworks

As I recently noted, we upgraded our Wix.com production databases to the newly released MySQL 5.5.8 and moved it to Solaris 10.

Here is the first experience and notes:

“mysqldump –single-transaction –flush-logs” does not produce consistent dump anymore
This way I usually set up a new first slave. After I restored first three databases, I tried to set up replication using “–master-data=2” output from the dump file. Starting replication on the newly created slave has immediately failed with error 1062 – “show slave status\G” says: "1062 | Error 'Duplicate entry '...' for key 1' on query. Default database: '...'. Query: 'INSERT INTO ...". This never happened in previous versions. I don’t know yet exactly what happens there. Brief checking of a few tables is showing that all records are in place. Perhaps, several transactions were flushed twice and small portion of the transaction log was replayed on existing data. I carefully checked new mysqldump documentation, there is no change in –single-transaction behavior. I made a post in the MySQL forum. The forum is somewhat sleepy, waiting for replies…
Reading mysqldump documentation has revealed a few new quite useful features added in mysqldump in version 5.5:
- --dump-slave[=value] – This option is very useful for cloning existing slave replicating data from the same master. It causes mysqldump to include “change master” statement exactly how it does –master-data, but the master info is of the dumped slave’s master. If the running MySQL slave cannotbe stopped for faster cloning procedure, using this option can be very handy.
- –apply-slave-statements – This one makes life easier in conjunction with –dump-slave when resulting dump should be reapplied on existing slave. It says to mysqldump to include STOP SLAVE prior to CHANGE MASTER statement and START SLAVE at end of the resulting dump.
InnoDB Table Compression works even better than expected
Compression seems to work quite smooth and stable. Using 16K pages for table containing large XML blobs helped to reduce 1TB database to approximately 250G which was expected. But performance gain is pretty surprising. Look at the New Relic chart:

Note the brown graph. This is behavior of most popular get XML call. We switched app server to the new compressed database at 3:00. Since then fetching blobs from the database is much faster.
Solaris packages still do not have DTrace probes enabled
This is bad surprise. One of my big expectations after Sun has acquired MySQL is that Solaris support in MySQL will be significantly improved. I reported about lack of DTrace in 5.5.7-RC in the past. I was really hoping that this is just a matter of release candidate build and that will be solved when 5.5 will go to the GA line. After 5.5.8 was deployed on our servers we have found that there is is still empty “dtrace -l | grep mysql”. I made another post about it in the MySQL forum. No replies so far.
So while “What Is New in MySQL 5.5” article is announcing “Enhanced Solaris Support“, in practice it is not that enhanced. There is also a good news regarding Solaris support. “innodb_flush_method = O_DIRECT” is now working (it was causing MySQL to produce an error on startup in 5.5.7-RC).

The bottom line so far is that MySQL 5.58 is ready for production. This is a big jump for the MySQL community. We can say good bye to the buggy 5.1 line and really enjoy the next generation of this amazing piece of of software.

We have a database that stores XML BLOBs. The database is big, it takes about more than 900GB of disk space. Compressed mysqldump in contrast, takes about 60GB. Since MySQL 5.5 went GA last Friday, it is quite obvious, that this database is a good candidate for applying the new MySQL feature of the InnoDB Data Compression. We are playing with MySQL 5.5 since Google released its version of MySQL and it was adopted and published by Sun as 5.4 beta. I was excited by the new ability of compressing table data in a InnoDB table and I tried to enable it on one of our replicas. It was not that successful, and MySQL has segfaulted on importing very simple data chunk into compression enabled table. So, again, we are promised now that 5.5 has reached production quality and I decided to try it once again on our XML BLOB database.

There are a few things that we should know before we just do “alter table blobs ROW_FORMAT=COMPRESSED;” on a database running in MySQL 5.5.8 server. Table compression can be enabled only on tables stored in the new InnoDB file format called Barracuda. MySQL documentation is traditionally messed up and confusing. While this documentation entry is saying that “In MySQL 5.5.5 and higher, the default value is “Barracuda””, it is actually not true and newly installed MySQL 5.5.8 server default file format is the old one called “Antelope”. It is noted in the new manual document. You can see that Barracuda was default in (>= 5.5.0, <= 5.5.6) versions, but (>= 5.5.7) version defaults to the old one, “Antelope” file format. Needless to say that ‘This applies only for tables that have their own tablespace, so for it to have an effect, innodb_file_per_table must be enabled.”

So after we installed the new, 5.5.8 GA on our server, the following lines must be present in the /etc/my.cnf file:

innodb_file_per_table = 1
innodb_file_format = Barracuda

Currently running settings can be validated this way:

mysql> show variables like ‘innodb_file%’;
+———————————+————-+
| Variable_name | Value |
+———————————+————-+
| innodb_file_format | Barracuda |
| innodb_file_format_check | ON |
| innodb_file_format_max | Barracuda |
| innodb_file_per_table | ON |
+———————————+————+
4 rows in set (0.01 sec)

After server is started with this config, we are able to create a compressed table with the following statement:

CREATE TABLE blobs (
id int NOT NULL PRIMARY KEY AUTO_INCREMENT,
payload longtext,
dateupdated timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=16 DEFAULT CHARSET=utf8;

I specified compressed page size of 16K as the database is used for storing XML blobs. And as noted in the MySQL documentation, “Using a 16K compressed page size can reduce storage and I/O costs for BLOB, VARCHAR or TEXT columns, because such data often compress well, and might therefore require fewer “overflow” pages, even though the B-tree nodes themselves take as many pages as in the uncompressed form”.

This is it, our blobs table is now compressed. It can be verified by:

mysql> show table status like ‘blobs’\G
*************************** 1. row ***************************
Name: blobs
Engine: InnoDB
Version: 10
Row_format: Compressed
Rows: 4
Avg_row_length: 4096
Data_length: 16384
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: 5
Create_time: 2010-12-18 17:56:57
Update_time: NULL
Check_time: NULL
Collation: utf8_general_ci
Checksum: NULL
Create_options: row_format=COMPRESSED KEY_BLOCK_SIZE=16
Comment:
1 row in set (0.01 sec)

Converting existing database using mysqldump

Instead of altering existing tables which will not free up already allocated space we can dump existing database with mysqldump, change the create table statements inside the dump file and restore it on another server. After that we can set up replication and swap databases when everything is in sync.

Dumping existing database:

mysqldump –single-transaction –flush-logs –master-data=2 -R blob_db | sed ‘s%ENGINE\=InnoDB%ENGINE\=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=16 %g’ > blob_db_compressed.sql

In case and we have existing dump file it can be converted this way (we keep it compressed):

gzcat blob_db.sql.gz | sed ‘s%ENGINE\=InnoDB%ENGINE\=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=16 %g’ > blob_db_compressed.sql

On the destination server:

Create database. Run mysql shell and execute:

mysql> create database blob_db default charset=utf8;
Query OK, 1 row affected (0.00 sec)
Restore the database from the dump file

mysql blob_db < blob_db_compressed.sql
Set up replication using master info from the dump file. In the mysql shell do

CHANGE MASTER TO MASTER_LOG_FILE=’mysql-bin.014093′, MASTER_LOG_POS=107, master_host=’source_server’, master_user=’replication_user’, master_password=”;

Now, when “show slave status” is indicating that databases are in sync, we can switch over application server to use the newly migrated compressed database.

Infraworks

System Engineering Blog

Tag Archives: mysqldump

MySQL 5.5.8 in production – First impressions

Using InnoDB table compression in MySQL 5.5

Converting existing database using mysqldump