How partitioning affects performance of MySQL tables

Partitioning in MySQL allows one to distribute tables accross filesystem and to split these tables into several files which in most cases leads to greater performance when accessing and modifying data.

First of all most MySQL databases are quite small, therefore partitioning is not necessary unless your project belongs to those large ones. Even a table with say one million of rows usually doesn’t need partitions, however it’s good to know of this feature. Moreover it’s also good to test it (which is below).

Prerequisites

Of course there should be a MySQL database installed on your machine. If you’re missing it, get one here. In addition you need around 1GB free on your database filesystem and even more space is needed if binary logs are being generated. We will also create few procedures and two tables - one with partitioning, another one without partitioning. All steps are described below, let’s start!

The test itself

In this case we’re going to test SELECT statement in order to determine advantages of partitioning.

#login to mysql console and enter password when prompted mysql -u root -p --now connect to the database test use test; --let’s create a procedure - this one will insert 10M of rows into both tables delimiter $$ CREATE PROCEDURE `randomizer_nopart`() begin declare i int Default 0 ; declare random char(20) ; declare mydate DATE Default '2013-04-04'; myloop: loop set random=conv(floor(rand() * 99999999999999), 20, 36) ; insert into `test_nopart` (`id`, `name`, `date`) VALUES ('',random,mydate + interval -i minute) ; insert into `test_part` (`id`, `name`, `date`) VALUES ('',random,mydate + interval -i minute) ; set i=i+1; if i=10000000 then leave myloop; end if; end loop myloop; end $$ delimiter ; --now create two tables - first one with partitions and second one without partitions CREATE TABLE `test_part` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL, `date` year(4) NOT NULL, KEY `id` (`id`), KEY `date1` (`date`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8 /*!50100 PARTITION BY RANGE (`date`) (PARTITION p0 VALUES LESS THAN (1995) ENGINE = MyISAM, PARTITION p1 VALUES LESS THAN (1996) ENGINE = MyISAM, PARTITION p2 VALUES LESS THAN (1997) ENGINE = MyISAM, PARTITION p3 VALUES LESS THAN (1998) ENGINE = MyISAM, PARTITION p4 VALUES LESS THAN (1999) ENGINE = MyISAM, PARTITION p5 VALUES LESS THAN (2000) ENGINE = MyISAM, PARTITION p6 VALUES LESS THAN (2001) ENGINE = MyISAM, PARTITION p7 VALUES LESS THAN (2002) ENGINE = MyISAM, PARTITION p8 VALUES LESS THAN (2003) ENGINE = MyISAM, PARTITION p9 VALUES LESS THAN (2004) ENGINE = MyISAM, PARTITION p10 VALUES LESS THAN (2005) ENGINE = MyISAM, PARTITION p11 VALUES LESS THAN (2006) ENGINE = MyISAM, PARTITION p12 VALUES LESS THAN (2007) ENGINE = MyISAM, PARTITION p13 VALUES LESS THAN (2008) ENGINE = MyISAM, PARTITION p14 VALUES LESS THAN (2009) ENGINE = MyISAM, PARTITION p15 VALUES LESS THAN (2010) ENGINE = MyISAM, PARTITION p16 VALUES LESS THAN (2011) ENGINE = MyISAM, PARTITION p17 VALUES LESS THAN (2012) ENGINE = MyISAM, PARTITION p18 VALUES LESS THAN (2013) ENGINE = MyISAM, PARTITION p19 VALUES LESS THAN (2014) ENGINE = MyISAM) */; CREATE TABLE `test_nopart` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(20) NOT NULL, `date` year(4) NOT NULL, KEY `id` (`id`), KEY `date1` (`date`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; --now insert data into both tables using our procedure - it might take some while until 10M of rows is inserted --this action took almost one hour in my virtualbox call randomizer_nopart(); --now perform simple SELECT statement over non-partitioned table SELECT COUNT(*) FROM test_nopart WHERE date = '2002' AND name LIKE '13%'; --this gives a result such as --1 row in set (4.06 sec) --let’s test the same query with partitioned table SELECT COUNT(*) FROM test_part WHERE date = '2002' AND name LIKE '13%'; --this gives a result such as --1 row in set (0.65 sec))

Conclusion

This simple test has shown that SQL statements executed over partitioned tables return results faster. Of course there are cases when partitioning cannot be used due to structure of table, it’s also important to use properly designed queries when accessing data in partitions. In order to find out whether our query goes directly to some partition, use command EXPLAIN which is used below.

--let’s explain use of partitions EXPLAIN PARTITIONS SELECT COUNT(*) FROM test_part WHERE date = ’2002’ AND name LIKE ’13%’; --we should get a result which is similar to the below: +------+-------------+-----------+------------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-----------+------------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | test_part | p8 | ALL | date1 | NULL | NULL | NULL | 525600 | Using where | +------+-------------+-----------+------------+------+---------------+------+---------+------+--------+-------------+

As you can see, our SELECT statement is forwarded to the partition p8 directly, therefore only a small portion of data is going to be accessed. All in all our SQL won’t have to scan entire table.

Cleanup

If everything is clear, execute following commands in order to cleanup all above.

--drop the tables DROP table test.test_part; DROP table test.test_nopart; --remove the procedure drop procedure test.randomizer_nopart;

Further reading & help

If you need help with your MySQL database, drop me a line. You’re also welcome to visit other articles devoted to MySQL - they’re listed below.