How partitioning affects performance of MySQL tables
Partitioning in MySQL allows one to distribute tables accross filesystem and to split these tables into several files which in most cases leads to greater performance when accessing and modifying data.
First of all most MySQL databases are quite small, therefore partitioning is not necessary unless your project belongs to those large ones. Even a table with say one million of rows usually doesn’t need partitions, however it’s good to know of this feature. Moreover it’s also good to test it (which is below).
Prerequisites
Of course there should be a MySQL database installed on your machine. If you’re missing it, get one here. In addition you need around 1GB free on your database filesystem and even more space is needed if binary logs are being generated. We will also create few procedures and two tables - one with partitioning, another one without partitioning. All steps are described below, let’s start!
The test itself
In this case we’re going to test SELECT statement in order to determine advantages of partitioning.
#login to mysql console and enter password when prompted
mysql -u root -p
--now connect to the database test
use test;
--let’s create a procedure - this one will insert 10M of rows into both tables
delimiter $$
CREATE PROCEDURE `randomizer_nopart`()
begin
declare i int Default 0 ;
declare random char(20) ;
declare mydate DATE Default '2013-04-04';
myloop: loop
set random=conv(floor(rand() * 99999999999999), 20, 36) ;
insert into `test_nopart` (`id`, `name`, `date`) VALUES ('',random,mydate + interval -i minute) ;
insert into `test_part` (`id`, `name`, `date`) VALUES ('',random,mydate + interval -i minute) ;
set i=i+1;
if i=10000000 then
leave myloop;
end if;
end loop myloop;
end $$
delimiter ;
--now create two tables - first one with partitions and second one without partitions
CREATE TABLE `test_part` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
`date` year(4) NOT NULL,
KEY `id` (`id`),
KEY `date1` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`date`)
(PARTITION p0 VALUES LESS THAN (1995) ENGINE = MyISAM,
PARTITION p1 VALUES LESS THAN (1996) ENGINE = MyISAM,
PARTITION p2 VALUES LESS THAN (1997) ENGINE = MyISAM,
PARTITION p3 VALUES LESS THAN (1998) ENGINE = MyISAM,
PARTITION p4 VALUES LESS THAN (1999) ENGINE = MyISAM,
PARTITION p5 VALUES LESS THAN (2000) ENGINE = MyISAM,
PARTITION p6 VALUES LESS THAN (2001) ENGINE = MyISAM,
PARTITION p7 VALUES LESS THAN (2002) ENGINE = MyISAM,
PARTITION p8 VALUES LESS THAN (2003) ENGINE = MyISAM,
PARTITION p9 VALUES LESS THAN (2004) ENGINE = MyISAM,
PARTITION p10 VALUES LESS THAN (2005) ENGINE = MyISAM,
PARTITION p11 VALUES LESS THAN (2006) ENGINE = MyISAM,
PARTITION p12 VALUES LESS THAN (2007) ENGINE = MyISAM,
PARTITION p13 VALUES LESS THAN (2008) ENGINE = MyISAM,
PARTITION p14 VALUES LESS THAN (2009) ENGINE = MyISAM,
PARTITION p15 VALUES LESS THAN (2010) ENGINE = MyISAM,
PARTITION p16 VALUES LESS THAN (2011) ENGINE = MyISAM,
PARTITION p17 VALUES LESS THAN (2012) ENGINE = MyISAM,
PARTITION p18 VALUES LESS THAN (2013) ENGINE = MyISAM,
PARTITION p19 VALUES LESS THAN (2014) ENGINE = MyISAM) */;
CREATE TABLE `test_nopart` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
`date` year(4) NOT NULL,
KEY `id` (`id`),
KEY `date1` (`date`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--now insert data into both tables using our procedure - it might take some while until 10M of rows is inserted
--this action took almost one hour in my virtualbox
call randomizer_nopart();
--now perform simple SELECT statement over non-partitioned table
SELECT COUNT(*) FROM test_nopart WHERE date = '2002' AND name LIKE '13%';
--this gives a result such as
--1 row in set (4.06 sec)
--let’s test the same query with partitioned table
SELECT COUNT(*) FROM test_part WHERE date = '2002' AND name LIKE '13%';
--this gives a result such as
--1 row in set (0.65 sec))
Conclusion
This simple test has shown that SQL statements executed over partitioned tables return results faster. Of course there are cases when partitioning cannot be used due to structure of table, it’s also important to use properly designed queries when accessing data in partitions. In order to find out whether our query goes directly to some partition, use command EXPLAIN which is used below.
--let’s explain use of partitions
EXPLAIN PARTITIONS SELECT COUNT(*) FROM test_part WHERE date = ’2002’ AND name LIKE ’13%’;
--we should get a result which is similar to the below:
+------+-------------+-----------+------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------+------------+------+---------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | test_part | p8 | ALL | date1 | NULL | NULL | NULL | 525600 | Using where |
+------+-------------+-----------+------------+------+---------------+------+---------+------+--------+-------------+
As you can see, our SELECT statement is forwarded to the partition p8 directly, therefore only a small portion of data is going to be accessed. All in all our SQL won’t have to scan entire table.
Cleanup
If everything is clear, execute following commands in order to cleanup all above.
--drop the tables
DROP table test.test_part;
DROP table test.test_nopart;
--remove the procedure
drop procedure test.randomizer_nopart;
Further reading & help
If you need help with your MySQL database, drop me a line. You’re also welcome to visit other articles devoted to MySQL - they’re listed below.
- Performing regular backups and point-in-time recovery
- Enabling logging of issued SQL statements
- Understanding indexes
- Memory tuning & filesystem tuning & data optimisation
- Moving datafiles between two databases
- Difference between MYISAM and InnoDB engines
- Partitioning
- Enabling remote access to a MySQL database
- How to setup replication
- Rebuilding indexes and checking health of tables





