How and when to use indexes in MySQL

Indexes in MySQL improve speed and decrease time which is needed to retrieve data from tables.

If you have ever created a table with millions of rows then you must have noticed that sorting or maintaining data becomes time consuming if there are no indexes over columns in such a table. Of course it would be ideal if we had indexes over every column of each table, however this is not always possible due to limited space on filesystem - indexes indeed cost space.

Prerequisites

Of course there should be a MySQL database installed on your machine. If you’re missing it, get one here. Here below are instructions how to create two tables, insert a reasonable amount of data into them, then we’re going to perform some operations in order to compare the difference with and without indexes. Make sure there is around 1GB free space on your database filesystem, additional space might be needed if binary logs are being generated.

The test itself

As mentioned above, let’s connect to the database and let’s create our procedure and tables.

#login to mysql console and enter password when prompted mysql -u root -p --now connect to the database test use test; --let’s create a procedure - this one will insert 10M of rows into our tables delimiter $$ CREATE PROCEDURE `randomizer_idx`() begin declare i int Default 0 ; declare random char(20) ; myloop: loop set random=conv(floor(rand() * 99999999999999), 20, 36) ; INSERT INTO `test_index` VALUES ( NOW() , random ); INSERT INTO `test_indexw` VALUES ( NOW() , random ); set i=i+1; if i=10000000 then leave myloop; end if; end loop myloop; end $$ delimiter ; --now let’s create our table without indexes CREATE TABLE `test_index` ( `datecol` datetime NOT NULL, `name` varchar(25) NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8; --and here’s our table with indexes CREATE TABLE `test_indexw` ( `datecol` datetime NOT NULL, `name` varchar(25) NOT NULL, KEY `datecol` (`datecol`), KEY `name` (`name`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8; --let’s insert rows into both tables --this will take some time call randomizer_idx(); --once we’re done, let’s perform some SELECT statement - this is done over indexes-free table SELECT COUNT(*) FROM test_index WHERE name LIKE 'CY%' AND datecol LIKE '%15%' ORDER BY datecol ASC, name ASC; --we will get something like --1 row in set (8.52 sec) --now let’s try the same statement over our table with indexes SELECT COUNT(*) FROM test_indexw WHERE name LIKE 'CY%' AND datecol LIKE '%15%' ORDER BY datecol ASC, name ASC; --we will get something like --1 row in set (2.13 sec) --OK, SELECT is much faster with indexes, let’s check size of indexes SELECT table_name AS "TABLE NAME", ROUND(data_length/1024/1024,2) AS "size in MB", ROUND(index_length/1024/1024,2) AS "INDEXES SIZE" FROM information_schema.TABLES WHERE table_schema LIKE 'test' AND table_name LIKE 'test_indexw'; --we’ll get following output: +-------------+------------+--------------+ | TABLE NAME | size in MB | INDEXES SIZE | +-------------+------------+--------------+ | test_indexw | 259.40 | 248.76 | +-------------+------------+--------------+

Cleanup

If everything is clear, execute following commands in order to cleanup all above.

--drop the tables DROP table test.test_index; DROP table test.test_indexw; --remove the procedure DROP procedure test.randomizer_idx;

Conclusion

Indexes are great when it comes to performance, however they may consume more space than data alone! It’s always important to tell whether we need indexes over particular columns, actually it’s necessary to design the whole database structure properly so we know what data is going to be selected even prior to creating first table.

Further reading & help

If you need help with your MySQL database, drop me a line. You’re also welcome to visit other articles devoted to MySQL - they’re listed below.