// you’re reading...

MySQL

MySQL EXPLAIN Deception and Fraud

Okay, let’s get this straight. I wrote the headline to get some attention. Now that you’re hooked, I have to admit for full disclosure, I’m a MySQL fanboy. This post is simply to draw some attention to the fact that EXPLAIN is just one (very important) tool in a developer’s arsenal for diagnosing performance problems. Yet, there are times when EXPLAIN can seem deceptively positive.

For example, let’s examine the following SQL query:

SELECT org_id, 
location_id, 
DATE(for_datehour) AS theday, 
SUM(total_actions_incomplete) AS tai, 
SUM(total_actions_accepted) AS taa, 
SUM(total_actions_completed) AS tac, 
SUM(total_actions_past_due) AS tapd 
FROM analytics_location_hourly 
WHERE DATE(for_datehour) = '2013-12-06' 
GROUP BY location_id, theday;

Running an EXPLAIN on this query tells us that it’s SIMPLE and using an INDEX. …and it’s going to look at 663542 rows… which shouldn’t be that bad, right?

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: analytics_location_hourly
type: index
possible_keys: for_date_UNIQUE,fk_location_id_idx,fk_org_id_idx
key: fk_location_id_idx
key_len: 4
ref: NULL
rows: 663542
Extra: Using where
1 row in set (0.00 sec)

However, when we run this beast, it takes almost 6 seconds to return.

78 rows in set (5.90 sec)

If we look at the query itself, we can see there’s a function call in the WHERE clause. This is introducing some seriously latency that the EXPLAIN completely ignored.

WHERE DATE(for_datehour) = '2013-12-06' 

But what if we factor that function call out. Our query looks like this:

SELECT org_id, 
location_id, 
DATE(for_datehour) AS theday, 
SUM(total_actions_incomplete) AS tai, 
SUM(total_actions_accepted) AS taa, 
SUM(total_actions_completed) AS tac, 
SUM(total_actions_past_due) AS tapd 
FROM analytics_location_hourly 
WHERE for_datehour BETWEEN '2013-12-06 00:00:00' AND '2013-12-06 23:59:59' 
GROUP BY location_id, theday;

Now EXPLAIN tells us this query is a mess, making a temp table, using filesort… oh my! Yet this small change allowed the index to actually do its job and examine 1855 rows rather than 663542.

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: analytics_location_hourly
type: range
possible_keys: for_date_UNIQUE,fk_location_id_idx,fk_org_id_idx
key: for_date_UNIQUE
key_len: 5
ref: NULL
rows: 1855
Extra: Using where; Using temporary; Using filesort
1 row in set (0.00 sec)

However, when we run this query:

78 rows in set (0.00 sec)

The moral of my short story is to:

  • Use EXPLAIN, but understand that it doesn’t give you a complete picture.
  • Use your brain, but understand that tools are made for a reason.
  • Construct queries that utilize indexes to limit examined rows.
  • Trust no one.

I hope this was fun.

Discussion

No comments for “MySQL EXPLAIN Deception and Fraud”

Post a comment

Help support my site and buy a domain name at http://domainsemailhosting.com/