DevHeads.net

Optimize fails due to duplicate rows error but no duplicates found

Good day guys,

I am hoping this mail finds you well.

I am at a bit of a loss here...

We are trying to run optimize against a table in order to reclaim disk space from archived data which has been removed.

However, after running for over an hour , the optimize fails stating there is a duplicate entry in the table.

We have now spent 2 days using various methods but we are unable to find any duplicates in the primary key and also nothing on the unique key fields.

Any idea on why optimize would still be failing ?

Regards

Comments

Re: Optimize fails due to duplicate rows error but no duplicates

By Shawn Green (MySQL) at 02/13/2018 - 15:51

Hello Machiel,

On 2/13/2018 3:02 AM, Machiel Richards wrote:
Is it possible that the duplicate keys were the result of
re-partitioning your data where one of the "older" copies was in the
wrong partition as part of an upgrade from an earlier version?

See the entry in
<a href="https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html" title="https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html">https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html</a> that
start with...
A normal indexed lookup against a partitioned table will use
(particularly for a PK value) "partition pruning" . To see all of your
PK values regardless of which partition they are in, you need to scan
the table and avoid all indexes.

example:
# for a numeric PK column
CREATE TABLE myPK_list SELECT pk from sourcetable WHERE pk+0 > 0;

Then you can check the list in the generated table to find any duplicate
values.

Then you can modify a SELECT command to search each partition or
subpartition individually until you find the rows that are in the wrong
spots.
<a href="https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html" title="https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html">https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html</a>

Yours,

Re: Optimize fails due to duplicate rows error but no duplicates

By Machiel Richards at 02/13/2018 - 17:00

ok, so we have managed to get an id out of the errors etc... however when we look in the table that id does not even exist at all.

no idea what is going on here though.

Hello Machiel,

On 2/13/2018 3:02 AM, Machiel Richards wrote:
Is it possible that the duplicate keys were the result of
re-partitioning your data where one of the "older" copies was in the
wrong partition as part of an upgrade from an earlier version?

See the entry in
<a href="https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html" title="https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html">https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html</a> that
start with...
A normal indexed lookup against a partitioned table will use
(particularly for a PK value) "partition pruning" . To see all of your
PK values regardless of which partition they are in, you need to scan
the table and avoid all indexes.

example:
# for a numeric PK column
CREATE TABLE myPK_list SELECT pk from sourcetable WHERE pk+0 > 0;

Then you can check the list in the generated table to find any duplicate
values.

Then you can modify a SELECT command to search each partition or
subpartition individually until you find the rows that are in the wrong
spots.
<a href="https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html" title="https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html">https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html</a>

Yours,

Re: Optimize fails due to duplicate rows error but no duplicates

By Shawn Green (MySQL) at 02/23/2018 - 11:28

(please do not top post - see my answer below)

On 2/13/2018 4:00 PM, Machiel Richards wrote:
Then another thing to consider is that someone (during the lifetime of
this table) changed the character set of your table (possibly changing
it from using a case-sensitive collation to a case-insensitive
collation) without actually converting the data on the table to use the
new character set.

Is the key being duplicated numeric or character-based?

If numeric, is the value being reported as the duplicate at the high end
of the permitted range of values for that column?

Regards,

Re: Optimize fails due to duplicate rows error but no duplicates

By Machiel Richards at 02/13/2018 - 16:20

Thank you for the response.

There are no partitioning involved at all...

I will try the suggestion though and see if we get anything...

Regards

Hello Machiel,

On 2/13/2018 3:02 AM, Machiel Richards wrote:
Is it possible that the duplicate keys were the result of
re-partitioning your data where one of the "older" copies was in the
wrong partition as part of an upgrade from an earlier version?

See the entry in
<a href="https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html" title="https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html">https://dev.mysql.com/doc/relnotes/mysql/5.5/en/news-5-5-31.html</a> that
start with...
A normal indexed lookup against a partitioned table will use
(particularly for a PK value) "partition pruning" . To see all of your
PK values regardless of which partition they are in, you need to scan
the table and avoid all indexes.

example:
# for a numeric PK column
CREATE TABLE myPK_list SELECT pk from sourcetable WHERE pk+0 > 0;

Then you can check the list in the generated table to find any duplicate
values.

Then you can modify a SELECT command to search each partition or
subpartition individually until you find the rows that are in the wrong
spots.
<a href="https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html" title="https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html">https://dev.mysql.com/doc/refman/5.7/en/partitioning-selection.html</a>

Yours,