Skip to content

Commit 31ba582

Browse files
author
Shlomi Noach
authored
Merge pull request #358 from github/tests-shared-unique-key
Testing and clarifying shared unique key requirement
2 parents 8d8ef34 + e020b9c commit 31ba582

17 files changed

Lines changed: 205 additions & 9 deletions

File tree

doc/requirements-and-limitations.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,12 @@ The `SUPER` privilege is required for `STOP SLAVE`, `START SLAVE` operations. Th
3030

3131
- MySQL 5.7 `JSON` columns are not supported. They are likely to be supported shortly.
3232

33-
- The two _before_ & _after_ tables must share some `UNIQUE KEY`. Such key would be used by `gh-ost` to iterate the table.
34-
- As an example, if your table has a single `UNIQUE KEY` and no `PRIMARY KEY`, and you wish to replace it with a `PRIMARY KEY`, you will need two migrations: one to add the `PRIMARY KEY` (this migration will use the existing `UNIQUE KEY`), another to drop the now redundant `UNIQUE KEY` (this migration will use the `PRIMARY KEY`).
35-
36-
- The chosen migration key must not include columns with `NULL` values.
37-
- `gh-ost` will do its best to pick a migration key with non-nullable columns. It will by default refuse a migration where the only possible `UNIQUE KEY` includes nullable-columns. You may override this refusal via `--allow-nullable-unique-key` but **you must** be sure there are no actual `NULL` values in those columns. Such `NULL` values would cause a data integrity problem and potentially a corrupted migration.
33+
- The two _before_ & _after_ tables must share a `PRIMARY KEY` or other `UNIQUE KEY`. This key will be used by `gh-ost` to iterate through the table rows when copying. [Read more](shared-key.md)
34+
- The migration key must not include columns with NULL values. This means either:
35+
1. The columns are `NOT NULL`, or
36+
2. The columns are nullable but don't contain any NULL values.
37+
- by default, `gh-ost` will not run if the only `UNIQUE KEY` includes nullable columns.
38+
- You may override this via `--allow-nullable-unique-key` but make sure there are no actual `NULL` values in those columns. Existing NULL values can't guarantee data integrity on the migrated table.
3839

3940
- It is not allowed to migrate a table where another table exists with same name and different upper/lower case.
4041
- For example, you may not migrate `MyTable` if another table called `MYtable` exists in the same schema.
@@ -48,4 +49,4 @@ The `SUPER` privilege is required for `STOP SLAVE`, `START SLAVE` operations. Th
4849

4950
- If you have en `enum` field as part of your migration key (typically the `PRIMARY KEY`), migration performance will be degraded and potentially bad. [Read more](https://github.com/github/gh-ost/pull/277#issuecomment-254811520)
5051

51-
- Migrating a `FEDERATED` table is unsupported and is irrelevant to the problem `gh-ost` tackles.
52+
- Migrating a `FEDERATED` table is unsupported and is irrelevant to the problem `gh-ost` tackles.

doc/shared-key.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Shared key
2+
3+
A requirement for a migration to run is that the two _before_ and _after_ tables have a shared unique key. This is to elaborate and illustrate on the matter.
4+
5+
### Introduction
6+
7+
Consider a classic, simple migration. The table is any normal:
8+
9+
```
10+
CREATE TABLE tbl (
11+
id bigint unsigned not null auto_increment,
12+
data varchar(255),
13+
more_data int,
14+
PRIMARY KEY(id)
15+
)
16+
```
17+
18+
And the migration is a simple `add column ts timestamp`.
19+
20+
In such migration there is no change in indexes, and in particular no change to any unique key, and specifically no change to the `PRIMARY KEY`. To run this migration, `gh-ost` would iterate the `tbl` table using the primary key, copy rows from `tbl` to the _ghost_ table `_tbl_gho` by order of `id`, and then apply binlog events onto `_tbl_gho`.
21+
22+
Applying the binlog events assumes the existence of a shared unique key. For example, an `UPDATE` statement in the binary log translate to a `REPLACE` statement which `gh-ost` applies to the _ghost_ table. Such statement expects to add or replace an existing row based on given row data. In particular, it would _replace_ an existing row if a unique key violation is met.
23+
24+
So `gh-ost` correlates `tbl` and `_tbl_gho` rows using a unique key. In the above example that would be the `PRIMARY KEY`.
25+
26+
### Rules
27+
28+
There must be a shared set of not-null columns for which there is a unique constraint in both the original table and the migration (_ghost_) table.
29+
30+
### Interpreting the rules
31+
32+
The same columns must be covered by a unique key in both tables. This doesn't have to be the `PRIMARY KEY`. This doesn't have to be a key of the same name.
33+
34+
Upon migration, `gh-ost` inspects both the original and _ghost_ table and attempts to find at least one such unique key (or rather, a set of columns) that is shared between the two. Typically this would just be the `PRIMARY KEY`, but sometimes you may change the `PRIMARY KEY` itself, in which case `gh-ost` will look for other options.
35+
36+
`gh-ost` expects unique keys where no `NULL` values are found, i.e. all columns covered by the unique key are defined as `NOT NULL`. This is implicitly true for `PRIMARY KEY`s. If no such key can be found, `gh-ost` bails out. In the event there is no such key, but you happen to _know_ your columns have no `NULL` values even though they're `NULL`-able, you may take responsibility and pass the `--allow-nullable-unique-key`. The migration will run well as long as no `NULL` values are found in the unique key's columns. Any actual `NULL`s may corrupt the migration.
37+
38+
### Examples: allowed and not allowed
39+
40+
```
41+
create table some_table (
42+
id int auto_increment,
43+
ts timestamp,
44+
name varchar(128) not null,
45+
owner_id int not null,
46+
loc_id int,
47+
primary key(id),
48+
unique key name_uidx(name)
49+
)
50+
```
51+
52+
Following are examples of migrations that are _good to run_:
53+
54+
- `add column i int`
55+
- `add key owner_idx(owner_id)`
56+
- `add unique key owner_name_idx(owner_id, name)` - though you need to make sure to not write conflicting rows while this migration runs
57+
- `drop key name_uidx` - `primary key` is shared between the tables
58+
- `drop primary key, add primary key(owner_id, loc_id)` - `name_uidx` is shared between the tables and is used for migration
59+
- `change id bigint unsigned` - the `'primary key` is used. The change of type still makes the `primary key` workable.
60+
- `drop primary key, drop key name_uidx, create primary key(name), create unique key id_uidx(id)` - swapping the two keys. `gh-ost` is still happy because `id` is still unique in both tables. So is `name`.
61+
62+
63+
Following are examples of migrations that _cannot run_:
64+
65+
- `drop primary key, drop key name_uidx` - no unique key to _ghost_ table, so clearly cannot run
66+
- `drop primary key, drop key name_uidx, create primary key(name, owner_id)` - no shared columns to both tables. Even though `name` exists in the _ghost_ table's `primary key`, it is only part of the key and in itself does not guarantee uniqueness in the _ghost_ table.
67+
68+
Also, you cannot run a migration on a table that doesn't have some form of `unique key` in the first place, such as `some_table (id int, ts timestamp)`

localtests/fail-drop-pk/create.sql

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
drop table if exists gh_ost_test;
2+
create table gh_ost_test (
3+
id int auto_increment,
4+
i int not null,
5+
ts timestamp,
6+
primary key(id)
7+
) auto_increment=1;
8+
9+
drop event if exists gh_ost_test;
10+
delimiter ;;
11+
create event gh_ost_test
12+
on schedule every 1 second
13+
starts current_timestamp
14+
ends current_timestamp + interval 60 second
15+
on completion not preserve
16+
enable
17+
do
18+
begin
19+
insert into gh_ost_test values (null, 11, now());
20+
insert into gh_ost_test values (null, 13, now());
21+
insert into gh_ost_test values (null, 17, now());
22+
end ;;
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
No PRIMARY nor UNIQUE key found in table

localtests/fail-drop-pk/extra_args

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
--alter="change id id int, drop primary key"
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
drop table if exists gh_ost_test;
2+
create table gh_ost_test (
3+
id int auto_increment,
4+
i int not null,
5+
ts timestamp,
6+
primary key(id)
7+
) auto_increment=1;
8+
9+
drop event if exists gh_ost_test;
10+
delimiter ;;
11+
create event gh_ost_test
12+
on schedule every 1 second
13+
starts current_timestamp
14+
ends current_timestamp + interval 60 second
15+
on completion not preserve
16+
enable
17+
do
18+
begin
19+
insert into gh_ost_test values (null, 11, now());
20+
insert into gh_ost_test values (null, 13, now());
21+
insert into gh_ost_test values (null, 17, now());
22+
end ;;
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
No shared unique key can be found after ALTER
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
--alter="drop primary key, add primary key (id, i)"

localtests/swap-pk-uk/create.sql

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
drop table if exists gh_ost_test;
2+
create table gh_ost_test (
3+
id bigint,
4+
i int not null,
5+
ts timestamp(6),
6+
primary key(id),
7+
unique key its_uidx(i, ts)
8+
) ;
9+
10+
drop event if exists gh_ost_test;
11+
delimiter ;;
12+
create event gh_ost_test
13+
on schedule every 1 second
14+
starts current_timestamp
15+
ends current_timestamp + interval 60 second
16+
on completion not preserve
17+
enable
18+
do
19+
begin
20+
insert into gh_ost_test values ((unix_timestamp() << 2) + 0, 11, now(6));
21+
insert into gh_ost_test values ((unix_timestamp() << 2) + 1, 13, now(6));
22+
insert into gh_ost_test values ((unix_timestamp() << 2) + 2, 17, now(6));
23+
insert into gh_ost_test values ((unix_timestamp() << 2) + 3, 19, now(6));
24+
end ;;

localtests/swap-pk-uk/extra_args

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
--alter="drop primary key, drop key its_uidx, add primary key (i, ts), add unique key id_uidx(id)"

0 commit comments

Comments
 (0)