Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add helper to assert that tables are logically identical #42658

Closed
akshayjshah opened this issue Nov 21, 2019 · 2 comments
Closed

Add helper to assert that tables are logically identical #42658

akshayjshah opened this issue Nov 21, 2019 · 2 comments

Comments

@akshayjshah
Copy link
Contributor

akshayjshah commented Nov 21, 2019

Is your feature request related to a problem? Please describe.
Like many people, we need a way to manage SQL migrations in our codebase. Each migration includes an upward migration statement (ALTER TABLE foo ADD COLUMN bar INT) and a revert statement (ALTER TABLE foo DROP COLUMN bar). In CI, we run tests to ensure that each migration is idempotent. Those tests set up the table(s), query for the initial schema (SHOW CREATE TABLE foo), apply the migration, roll it back, query the schema again, and then assert that the two CREATE TABLE statements are the same.

In many cases, the two schemas aren't exactly identical. For example, dropping and re-adding a column changes ordering in a few places. We'd like a principled, correct way to determine that two CREATE TABLE statements describe logically-equivalent schemas.

Describe the solution you'd like
Our code happens to be written in Go. I'd love it if CRDB exposed an externally-consumable package of test helpers that included a function like this.

Describe alternatives you've considered
I've tried to use github.com/cockroachdb/cockroach/pkg/sql/parser, but it's not installable with modules:

../../pkg/mod/github.com/cockroachdb/[email protected]+incompatible/pkg/roachpb/metadata_replicas.go:17:2: module go.etcd.io/etcd@latest found (v3.3.17+incompatible), but does not contain package go.etcd.io/etcd/raft/confchange
../../pkg/mod/github.com/cockroachdb/[email protected]+incompatible/pkg/roachpb/metadata_replicas.go:18:2: module go.etcd.io/etcd@latest found (v3.3.17+incompatible), but does not contain package go.etcd.io/etcd/raft/quorum
../../pkg/mod/github.com/cockroachdb/[email protected]+incompatible/pkg/roachpb/metadata_replicas.go:20:2: module go.etcd.io/etcd@latest found (v3.3.17+incompatible), but does not contain package go.etcd.io/etcd/raft/tracker

(Migrating the whole CockroachDB project to modules is tracked in #35426.) Even if I could install the parser in my project, it takes quite a bit of code to make the assertion I'd like. Code using the parser is also somewhat difficult to follow unless you're familiar with the parser package: I'm sure I'd have trouble understanding my own code a year from now.

Instead, we've settled on the pragmatic-but-gross approach of string manipulation: with a bunch of parens-counting and comma-splitting operations, we chop the SQL up into "parts", drop the FAMILY definitions, and assert that the remaining sets of "parts" are identical. It's short and fairly easy to read, but subtly wrong in a hundred ways.

Additional context
Current string hackery that I'd like to replace:

type create struct {
	Name       string
	ColumnsEtc stringset.Set
}

func (c *create) Equal(other *create) bool {
	if c.Name != other.Name {
		return false
	}
	return c.ColumnsEtc.Equal(other.ColumnsEtc)
}

func parse(ddl string) (*create, error) {
	if !strings.HasPrefix(ddl, "CREATE TABLE") {
		return nil, fmt.Errorf("DDL isn't CREATE TABLE: %s", ddl)
	}
	// Strip off the CREATE TABLE.
	unparsed := strings.TrimPrefix(ddl, "CREATE TABLE")
	unparsed = strings.TrimSuffix(unparsed, ";")
	unparsed = strings.TrimSpace(unparsed)
	// Lop off the table name.
	parts := strings.SplitN(unparsed, " ", 2)
	stmt := &create{
		Name:       parts[0],
		ColumnsEtc: stringset.New(),
	}
	// Remainder should be table definition
	defn, err := elements(parts[1])
	if err != nil {
		return nil, fmt.Errorf("can't parse %q: %w", unparsed, err)
	}
	for _, col := range defn {
		if strings.HasPrefix(col, `FAMILY "primary"`) {
			// We don't care about the default column family...
			continue
		}
		if strings.HasPrefix(col, "FAMILY") {
			// ...but if we've introduced another family, we need to add family-comparison logic.
			return nil, fmt.Errorf("parser can't handle FAMILY clauses, found %q", col)
		}
		// This is a column definition, a constraint, or some other meaningful DDL clause.
		stmt.ColumnsEtc.Insert(col)
	}
	return stmt, nil
}

func elements(ddl string) ([]string, error) {
	// Trim whitespace and surrounding parens.
	unparsed := strings.TrimSpace(ddl)
	if unparsed[0] == '(' && unparsed[len(unparsed)-1] == ')' {
		unparsed = unparsed[1 : len(unparsed)-1]
	}
	// Split into elements, leaving parenthesized groups unsplit.
	var elems []string
	var el strings.Builder
	var parens int
	for _, r := range unparsed {
		switch r {
		case '(':
			parens++
			el.WriteRune(r)
		case ')':
			parens--
			if parens < 0 {
				return nil, errors.New("unmatched parens")
			}
			el.WriteRune(r)
		case ',':
			if parens == 0 {
				// Top-level element.
				elems = append(elems, strings.TrimSpace(el.String()))
				el.Reset()
			} else {
				// Separator protected by parens within element.
				el.WriteRune(r)
			}
		default:
			el.WriteRune(r)
		}
	}
	elems = append(elems, strings.TrimSpace(el.String()))
	if parens != 0 {
		return nil, errors.New("unmatched parens")
	}
	return elems, nil
}
@lopezator
Copy link
Contributor

@akshayjshah I've went through similar problems when trying to vendorize cockroachdb using modules for: https:/lopezator/sqlfmt

I've updated it recently to support v19.2.0 statements formatting, and had to add a few replaces in there, specifically:

replace (
	github.com/cockroachdb/apd => github.com/cockroachdb/apd/v2 v2.0.2-0.20190505045354-5843f9b6a583
	github.com/cockroachdb/cockroach => github.com/cockroachdb/cockroach-gen v0.0.0-20191119173932-8db914c12b6a
	go.etcd.io/etcd => go.etcd.io/etcd v0.5.0-alpha.5.0.20191113212143-63dd73c1869f
)

https:/lopezator/sqlfmt/blob/master/go.mod#L91-L95

Btw, your approach to migrations is very interesting!

Do you have some code uploaded somewhere? Maybe I could help.

We are using https:/lopezator/migrator for our db migrations and we've been thinking about testing migrations somehow in here: lopezator/migrator#7

Your approach give us more ideas to think of and research, so thank you for opening it!

@github-actions
Copy link

github-actions bot commented Jun 4, 2021

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
5 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants