Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing Database tests and Snowflake Dialect - part 3 out of ... #10458

Merged
merged 83 commits into from
Jul 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
83 commits
Select commit Hold shift + click to select a range
32ab547
notes
radeusgd Jun 27, 2024
56638c8
fixing select columns tests
radeusgd Jun 28, 2024
7dd93c3
update conflict after #10372 - use sort now instead of order_by in tests
radeusgd Jun 28, 2024
76d977d
fix iif test
radeusgd Jun 28, 2024
036c012
test for https:/enso-org/enso/issues/10402
radeusgd Jun 28, 2024
9a7e80f
update how we check value type for clearer errors
radeusgd Jun 28, 2024
7db1e29
fail hard if cannot connect to Snowflake instead of silently running …
radeusgd Jun 28, 2024
bb1c4bc
better message in test suite
radeusgd Jun 28, 2024
c9f6fcc
fix test
radeusgd Jul 1, 2024
d4d14ce
fixing COUNT DISTINCT and trying to fix FIRST
radeusgd Jul 1, 2024
722c2b2
disable first / last
radeusgd Jul 1, 2024
64d1c06
fix COUNT DISTINCT ignoring NULLs
radeusgd Jul 1, 2024
a13437e
checkpoint
radeusgd Jul 1, 2024
4a45ab9
optimize aggregate by sharing tables/connection - create a 2.5k rows …
radeusgd Jul 1, 2024
17e45dd
fixing shortest/longest
radeusgd Jul 1, 2024
ec4050f
unused var
radeusgd Jul 1, 2024
63b04d5
workaround for https:/enso-org/enso/issues/10412
radeusgd Jul 1, 2024
8c4e8a6
fix empty COUNT aggs
radeusgd Jul 1, 2024
ca11ce3
naming tests
radeusgd Jul 1, 2024
d271f3b
re-using same connection in tests
radeusgd Jul 1, 2024
be9aa99
almost 50% faster Core_Spec by more tables sharing
radeusgd Jul 1, 2024
b708e76
sort tables in Core_Spec
radeusgd Jul 1, 2024
67bc315
must be same connection
radeusgd Jul 1, 2024
45e65ee
more sharing
radeusgd Jul 1, 2024
493897c
sharing in Date_Time_Spec
radeusgd Jul 1, 2024
4ab02fd
cross tab share connection
radeusgd Jul 1, 2024
ed16ea9
support setting Date column
radeusgd Jul 2, 2024
d2d03da
wip
radeusgd Jul 2, 2024
cf30d26
WIP
radeusgd Jul 2, 2024
3abd986
correctly escape regex characters in test name patterns - otherwise t…
radeusgd Jul 2, 2024
375ebed
round trip Date_Time column
radeusgd Jul 2, 2024
65aeff3
add a test for round-trip of Date_Time without TZ in supported DBs
radeusgd Jul 2, 2024
c2c2d4a
fixing timestamp without TZ round trip, adding implicit conversion no…
radeusgd Jul 2, 2024
fcf9b46
ignore internal problems
radeusgd Jul 2, 2024
49c0e05
fix timestamp without TZ round trip
radeusgd Jul 2, 2024
0a529a1
avoid warning for implicit conversion
radeusgd Jul 2, 2024
2d8a257
remove print
radeusgd Jul 2, 2024
8e8b5a0
fix test - a date time coercion warning in update is actually expecte…
radeusgd Jul 2, 2024
ca48da6
implementing date operations in Snowflake (date_add, date_part, date_…
radeusgd Jul 3, 2024
eea4618
merge default in-memory setup with the inline one
radeusgd Jul 3, 2024
b36e6b0
tell tests Snowflake actually supports nanos
radeusgd Jul 3, 2024
9673cda
supported aggregates
radeusgd Jul 3, 2024
1267cca
align Time_Period.Day to become Date_Period.Day in Date operations to…
radeusgd Jul 3, 2024
bb77691
updating date add edge cases, DST tests...
radeusgd Jul 3, 2024
b5ddad9
extract edge cases for https:/enso-org/enso/issues/10438 …
radeusgd Jul 3, 2024
73d5007
more informative value type check
radeusgd Jul 4, 2024
afc8394
Revert "more informative value type check"
radeusgd Jul 4, 2024
8a801ef
checkpoint: fixing date operation types and integer type checks etc.
radeusgd Jul 4, 2024
31079f7
migrating to backend specific definition of Integer
radeusgd Jul 4, 2024
2de71da
fix
radeusgd Jul 4, 2024
32a5387
batch rounding
radeusgd Jul 4, 2024
2aa121b
fixing batching
radeusgd Jul 5, 2024
c1034d4
not run some advanced tests
radeusgd Jul 5, 2024
6bb64d5
improve iif
radeusgd Jul 5, 2024
8affc51
parametrized needs execute
radeusgd Jul 5, 2024
550fa0e
add a test
radeusgd Jul 5, 2024
6ac17e0
regex
radeusgd Jul 5, 2024
a038bd2
misc
radeusgd Jul 5, 2024
f397f3b
testing literals
radeusgd Jul 5, 2024
1a68567
more tests for literal types
radeusgd Jul 5, 2024
4d3b68d
even more tests for literal types
radeusgd Jul 5, 2024
ce31419
fix literals with some casts
radeusgd Jul 5, 2024
91bd957
fixing tests
radeusgd Jul 5, 2024
ea2ab6a
disable variable length as not supported, sort table
radeusgd Jul 5, 2024
99db14f
fixing By_Type selection and Conversion_Spec
radeusgd Jul 5, 2024
cb76cfc
Merge branch 'refs/heads/develop' into wip/radeusgd/snowflake-dialect-3
radeusgd Jul 5, 2024
66be4ba
fix a few types
radeusgd Jul 5, 2024
edf512f
remove commented out code
radeusgd Jul 5, 2024
9ba1e41
one more workaround for https:/enso-org/enso/issues/10438
radeusgd Jul 6, 2024
63c5111
Merge branch 'refs/heads/develop' into wip/radeusgd/snowflake-dialect-3
radeusgd Jul 6, 2024
8dab4b8
fix
radeusgd Jul 6, 2024
81b2c82
workaround for https:/enso-org/enso/issues/10465
radeusgd Jul 6, 2024
9f4fcad
Merge branch 'refs/heads/develop' into wip/radeusgd/snowflake-dialect-3
radeusgd Jul 8, 2024
f9bbbe0
cache types from the query to avoid re-fetching queries for each sepa…
radeusgd Jul 8, 2024
f6cbe2e
Merge branch 'refs/heads/develop' into wip/radeusgd/snowflake-dialect-3
radeusgd Jul 8, 2024
cb679b8
fix secret test
radeusgd Jul 8, 2024
ea71e27
update spec to treat integer roundtrip accordingly
radeusgd Jul 8, 2024
49484f4
Merge branch 'refs/heads/develop' into wip/radeusgd/snowflake-dialect-3
radeusgd Jul 8, 2024
ff21e8c
Merge branch 'refs/heads/develop' into wip/radeusgd/snowflake-dialect-3
radeusgd Jul 10, 2024
19c458f
addressing CR comments
radeusgd Jul 10, 2024
5f4f102
adapt after #10483
radeusgd Jul 10, 2024
dd682e0
adapt after #10474
radeusgd Jul 10, 2024
f2cabe6
fix typo
radeusgd Jul 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -122,8 +122,10 @@ type Redshift_Dialect
Internal_Column.Value column.name new_sql_type_reference new_expression

## PRIVATE
needs_execute_query_for_type_inference : Boolean
needs_execute_query_for_type_inference self = False
needs_execute_query_for_type_inference : Text | SQL_Statement -> Boolean
needs_execute_query_for_type_inference self statement =
_ = statement
False

## PRIVATE
supports_separate_nan : Boolean
Expand Down Expand Up @@ -198,3 +200,9 @@ type Redshift_Dialect
## TODO special behaviour for big integer columns should be added here, once we start testing this dialect again
See: https://docs.aws.amazon.com/redshift/latest/dg/r_Numeric_types201.html#r_Numeric_types201-decimal-or-numeric-type
column.value_type

## PRIVATE
needs_literal_table_cast : Value_Type -> Boolean
needs_literal_table_cast self value_type =
_ = value_type
False
Original file line number Diff line number Diff line change
Expand Up @@ -318,18 +318,26 @@ type Connection

Arguments:
- statement: SQL_Statement to execute.
- column_type_suggestions: A vector of SQL type references that can act
as suggested column types. By default, the overrides are respected and
types that should be computed by the database are passed as `Nothing`
to ensure that default `ResultSet` metadata is used for these columns.
- column_types: A vector of SQL type references that can act as suggested
column types. Only `Override` references override the type. Other kinds
of references do not influence the result. `Computed_By_Database`
references may get updated to cache the types fetched from the Database.
- last_row_only: If set true, only the last row of the query is fetched.
Defaults to false.
read_statement : SQL_Statement -> (Nothing | Vector SQL_Type_Reference) -> Boolean -> Table
read_statement self statement column_type_suggestions=Nothing last_row_only=False =
type_overrides = self.dialect.get_type_mapping.prepare_type_overrides column_type_suggestions
read_statement self statement column_types=Nothing last_row_only=False =
type_overrides = self.dialect.get_type_mapping.prepare_type_overrides column_types
statement_setter = self.dialect.get_statement_setter
self.jdbc_connection.with_prepared_statement statement statement_setter stmt->
rs = stmt.executeQuery

# If column types were provided, we will cache the types that were not yet cached.
column_types.if_not_nothing <|
metadata = rs.getMetaData
column_types.each_with_index ix-> sql_type_reference->
sql_type_reference.cache_computed_type <| SQL_Type.from_metadata metadata ix+1

# And finally, materialize the results.
SQL_Warning_Helper.process_warnings stmt <|
result_set_to_table rs self.dialect.get_type_mapping.make_column_fetcher type_overrides last_row_only

Expand All @@ -338,7 +346,7 @@ type Connection
result set.
fetch_columns : Text | SQL_Statement -> Statement_Setter -> Any
fetch_columns self statement statement_setter =
needs_execute_query = self.dialect.needs_execute_query_for_type_inference
needs_execute_query = self.dialect.needs_execute_query_for_type_inference statement
self.jdbc_connection.raw_fetch_columns statement needs_execute_query statement_setter

## PRIVATE
Expand Down
23 changes: 16 additions & 7 deletions distribution/lib/Standard/Database/0.0.0-dev/src/DB_Column.enso
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,13 @@ type DB_Column
inferred_precise_value_type self =
self.value_type

## PRIVATE
Internal hook that says if a given column should be selected by a
specific type in a `By_Type` selection.
should_be_selected_by_type self (value_type : Value_Type) -> Boolean =
type_mapping = self.connection.dialect.get_type_mapping
type_mapping.is_same_type self.value_type value_type

## ICON convert
Returns an SQL statement that will be used for materializing this column.
to_sql : SQL_Statement
Expand Down Expand Up @@ -1299,7 +1306,7 @@ type DB_Column
Examples.text_column_1.text_left 5
text_left : DB_Column|Integer -> DB_Column
text_left self n =
Value_Type.expect_text self <| Value_Type.expect_integer n <|
Value_Type.expect_text self <| Helpers.expect_dialect_specific_integer_type self n <|
n2 = n.max 0
new_name = self.naming_helper.function_name "text_left" [self, n]
self.make_binary_op "LEFT" n2 new_name
Expand All @@ -1320,7 +1327,7 @@ type DB_Column
Examples.text_column_1.text_right 5
text_right : DB_Column|Integer -> DB_Column
text_right self n =
Value_Type.expect_text self <| Value_Type.expect_integer n <|
Value_Type.expect_text self <| Helpers.expect_dialect_specific_integer_type self n <|
n2 = n.max 0
new_name = self.naming_helper.function_name "text_right" [self, n]
self.make_binary_op "RIGHT" n2 new_name
Expand Down Expand Up @@ -1618,9 +1625,10 @@ type DB_Column
Value_Type.expect_type self .is_date_or_time "date/time" <|
my_type = self.inferred_precise_value_type
Value_Type.expect_type end (== my_type) my_type.to_display_text <|
Date_Time_Helpers.check_period_aligned_with_value_type my_type period <|
aligned_period = Date_Time_Helpers.align_period_with_value_type my_type period
aligned_period.if_not_error <|
new_name = self.naming_helper.function_name "date_diff" [self, end, period.to_display_text]
metadata = self.connection.dialect.prepare_metadata_for_period period my_type
metadata = self.connection.dialect.prepare_metadata_for_period aligned_period my_type
self.make_op "date_diff" [end] new_name metadata

## GROUP Standard.Base.DateTime
Expand All @@ -1647,10 +1655,11 @@ type DB_Column
date_add self amount (period : Date_Period | Time_Period = default_date_period self) =
Value_Type.expect_type self .is_date_or_time "date/time" <|
my_type = self.inferred_precise_value_type
Value_Type.expect_integer amount <|
Date_Time_Helpers.check_period_aligned_with_value_type my_type period <|
Helpers.expect_dialect_specific_integer_type self amount <|
aligned_period = Date_Time_Helpers.align_period_with_value_type my_type period
aligned_period.if_not_error <|
new_name = self.naming_helper.function_name "date_add" [self, amount, period.to_display_text]
metadata = self.connection.dialect.prepare_metadata_for_period period my_type
metadata = self.connection.dialect.prepare_metadata_for_period aligned_period my_type
self.make_op "date_add" [amount] new_name metadata

## GROUP Standard.Base.Logical
Expand Down
39 changes: 24 additions & 15 deletions distribution/lib/Standard/Database/0.0.0-dev/src/DB_Table.enso
Original file line number Diff line number Diff line change
Expand Up @@ -1085,7 +1085,13 @@ type DB_Table
_ -> type_mapping.value_type_to_sql argument_value_type Problem_Behavior.Ignore
expr = SQL_Expression.Constant value
new_type_ref = SQL_Type_Reference.from_constant sql_type
DB_Column.Value value.pretty self.connection new_type_ref expr self.context
base_column = Internal_Column.Value value.pretty new_type_ref expr
needs_cast = argument_value_type.is_nothing.not && self.connection.dialect.needs_literal_table_cast argument_value_type
result_internal_column = if needs_cast.not then base_column else
infer_type_from_database new_expression =
SQL_Type_Reference.new self.connection self.context new_expression
self.connection.dialect.make_cast base_column sql_type infer_type_from_database
self.make_column result_internal_column

## PRIVATE
Create a unique temporary column name.
Expand Down Expand Up @@ -1153,8 +1159,8 @@ type DB_Table
last_row self =
if self.internal_columns.is_empty then Error.throw (Illegal_Argument.Error "Cannot create a table with no columns.") else
sql = self.to_sql
column_type_suggestions = self.internal_columns.map .sql_type_reference
table = self.connection.read_statement sql column_type_suggestions last_row_only=True
column_types = self.internal_columns.map .sql_type_reference
table = self.connection.read_statement sql column_types last_row_only=True
table.rows.first

## ALIAS sort
Expand Down Expand Up @@ -2596,8 +2602,8 @@ type DB_Table
Rows_To_Read.First_With_Warning n -> self.limit n+1

sql = preprocessed.to_sql
column_type_suggestions = preprocessed.internal_columns.map .sql_type_reference
materialized_table = self.connection.read_statement sql column_type_suggestions . catch SQL_Error sql_error->
column_types = preprocessed.internal_columns.map .sql_type_reference
materialized_table = self.connection.read_statement sql column_types . catch SQL_Error sql_error->
Error.throw (self.connection.dialect.get_error_mapper.transform_custom_errors sql_error)

warnings_builder = Builder.new
Expand Down Expand Up @@ -3055,23 +3061,26 @@ make_literal_table connection column_vectors column_names alias =
if total_size == 0 then Error.throw (Illegal_Argument.Error "Vectors cannot be empty") else
if total_size > MAX_LITERAL_ELEMENT_COUNT then Error.throw (Illegal_Argument.Error "Too many elements for table literal ("+total_size.to_text+"): materialize a table into the database instead") else
type_mapping = connection.dialect.get_type_mapping

values_to_type_ref column_vector =
value_type = Value_Type_Helpers.find_common_type_for_arguments column_vector
sql_type = case value_type of
Nothing -> SQL_Type.null
_ -> type_mapping.value_type_to_sql value_type Problem_Behavior.Ignore
SQL_Type_Reference.from_constant sql_type

from_spec = From_Spec.Literal_Values column_vectors column_names alias
context = Context.for_subquery from_spec

infer_type_from_database new_expression =
SQL_Type_Reference.new connection context new_expression

internal_columns = 0.up_to column_vectors.length . map i->
column_vector = column_vectors.at i
column_name = column_names.at i

type_ref = values_to_type_ref column_vector.to_vector
value_type = Value_Type_Helpers.find_common_type_for_arguments column_vector.to_vector
sql_type = case value_type of
Nothing -> SQL_Type.null
_ -> type_mapping.value_type_to_sql value_type Problem_Behavior.Ignore
type_ref = SQL_Type_Reference.from_constant sql_type
sql_expression = SQL_Expression.Column alias column_name
Internal_Column.Value column_name type_ref sql_expression
base_column = Internal_Column.Value column_name type_ref sql_expression

needs_cast = value_type.is_nothing.not && connection.dialect.needs_literal_table_cast value_type
if needs_cast.not then base_column else
connection.dialect.make_cast base_column sql_type infer_type_from_database

DB_Table.Value alias connection internal_columns context
Original file line number Diff line number Diff line change
Expand Up @@ -127,8 +127,12 @@ type Dialect
executing the query. In some however, like SQLite, this is insufficient
and will yield incorrect results, so the query needs to be executed (even
though the full results may not need to be streamed).
needs_execute_query_for_type_inference : Boolean
needs_execute_query_for_type_inference self =

The function takes the statement as an argument which can be used in
heuristics telling whether the execute is needed.
needs_execute_query_for_type_inference : Text | SQL_Statement -> Boolean
needs_execute_query_for_type_inference self statement =
_ = statement
Unimplemented.throw "This is an interface only."

## PRIVATE
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ from Standard.Base import all
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Base.Runtime import assert

from Standard.Table import Value_Type

import project.DB_Column.DB_Column
import project.DB_Table.DB_Table
import project.Internal.IR.Internal_Column.Internal_Column
Expand Down Expand Up @@ -81,3 +83,10 @@ rename_internal_columns : Vector Internal_Column -> Vector Text -> Vector Intern
rename_internal_columns columns new_names =
columns.zip new_names col-> name->
col.rename name

## PRIVATE
Checks if the `argument` has an integer type (as defined by the dialect associated with `related_column`).
See `SQL_Type_Mapping.is_integer_type` for details.
expect_dialect_specific_integer_type related_column argument ~action =
type_mapping = related_column.connection.dialect.get_type_mapping
Value_Type.expect_type argument type_mapping.is_integer_type "Integer" action
Original file line number Diff line number Diff line change
Expand Up @@ -158,8 +158,10 @@ type Postgres_Dialect
Internal_Column.Value column.name new_sql_type_reference new_expression

## PRIVATE
needs_execute_query_for_type_inference : Boolean
needs_execute_query_for_type_inference self = False
needs_execute_query_for_type_inference : Text | SQL_Statement -> Boolean
needs_execute_query_for_type_inference self statement =
_ = statement
False

## PRIVATE
supports_separate_nan : Boolean
Expand Down Expand Up @@ -302,6 +304,12 @@ type Postgres_Dialect
False -> base_type
_ -> base_type

## PRIVATE
needs_literal_table_cast : Value_Type -> Boolean
needs_literal_table_cast self value_type =
_ = value_type
False

## PRIVATE
make_dialect_operations =
cases = [["LOWER", Base_Generator.make_function "LOWER"], ["UPPER", Base_Generator.make_function "UPPER"]]
Expand Down Expand Up @@ -664,6 +672,8 @@ make_date_add arguments (metadata : Date_Period_Metadata) =
"secs=>0.001"
Time_Period.Microsecond ->
"secs=>0.000001"
Time_Period.Nanosecond ->
Panic.throw (Illegal_State.Error "Nanosecond precision is not supported by Postgres.")
interval_expression = SQL_Builder.code "make_interval(" ++ interval_arg ++ ")"
shifted = SQL_Builder.code "(" ++ expr ++ " + (" ++ amount ++ " * " ++ interval_expression ++ "))"
case metadata.input_value_type of
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,19 @@ type Postgres_Type_Mapping
value_type = self.sql_type_to_value_type sql_type
Column_Fetcher_Module.default_fetcher_for_value_type value_type

## PRIVATE
is_implicit_conversion (source_type : Value_Type) (target_type : Value_Type) -> Boolean =
# Currently, we do not have any implicit conversions.
_ = [source_type, target_type]
False

## PRIVATE
is_integer_type (value_type : Value_Type) -> Boolean = value_type.is_integer

## PRIVATE
is_same_type (value_type1 : Value_Type) (value_type2 : Value_Type) -> Boolean =
value_type1.is_same_type value_type2

## PRIVATE
simple_types_map = Dictionary.from_vector <|
ints = [[Types.SMALLINT, Value_Type.Integer Bits.Bits_16], [Types.BIGINT, Value_Type.Integer Bits.Bits_64], [Types.INTEGER, Value_Type.Integer Bits.Bits_32]]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,40 @@ type SQL_Type_Mapping
_ = column_type_suggestions
Unimplemented.throw "This is an interface only."

## PRIVATE
Checks if the conversion between the two types is one to be done implicitly in the given backend.
Conversions marked as implicit will not raise Inexact_Type_Coercion warnings.

For example, the Snowflake database converts all integer types to NUMERIC(38, 0).
This conversion is a property of the database, so warning about it would only be annoying.
is_implicit_conversion (source_type : Value_Type) (target_type : Value_Type) -> Boolean =
_ = [source_type, target_type]
Unimplemented.throw "This is an interface only."

## PRIVATE
Specifies if this backend recognizes the given type as an integer type.

For most backends, this should just be `.is_integer`.
However, in some backends (e.g. Snowflake), the Decimal type is treated
as the main Integer type, so this method can be used to specify that.
We don't make Decimal type an integer type by default, as in other
backends we do want to keep the distinction (for example in Postgres,
`date_add` function will work with Integer but not with Decimal types).
is_integer_type (value_type : Value_Type) -> Boolean =
_ = value_type
Unimplemented.throw "This is an interface only."

## PRIVATE
Checks if the two types are to be considered the same by the `By_Type`
selector.

In most backends this can just delegate to `Value_Type.is_same_type`. But
e.g. in Snowflake this can be used to make Decimal and Integer types
interchangeable.
is_same_type (value_type1 : Value_Type) (value_type2 : Value_Type) -> Boolean =
_ = [value_type1, value_type2]
Unimplemented.throw "This is an interface only."

## PRIVATE
default_sql_type_to_text sql_type =
suffix = case sql_type.precision of
Expand Down
Loading
Loading