Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New JDBC driver #747

Merged
merged 10 commits into from
Nov 28, 2021
Merged

New JDBC driver #747

merged 10 commits into from
Nov 28, 2021

Conversation

zhicwu
Copy link
Contributor

@zhicwu zhicwu commented Oct 17, 2021

Gave up most changes in #743 because 1) mixed APIs are confusing; 2) risky to upgrade.

This pull request is to add a brand new JDBC driver com.clickhouse.jdbc.ClickHouseDriver with minimum changes to legacy code, for better backward compatibility and visibility of what's coming next.

TODOs:

  • primitive array support
  • new connection string
    • support both jdbc:clickhouse: and jdbc:ch: as prefix
    • protocol defaults to http, so jdbc:ch://localhost is same as jdbc:ch:http://localhost
    • format: RowBinaryWithNamesAndTypes
  • new implementation under com.clickhouse.jdbc
  • move SQL parser to com.clickhouse.jdbc.parser
  • implement clickhouse-http-client and make it default provider for the new driver
  • new and enhanced JDBC driver built on top of clickhouse-client:
    • ClickHouseProperties/Map/Properties -> ClickHouseConfig
    • ClickHouseConnection -> ClickHouseClient
    • ClickHouseStatement -> ClickHouseRequest
    • ClickHouseResultSet -> ClickHouseRecord
  • multiple shaded jars(like in clickhouse-grpc-client) for JDBC driver
    • default: both old and new drivers with http client < 3MB
    • all: both old and new drivers with http and grpc(netty + okhttp) clients < 20MB
    • http: new driver and http client(exclude Apache http client) < 1MB
    • grpc: new driver and grpc(only netty) client < 18MB
  • better support of metadata and SQLState
    • more data types
    • more table types: DICTIONARY, LOG TABLE, MEMORY TABLE, REMOTE TABLE, TABLE, VIEW, SYSTEM TABLE, TEMPORARY TABLE
    • return data skipping indices and projection from DatabaseMetaData.getIndexInfo
    • return JDBC datasources from DatabaseMetaData.getSchemas
  • setClientInfo is supported and leading comment of query can be extracted as log_comment(in system.query_log)
  • new options for customizing headers and query parameters(only works when using http protocol)
    • custom_http_headers: "User-Agent=New Client, X-Forwarded-For=1\,2"
    • custom_http_params: "max_execution_time=30, mutations_sync=2"
  • jdbc_compliant option to enable:
    • fake transaction(including savepoint) support
    • enhance parser to support standard synchronous UPDATE and DELETE statements
      • update a set c1=1 where c2=2 -> alter table a update c1=1 where c2=2 settings mutations_sync=1
      • delete from a -> truncate table a
      • delete from a where c1=1 -> alter table a delete where c1=1 settings mutations_sync=1
  • support timestamp with timezone
  • support query with custom format(e.g. select * from system.query_log limit 5 format JSONEachRow) - only works for text format
  • improve prepared statement
    * ternary operator support - select 1 ? 'a' : 'b', 2 ? (select 1) : 2, ? only contains one parameter(the last ?)
    * type inferring from input function and maybe magic comments?
    * parameter templating based on first value?
  • shared test cases for clickhouse-grpc-client and clickhouse-http-client
  • better support of AggregateFunction
  • optimize Bitmap support by implementing ClickHouseValue interface and use InputStream/OutputStream instead of in-memory byte array
  • make Gson optional

Update:

  • performance test on my laptop
    50 concurrent users, each issue query select * from numbers(10000) for 10,000 times; connection pool size is 50, select 1 is used to validate connection.

    Driver Samples Average Median 90% Line 95% Line 99% Line Min Max Error % Throughput Received KB/sec
    0.3.2-SNAPSHOT(1st time) 500000 32 26 63 79 129 3 1151 0 1515.15152 72349.96449
    clickhouse4j 1.4.4 500000 45 37 91 104 149 4 726 0 1067.12653 50956.33378
    0.3.1-patch 500000 45 37 89 101 167 5 854 0 1067.42269 50970.47569
    Native JDBC 2.5.6 500000 46 40 75 97 163 4 1494 0 1050.2238 50149.21219
    0.3.2-SNAPSHOT(2nd time) 500000 33 27 66 84 135 3 2143 0 1469.60706 70175.17211

    Note: JMeter was used for the testing(close and re-open for each run).

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt    Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  255.344 ±  23.862  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  252.867 ±  30.822  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  256.272 ±  31.537  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  245.739 ±  29.416  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  813.968 ±  59.776  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  810.547 ± 102.419  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  747.209 ± 106.703  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  808.086 ±  78.744  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt    Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  263.747 ±  27.532  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  260.837 ±  30.861  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  256.529 ±  35.431  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  257.364 ±  27.079  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  935.103 ± 103.960  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  957.903 ±  97.434  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  863.016 ± 105.992  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  913.743 ± 123.475  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt     Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20   281.761 ±  30.613  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20   281.392 ±  30.516  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20   276.713 ±  27.903  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20   275.474 ±  31.496  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  1076.904 ± 161.589  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  1047.279 ± 159.435  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  1101.816 ± 106.802  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  1065.013 ± 142.951  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt    Score    Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  248.054 ± 27.872  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  247.728 ± 30.041  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  250.852 ± 28.662  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  248.888 ± 32.246  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  710.596 ± 75.300  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  759.606 ± 61.576  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  772.855 ± 70.034  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  769.686 ± 75.950  ops/s

@sriram-adpusuhup
Copy link

Rooting for this feature to go live soon.... Native array support would be very very useful

@zhicwu
Copy link
Contributor Author

zhicwu commented Oct 29, 2021

Rooting for this feature to go live soon.... Native array support would be very very useful

Thanks but native array support is not a new feature - it's been there for a while...

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt    Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  240.227 ±  29.466  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  231.565 ±  30.623  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  238.046 ±  32.928  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  238.954 ±  29.862  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  871.643 ± 108.276  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  872.251 ± 121.699  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  898.340 ±  78.182  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  882.122 ± 116.631  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt    Score    Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  264.316 ± 31.018  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  255.312 ± 34.004  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  247.712 ± 33.833  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  249.528 ± 25.614  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  761.262 ± 79.712  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  738.904 ± 53.579  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  760.557 ± 75.642  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  751.698 ± 89.213  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt     Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20   276.534 ±  27.450  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20   267.769 ±  33.274  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20   268.201 ±  28.424  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20   261.112 ±  32.914  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20   980.856 ± 111.892  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  1007.318 ± 123.171  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20   945.274 ± 108.789  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20   895.885 ± 137.363  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt     Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20   282.624 ±  33.603  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20   273.285 ±  29.705  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20   274.188 ±  32.914  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20   271.985 ±  30.366  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  1097.392 ±  94.994  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  1072.369 ± 144.691  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20   980.367 ±  99.753  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  1081.616 ± 166.831  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt     Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20   269.293 ±  34.622  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20   264.608 ±  30.681  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20   263.739 ±  29.701  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20   265.491 ±  30.191  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  1129.307 ±  92.684  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  1027.484 ± 158.988  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  1096.134 ± 137.774  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  1069.247 ± 136.395  ops/s

@github-actions
Copy link

Benchmark                           (client)  (connection)  (statement)   Mode  Cnt    Score     Error  Units
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  234.483 ±  27.824  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  229.260 ±  34.784  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  229.891 ±  32.569  ops/s
Basic.insertOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  224.245 ±  31.813  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse       normal  thrpt   20  725.417 ±  62.071  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc         reuse     prepared  thrpt   20  685.442 ± 101.599  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new       normal  thrpt   20  722.138 ±  77.278  ops/s
Basic.selectOneRandomNumber  clickhouse-jdbc           new     prepared  thrpt   20  739.166 ±  58.385  ops/s

@zhicwu
Copy link
Contributor Author

zhicwu commented Nov 28, 2021

Key features of the new JDBC driver are ready and being tested in local for weeks. It's time to get the pull request merged into develop for creating test builds.

I'll create a separate PR to complete below features:

  • basic JDBC escape syntax support
  • enhance prepared statement - 1) multiple values for batch insertion; 2) leverage JDBC escape syntax and external table for non-insert query(in a similar way as input function based approach)
  • shared test cases for all type of clients
  • better support of AggregateFunction(and further optimization to bitmap)
  • implement unwrap methods to expose Java client API for JDBC users

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment