Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] workload D reports read failures when used with client side batched inserts #1356

Open
VinayBanakar opened this issue Sep 23, 2019 · 5 comments

Comments

@VinayBanakar
Copy link

VinayBanakar commented Sep 23, 2019

Hi, I am using PG 13 (seen on older PGs also) and cloned YCSB last week and using jdbc binding 0.16. Irrespective of what number I set operationcount and recordcount to, I observe the same failure always.

The command I use: (I have run load A by now)
./bin/ycsb run jdbc -P workloads/workloadd -P jdbc/src/main/conf/db.properties -p recordcount=100 -p operationcount=100 -cp jdbc/target/postgresql-9.0-802.jdbc4.jar

Result:

# ./bin/ycsb run jdbc -P workloads/workloadd -P jdbc/src/main/conf/db.properties -p recordcount=100 -p operationcount=100 -cp jdbc/target/postgresql-9.0-802.jdbc4.jar
[WARN]  Running against a source checkout. In order to get our runtime dependencies we'll have to invoke Maven. Depending on the state of your system, this may take ~30-45 seconds
[DEBUG]  Running 'mvn -pl com.yahoo.ycsb:jdbc-binding -am package -DskipTests dependency:build-classpath -DincludeScope=compile -Dmdep.outputFilterFile=true'
java -cp jdbc/target/postgresql-9.0-802.jdbc4.jar:/home/cc/YCSB/jdbc/conf:/home/cc/YCSB/jdbc/target/jdbc-binding-0.16.0-SNAPSHOT.jar:/home/cc/YCSB/jdbc/target/jdbc-binding-0.17.0-SNAPSHOT.jar:/root/.m2/repository/org/apache/geronimo/specs/geronimo-jta_1.1_spec/1.1.1/geronimo-jta_1.1_spec-1.1.1.jar:/root/.m2/repository/org/apache/htrace/htrace-core4/4.1.0-incubating/htrace-core4-4.1.0-incubating.jar:/root/.m2/repository/net/sourceforge/serp/serp/1.13.1/serp-1.13.1.jar:/root/.m2/repository/org/hdrhistogram/HdrHistogram/2.1.4/HdrHistogram-2.1.4.jar:/root/.m2/repository/org/apache/openjpa/openjpa-jdbc/2.1.1/openjpa-jdbc-2.1.1.jar:/root/.m2/repository/org/apache/geronimo/specs/geronimo-jms_1.1_spec/1.1.1/geronimo-jms_1.1_spec-1.1.1.jar:/home/cc/YCSB/core/target/core-0.17.0-SNAPSHOT.jar:/root/.m2/repository/org/apache/openjpa/openjpa-kernel/2.1.1/openjpa-kernel-2.1.1.jar:/root/.m2/repository/org/codehaus/jackson/jackson-core-asl/1.9.4/jackson-core-asl-1.9.4.jar:/root/.m2/repository/commons-collections/commons-collections/3.2.1/commons-collections-3.2.1.jar:/root/.m2/repository/commons-lang/commons-lang/2.4/commons-lang-2.4.jar:/root/.m2/repository/org/codehaus/jackson/jackson-mapper-asl/1.9.4/jackson-mapper-asl-1.9.4.jar:/root/.m2/repository/org/apache/openjpa/openjpa-lib/2.1.1/openjpa-lib-2.1.1.jar:/root/.m2/repository/commons-pool/commons-pool/1.5.4/commons-pool-1.5.4.jar com.yahoo.ycsb.Client -db com.yahoo.ycsb.db.JdbcDBClient -P workloads/workloadd -P jdbc/src/main/conf/db.properties -p recordcount=100 -p operationcount=100 -t
Command line: -db com.yahoo.ycsb.db.JdbcDBClient -P workloads/workloadd -P jdbc/src/main/conf/db.properties -p recordcount=100 -p operationcount=100 -t
YCSB Client 0.17.0-SNAPSHOT

Loading workload...
Starting test.
Adding shard node URL: jdbc:postgresql://localhost:5432/abc?reWriteBatchedInserts=true
Using shards: 1, batchSize:1000, fetchSize: 1000
DBWrapper: report latency for each error is false and specific error codes to track for latency are: []
[OVERALL], RunTime(ms), 203
[OVERALL], Throughput(ops/sec), 492.61083743842363
[TOTAL_GCS_PS_Scavenge], Count, 0
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 0
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.0
[TOTAL_GCS_PS_MarkSweep], Count, 0
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 0
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.0
[TOTAL_GCs], Count, 0
[TOTAL_GC_TIME], Time(ms), 0
[TOTAL_GC_TIME_%], Time(%), 0.0
[READ], Operations, 71
[READ], AverageLatency(us), 862.4084507042254
[READ], MinLatency(us), 301
[READ], MaxLatency(us), 27935
[READ], 95thPercentileLatency(us), 922
[READ], 99thPercentileLatency(us), 1168
[READ], Return=OK, 71
[READ], Return=NOT_FOUND, 21
[CLEANUP], Operations, 1
[CLEANUP], AverageLatency(us), 7366.0
[CLEANUP], MinLatency(us), 7364
[CLEANUP], MaxLatency(us), 7367
[CLEANUP], 95thPercentileLatency(us), 7367
[CLEANUP], 99thPercentileLatency(us), 7367
[INSERT], Operations, 8
[INSERT], AverageLatency(us), 573.0
[INSERT], MinLatency(us), 311
[INSERT], MaxLatency(us), 1881
[INSERT], 95thPercentileLatency(us), 1881
[INSERT], 99thPercentileLatency(us), 1881
[INSERT], Return=BATCHED_OK, 8
[READ-FAILED], Operations, 21
[READ-FAILED], AverageLatency(us), 285.57142857142856
[READ-FAILED], MinLatency(us), 210
[READ-FAILED], MaxLatency(us), 409
[READ-FAILED], 95thPercentileLatency(us), 388
[READ-FAILED], 99thPercentileLatency(us), 409

This is my db.properties file:

db.driver=org.postgresql.Driver
jdbc.fetchsize=1000
#db.url=jdbc:postgresql://localhost:5432/abc?reWriteBatchedInserts=true&sslmode='verify-ca'&ssl='true'&sslcert='/home/cc/postFix/pg13Data/data/client/postgresql.crt'&sslkey='/home/cc/postFix/pg13Data/data/client/postgresql.key'&sslrootcert='/home/cc/postFix/pg13Data/data/server.crt'
db.url=jdbc:postgresql://localhost:5432/abc?reWriteBatchedInserts=true
db.user=abc
db.passwd=PASSWORD
db.batchsize=1000
jdbc.batchupdateapi=true

I don't particularly see failures in any other workloads. Let me know if you need any further information. I could verify the same failure on different machines also.

And I have made no changes to the workload D file:

recordcount=1000
operationcount=1000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=true
readproportion=0.95
updateproportion=0
scanproportion=0
insertproportion=0.05
requestdistribution=latest
@VinayBanakar VinayBanakar changed the title Seeing [READ-FAILED] always for a proportion of READs in workload D for PostgreSQL via Jdbc consistently seeing [READ-FAILED] for a proportion of READs in workload D for PostgreSQL via Jdbc Sep 23, 2019
@VinayBanakar VinayBanakar changed the title consistently seeing [READ-FAILED] for a proportion of READs in workload D for PostgreSQL via Jdbc Consistently seeing [READ-FAILED] for a proportion of READs in workload D for PostgreSQL via Jdbc Sep 23, 2019
@busbey
Copy link
Collaborator

busbey commented Sep 24, 2019

what commands did you run prior to the run command listed?

@VinayBanakar
Copy link
Author

Hi, I started with before running RUN D.

./bin/ycsb load jdbc -P workloads/workloada -P jdbc/src/main/conf/db.properties -p recordcount=100 -p operationcount=100 -cp jdbc/target/postgresql-9.0-802.jdbc4.jar

So to narrow down what's going on I enabled pg logging (and have a function policy on usertable that logs every read also).

./bin/ycsb run jdbc -P workloads/workloadd -P jdbc/src/main/conf/db.properties -p recordcount=100 -p operationcount=20 -cp jdbc/target/postgresql-9.0-802.jdbc4.jar

[OVERALL], RunTime(ms), 175
[OVERALL], Throughput(ops/sec), 114.28571428571429
[TOTAL_GCS_PS_Scavenge], Count, 0
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 0
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.0
[TOTAL_GCS_PS_MarkSweep], Count, 0
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 0
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.0
[TOTAL_GCs], Count, 0
[TOTAL_GC_TIME], Time(ms), 0
[TOTAL_GC_TIME_%], Time(%), 0.0
[READ], Operations, 16
[READ], AverageLatency(us), 2444.25
[READ], MinLatency(us), 434
[READ], MaxLatency(us), 29519
[READ], 95thPercentileLatency(us), 1165
[READ], 99thPercentileLatency(us), 29519
[READ], Return=OK, 16
[READ], Return=NOT_FOUND, 1
[CLEANUP], Operations, 1
[CLEANUP], AverageLatency(us), 5774.0
[CLEANUP], MinLatency(us), 5772
[CLEANUP], MaxLatency(us), 5775
[CLEANUP], 95thPercentileLatency(us), 5775
[CLEANUP], 99thPercentileLatency(us), 5775
[INSERT], Operations, 3
[INSERT], AverageLatency(us), 1051.3333333333333
[INSERT], MinLatency(us), 460
[INSERT], MaxLatency(us), 1771
[INSERT], 95thPercentileLatency(us), 1771
[INSERT], 99thPercentileLatency(us), 1771
[INSERT], Return=BATCHED_OK, 3
[READ-FAILED], Operations, 1
[READ-FAILED], AverageLatency(us), 389.0
[READ-FAILED], MinLatency(us), 389
[READ-FAILED], MaxLatency(us), 389
[READ-FAILED], 95thPercentileLatency(us), 389
[READ-FAILED], 99thPercentileLatency(us), 389

I have the xlsx attached (github does not support csv), you can see the one Read failure for user879817313296471393, but I can also see there are inserts for same ycsb_key. So I am wondering if this READ was supposed to be post the user INSERT.

1ReadFailed.xlsx

@VinayBanakar
Copy link
Author

I say this because,

SELECT * FROM usertable WHERE YCSB_KEY ='user879817313296471393';
INFO:  log: (user879817313296471393,"*'*)#6.E%,=b=5x2Wm*?|/R9,(*=F92N'!H#$]o""@s''~+B)*^#(Uq*Jm37d*6*8G!(Y7,Ua*(*!1<?Uy*4j#N-1Mo&Jk.Pi6.d>",".W' ""j*Aq.Hy:P?1?:>4`'5l(]s;Y1?B{&?b<S5-3r/%`:Z;&7  ,>)R;)Dk)4l+R5'^'54`<:,!0$,<<-Q17=p/)l""I5=^a=Z75","0O'53(5&n;'~ 7l:8z /x6&h=]%1N!=T5= ` Wm':"",6f&K7;T90O98Ry6Iy4_u>-((12$Rc4[w:-v>0 !(b3(v&)0-%(!/"";>v;","9Q/,74<R152r,#0,Do*0`:Jc4Vc2D{!"">=!d:=p'Gm=B9 )8:4h:]{&Q5('d))2 ..5=:1+6-@i/26$7`+^5$O-)^o!""0!T9*L!%","5O%=)p!+h;$j<A/8D-7:t#3t/@s''<,Ck/\\5&400T)(""$9=$0- $Q%#^c>.8<!h'Nc Kg6Z-4""z;%`""-~*3$1?~'D;<E50-,$+","$0l:3~2V?(Gi446=J?,N}:.d#W+1)(#+h;$(/&>!M? - 18.>*z2Sc1&*1Ys ]k#[?*E7<Xw'W+-,4)U7<7U50Q;>G1(Yy&=l+","""_#(Y=!#z*Tw66v?Oi*4z:0$- b69b6Fg#3b/V=,Uo(J51=#h>=v6Jo& l5'& T!=+h.[w,Cc&_#4Ti*D+ Nu3+6'Js>,j=Qo6","+5p>C{;Ww:0<8_/5^k7&9v$'x""I!%!b?T= 50$A;'Hy;-<<&,$H?.?:)2>4;f::0&..5.n,S% !z>74<&&#]k""^a, r""Li>S!(","=P#0M/1Ra>Jo2Mg/(p7Sq:Jg'Di/Yk,P}#Ay8)v>_1""7""![-60""%M5/@y&Ji6Nu&Kw8M'$Jc0K%5Ng+Vy2Lc4C}:@==?x6 z>.>=",")S+,M5<648T52 l>3$7E9!Dq(8.1^e+:f#Ak)S}&;v""669A'1-f6;$  $!-*4, ,4*9Pc6U-(O'83""=G}>6~!Ei:0b7(4 406O/8","2019-09-24 03:44:20.579415")
CONTEXT:  PL/pgSQL function log_record(usertable) line 1 at RAISE

Which means user user879817313296471393 exists when I try from psql, but not at the time of READ from YCSB.

@VinayBanakar
Copy link
Author

Okay, so removing reWriteBatchedInserts from db.url seems to fix the issue. So likely there is a problem with batched inserts but haven't been able to debug more.

@busbey
Copy link
Collaborator

busbey commented Oct 2, 2019

Oh! It's probably because the batched insert hasn't actually finished inserting when the shared state in workload D has updated the key range to include the new records.

I'll add this to the known issues on the next release.

@busbey busbey changed the title Consistently seeing [READ-FAILED] for a proportion of READs in workload D for PostgreSQL via Jdbc [core] workload D reports read failures when used with client side batched inserts Oct 2, 2019
@busbey busbey mentioned this issue Oct 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants