NetworkInputStream #23

sagenschneider · 2018-08-27T15:59:19Z

@alexanderkjall thanks for merging the other PR. So that I don't get too much divergence, I've raised the PR with the NetworkInputStream for you to get an idea of where I'm going with this.

There are test failures on this branch, however, as per other PR... raising early for you to see the changes and where they are to avoid too much merge conflicts down the line... once I can get it working.

One thing I believe I have wrong is the Portal / Query relationship. I think the objects need to be the other way round. Query is submitted and associated to a potential portal. Once the query is undertaken more than 5 times (following pgjdbc), then can start naming the portal to associate.

Interesting is that (as per @davecramer on #19 ), is portal created:

for re-using the query and avoiding reparsing (prepared statements)
for cursors (note clear on how this works... but assuming the creation of a cursor requires naming or something along these lines)

For first above an LRU cache would be ok. For the second, LRU cache could create some "interesting" hard to reproduce errors. For each, I'm thinking would need to manage separately. However, for now prepared statements are my interest for performance.

fixed the test in PGConnectionTLSTest

Pr 19

Conflicts: src/main/java/org/postgresql/sql2/communication/network/DescribeResponse.java src/test/java/org/postgresql/sql2/communication/BEFrameReaderTest.java

Conflicts: src/main/java/org/postgresql/sql2/communication/BEFrameParser.java src/main/java/org/postgresql/sql2/communication/NetworkConnection.java src/main/java/org/postgresql/sql2/communication/NetworkReadContext.java src/main/java/org/postgresql/sql2/communication/TableCell.java src/main/java/org/postgresql/sql2/communication/network/AuthenticationResponse.java src/main/java/org/postgresql/sql2/communication/network/BindResponse.java src/main/java/org/postgresql/sql2/communication/network/DescribeResponse.java src/main/java/org/postgresql/sql2/communication/network/ExecuteResponse.java src/main/java/org/postgresql/sql2/communication/network/NetworkConnectRequest.java src/main/java/org/postgresql/sql2/communication/network/ParseResponse.java src/main/java/org/postgresql/sql2/communication/network/ReadyForQueryResponse.java src/main/java/org/postgresql/sql2/communication/packets/AuthenticationRequest.java src/main/java/org/postgresql/sql2/communication/packets/CommandComplete.java src/main/java/org/postgresql/sql2/communication/packets/DataRow.java src/main/java/org/postgresql/sql2/communication/packets/ErrorPacket.java src/main/java/org/postgresql/sql2/communication/packets/ParameterStatus.java src/main/java/org/postgresql/sql2/communication/packets/RowDescription.java src/main/java/org/postgresql/sql2/communication/packets/parsers/BinaryParser.java src/main/java/org/postgresql/sql2/communication/packets/parts/ColumnTypes.java src/test/java/org/postgresql/sql2/execution/NioLoopTest.java

cretz · 2018-08-27T16:45:25Z

deploy-pom.xml

+	</properties>
+	<distributionManagement>
+		<repository>
+			<id>repo.officefloor.sf.net</id>


Lots of commits keep coming in with some of this stuff

Sorry, this is my local deployment to make the jar available for download. This is necessary for TechEmpower to access the jar for building the OfficeFloor server to use pgsql2. We should possibly consider starting to apply to release to maven central to make early adopter interest available.

alexanderkjall · 2018-08-27T19:47:09Z

Nice, I will read through it, but one thing regarding portals. I now realize that it's not mandatory to give the portals names.

But regarding named portals, those are state in the connection, and I think the server can deallocate them if it runs out of resources or by some other rule, so we need to determine what should happen if we try to use a named portal that no longer exist on the server end. I think we have two options

retry the statement/transaction.
propagate the error to the user.

Both have drawbacks, with 1 it means that statements might be executed in the wrong order and with 2 it exposes the user to something that should be the responsibility of the connection library.

This needs to be investigated further, maybe it's enough to set a upper limit on the number of named portals per connection.

sagenschneider · 2018-08-28T02:26:42Z

@alexanderkjall I'm not sure of this, however, I'm wondering if the database sends a notification if it is deallocating a portal. But I'm guessing if we following patterns in pgjdbc (i.e. the create portal only after 5 queries) this would be a good start. I don't believe we need to "reinvent the wheel" for this driver. For me, we are just creating a new driver that takes all the pgjdbc patterns except opens up:

asynchronous calls
pipelining of queries, made possibly by asynchronous calls

Hence, for me, I'm going to fix the Portal/Query to follow pgjdbc as a starting point (to re-use all their experience). Then once we have working (and deployed in a few applications for "real-world" use) we can then look for enhancements.

Note: please excuse me if this sounds a little simplistic of what we are trying to achieve. I'm just thinking that as starting point to re-use as much as possible of pgjdbc, within the context of asynchronous calls. pgjdbc is a "proven" driver and I'm thinking default designs should base itself a little off learning from this driver... and then diverge to allow the asynchronous functionality.

alexanderkjall · 2018-08-28T07:20:34Z

I agree that reusing as much as possible from pgjdbc is a good idea, but I'm not sure that this logic can be reused, as we do pipe lining of requests and pgjdbc does not.

Imagine the following scenario:

1 We have two queries, q1 and q2, q1 have a named portal.
2 We write q1 to the network and the server responds that the named portal is no longer availible
3 before the server response arrived we write q2 to the network
4 server error message for q1 arrives
5 server response for q2 arrives

At point 4 we have the dilemma from my last post.

davecramer · 2018-08-28T12:56:29Z

The server does not de-allocate Portals, nor does it send any message saying it did de-allocate them

alexanderkjall · 2018-08-28T16:07:45Z

Aha, my misstake, I based it on that I got the message "prepared statement "q" does not exist" from this test https://github.com/pgjdbc/pgsql2/blob/bd92a7d0272fdaaa23b1311a92dc53a65698c688/src/test/java/org/postgresql/sql2/SelectDataTypesTest.java#L321 .

But after closer inspection of the network packets it turns out that something produces broken Bind packages. Lets track that issue in another thread, here; #24

Conflicts: src/main/java/org/postgresql/sql2/communication/network/AuthenticationResponse.java src/main/java/org/postgresql/sql2/communication/network/NetworkConnectRequest.java

sagenschneider · 2018-09-07T18:28:40Z

Ok, I tried to merge in the latest of master into this branch and found some significant changes to the NetworkOutputStream... seems to have a lot of synchronized around the write buffer (so much so I'm thinking it would significantly degrade performance... synchronize on each get of buffer is a lot of overhead at this level of performance).

The idea I was trying to achieve:

application runs with its own threads
application requests a submission of the network level
this submission works in producer/consumer fashion with separate consumer thread (in other words, the application threads can continue working without having to handle network details)
NioLoop thread is responsible for translating the submission into buffer data to send over the line. It is also responsible for reading in that data and parsing back onto the submission (as this is single threaded, there is no synchronizing required for it)

Basically, the NioLoop does very little in the way of synchronizing its state, as it runs on a single thread (so no memory synchronizing is necessary). It is only when this thread interacts with other threads (application thread submitting a submission or registered response handler is invoked on another thread) does there need to be any synchronizing

alexanderkjall · 2018-09-08T15:07:05Z

@sagenschneider I would also like to reduce the amount of synchronization, and I also discovered that the code still have a race condition even after I added the synchronization, sometimes the READ / WRITE registration happens in the wrong order, so one the selector gets a read only registration while there still is bytes to be written, debugging that right now.

The problem that led me to add the synchronization was in usage of the getCurrentBuffer function, sometimes a buffer could end up being used by both the query producing thread and the network thread, and that caused memory corruption and incorrect packaged being written to the network.

One way to reduce the synchronization would be to have a ConcurrentLinkedQueue as a handover point between the threads, and the query producing thread puts buffers that it's finished with on it, and the network thread pulls from it when it needs more data to write to the network.

alexanderkjall · 2018-09-09T10:42:55Z

Thought about this a bit more, and one way to do this would be to move the written buffers to the queue that the network thread reads when flush() is called, that way we shouldn't need any synchronization on writes. And if flush also wakes up the selector i think we can get around that other race condition.

sagenschneider · 2018-09-10T16:52:28Z

Possibly, though I would like to keep the network thread without too much synchronizing internally.

Question: can we make the submissions effectively immutable?

Ideally, I would be thinking:

Submission submission = new Submission(...) // is immutable with final fields so no threading issues
submit(submission) is therefore able to put onto concurrent queue for network thread to pull off
network thread is free to run internally without synchronize
on results for submission, the network thread locks on the submission to make changes to the submission (e.g. loading of rows to referenced mutable result with possible result handlers - changes to this mutable, such as registering result handler, is via same lock)

The result is there is no "course grained" lock slowing down performance. The only lock is within each submission on results. As this is less likely to change frequently, there should be little lock contention causing slow down.

Note: if we start putting locks in the network layer, they typically end up being course grained locking. As many threads will contend over the locks (network thread will contend with application threads submitting), this will show up as noticeable slow down.

alexanderkjall · 2018-09-11T15:22:46Z

Do you feel that using a ConcurrentLinkedQueue is too much synchronization?

sagenschneider · 2018-09-12T05:28:12Z

I would be thinking:

application threads to submit to network thread (immutable submissions on concurrent queue, so no heavy synchronization overheads)
network internal buffers queues to be LinkedList (no concurrency required as same thread)
network thread results (add rows) to synchronize on the submission to load the results (fine grained locking that will least likely cause lock contention - most expensive costs to locks)

alexanderkjall · 2018-09-13T17:05:22Z

That sounds like working plan, only thing I can think about that would be tricky is that a submission can have futures as parameters.

How the results of the queries are supplied to the user is really out of scope for this issue but I think the best thing might be to have a ExecutorPool since the callback is a function that is supplied by the user.

sagenschneider · 2018-09-16T15:20:12Z

I would likely expect the parameter futures would be resolved before the submission.

This way the network layer does not have to deal with the "promises" of the asynchronous layer.

Basically then:

application threads submit to resolution layer
resolution layer, resolves all parameter futures and creates the immutable submission
resolution layer submits the submission to the network layer (via concurrent queue)
network layer on same thread organises the sending, receiving and parsing the network traffic
network layer synchronizes on the submission to feed back results to the submission

alexanderkjall · 2018-09-17T13:42:48Z

Isn't that over complicating things a bit? Wouldn't it be enough to check if the future is completed with isDone() on the future, and if that is false just not pull anything from the queue of submissions?

sagenschneider · 2018-09-18T03:30:35Z

It's a complicated way of saying, don't submit anything to the network queue unless it is ready for execution.

For me, there could be multiple "parallel" execution trees by the same client. I would not assumed that there is any relationship between each branch of this execution tree (think of it like promise tree). As each node is resolved, it can then submit it's "immutable network submission".

I would not queue one submission behind another submission due to future resolution. This could:

starve a tree from being executed while a long running query is running
potential for starvation, as the future could be waiting on the next query submitted to the network

Basically, what I'm saying is that the network layer only receives "immutable submissions" that ready for execution. If it contains a future that needs resolution, this should be handled at the higher layers.

sagenschneider · 2018-09-18T03:32:27Z

Note: also we want to avoid "polling"... once the final future is resolved for the node then the submission is sent to the network. Otherwise, we end up having to poll the head of the queue waiting for isDone() to return true.

alexanderkjall · 2018-09-18T07:59:31Z

I don't think we can start reordering queries that the user send, that could severely break the users application.

Imagine two update queries that operate on the same rows, if we reorder them then a completely different result would be produced.

sagenschneider · 2018-09-22T16:21:56Z

It is not "re-ordering" queries. It is executing queries when they are ready to execute. If a query has an input parameter with a future not yet resolved, then unless there is a "then" relationship, the query should be held and other ready queries (that have all their parameters resolved) be submitted.

For me this is all part of the "promise tree" of the asynchronous execution. Order of asynchronous operations is not guaranteed unless some "then" relationship is between the queries. If I submit two queries without some relationship between them, from asynchronous programming point of view, there is no guarantee to the order they are executed.

For me, this is all part of the asynchronous query layer problem. It should not be put into the Network Layer.

sagenschneider · 2018-09-22T16:23:17Z

(but then again... I'm not totally up with the ADBA specs... however, my take on asynchronous programming is that order is not something that is guaranteed... unless the client explicitly sets ups up some form of relationship between the operations... e.g. Promise.then(..).then(...) etc)

alexanderkjall · 2018-09-22T17:06:24Z

You can always ask on the http://mail.openjdk.java.net/pipermail/jdbc-spec-discuss/ list, I might have understood the api wrong :)

alexanderkjall · 2018-10-28T12:39:49Z

The latest version of the API have a comment section that specifically mentions this use case, let me quote:

 Not all {@link Operation}s need to be executed in the order submitted. The
 most common example is a mass insert. The order in which the records are
 inserted doesn’t matter. A parallel {@link OperationGroup} gives the
 implementation the freedom to execute the {@link Operation}s in any order. If
 some of the {@link Operation}s have
 {@link java.util.concurrent.CompletionStage} parameters this can be
 especially valuable.</p>

sagenschneider and others added 6 commits August 17, 2018 17:28

Merge pull request #2 from alexanderkjall/master

22f383c

fixed the test in PGConnectionTLSTest

Merge pull request #3 from alexanderkjall/pr-19

310e719

Pr 19

NetworkInputStream for reading direct from buffers

62f1f9c

Merge branch 'master' of [email protected]:sagenschneider/pgsql2.git

cc54110

Conflicts: src/main/java/org/postgresql/sql2/communication/network/DescribeResponse.java src/test/java/org/postgresql/sql2/communication/BEFrameReaderTest.java

Providing deploy script for temporary use

4fac651

cretz reviewed Aug 27, 2018

View reviewed changes

Daniel Sagenschneider added 2 commits September 3, 2018 08:41

Merge branch 'master' of [email protected]:pgjdbc/pgsql2.git

e402c42

Conflicts: src/main/java/org/postgresql/sql2/communication/network/AuthenticationResponse.java src/main/java/org/postgresql/sql2/communication/network/NetworkConnectRequest.java

Tidy up for reducing number of requests

d501c4d

sagenschneider mentioned this pull request Sep 18, 2018

PostgreSql pipeline solution for Read-Only connections officefloor/OfficeFloor#33

Closed

sagenschneider closed this Jul 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NetworkInputStream #23

NetworkInputStream #23

sagenschneider commented Aug 27, 2018 •

edited

Loading

cretz Aug 27, 2018

sagenschneider Aug 28, 2018

alexanderkjall commented Aug 27, 2018

sagenschneider commented Aug 28, 2018 •

edited

Loading

alexanderkjall commented Aug 28, 2018

davecramer commented Aug 28, 2018

alexanderkjall commented Aug 28, 2018

sagenschneider commented Sep 7, 2018

alexanderkjall commented Sep 8, 2018

alexanderkjall commented Sep 9, 2018

sagenschneider commented Sep 10, 2018

alexanderkjall commented Sep 11, 2018

sagenschneider commented Sep 12, 2018 •

edited

Loading

alexanderkjall commented Sep 13, 2018

sagenschneider commented Sep 16, 2018

alexanderkjall commented Sep 17, 2018

sagenschneider commented Sep 18, 2018

sagenschneider commented Sep 18, 2018

alexanderkjall commented Sep 18, 2018

sagenschneider commented Sep 22, 2018

sagenschneider commented Sep 22, 2018

alexanderkjall commented Sep 22, 2018

alexanderkjall commented Oct 28, 2018

NetworkInputStream #23

NetworkInputStream #23

Conversation

sagenschneider commented Aug 27, 2018 • edited Loading

cretz Aug 27, 2018

Choose a reason for hiding this comment

sagenschneider Aug 28, 2018

Choose a reason for hiding this comment

alexanderkjall commented Aug 27, 2018

sagenschneider commented Aug 28, 2018 • edited Loading

alexanderkjall commented Aug 28, 2018

davecramer commented Aug 28, 2018

alexanderkjall commented Aug 28, 2018

sagenschneider commented Sep 7, 2018

alexanderkjall commented Sep 8, 2018

alexanderkjall commented Sep 9, 2018

sagenschneider commented Sep 10, 2018

alexanderkjall commented Sep 11, 2018

sagenschneider commented Sep 12, 2018 • edited Loading

alexanderkjall commented Sep 13, 2018

sagenschneider commented Sep 16, 2018

alexanderkjall commented Sep 17, 2018

sagenschneider commented Sep 18, 2018

sagenschneider commented Sep 18, 2018

alexanderkjall commented Sep 18, 2018

sagenschneider commented Sep 22, 2018

sagenschneider commented Sep 22, 2018

alexanderkjall commented Sep 22, 2018

alexanderkjall commented Oct 28, 2018

sagenschneider commented Aug 27, 2018 •

edited

Loading

sagenschneider commented Aug 28, 2018 •

edited

Loading

sagenschneider commented Sep 12, 2018 •

edited

Loading