Implement parser functions for all types #10

alexanderkjall · 2018-04-29T14:20:52Z

We need to be able to parse the data that the database returns into java types.

davecramer · 2018-04-29T14:29:33Z

Don't we get the types back from the server

On Sun, Apr 29, 2018, 10:20 AM Alexander Kjäll, ***@***.***> wrote: We need to be able to parse the data that the database returns into java types. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#10>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAYz9k8d1KHdEQRhQ6lHn2qaUG7Q4AT2ks5ttcxFgaJpZM4Tr0WS> .

alexanderkjall · 2018-04-30T14:13:51Z

I was a bit imprecise. I have just implemented that we send describe messages to the server to ask for the return types of queries.

The different types can then be encoded as text or in binary by the server and the default seems to be text. There might be significant performance gain to be had by using binary, specially for some types as bytea, but that can be a future improvement.

We then get sent the datarows from the server, I store the columns in an Map<String, Object> and that map can then be queried by the Collector that the user has supplied through the ResultMap.get method.

I would like to have as few memory allocations as possible through this process, but I don't know what class the user wants to represent the column as when parsing the DataRow message, so thought was to first parse it to an appropriate type to store in the Map<String, Object> map (postgresql int4 -> java Integer for example) and then have a second set of transformations in the get() method.

Another method would be to convert everything to strings, store the original OID from the db and do all the conversions when the user asks for them in get method. This will save an conversion if the user doesn't want the Class that we parse the result into but requires the driver to keep more state about each column and it will be harder to implement the usage of binary in the future.

cretz · 2018-04-30T15:02:18Z

There might be significant performance gain to be had by using binary

Not sure the gain would be that significant, but I concur binary should be implemented after text (if at all). I recommend storing the row data as a two dimensional byte array (or you can just keep the entire byte buffer and store offsets, but you have to traverse anyways) and the row metadata as a separate object and query it as you need, with the columns in two forms: a Column[] which lets you get the columns by index and more information on them than just their name, and a Map<String, Integer> which is a map of column names (lower cased to provide case-insensitive lookups) to indices. Sometimes you'll need more than the column name to do the type of conversion the user asks, so don't eagerly just make a Map<String, Object>, wait for them to ask.

Another method would be to convert everything to strings

Don't do this eagerly. Not everyone wants all of their columns.

cretz · 2018-04-30T15:14:51Z

Also, if you want to save yourself some time or just need some inspiration (no credit wanted/needed), you can take a peek at Converters.java that has some things like date formats instead of hand typing, DataType.java that has a bunch of parsers implemented for non-standard types, and QueryTest.java which has a bunch of interesting values that are good to test with (tests params, output, arrays, multidimensional arrays, etc too).

alexanderkjall · 2018-04-30T17:46:51Z

That not all columns will be read by the user is a very good point. I'll refactor the code so that the conversions happen when the get method is called. I think I'll try to store it as the original byte sequence with offsets, there might even be some upside to implement the CharSequence interface to help with the parsing.

… in #10

davecramer · 2018-07-31T21:02:55Z

Binary has a significant performance benefit, I would argue you should prioritize binary and leave text for things like timestamps which have historically been difficult to deal with in binary

alexanderkjall added a commit that referenced this issue May 1, 2018

moved the parsing logic into the get method based on advice from @cretz…

9bf382d

… in #10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement parser functions for all types #10

Implement parser functions for all types #10

alexanderkjall commented Apr 29, 2018

davecramer commented Apr 29, 2018 via email

alexanderkjall commented Apr 30, 2018

cretz commented Apr 30, 2018 •

edited

Loading

cretz commented Apr 30, 2018

alexanderkjall commented Apr 30, 2018

davecramer commented Jul 31, 2018

Implement parser functions for all types #10

Implement parser functions for all types #10

Comments

alexanderkjall commented Apr 29, 2018

davecramer commented Apr 29, 2018 via email

alexanderkjall commented Apr 30, 2018

cretz commented Apr 30, 2018 • edited Loading

cretz commented Apr 30, 2018

alexanderkjall commented Apr 30, 2018

davecramer commented Jul 31, 2018

cretz commented Apr 30, 2018 •

edited

Loading