Feature: Load data in slices(pagination under hood) #77

CyborTronik · 2017-06-08T22:26:11Z

Considering the fact this library has to be used in Rx architecture, there comes the need to load and process data in slices (pagination).

Lets say you have to process a million of records so you cannot handle all this stuff just using one 'select' operation. You have to load data in bits and process just few rows at time and then load next few rows and process ...

thomasnield · 2017-06-08T22:29:50Z

It sounds like there are many ways to achieve this already using existing RxJava operators. buffer(), flatMap(), etc. Even backpressure natively will prevent all data from being pushed at once. It's not clear what you want RxJava-JDBC to do that Rx doesn't do already.

CyborTronik · 2017-06-08T22:39:35Z

if you run a sql query it does request all the data at once and then transform to rx.
But what I mean is to have a way to request data in parts via JDBC and process part by part.
for example what happens when you run a select * from a_table and there you have few millions of rows? do you have enough resources to load this data in memory? There is the deal to load and process data using pagination or some cursor for that, so would be nice to have a built in approach for that.

davidmoten · 2017-06-09T00:37:47Z

Sounds like you are talking about fetchSize. This is supported now, see

https://github.com/davidmoten/rxjava-jdbc#fetch-size.

If you don't set fetchSize then your jdbc driver will use its default (which can be quite small; for Oracle is 10!) so you are unlikely to run out of memory but if you are fetching a lot of records you will make fewer calls to the db by setting fetchSize to a larger value.

stanisfun · 2017-06-09T07:38:19Z

Cool. Somehow I've missed that from documentation.
Thanks

litalk · 2017-06-27T08:40:04Z

@CyborTronik @stanisfun pay attention that currently fetchsize cannot be set to Integer.MIN_VALUE, which is critical for getting query result as a stream, at least on MySQL.
#78

jad7 · 2019-04-26T09:28:10Z

I assume that @CyborTronik means an alternative of JdbcCursorItemReader (from spring batch). This really makes sense, because consumer can work significantly slowly, then producer. And pauses between fetching new data can be more than connection timeout, and you will get an exception. Cursor reader will fix this problem, but it's not easy to implement, because of different DBs has a different syntax (approach) for pagination.

davidmoten added the question label Jun 9, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Load data in slices(pagination under hood) #77

Feature: Load data in slices(pagination under hood) #77

CyborTronik commented Jun 8, 2017

thomasnield commented Jun 8, 2017 •

edited

Loading

CyborTronik commented Jun 8, 2017

davidmoten commented Jun 9, 2017

stanisfun commented Jun 9, 2017

litalk commented Jun 27, 2017 •

edited

Loading

jad7 commented Apr 26, 2019

Feature: Load data in slices(pagination under hood) #77

Feature: Load data in slices(pagination under hood) #77

Comments

CyborTronik commented Jun 8, 2017

thomasnield commented Jun 8, 2017 • edited Loading

CyborTronik commented Jun 8, 2017

davidmoten commented Jun 9, 2017

stanisfun commented Jun 9, 2017

litalk commented Jun 27, 2017 • edited Loading

jad7 commented Apr 26, 2019

thomasnield commented Jun 8, 2017 •

edited

Loading

litalk commented Jun 27, 2017 •

edited

Loading