Browse Source

Merge pull request #14215 from ihipop/feature/add_doc_for_issue_#8420 [skip ci]

Add doc for issue #8420
tags/2.0.12
Alexander Makarov 7 years ago committed by GitHub
parent
commit
137a936a60
  1. 79
      docs/guide/db-query-builder.md

79
docs/guide/db-query-builder.md

@ -652,10 +652,9 @@ value which will be used as the index value for the current row.
### Batch Query <span id="batch-query"></span> ### Batch Query <span id="batch-query"></span>
When working with large amounts of data, methods such as [[yii\db\Query::all()]] are not suitable When working with large amounts of data, methods such as [[yii\db\Query::all()]] are not suitable because they require loading the whole query result into the client's memory. To solve this issue Yii provides batch query support. The server holds the query result, and the client uses a cursor to iterate over the result set one batch at a time.
because they require loading all data into the memory. To keep the memory requirement low, Yii
provides the so-called batch query support. A batch query makes use of the data cursor and fetches > Warning: There are known limitations and workarounds for the MySQL implementation of batch queries. See below.
data in batches.
Batch query can be used like the following: Batch query can be used like the following:
@ -670,23 +669,23 @@ foreach ($query->batch() as $users) {
// $users is an array of 100 or fewer rows from the user table // $users is an array of 100 or fewer rows from the user table
} }
// or if you want to iterate the row one by one // or to iterate the row one by one
foreach ($query->each() as $user) { foreach ($query->each() as $user) {
// $user represents one row of data from the user table // data is being fetched from the server in batches of 100,
// but $user represents one row of data from the user table
} }
``` ```
The method [[yii\db\Query::batch()]] and [[yii\db\Query::each()]] return an [[yii\db\BatchQueryResult]] object The method [[yii\db\Query::batch()]] and [[yii\db\Query::each()]] return an [[yii\db\BatchQueryResult]] object which implements the `Iterator` interface and thus can be used in the `foreach` construct.
which implements the `Iterator` interface and thus can be used in the `foreach` construct. During the first iteration, a SQL query is made to the database. Data is then fetched in batches
During the first iteration, a SQL query is made to the database. Data are then fetched in batches
in the remaining iterations. By default, the batch size is 100, meaning 100 rows of data are being fetched in each batch. in the remaining iterations. By default, the batch size is 100, meaning 100 rows of data are being fetched in each batch.
You can change the batch size by passing the first parameter to the `batch()` or `each()` method. You can change the batch size by passing the first parameter to the `batch()` or `each()` method.
Compared to the [[yii\db\Query::all()]], the batch query only loads 100 rows of data at a time into the memory. Compared to the [[yii\db\Query::all()]], the batch query only loads 100 rows of data at a time into the memory.
If you process the data and then discard it right away, the batch query can help reduce memory usage.
If you specify the query result to be indexed by some column via [[yii\db\Query::indexBy()]], the batch query If you specify the query result to be indexed by some column via [[yii\db\Query::indexBy()]], the batch query will still keep the proper index.
will still keep the proper index. For example,
For example:
```php ```php
$query = (new \yii\db\Query()) $query = (new \yii\db\Query())
@ -701,3 +700,59 @@ foreach ($query->each() as $username => $user) {
// ... // ...
} }
``` ```
#### Limitations of batch query in MySQL <span id="batch-query-mysql"></span>
MySQL implementation of batch queries relies on the PDO driver library. By default, MySQL queries are [`buffered`](http://php.net/manual/en/mysqlinfo.concepts.buffering.php). This defeats the purpose of using the cursor to get the data, because it doesn't prevent the whole result set from being loaded into the client's memory by the driver.
> Note: When `libmysqlclient` is used (typical of PHP5), PHP's memory limit won't count the memory used for result sets. It may seem that batch queries work correctly, but in reality the whole dataset is loaded into client's memory, and has the potential of using it up.
To disable buffering and reduce client memory requirements, PDO connection property `PDO::MYSQL_ATTR_USE_BUFFERED_QUERY` must be set to `false`. However, until the whole dataset has been retrieved, no other query can be made through the same connection. This may prevent `ActiveRecord` from making a query to get the table schema when it needs to. If this is not a problem (the table schema is cached already), it is possible to switch the original connection into unbuffered mode, and then roll back when the batch query is done.
```php
Yii::$app->db->pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
// Do batch query
Yii::$app->db->pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, true);
```
> Note: In the case of MyISAM, for the duration of the batch query, the table may become locked, delaying or denying write access for other connections. When using unbuffered queries, try to keep the cursor open for as little time as possible.
If the schema is not cached, or it is necessary to run other queries while the batch query is being processed, you can create a separate unbuffered connection to the database:
```php
$unbufferedDb = new \yii\db\Connection([
'dsn' => Yii::$app->db->dsn,
'username' => Yii::$app->db->username,
'password' => Yii::$app->db->password,
'charset' => Yii::$app->db->charset,
]);
$unbufferedDb->open();
$unbufferedDb->pdo->setAttribute(\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY, false);
```
If you want to ensure that the `$unbufferedDb` has exactly the same PDO attributes like the original buffered `$db` but the `PDO::MYSQL_ATTR_USE_BUFFERED_QUERY` is `false`, [consider a deep copy of `$db`](https://github.com/yiisoft/yii2/issues/8420#issuecomment-301423833), set it to false manually.
Then, queries are created normally. The new connection is used to run batch queries and retrieve results either in batches or one by one:
```php
// getting data in batches of 1000
foreach ($query->batch(1000, $unbufferedDb) as $users) {
// ...
}
// data is fetched from server in batches of 1000, but is iterated one by one
foreach ($query->each(1000, $unbufferedDb) as $user) {
// ...
}
```
When the connection is no longer necessary and the result set has been retrieved, it can be closed:
```php
$unbufferedDb->close();
```
> Note: unbuffered query uses less memory on the PHP-side, but can increase the load on the MySQL server. It is recommended to design your own code with your production practice for extra massive data, [for example, divide the range for integer keys, loop them with Unbuffered Queries](https://github.com/yiisoft/yii2/issues/8420#issuecomment-296109257).

Loading…
Cancel
Save