H
Halfgaar
Guest
Halfgaar Asks: Reduce 'wait/sql/table/io/handler' load of high frequent updates on the same rows
We have a time-series application that also writes all the records to kind of a 'head' table in MySQL. This is somewhat of a legacy, but very interwoven and hard to get rid of. The problem is, that it's forming a bottle-neck, mostly IO. I'm hoping there are (clever) tricks to make it faster. For instance, it can remain dirty in the buffer pool for a long time, that's not too important.
It's currently running on a 96 CPU Amazon RDS instance (db.m5.24xlarge, 384 GB RAM), Multi-AZ. It still has general purpose IO with 6000 IO/s for now, but that will be changed to provisioned IO. I'm not sure if it will have much effect though. We don't reach the 6000 yet.
The table is this:
In other words, it mostly finds existing rows and updates the values.
And currently about 2000/s of these are done.
It's currently updating about 500k rows per second, but it's normally lower. It's reporting 40k queries/s, but that number doesn't make sense to me (long story).
The table is currently 'only' 3932.45MB big.
This is the load from an incident, where there is a lot of back-log to process:
You can see the light blue, which is IO.
You can also see the non-CPU load in this peak:
Because new rows are infrequently added and only existing values overwritten, theoretically, it could make do with very little IO. If it were to sync every few minutes, I'd be fine with that.
I tried attaining that a little bit with
One specific question on a side note: can a high enough
Any other ways to reduce the impact of IO?
Edit: also good to know there's a lot of parallelism. At the time of the incident, all 128 queue runners were active. They are set up so that they never update each other's rows. One
Edit: more graphs below, of the last 24 hours:
History length:
We have a time-series application that also writes all the records to kind of a 'head' table in MySQL. This is somewhat of a legacy, but very interwoven and hard to get rid of. The problem is, that it's forming a bottle-neck, mostly IO. I'm hoping there are (clever) tricks to make it faster. For instance, it can remain dirty in the buffer pool for a long time, that's not too important.
It's currently running on a 96 CPU Amazon RDS instance (db.m5.24xlarge, 384 GB RAM), Multi-AZ. It still has general purpose IO with 6000 IO/s for now, but that will be changed to provisioned IO. I'm not sure if it will have much effect though. We don't reach the 6000 yet.
The table is this:
Code:
CREATE TABLE `headData` (
`idInstallation` int unsigned NOT NULL,
`idDataAttribute` smallint unsigned NOT NULL,
`instance` smallint unsigned NOT NULL,
`secondsToNextLog` int DEFAULT NULL,
`valueFloat` float DEFAULT NULL,
`valueString` varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`valueEnum` smallint DEFAULT NULL,
`timestamp` int unsigned DEFAULT NULL,
`alarmStart` int unsigned DEFAULT NULL,
`alarmChangeTimer` int unsigned DEFAULT NULL,
`alarmCleared` int unsigned DEFAULT NULL
PRIMARY KEY (`idInstallation`,`idDataAttribute`,`instance`),
KEY `idDataAttribute` (`idDataAttribute`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8_unicode_ci
INSERT ON DUPLICATE KEY UPDATE
statements are multi-value: with one idInstallation
. It's several hundred per query. 40 of those statements are batched inside a transaction. One statement is:
Code:
INSERT INTO headData
(idInstallation, idDataAttribute, instance, timestamp,
secondsToNextLog, valueFloat, valueString, valueEnum)
VALUES
(1, 2, 0, unix_timestamp(), 60, 4, null, null),
(1, 3, 0, unix_timestamp(), 60, 66, null, null),
(1, 4, 0, unix_timestamp(), 60, 3, null, null),
(1, 5, 0, unix_timestamp(), 60, 1, null, null),
...
(1, 100, 0, unix_timestamp(), 60, 4, null, null),
ON DUPLICATE KEY UPDATE
timestamp = VALUES(timestamp),
secondsToNextLog = VALUES(secondsToNextLog),
valueFloat = VALUES(valueFloat),
valueString = VALUES(valueString),
valueEnum = VALUES(valueEnum)
In other words, it mostly finds existing rows and updates the values.
And currently about 2000/s of these are done.
It's currently updating about 500k rows per second, but it's normally lower. It's reporting 40k queries/s, but that number doesn't make sense to me (long story).
innodb_flush_log_at_trx_commit
is 0. sync_binlog
is 0.The table is currently 'only' 3932.45MB big.
This is the load from an incident, where there is a lot of back-log to process:
You can see the light blue, which is IO.
You can also see the non-CPU load in this peak:
Because new rows are infrequently added and only existing values overwritten, theoretically, it could make do with very little IO. If it were to sync every few minutes, I'd be fine with that.
I tried attaining that a little bit with
innodb_log_file_size
of 8 GB (I could increase this more). According to the log sequence numbers, there is currently about 2 GB per minute written. If I were to have an hours worth of data as recommended here, it would be 120 GB...One specific question on a side note: can a high enough
innodb_log_file_size
make a whole table fit in memory without IO?Any other ways to reduce the impact of IO?
Edit: also good to know there's a lot of parallelism. At the time of the incident, all 128 queue runners were active. They are set up so that they never update each other's rows. One
idInstallation
is ever only in one queue.Edit: more graphs below, of the last 24 hours:
History length:
SolveForum.com may not be responsible for the answers or solutions given to any question asked by the users. All Answers or responses are user generated answers and we do not have proof of its validity or correctness. Please vote for the answer that helped you in order to help others find out which is the most helpful answer. Questions labeled as solved may be solved or may not be solved depending on the type of question and the date posted for some posts may be scheduled to be deleted periodically. Do not hesitate to share your thoughts here to help others.