The world is brimming with the ideas of self-driving cars and all sorts of

other concepts where computers are supposed to take care of

decision making.

This makes a bit worried, for one because I simply like to drive and

would not want a computer to interfere with my driving. I am already

quite irritated by many automatic things in cars that don't really work

when winter is upon in Sweden :)

Anyways this post is not about that, this post is more about the general

problem of designing adaptive algorithms.

I've been designing NDB software for more than 20 years. During the

course of these years I have learned a bit about what is optimal

when executing NDB. Most of the software I write today is about

putting this knowledge into the NDB software itself.

This is a trend in databases today to automate configuration handling

in a DBMS. In NDB we started this trend in MySQL Cluster 7.4

when we implemented a "top" facility inside the NDB data nodes.

At the same time we also keep track of lags in writing to disk.

We used this knowledge to design an adaptive algorithm that changes

the speed of writing local checkpoints based on the current CPU usage

and IO lag.

We moved on in 7.5 and implemented an adaptive algorithm to control

from where sending will happen. This algorithm is also based on

keeping track of CPU usage in the threads in the data node.

The new partial LCP algorithm is also highly adaptive where it decides

how incremental the LCP should be based on the writing in the

database.

There is also work ongoing on some adaptiveness in the NDB API

where some threads will start up to assist the receive thread in the NDB

API when it gets overloaded.

There is even more work ongoing to ensure that the checkpoint speed

adapts also to conditions where we are starting to run out of REDO log.

Now the idea of adaptive algorithms is clearly a good idea, but, and there

is a big but, there are two problems with ANY adaptive algorithm.

The first problem is oscillation. Adaptive algorithms works by changing

the environment based on input from the environment. When you look

at an adaptive algorithm that works it is actually quite impressive. By

merely finding the proper conditions based on the input you can get a

system that quickly adapts to any working condition and finds a new

optimal spot to work in.

My original study at the university was mathematical statistics.

One important fact in most mathematical statistics is that you

have stable states and you have transient states.

An adaptive algorithm will work fine as long as the frequency of

changes in the environment is not faster than the time it takes to

find a new stable state.

As an example in the algorithms in NDB, most of them takes

decisions to change the environment about once per second.

One important thing to make those adaptive algorithms better

at adapting is to not change the controls to much. If one base

the decision on what to do the next second only on the last

second the adaptive algorithm is quite likely to

self-oscillate.

Thus it is important to build in some inertia in the adaptive

algorithm. This protects the algorithm from going wild.

But it doesn't make it adapt to conditions that change

quicker than the change frequency. Adaptive algorithms

cannot handle that.

So this is the first problem, to ensure that the adaptive

algorithm is quick enough to change to handle the

changing environment, but not so quick that it starts to

self-oscillate.

The second problem is when two adaptive algorithms

crash into each other. As an example in NDB we have a

problem when CPU load is extremely high due to

application activity while at the same time we are

coming close to the limit of the REDO log. In this case

we have two adaptive algorithms that conflict, one wants

to decrease the checkpoint speed to keep the application

activity while the other algorithm tries to slow down the

checkpoint activity to avoid running out of REDO log.

Now in a car the bets are higher, its human lifes involved.

Almost the same problem a self-driving car will have to

solve when the driver has decided on the speed he wants

to travel while at the same time the control of the car sees

dangers coming up ahead. These dangers could be other

cars, cliffs or any other thing.

Sometimes cars even have to make decision on whether

its own passengers should survive or whether the by-stander

should survive.

So the software of a self-driving car and any other

self-controlling software suffers from two big problems

to solve.

1) How often should I take input from the environment and

decide to change the controller parameters.

2) How should I handle conflicting requirements

Failure in handling 1) will lead to self-oscillating

behaviour and failure to handle 2) will lead to

crashes.

So hopefully any developer of self-driving cars has read up

a lot on adaptive algorithms and know exactly when the

algorithm is safe and when it isn't.

Personally I always feel a bit uneasy about any adaptive

algorithm since I know that it is almost impossible to

predict exactly how it is going to behave in all situations.

The mathematics involved in understanding adaptive

algorithms requires a lot of understanding of differential

equations.

other concepts where computers are supposed to take care of

decision making.

This makes a bit worried, for one because I simply like to drive and

would not want a computer to interfere with my driving. I am already

quite irritated by many automatic things in cars that don't really work

when winter is upon in Sweden :)

Anyways this post is not about that, this post is more about the general

problem of designing adaptive algorithms.

I've been designing NDB software for more than 20 years. During the

course of these years I have learned a bit about what is optimal

when executing NDB. Most of the software I write today is about

putting this knowledge into the NDB software itself.

This is a trend in databases today to automate configuration handling

in a DBMS. In NDB we started this trend in MySQL Cluster 7.4

when we implemented a "top" facility inside the NDB data nodes.

At the same time we also keep track of lags in writing to disk.

We used this knowledge to design an adaptive algorithm that changes

the speed of writing local checkpoints based on the current CPU usage

and IO lag.

We moved on in 7.5 and implemented an adaptive algorithm to control

from where sending will happen. This algorithm is also based on

keeping track of CPU usage in the threads in the data node.

The new partial LCP algorithm is also highly adaptive where it decides

how incremental the LCP should be based on the writing in the

database.

There is also work ongoing on some adaptiveness in the NDB API

where some threads will start up to assist the receive thread in the NDB

API when it gets overloaded.

There is even more work ongoing to ensure that the checkpoint speed

adapts also to conditions where we are starting to run out of REDO log.

Now the idea of adaptive algorithms is clearly a good idea, but, and there

is a big but, there are two problems with ANY adaptive algorithm.

The first problem is oscillation. Adaptive algorithms works by changing

the environment based on input from the environment. When you look

at an adaptive algorithm that works it is actually quite impressive. By

merely finding the proper conditions based on the input you can get a

system that quickly adapts to any working condition and finds a new

optimal spot to work in.

My original study at the university was mathematical statistics.

One important fact in most mathematical statistics is that you

have stable states and you have transient states.

An adaptive algorithm will work fine as long as the frequency of

changes in the environment is not faster than the time it takes to

find a new stable state.

As an example in the algorithms in NDB, most of them takes

decisions to change the environment about once per second.

One important thing to make those adaptive algorithms better

at adapting is to not change the controls to much. If one base

the decision on what to do the next second only on the last

second the adaptive algorithm is quite likely to

self-oscillate.

Thus it is important to build in some inertia in the adaptive

algorithm. This protects the algorithm from going wild.

But it doesn't make it adapt to conditions that change

quicker than the change frequency. Adaptive algorithms

cannot handle that.

So this is the first problem, to ensure that the adaptive

algorithm is quick enough to change to handle the

changing environment, but not so quick that it starts to

self-oscillate.

The second problem is when two adaptive algorithms

crash into each other. As an example in NDB we have a

problem when CPU load is extremely high due to

application activity while at the same time we are

coming close to the limit of the REDO log. In this case

we have two adaptive algorithms that conflict, one wants

to decrease the checkpoint speed to keep the application

activity while the other algorithm tries to slow down the

checkpoint activity to avoid running out of REDO log.

Now in a car the bets are higher, its human lifes involved.

Almost the same problem a self-driving car will have to

solve when the driver has decided on the speed he wants

to travel while at the same time the control of the car sees

dangers coming up ahead. These dangers could be other

cars, cliffs or any other thing.

Sometimes cars even have to make decision on whether

its own passengers should survive or whether the by-stander

should survive.

So the software of a self-driving car and any other

self-controlling software suffers from two big problems

to solve.

1) How often should I take input from the environment and

decide to change the controller parameters.

2) How should I handle conflicting requirements

Failure in handling 1) will lead to self-oscillating

behaviour and failure to handle 2) will lead to

crashes.

So hopefully any developer of self-driving cars has read up

a lot on adaptive algorithms and know exactly when the

algorithm is safe and when it isn't.

Personally I always feel a bit uneasy about any adaptive

algorithm since I know that it is almost impossible to

predict exactly how it is going to behave in all situations.

The mathematics involved in understanding adaptive

algorithms requires a lot of understanding of differential

equations.