Fun With Math

I’ve been reading a paper about artificial intelligence. This paper discusses a set of algorithms that build cooperative relationships with humans. The paper discusses various strategies that the program can employ.

One, called “expectant followers” interested me. In this strategy, the computer attempts to cooperate with its human partner. If the human goes back on his word, the algorithm punishes him by computing outcomes of each move that will give the human less than he would get if he cooperated earlier. In other words, there is a price for failure to cooperate.

This strategy is a mathematical model of how FLR with discipline works. Mrs. Lion gives me regular sexual stimulation. As long as I follow my rules and obey her, she provides pleasure to me. If I break a rule or disobey (go back on my cooperative behavior), I am punished.

That much is pretty obvious. What caught my eye is that there is a calculation of how much punishment should be administered. In the mathematical model, at each move in the game (all this AI stuff uses various games to model stuff), the system computes the outcome if it cooperates. It also computes what move it could make to give the human the least reward. Then, for punishment it picks the move with the least reward. The system stops doing this and starts cooperating only after it computes that it costs the human more than he would have gained by not cooperating. Then it “forgives” the human and computes the best return for both of them.

Other than being pretty cool math, this algorithm demonstrates a behavior that is very difficult for humans. We agree that it’s pretty easy to determine when a punishment is necessary. That is triggered by a broken rule or disobedience. Is there a way to objectively determine how much punishment should be administered before forgiveness?

This is a problem for Mrs. Lion. So far, she has elected to administer a fairly constant punishment for any offense: a spanking of medium intensity. She doesn’t appear to consider prior acts in determining how severe she should be.

To me, at least, her strategy may be an indication of how she feels about the offenses themselves. A fairly uniform punishment regardless of infraction history suggests that she doesn’t have a real desire to extinguish the bad behavior. At least, that’s my hypothesis. She wants to punish me because I did something wrong, but she isn’t strongly committed to changing me.

If, on the other hand, she increased the duration and intensity of punishment for repeat offenses, then at some point I would decide that repeating the offense wasn’t worth the resulting punishment. Then, my behavior would improve, at least for a while. When I reoffend, then she would administer the level of punishment that was effective in stopping repeat offenses.

I’m not saying that Mrs. Lion should or shouldn’t do this. I am illustrating a very successful algorithm in artificial intelligence. I’m pretty sure that this strategy would be effective if applied to me. See? Math can be useful and fun.