Common Lisp用の遺伝的アルゴリズムのライブラリGECOを使ってみる

GECO (Genetic Evolution through Combination of Objects)

ASDFが使えるので、インストールするには、このサイトからパッケージをダウンロードして展開し、その中のgeco.asdへのリンクを*central-registry*に登録されているディレクトリに置く。SBCLの場合、~/.sbcl/site以下にgecoのパッケージを展開し、~/.sbcl/systemsにシンボリックリンクを張る。

あとはREPLから

(asdf:oos 'asdf:load-op :geco)

とすればおk。

付属のサンプルプログラムをいじってフロイド問題を解かせてみよう。

フロイド問題とは、1から50までの自然数の集合が与えられたときに、それを2つの排他的な集合に分けて、それぞれ集合の要素の平方根の和を計算し、この2つの値の差を最小化するような分け方は何かという問題だ。
例えば、一つの集合を
 A = \{7, 8, 12, 13, 15, 16, 25, 27, 30, 31, 33, 36, 37, 38, 41, 43, 44, 45, 46, 47, 48, 49\}
として、Aに排反な集合をBとすれば、それぞれ集合の平方根の和を計算すると、
 \sum_{a \in A} \sqrt{a} = 119.517900301760320754230296092
 \sum_{b \in B} \sqrt{b} = 119.517900301760463739810150134
となってその差は0.00000000000014になる。

これをGAで解くときは、Aの集合に含まれる数字のビットを1とするビット列を遺伝子として使う。上の例なら00000011000110110000000010100110100111001011111110。

バイナリ遺伝子を用いるサンプルコードのうち、遺伝子の長さを50に変更して、評価関数をフロイド問題に合わせてやる。突然変異や交叉の起きる確率なんかはPROB-MUTATEやPROB-CROSSといったメソッド定義の中で指定するようだ。

(in-package :GECO-USER)

(defclass BINARY-CHROMOSOME-10 (binary-chromosome)
  ()
  (:documentation
   "A 10-bit binary chromosome."))

(defmethod SIZE ((self binary-chromosome-10))
  "So GECO will know how large to make the chromosome."
  50)

(defclass SIMPLE-BINARY-10-ORGANISM (organism)
  ()
  (:documentation
   "An organism with only a 10-bit binary chromosome."))

(defmethod CHROMOSOME-CLASSES ((self simple-binary-10-organism))
  "So GECO will know what chromosomes to make."
  '(binary-chromosome-10))

(defclass BINARY-POPULATION-STATISTICS (population-statistics)
  ((ALLELE-COUNTS
    :accessor allele-counts
    :initform nil
    :type (or null (vector fixnum 50))
    :documentation
    "The number of non-zero alleles, by locus, for our population."))
  (:documentation
   "Our population-statistics also contains allele counts."))

(defmethod COMPUTE-STATISTICS :AFTER ((pop-stats binary-population-statistics))
  "Compute the allele statistics for the population and save them."
  (setf (allele-counts pop-stats)
        (compute-binary-allele-statistics (population pop-stats))))

(defclass SIMPLE-BINARY-POPULATION (generational-population maximizing-score-mixin)
  ()
  (:documentation
   "Our populations are generational, and the scores are maximized."))

(defmethod ORGANISM-CLASS ((self simple-binary-population))
  "So GECO knows how to make the organisms in our population."
  'simple-binary-10-organism)

(defmethod POPULATION-STATISTICS-CLASS ((self simple-binary-population))
  "So GECO knows how to make our population statistics instances."
  'binary-population-statistics)

(defclass SIMPLE-PLAN (genetic-plan)
  ((STATISTICS
    :accessor statistics
    :initarg :statisics
    :initform nil))      ; so we can push instances
  (:documentation
   "Abstract class to allow method sharing for initialization & regeneration."))

(defun froid-eval (bit-vec)
  (wiz-util:nlet iter ((i 1)
		       (sum-a 0d0)
		       (sum-b 0d0))
    (if (> i (length bit-vec))
	(abs (- sum-a sum-b))
	(if (= (aref bit-vec (1- i)) 1)
	    (iter (1+ i) (+ sum-a (sqrt i)) sum-b)
	    (iter (1+ i) sum-a (+ sum-b (sqrt i)))))))

;; 評価関数EVALUATEをフロイド問題に合わせて書き換える
(defmethod EVALUATE ((self simple-binary-10-organism) (plan simple-plan)
                     &AUX (chromosome (first (genotype self))))
  "The score for our organisms is the number of non-zero alleles."
  #+:mcl (declare (ignore plan))
  (setf (score self)
	(let ((froid-score (froid-eval (loci chromosome))))
	  (/ 1.0 froid-score))))

(defmethod REGENERATE ((plan simple-plan) (old-pop simple-binary-population) &AUX
                       (new-pop (make-population (ecosystem old-pop)
                                                 (class-of old-pop)
                                                 :size (size old-pop))))
  "Create a new generation from the previous one, and record statistics."
  (setf (ecosystem new-pop) (ecosystem old-pop))
  ;; selectively reproduce, crossover, and mutate
  (operate-on-population plan old-pop new-pop)
  ;; record old-pop's statistics
  (push (statistics old-pop)     ; impractical for real-world problems
        (statistics plan))
  new-pop)

(defclass SIMPLE-PLAN-1 (simple-plan)
  ())

(defmethod PROB-MUTATE ((self SIMPLE-PLAN-1))
  "This is the probability of mutating an organism, not a single locus as is often used."
  0.05)

(defmethod PROB-CROSS ((self SIMPLE-PLAN-1))
  "The probability of crossover for an organism."
  0.9)

;; ここで世代ごとの操作(交叉や突然変異)を具体的にどのようにするかを定義している。
;; simple-plan-1とsimple-plan-2でここがどう違うかを確認する。
(defmethod OPERATE-ON-POPULATION
           ((plan simple-plan-1) old-population new-population &AUX
            (new-organisms (organisms new-population))
            (p-cross (prob-cross plan))
            (p-mutate (+ p-cross (prob-mutate plan)))
            (orphan (make-instance (organism-class old-population)))) ; not in any population
  "Apply the genetic operators on selected organisms from the old population."
  (let ((selector (stochastic-remainder-preselect old-population)))
    (do ((org1 (funcall selector) (funcall selector))
         org2
         (random# (geco-random-float 1.0) (geco-random-float 1.0))
         (i 0 (1+ i)))
        ((null org1))
      (cond
       ((> p-cross random#)
        (if (< 1 (hamming-distance (first (genotype org1))
                                   (first (genotype (setf org2 (pick-random-organism
                                                                old-population))))))
            (uniform-cross-organisms
             org1 org2
             (setf (aref new-organisms i)
                   (copy-organism org1 :new-population new-population))
             orphan) ;a throw-away, not in any population so it can be GC'd
          ;; hamming distances < 2 will produce eidetic offspring anyway, so bypass crossover & evaluation
          (setf (aref new-organisms i)
                (copy-organism-with-score org1 :new-population new-population))))
       ((> p-mutate random#)
        (mutate-organism
         (setf (aref new-organisms i)
               (copy-organism org1 :new-population new-population))))
       (T ;; copying the score bypasses the need for a redundant evaluate
        (setf (aref new-organisms i)
              (copy-organism-with-score org1 :new-population new-population)))))))

;;; Some stuff to test the algorithms:

(defvar *SBE* nil "a simple binary ecosystem")

(defun TEST-PLAN (times plan &KEY
                        (stream t)
                        (pop-size 20)
                        (evaluation-limit 400))
  (let (maxs avgs gens evals)
    (flet ((test ()
             (dotimes (i times)
               (setq *sbe* (make-instance 'ecosystem
                             :pop-class 'simple-binary-population
                             :pop-size pop-size
                             :plan-class plan
                             :evaluation-limit evaluation-limit))
               (evolve *sbe*)
               (format t "~& -- Max=~F, Avg=~F, #gens=~D, #evals=~D, Best=~A."
                       (max-score (statistics (population *sbe*)))
                       (avg-score (statistics (population *sbe*)))
                       (generation-number *sbe*)
                       (evaluation-number *sbe*)
		       (loci (car (genotype (best-organism (population *sbe*)))))
		       )
               (push (max-score (statistics (population *sbe*))) maxs)
               (push (avg-score (statistics (population *sbe*))) avgs)
               (push (generation-number *sbe*) gens)
               (push (evaluation-number *sbe*) evals))))
      (format stream "~&~A:" plan)
      (time (test)))
    (format stream "~& ==> Avg max=~F, Avg avg=~F, Avg #gens=~F, Avg #evals=~F"
            (/ (float (reduce #'+ maxs)) times)
            (/ (float (reduce #'+ avgs)) times)
            (/ (float (reduce #'+ gens)) times)
            (/ (float (reduce #'+ evals)) times))))

これでGAを実行できるようになる。実行は以下のようにする。これを初期値を変えながら100回実行する。個体数は3000、評価回数の上限は1000000000とする。

(test-plan 100 'simple-plan-1 :pop-size 3000 :evaluation-limit 1000000000)

結果は

SIMPLE-PLAN-1:
 -- Max=142.61489289357363, Avg=137.15552032970106, #gens=10, #evals=19851, Best=#*11101000110001000111100010011010010110001111111000.
 -- Max=5391.136246786632, Avg=5134.166416579917, #gens=7, #evals=14304, Best=#*10000010010101011111100101010100000100110101111100.
 -- Max=321.82183687562343, Avg=308.63278835266544, #gens=11, #evals=24536, Best=#*00110001110101111100111100000100010100001110101011.
 -- Max=3039.350724637681, Avg=2927.9129416460914, #gens=18, #evals=40850, Best=#*10001110101000001110110011110011100011011001100010.
 -- Max=3086.316409124356, Avg=2974.185924951546, #gens=8, #evals=14849, Best=#*00101000110101011101110000001000101011100101110110.
 -- Max=2501.075730471079, Avg=2400.2046481735915, #gens=8, #evals=14853, Best=#*10100111100101010011000100100011110010101001011110.
 -- Max=240.58185155443385, Avg=230.96363448202854, #gens=10, #evals=19397, Best=#*01101010001110101101100001011000111001000111011001.
 -- Max=760.8024668964266, Avg=734.6861283552338, #gens=10, #evals=19848, Best=#*11000110010101011011001100110010100000100111111100.
 -- Max=113.20964128586466, Avg=107.56115507851729, #gens=10, #evals=21065, Best=#*11011011110101110111001000100111000000110110010101.
 -- Max=48770.976744186046, Avg=47194.052766792905, #gens=16, #evals=36461, Best=#*10001111101010101100111100110100011010100100011001.
 -- Max=367.7923535601543, Avg=352.2279566215395, #gens=9, #evals=18251, Best=#*00000010101000000000111110110010101011011011100101.
 -- Max=144.89598231250216, Avg=139.5894589745756, #gens=10, #evals=19413, Best=#*00101100110011100110011010111100101000110011001010.
 -- Max=998.4060937871935, Avg=956.812256633669, #gens=8, #evals=15297, Best=#*01010001100000100001101010001111001100011101100111.
 -- Max=144.9110005527916, Avg=140.5674017258869, #gens=9, #evals=17246, Best=#*00001011100100110010000011100001010100101101101111.
 -- Max=2634.613065326633, Avg=2514.3065220883204, #gens=11, #evals=21514, Best=#*01100001110000001111100111100010000101000110011111.
 -- Max=433.9683393688567, Avg=415.31702442800355, #gens=10, #evals=21329, Best=#*01011011100001001001011000010001100001001111011111.
 -- Max=321.92063857548544, Avg=311.730311934609, #gens=9, #evals=17198, Best=#*11000010001010001101010110001001111101011011110000.
 -- Max=105.58082867643357, Avg=100.97763609447222, #gens=10, #evals=21104, Best=#*01111000110110010011101111001011000110011001011000.
 -- Max=281.2703862660944, Avg=271.89971514833843, #gens=15, #evals=32500, Best=#*00011110111010111001000011000101000011001011110011.
 -- Max=1671.0374501992032, Avg=1605.3150850833645, #gens=10, #evals=21083, Best=#*11110101000111111100001110010111010110100001101000.
 -- Max=621.8389918458117, Avg=599.0423787666093, #gens=20, #evals=37655, Best=#*00110000111001100110001101000100111100011111000110.
 -- Max=5737.761969904241, Avg=5491.044407643343, #gens=20, #evals=43692, Best=#*00111100100011100011110101100001101101100000001111.
 -- Max=574.2475355969332, Avg=553.7706584580071, #gens=12, #evals=25066, Best=#*01101001000011100000101011100011111000100111010101.
 -- Max=611.8605397520058, Avg=593.9162043065234, #gens=35, #evals=35561, Best=#*00100010011001000001110110010011010110100110110011.
 -- Max=145.93451863191956, Avg=138.7454583176363, #gens=8, #evals=17173, Best=#*11010000010010101011100110101111011111000110100000.
 -- Max=75.48328114314509, Avg=72.87191070970853, #gens=12, #evals=23561, Best=#*11101001110111110110101010100000100111101000011010.
 -- Max=212.83320647485664, Avg=205.46000889763192, #gens=13, #evals=27487, Best=#*01011010111001001110000011000011110011110000101101.
 -- Max=25890.765432098764, Avg=25053.635760838708, #gens=10, #evals=20425, Best=#*10001101101100110011001001110100101000100011101101.

(中略)

Evaluation took:
  105.236 seconds of real time
  105.162170 seconds of total run time (103.998552 user, 1.163618 system)
  [ Run times consist of 6.668 seconds GC time, and 98.495 seconds non-GC time. ]
  99.93% CPU
  447 lambdas converted
  280,637,441,808 processor cycles
  2 page faults
  4,723,809,104 bytes consed
  
 ==> Avg max=1662.6308595252744, Avg avg=1602.621294944806, Avg #gens=11.98, Avg #evals=24154.88

このうち評価値が最大のものは10001101101100110011001001110100101000100011101101だ。2つの集合の要素の平方根の和は0.0000386でまずまずといえる。