Database performance tuning for App Dev

บทความนี้เกี่ยวกับการ tuning database สำหรับ Application Developer เพื่อเป็นแนวทาง และสร้างความเข้าใจในการใช้งาน database ให้มีประสิทธิภาพ ตัวอย่างนี้จะใช้ PostgreSQL ซึ่งสามารถ apply ใช้กับ database อื่นๆ ได้

เริ่มจาก connect ไปยัง PostgreSQL และสร้าง table financial พร้อมกับ insert ข้อมูล 100,000,000 random records ตาม script (สามารถใช้ PostgreSQL ที่ run ด้วย container ในบทความ Template for PostgreSQL HA with Patroni and HA test)

“แก้จำนวน random records ให้เหลือ 10,000,000 เพื่อที่จะไม่ต้องรอนานในแต่ละคำสั่ง”

$ psql -h localhost -p 5000 -U postgres
Password for user postgres:
psql (16.1, server 16.2 (Debian 16.2-1.pgdg120+2))
Type "help" for help.

postgres=# create database test;
CREATE DATABASE

postgres=# \c test
$ psql (16.1, server 16.2 (Debian 16.2-1.pgdg120+2))
You are now connected to database "test" as user "postgres".

test=# CREATE TABLE transactions (id SERIAL PRIMARY KEY,
test(#   acc_from  TEXT   NOT NULL,
test(#   acc_to    TEXT   NOT NULL,
test(#   value     INT    NOT NULL);
CREATE TABLE

test=# INSERT INTO transactions (acc_from, acc_to, value) SELECT  'person' || CAST(FLOOR(RANDOM() * 10000) AS TEXT),  'person' || CAST(FLOOR(RANDOM() * 10000) AS TEXT),  FLOOR(RANDOM() * RANDOM() * 1000000) + 1 FROM GENERATE_SERIES(1, 100000000) AS i;

INSERT 0 100000000

test=#
test=# select count(*) from transactions;
   count
-----------
 100000000
(1 row)

test=# select * from transactions limit 1;
 id |  acc_from  |   acc_to   | value
----+------------+------------+-------
  1 | person1633 | person8258 |  2914

ทดสอบ query เพื่อให้ return ค่า account ที่ถูกโอนจาก person1633

test=# SELECT * FROM transactions WHERE acc_from='person1633' limit 2;
  id  |  acc_from  |   acc_to   | value
------+------------+------------+--------
    1 | person1633 | person8258 |   2914
 7276 | person1633 | person6042 | 214390
(2 rows)

PostgreSQL มี feature ที่ทำให้เราสามารถเห็น plan สำหรับ query ได้ด้วยการเพิ่ม EXPLAIN ไว้หน้า query statement

test=# EXPLAIN SELECT * FROM transactions WHERE acc_from='person1633';
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1258126.36 rows=9974 width=28)
   Workers Planned: 2
   ->  Parallel Seq Scan on transactions  (cost=0.00..1256128.96 rows=4156 width=28)
         Filter: (acc_from = 'person1633'::text)
 JIT:
   Functions: 2
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(7 rows)

สิ่งที่น่าสนใจของ query นี้คือ plan จะ run ในแบบ “Sequential Scan” ด้วย Filter condition โดยการทำงานจะ loop ในทุกๆ row เพื่อเช็ค criteria ว่าตรงกับที่ต้องการค้นหาหรือไม่

นอกจากการใช้ EXPLAIN เพื่อดู plan แล้ว เราสามารถเพิ่ม option ANALYSE เพื่อให้ query ทำงานจริงและได้ข้อมูลเวลาที่ใช้ในการทำงาน query ออกมาด้วย

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633';
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1258126.36 rows=9974 width=28) (actual time=38.793..3755.765 rows=10037 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on transactions  (cost=0.00..1256128.96 rows=4156 width=28) (actual time=57.632..3710.918 rows=3346 loops=3)
         Filter: (acc_from = 'person1633'::text)
         Rows Removed by Filter: 33329988
 Planning Time: 0.249 ms
 JIT:
   Functions: 6
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 1.361 ms, Inlining 127.730 ms, Optimization 22.775 ms, Emission 19.870 ms, Total 171.737 ms
 Execution Time: 3757.601 ms
(12 rows)

ข้อมูล Execution Time: 3757.601 ms (3.8 sec) คือเวลาที่ใช้ในการ run query และข้อมูล actual time=57.632..3710.918 แบ่งเป็นเวลาสำหรับ first result คือ 57.632 ms และเวลาของผลลัพธ์ทั้งหมดคือ 3710.918 ms จากการที่ first result เป็นค่าที่ไม่มากนักคือ 58 ms เมื่อเทียบกับเวลาทั้งหมด โดยทั่วไป developer มักจะรอให้ได้ result ทั้งหมดเพื่อส่งกลับให้กับ Users แต่ถ้าเป็น application ลักษณะ streaming data จะได้ประโยชน์จากตรงนี้มาก

ค่าต่อไปที่ควรพิจารณาคือค่าประมาณ (predict) และค่าจริง (actual) ที่ได้จาก plan ตามข้อมูล result ค่า predict ได้ข้อมูลกลับมาจำนวน 9974 rows แต่ได้กลับมาจริง 10047 rows

อีกค่าที่เห็นได้จาก sub-query เป็นค่า predict ที่จะได้ผลลัพธ์กลับมาที่จำนวน 4156 rows แต่ได้กลับมาจริงที่จำนวน 3346 rows

ทั้งนี้ค่าดังกล่าวได้มาจากสถิติ (stat) ที่ควรจะไม่ต่างกันมากนัก ค่า stat ได้มาจาก PostgreSQL คาดการณ์และคำนวณกับ table ซึ่งเกิดขึ้นอัตโนมัติในทุกครั้งที่มีการ query บางครั้งค่าสถิติอาจจะไม่ล่าสุด เลยจำเป็นต้อง refresh เพื่อ update ค่าใหม่เสมอ และจำเป็นต้องทำทุกครั้งก่อนที่จะทำ performance tuning ด้วย cmd analyse

test=# analyse transactions;
ANALYZE

ทดลองดึงค่า plan และ result อีกรอบหลังจาก update stat

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633';
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Gather  (cost=1000.00..1258126.26 rows=9973 width=28) (actual time=64.676..3620.414 rows=10037 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on transactions  (cost=0.00..1256128.96 rows=4155 width=28) (actual time=57.300..3593.221 rows=3346 loops=3)
         Filter: (acc_from = 'person1633'::text)
         Rows Removed by Filter: 33329988
 Planning Time: 0.588 ms
 JIT:
   Functions: 6
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 0.534 ms, Inlining 140.111 ms, Optimization 13.482 ms, Emission 14.104 ms, Total 168.231 ms
 Execution Time: 3707.546 ms
(12 rows)

ค่าที่ได้จะสังเกตุว่าค่า predict กับ actual time มีค่าที่ใกล้เคียงกัน อย่างไรก็ตาม query นี้ก็ยังใช้เวลาค่อนข้างนาน ถ้าดูที่ filter จะเห็นว่ามีถึง 33,329,988 rows ที่ถูก remove จาก filter ซึ่งทำให้เสียเวลาสำหรับการ query จากข้อมูลที่ได้รับจริงเพียง 3346 rows ถ้าเราสามารถจำกัดข้อมูลให้มีการ load เฉพาะบางส่วนของ data แทนที่จะดึงมาทั้งหมดก็น่าจะช่วยในการค้นหาได้เร็วยิ่งขึ้น ซึ่งตรงนี้คือการทำ index ของ table นั่นเอง

test=# CREATE INDEX transactions_from_idx ON transactions (acc_from);
CREATE INDEX
test=# ANALYSE transactions;
ANALYZE

ทดสอบดึงค่า plan และ stat อีกรอบหลังจากสร้าง index

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633';
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on transactions  (cost=114.02..36486.62 rows=9994 width=28) (actual time=5.034..296.537 rows=10037 loops=1)
   Recheck Cond: (acc_from = 'person1633'::text)
   Heap Blocks: exact=9975
   ->  Bitmap Index Scan on transactions_from_idx  (cost=0.00..111.52 rows=9994 width=0) (actual time=2.109..2.109 rows=10037 loops=1)
         Index Cond: (acc_from = 'person1633'::text)
 Planning Time: 1.519 ms
 Execution Time: 297.486 ms
(7 rows)

จากข้อมูลใหม่จะเห็นว่า Execution Time ลดลงจาก 3757.601 ms เหลือ 297.486 ms เร็วขึ้นมากกว่า 12 เท่า

การใช้ Index จะเหมาะสำหรับ table ขนาดใหญ่ สำหรับ table ขนาดเล็กและ query ข้อมูลที่ตรงกับข้อมูลส่วนใหญ่นั้น การใช้วิธีการแบบ Sequential Scan อาจจะได้ความเร็วมากกว่า และการเลือกว่าจะใช้ index หรือไม่ควรมีข้อมูล stat ของ table ณ เวลานั้นด้วยเสมอ

กรณีที่ query มีความซับซ้อนมากขึ้น เราอาจจำเป็นต้องมี index มากกว่าหนึ่ง index

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633' AND acc_to='person6042';
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on transactions  (cost=111.52..36509.10 rows=1 width=28) (actual time=3.898..34.561 rows=2 loops=1)
   Recheck Cond: (acc_from = 'person1633'::text)
   Filter: (acc_to = 'person6042'::text)
   Rows Removed by Filter: 10035
   Heap Blocks: exact=9975
   ->  Bitmap Index Scan on transactions_from_idx  (cost=0.00..111.52 rows=9994 width=0) (actual time=1.257..1.257 rows=10037 loops=1)
         Index Cond: (acc_from = 'person1633'::text)
 Planning Time: 0.304 ms
 Execution Time: 34.643 ms
(9 rows)

จะเห็นว่ามีการเรียกใช้ index เพื่อให้ได้ข้อมูลชุดแรก แล้วส่งต่อให้ heap scan เพื่อวนหา person6042 และมีการ filter ข้อมูลออกถึง 10035 rows กรณีนี้เราสามารถสร้าง index ที่ column acc_to เพื่อให้การ filter ทำงานได้เร็วขึ้น

test=# CREATE INDEX transactions_to_idx ON transactions (acc_to);
CREATE INDEX
test=# ANALYSE transactions;
ANALYZE

ทดสอบดึงค่า stat หลังจากสร้าง index

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633' AND acc_to='person6042';
                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on transactions  (cost=223.18..227.20 rows=1 width=28) (actual time=11.115..11.150 rows=2 loops=1)
   Recheck Cond: ((acc_from = 'person1633'::text) AND (acc_to = 'person6042'::text))
   Heap Blocks: exact=2
   ->  BitmapAnd  (cost=223.18..223.18 rows=1 width=0) (actual time=10.941..10.943 rows=0 loops=1)
         ->  Bitmap Index Scan on transactions_from_idx  (cost=0.00..111.33 rows=9969 width=0) (actual time=5.265..5.265 rows=10037 loops=1)
               Index Cond: (acc_from = 'person1633'::text)
         ->  Bitmap Index Scan on transactions_to_idx  (cost=0.00..111.60 rows=10004 width=0) (actual time=4.751..4.751 rows=9912 loops=1)
               Index Cond: (acc_to = 'person6042'::text)
 Planning Time: 1.313 ms
 Execution Time: 11.391 ms
(10 rows)

ผลลัพธ์ที่ได้มีการใช้ bitmap เพื่อ matching rows ของทั้งสอง index ที่สร้างขึ้นแล้วใช้ BitmapAnd เพื่อรวมผลลัพธ์ ซึ่งผลลัพธ์ที่ได้ Execution Time: 11.3191 ms ดีกว่าใช้แค่ index เดียว

ทั้งนี้ มีอีกวิธีที่จะทำให้ผลลัพธ์ดีกว่าการแยก index คือการใช้ compound index การใช้ index แบบนี้จะเป็นการสร้าง index จากการรวมกันของหลายๆ column

test=# CREATE INDEX transactions_from_to_idx ON transactions (acc_from, acc_to);
CREATE INDEX
test=# ANALYSE transactions;
ANALYZE

ทดสอบดึงค่า stat หลังจากสร้าง index

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633' AND acc_to='person6042';
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using transactions_from_to_idx on transactions  (cost=0.57..8.59 rows=1 width=28) (actual time=1.159..1.172 rows=2 loops=1)
   Index Cond: ((acc_from = 'person1633'::text) AND (acc_to = 'person6042'::text))
 Planning Time: 0.582 ms
 Execution Time: 1.239 ms
(4 rows)

ผลลัพธ์ที่ได้จะเร็วกว่าเดิมมาก และ plan ที่ได้ก็ดูง่ายไม่ซับซ้อนด้วย

ต่อไปลองเปลี่ยน operation จาก AND เป็น OR หรือ UNION ALL เพื่อดูว่า transactions_from_to_idx จะถูกเรียกใช้หรือไม่

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633' OR acc_to='person6042';
                                                                  QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on transactions  (cost=232.97..69698.17 rows=19978 width=28) (actual time=13.400..596.243 rows=19947 loops=1)
   Recheck Cond: ((acc_from = 'person1633'::text) OR (acc_to = 'person6042'::text))
   Heap Blocks: exact=19692
   ->  BitmapOr  (cost=232.97..232.97 rows=19979 width=0) (actual time=4.631..4.633 rows=0 loops=1)
         ->  Bitmap Index Scan on transactions_from_idx  (cost=0.00..111.60 rows=10004 width=0) (actual time=3.021..3.021 rows=10037 loops=1)
               Index Cond: (acc_from = 'person1633'::text)
         ->  Bitmap Index Scan on transactions_to_idx  (cost=0.00..111.38 rows=9975 width=0) (actual time=1.607..1.607 rows=9912 loops=1)
               Index Cond: (acc_to = 'person6042'::text)
 Planning Time: 0.510 ms
 Execution Time: 597.583 ms
(10 rows)

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633' UNION ALL SELECT * FROM transactions WHERE acc_from='person6042';
                                                                    QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=114.10..73143.22 rows=20008 width=28) (actual time=3.422..412.339 rows=19879 loops=1)
   ->  Bitmap Heap Scan on transactions  (cost=114.10..36521.59 rows=10004 width=28) (actual time=3.419..126.569 rows=10037 loops=1)
         Recheck Cond: (acc_from = 'person1633'::text)
         Heap Blocks: exact=9975
         ->  Bitmap Index Scan on transactions_from_idx  (cost=0.00..111.60 rows=10004 width=0) (actual time=1.444..1.444 rows=10037 loops=1)
               Index Cond: (acc_from = 'person1633'::text)
   ->  Bitmap Heap Scan on transactions transactions_1  (cost=114.10..36521.59 rows=10004 width=28) (actual time=4.545..284.231 rows=9842 loops=1)
         Recheck Cond: (acc_from = 'person6042'::text)
         Heap Blocks: exact=9789
         ->  Bitmap Index Scan on transactions_from_idx  (cost=0.00..111.60 rows=10004 width=0) (actual time=3.600..3.600 rows=9842 loops=1)
               Index Cond: (acc_from = 'person6042'::text)
 Planning Time: 0.531 ms
 Execution Time: 413.466 ms
(13 rows)

ผลลัพธ์ที่ได้จะพบว่าไม่สามารถใช้ compound index ในกรณีที่เป็นลักษณะการรวมผลลัพธ์ (OR , UNION) แต่ plan สามารถเลือกใช้ index ที่มีอยู่ แล้วใช้ BitmapOR เพื่อรวมผลลัพธ์

กรณีที่ไม่มี index เลยหรือมีแค่ index เดียว การทำงานจะกลับไปเป็นแบบ Sequential scan กับ data ทั้งหมด ซึ่งทำให้ได้ผลลัพธ์ช้ามาก ถึงแม้จะมี index เดียว แต่การเชื่อมด้วย OR ก็ยังเกิด Sequential scan ของ column ที่เหลือ

การมี index จะช่วยให้การอ่านสามารถทำได้อย่างรวดเร็ว แต่จะส่งผลเสียกับการเขียนที่ช้าลง เพราะต้องเสียเวลากับการ update ข้อมูล index ดังนั้น ต้องใช้ index ให้เหมาะสมกับงาน เช่น ใช้หลาย index สำหรับงาน Data Warehouse แต่งานที่เป็น online application ที่ต้องมีการเขียนบ่อยๆ ก็จะไม่เหมาะ เพราะจะส่งผลต่อ performance ของ write transaction

การสร้าง index เองก็ต้องระวังเรื่องความซ้ำซ้อน (overlapping)

test=# SELECT indexname, indexdef FROM pg_indexes WHERE tablename='transactions';
        indexname         |                                          indexdef
--------------------------+---------------------------------------------------------------------------------------------
 transactions_pkey        | CREATE UNIQUE INDEX transactions_pkey ON public.transactions USING btree (id)
 transactions_from_idx    | CREATE INDEX transactions_from_idx ON public.transactions USING btree (acc_from)
 transactions_to_idx      | CREATE INDEX transactions_to_idx ON public.transactions USING btree (acc_to)
 transactions_from_to_idx | CREATE INDEX transactions_from_to_idx ON public.transactions USING btree (acc_from, acc_to)
(4 rows)

การสร้าง index ของ postgresql จะใช้ btree ซึ่งเป็น default ของ SQL index type และ Sorted list ดังนั้นค่า index ของ column จะถูก sort ตามลำดับ ตัวอย่างที่ผ่านมาเมื่อเราใช้ compound index จะได้ผลลัพธ์ดีกว่าเพราะการ sort จะทำที่ column แรกแล้ว sort ที่ column ที่สองสัมพันธ์กับ column แรก การค้นหาในลักษณะนี้จะได้ผลลัพธ์เร็วกว่าเพราะข้อมูลถูกเรียงไว้ให้แล้ว เมื่อเทียบกับข้อมูลที่ถูกเรียงแต่ไม่สัมพันธ์กันเหมือนแบบแรก

ขณะเดียวกัน ถ้าเราดู index ที่สร้างขึ้นมาจะพบว่า index column acc_from ถูกสร้างไว้สองที่ คือ transactions_from_idx และ transactions_from_to_idx เราสามารถลดความซ้ำซ้อนนี้ได้ด้วยการ drop transactions_from_idx และให้ plan เลือกใช้ index ของ column acc_from จาก index transactions_from_to_idx

test=# DROP INDEX transactions_from_idx;
DROP INDEX

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633';
                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on transactions  (cost=234.10..36641.59 rows=10004 width=28) (actual time=18.149..224.544 rows=10037 loops=1)
   Recheck Cond: (acc_from = 'person1633'::text)
   Heap Blocks: exact=9975
   ->  Bitmap Index Scan on transactions_from_to_idx  (cost=0.00..231.60 rows=10004 width=0) (actual time=16.133..16.133 rows=10037 loops=1)
         Index Cond: (acc_from = 'person1633'::text)
 Planning Time: 0.793 ms
 Execution Time: 225.338 ms
(7 rows)

จากผลลัพธ์ที่ได้ plan มีการใช้ index transactions_from_to_idx และผลลัพธ์ Execution Time ก็ไม่ต่างจากการมี index transactions_from_idx

กรณีนี้ไม่สามารถใช้กับ transactions_to_idx เพราะ transactions_from_to_idx เป็นการ sort column acc_to ตาม column acc_from จึงไม่ได้เป็นการ sort แบบอิสระเหมือน column acc_from ที่ไม่สนใจ column acc_to ดังนั้นกรณีนี้จึงใช้ได้เฉพาะ column แรกของ compound index เท่านั้น

ตัวอย่างที่ผ่านมาเรา query ข้อมูลทุกอย่างใน record ด้วย ‘*’ จะพบว่ามีสองขั้นตอนคืออ่านข้อมูลที่ตรงกับ index แล้วจึงดึงข้อมูลจาก heap (main table) ซึ่งก็คือ Bitmap Heap Scan on transactions

ถ้าเราเปลี่ยนจากอ่านข้อมูลทุกอย่างใน record เป็นอ่านเฉพาะ index column

test=# EXPLAIN (ANALYSE) SELECT acc_from, acc_to FROM transactions WHERE acc_from='person1633';
                                                                       QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using transactions_from_to_idx on transactions  (cost=0.57..335.64 rows=10004 width=20) (actual time=0.460..10.586 rows=10037 loops=1)
   Index Cond: (acc_from = 'person1633'::text)
   Heap Fetches: 0
 Planning Time: 0.358 ms
 Execution Time: 11.276 ms
(5 rows)

ผลลัพธ์จะได้ Execution Time ที่น้อยมาก เพราะมีแค่ขั้นตอนเดียวคือ match ข้อมูลตาม index โดยไม่ต้องอ่านข้อมูลใน main table โดยใน plan จะเห็นว่ามีการใช้ Index Only Scan

การใช้ Index Only Scan จะได้ performance ที่ดีมากๆ โดยที่เราไม่ต้องกังวลว่า data ที่ได้มาจะตรงกับ main table หรือไม่ เพราะกระบวนการ update ใดๆ ข้อมูล index และ main table จะมีการ update ให้ตรงกันอยู่แล้ว แต่ถ้าเมื่อไรที่เราเพิ่ม column ที่ไม่ใช่ index ใน select statement จะทำให้ plan กับมาใช้ Bitmap Heap Scan ซึ่งจะทำให้ query กลับมาอ่านข้อมูลใน main table และจะทำให้ query เรากลับมาช้าเหมือนเดิม

test=# EXPLAIN (ANALYSE) SELECT acc_from, acc_to, value FROM transactions WHERE acc_from='person1633';
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on transactions  (cost=234.10..36641.59 rows=10004 width=24) (actual time=6.518..99.584 rows=10037 loops=1)
   Recheck Cond: (acc_from = 'person1633'::text)
   Heap Blocks: exact=9975
   ->  Bitmap Index Scan on transactions_from_to_idx  (cost=0.00..231.60 rows=10004 width=0) (actual time=4.501..4.502 rows=10037 loops=1)
         Index Cond: (acc_from = 'person1633'::text)
 Planning Time: 0.319 ms
 Execution Time: 100.680 ms
(7 rows)

วิธีที่จะจัดการปัญหานี้คือการใช้ Index-only queries (+INCLUDE) ซึ้งเป็นวิธีทั่วไปที่ support ใน database อื่นๆ นอกจาก PostgreSQL ด้วย

test=# DROP INDEX transactions_from_to_idx;
DROP INDEX
test=# CREATE INDEX transactions_from_to_idx
test-# ON transactions (acc_from, acc_to)
test-# INCLUDE (value);
CREATE INDEX
test=# ANALYSE transactions;
ANALYZE

test=# EXPLAIN (ANALYSE) SELECT acc_from, acc_to, value FROM transactions WHERE acc_from='person1633';
                                                                      QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Only Scan using transactions_from_to_idx on transactions  (cost=0.57..422.66 rows=9948 width=24) (actual time=2.391..8.608 rows=10037 loops=1)
   Index Cond: (acc_from = 'person1633'::text)
   Heap Fetches: 0
 Planning Time: 4.168 ms
 Execution Time: 9.156 ms
(5 rows)

เมื่อทดสอบอีกครั้งจะพบว่า plan กลับมาใช้ index only scan และ Execution Time ทำงานได้เวลาที่ดีเช่นเดิม

ทีนี้อาจจะเกิดความสงสัยว่าเราน่าจะมีอีกวิธีคือใช้ compound index ให้มี column value เข้าไปด้วย เช่น CREATE INDEX transactions_from_to_idx ON transactions (acc_from, acc_to, value); วิธีนี้มีข้อเสียคือ column value จะต้องถูก sort ด้วย จะเป็นการเพิ่ม process เทียบกับการที่ไม่ต้อง sort จะทำให้ลดขั้นตอนและสูญเสียเวลาน้อยกว่า

วิธีการใช้ INCLUDE เป็นการบอก database ว่าไม่ต้อง order column ซึ่งจะช่วยลดงานในขั้นตอน rebalance ของ Tree

INCLUDE สามารถใช้กับ UNIQUE indices ได้ แต่ต้องไม่ถูกใช้ใน unique constraint

สิ่งที่ต้องระวังอีกอย่างคือ Cache Memory เนื่องจาก index เองก็มี limit เกี่ยวกับ data ที่จะจัดเก็บใน record การเก็บ data ทุกอย่างไว้ใน index จะทำให้ index ใหญ่เกินไป และส่งผลทั้งต่อการเขียน และอ่านได้ ต้องพิจารณาการใช้งานอย่างเหมาะสม

นอกจากการ Tuning ด้วย index แล้วต้องพิจารณาเรื่อง Pages และ Processor Cache เพราะการ load data จาก main table จะต้องอ่าน file ที่อยู่ใน Disk มาที่ RAM ที่ละส่วน (chunks) หรือเรียกว่า pages (blocks ของ data มีขนาดมาตรฐานที่ 4 kB) ปัจุบัน CPU รุ่นใหม่ๆ สามารถอ่านข้อมูลขนาดใหญ่ (megabytes) ได้จึงควร tuning ให้เหมาะสม

การบันทึกข้อมูล PostgreSQL ทำการเก็บข้อมูลของ Table ในรูปแบบของ pages โดยที่ 1 record จะอยู่ใน page เดียวเท่านั้น และใน 1 page อาจจะมีหลาย 100 records ขึ้นอยู่กับขนาดของ record

กรณีอ่านข้อมูล page ที่ต้องการจะถูกอ่านจาก Disk ไปยัง RAM โดยที่ performance จะขึ้นอยู่กับความเร็วการอ่านของ Disk เมื่อมีการเรียกใช้ page เดิม ข้อมูลจะถูกอ่านจาก RAM แทนที่จะอ่านจาก Disk อีกรอบ แต่ถ้า space ของ RAM มีน้อย database จะทำการลบข้อมูลจาก RAM เพื่อให้ได้เนื้อที่กลับมาในการ load page ใหม่เข้ามาที่ RAM การลบข้อมูล page เก่าจาก RAM และการอ่านจาก Disk ส่งผลต่อ performance โดยรวมของ database เป็นอย่างมาก

การเพิ่มขนาดของ RAM ให้เหมาะสมเช่น database ขนาดเล็กทุก pages อาจจะอยู่ใน RAM ทำให้ทำงานเร็ว แต่ถ้า data มีขนาดใหญ่จะเกิดการอ่านจาก disk และขณะเดียวกันก็ต้องลบ pages จาก RAM ด้วย performance ก็จะช้า จึงต้อง assign RAM ให้เหมาะสม

การ query ข้อมูล อย่างเช่น SELECT * FROM transactions WHERE acc_from=’person1633′; อาจจะต้องการข้อมูลจากทั้ง page 1 และ page 2 โดยแต่ละ page อาจจะมีหลาย 100 records ซึ่งจะใช้เวลามากถ้าต้องอ่านจาก page ทั้งหมด กลับกันถ้า query ต้องการข้อมูลจาก page เดียวก็จะได้ผลลัพธ์ที่ดีกว่า จึงเป็นที่มาว่าทำไม index-only ถึงเร็ว เพราะการที่ index ถูก sort จะทำให้โอกาสข้อมูลจะกระจายไปยังหลายๆ page นั้นน้อยกว่านั่นเอง

ย้อนกลับไปดู analysis เดิม จะพบข้อมูลที่น่าสนใจคือ

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions WHERE acc_from='person1633';
                                                                 QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on transactions  (cost=234.10..36641.59 rows=10004 width=28) (actual time=18.149..224.544 rows=10037 loops=1)
   Recheck Cond: (acc_from = 'person1633'::text)
   Heap Blocks: exact=9975
   ->  Bitmap Index Scan on transactions_from_to_idx  (cost=0.00..231.60 rows=10004 width=0) (actual time=16.133..16.133 rows=10037 loops=1)
         Index Cond: (acc_from = 'person1633'::text)
 Planning Time: 0.793 ms
 Execution Time: 225.338 ms
(7 rows)

จะพบว่ามีการดึงข้อมูลจำนวน 10037 records จาก table transactions ที่กระจายอยู่ใน page จำนวน 9975 pages หมายความว่าข้อมูลที่ต้องการถูกกระจายไปยังทุกๆ page page ละ 1 record query จะเร็วขึ้นมากถ้าข้อมูลไม่กระจายและถูก pack อยู่ในไม่กี่ page

การจัดการปัญหานี้ค่อนข้างยาก วิธีที่ง่ายคือการแยก read/write ข้อมูล และการ archive data ที่ไม่มีการเปลี่ยนแปลง แยก node เฉพาะสำหรับงาน อีกข้อแนะนำคือการลดขนาดของ record ให้เล็กลงเพื่อให้ 1 page สามารถเก็บ records ได้มากที่สุด

การกระจายของข้อมูลใน table เองก็มีผลต่อ performance ของการอ่านข้อมูล สามารถทดสอบได้จากการสร้าง table ใหม่คือ transaction2 แต่ทำการ order column acc_from หลังจากนั้น create index และดูผลลัพธ์ของ Execution Time

test=# CREATE TABLE transactions2 (id INT PRIMARY KEY,
  acc_from  TEXT   NOT NULL,
  acc_to    TEXT   NOT NULL,
  value     INT    NOT NULL);

test=# INSERT INTO transactions2 (id, acc_from, acc_to, value)
  SELECT id, acc_from, acc_to, value FROM transactions
  ORDER BY acc_from;

test=# CREATE INDEX transactions2_from_idx ON transactions2 (acc_from);

test=# ANALYSE transactions2;
ANALYZE

test=# EXPLAIN (ANALYSE) SELECT * FROM transactions2 WHERE acc_from='person1633';

อีกปัจจัยที่มีผลต่อ performance ของ database คือ data fragment ที่เกิดจากการ delete หรือ update บาง record ใน table ทั้งนี้ข้อมูล master table จะมี column ID ที่บอกว่า record นั้นถูกเขียนที่ page number และ record index อะไร เพื่อให้อ่านข้อมูลได้ถูกที่ ซึ่งจริงๆ แล้วข้อมูลไม่ได้ถูกลบออกทันที แต่จะถูก mark ว่า record นั้นเป็น dead record และข้อมูลใหม่หรือข้อมูล update จะถูกเพิ่มเข้าไป PostgreSQL เพิ่มกระบวนการในตรวจสอบ dead record ที่อยู่ใน page ด้วยการ track ข้อมูล dead record จาก memory ซึ่งในกระบวนการอ่านข้อมูลจาก page ถ้าพบว่า page นั้นมี dead record ก็จะมีกระบวนการตรวจสอบเพิ่มเติม แต่ถ้าเช็คแล้ว page นั้นไม่มี dead record ก็จะ return ทั้ง page ได้เลย ทำให้ performance จะดีขึ้นมาก

กระบวนการที่จะ scan dead record แล้วนำ space มาใช้ใหม่เรียกว่า Vacuum โดยที่กระบวนการนี้จะทำให้สามารถเขียนข้อมูลใหม่ที่ตำแหน่ง dead record ได้ แต่ก็ต้องเป็น record ของ table เดียวกันถึงจะใช้ space นี้ได้พอดี

ดังนั้นจำเป็นต้อง run process Vacuum อย่างสม่ำเสมอ เพื่อให้ได้ space กลับมาใช้อีกครั้ง อีกอย่างคือการที่ page มี space ว่างเยอะจากการที่ space ไม่ได้ถูกใช้เพราะไม่มี record ของ table เดียวกันสร้างใหม่ ก็จะเกิด space ใน page มากเกินไป ทำให้ 1 page มีจำนวน record น้อยลง ส่งผลให้การอ่านต้องอ่านจากหลาย page ซึ่งก็จะส่งผลต่อ performance เป็นเหตุการณ์ data fragmentation เช่นเดียวกับ Disk ที่เราต้องใช้ defragmenter utiltity ในการลด fragmentation ของ drive

การป้องกันปัญหานี้ จะต้องหลีกเลี่ยงการใช้งาน live data (เปลี่ยนแปลงบ่อย) กับ achived data (data ที่ไม่ค่อยเปลี่ยนแปลง) ใน node เดียวกัน การแยก node จะทำให้ปัญหานี้น้อยลง

…end…

Template for PostgreSQL HA with Patroni and HA test

ในการเริ่มต้นสร้าง PostgreSQL ด้วย patroni จะพบว่ามี config ค่อนข้างเยอะ จะมีวิธีไหนบ้างที่ง่ายๆ ที่จะ start database เพื่อทดลอง feature และ evaluate การทำงานใน case ต่างๆ เช่น กรณี fail over, read split-brain รวมถึงการ tuning เพื่อหา parameter ที่เหมาะสมไปใช้สำหรับงานจริง ในลักษณะของ sandbox ส่วนตัว แนะนำให้ใช้ template จาก Patroni: A Template for PostgreSQL HA with ZooKeeper, etcd or Consul โดยขั้นตอนตาม script ดังนี้

$ git clone https://github.com/zalando/patroni.git
$ cd patroni
$ docker build -t patroni .
$ docker-compose up -d

$ docker ps
CONTAINER ID   IMAGE     COMMAND                  CREATED          STATUS          PORTS                              NAMES
4505a559a9e3   patroni   "/bin/sh /entrypoint…"   43 seconds ago   Up 42 seconds                                      demo-etcd2
3b5183d6cee9   patroni   "/bin/sh /entrypoint…"   43 seconds ago   Up 42 seconds                                      demo-etcd1
bd982fed4baa   patroni   "/bin/sh /entrypoint…"   43 seconds ago   Up 42 seconds                                      demo-patroni1
425d51032cb2   patroni   "/bin/sh /entrypoint…"   43 seconds ago   Up 42 seconds   0.0.0.0:5000-5001->5000-5001/tcp   demo-haproxy
07cc2bfec72c   patroni   "/bin/sh /entrypoint…"   43 seconds ago   Up 42 seconds                                      demo-etcd3
cfa97b608d07   patroni   "/bin/sh /entrypoint…"   43 seconds ago   Up 42 seconds                                      demo-patroni3
96a5b49cd787   patroni   "/bin/sh /entrypoint…"   43 seconds ago   Up 42 seconds                                      demo-patroni2

จะเห็นว่า docker-compose.yml ได้ start patroni ขึ้นมา 3 node, etcd 3 node และ haproxy 1 node โดย expose port 5000 สำหรับ write และ 5001 สำหรับ read มาให้เลย

ทดสองเรียกใช้ cmd ของ patroni ใน cluster

$ docker exec -it demo-patroni1 patronictl list
+ Cluster: demo (7351235807499665431) --------+----+-----------+
| Member   | Host       | Role    | State     | TL | Lag in MB |
+----------+------------+---------+-----------+----+-----------+
| patroni1 | 172.19.0.6 | Leader  | running   |  1 |           |
| patroni2 | 172.19.0.7 | Replica | streaming |  1 |         0 |
| patroni3 | 172.19.0.5 | Replica | streaming |  1 |         0 |
+----------+------------+---------+-----------+----+-----------+

ทดสอบ edit config ของ postgresql instance

$ docker exec -i demo-patroni1 patronictl edit-config --apply - --force <<'JSON'
{
 synchronous_mode: "on",
 synchronous_mode_strict: "on",
 "postgresql":
   {
   "parameters":{
     "synchronous_commit": "on",
     "synchronous_standby_names": "*"
   }
  }
}
JSON

$ docker exec -it demo-patroni2 patronictl show-config
loop_wait: 10
maximum_lag_on_failover: 1048576
postgresql:
  parameters:
    max_connections: 100
    synchronous_commit: 'on'
    synchronous_standby_names: '*'
  pg_hba:
  - local all all trust
  - host replication replicator all md5
  - host all all all md5
  use_pg_rewind: true
retry_timeout: 10
synchronous_mode: 'on'
synchronous_mode_strict: 'on'
ttl: 30

ทดสอบ switch over primary node

$ docker exec -it demo-patroni1 patronictl list
+ Cluster: demo (7351235807499665431) -+-----------+----+-----------+
| Member   | Host       | Role         | State     | TL | Lag in MB |
+----------+------------+--------------+-----------+----+-----------+
| patroni1 | 172.19.0.6 | Sync Standby | streaming |  3 |         0 |
| patroni2 | 172.19.0.7 | Leader       | running   |  3 |           |
| patroni3 | 172.19.0.5 | Replica      | streaming |  3 |         0 |
+----------+------------+--------------+-----------+----+-----------+

$ docker exec -it demo-patroni1 patronictl switchover --candidate patroni1 --force
Current cluster topology
+ Cluster: demo (7351235807499665431) -+-----------+----+-----------+
| Member   | Host       | Role         | State     | TL | Lag in MB |
+----------+------------+--------------+-----------+----+-----------+
| patroni1 | 172.19.0.6 | Sync Standby | streaming |  3 |         0 |
| patroni2 | 172.19.0.7 | Leader       | running   |  3 |           |
| patroni3 | 172.19.0.5 | Replica      | streaming |  3 |         0 |
+----------+------------+--------------+-----------+----+-----------+
2024-03-28 03:11:15.87873 Successfully switched over to "patroni1"
+ Cluster: demo (7351235807499665431) ------+----+-----------+
| Member   | Host       | Role    | State   | TL | Lag in MB |
+----------+------------+---------+---------+----+-----------+
| patroni1 | 172.19.0.6 | Leader  | running |  3 |           |
| patroni2 | 172.19.0.7 | Replica | stopped |    |   unknown |
| patroni3 | 172.19.0.5 | Replica | running |  3 |         0 |
+----------+------------+---------+---------+----+-----------+

ทดสอบใช้งาน postgresql

$ docker exec -i demo-patroni1 psql <<'SQL'
create table test(name varchar(50));
insert into test(name) values ('tao');
SQL
CREATE TABLE
INSERT 0 1

$ docker exec -i demo-patroni1 psql <<'SQL'
select * from test;
SQL
 name
------
 tao
(1 row)

ทดสอบ Postgresql HA ที่ setup ด้วย Patroni

Standby Server Tests

No.	Test Scenario	Observation
1	Kill Postgresql process	Patroni จะ start process Postgresql เองโดยอัตโนมัติเมื่อเจอว่า postgresql process หายไป จากกระบวนการ health check – เนื่องจากเป็น process ที่ standby server จะไม่ส่งผลต่อ application ขณะทำการ write data
2	Stop Postgresql process	Patroni ทำการ start process เช่นเดียวกับ case 1 โดยไม่ส่งผลต่อ application ขณะทำการ write data
3	Reboot server	กรณีนี้จะต้อง set patroni service ให้เริ่มทำงานหลังจาก server boot เสร็จแล้ว ซึ่ง postgresql จะทำงานตามปกติหลังจาก service start – ไม่ส่งผลต่อ application ขณะทำการ write data
4	Stop patroni process	– Postgresql process ยังคงทำงานปกติ – ไม่สามารถใช้ cmd patronictl – ไม่ส่งผลต่อ application ขณะทำการ write data

Primary Server Tests

No	Test Scenario	Observation
1	Kill Postgresql process	Patroni ทำการ start postgresql process จากกระบวนการ health check แต่เนื่องจากเป็น node primary ที่มี primary lock อยู่แล้ว จึงไม่เกิดการกระบวนการ switch over ไปยัง replica node – ส่งผลต่อ application ไม่สามารถ write data ได้ในขณะ start process
2	Reboot server	ทำการ fail over ไปยัง standby server ด้วยกระบวนการเลือก standby server ที่อยู่ใน member หลังจากที่ server boot และ process postgresql ของ primary เดิมเริ่มทำงาน postgresql จะทำกระบวนการ pg_rewind เพื่อ update ข้อมูลล่าสุดและเปลี่ยนสถานะตัวเองเป็น standby server – ส่งผลต่อ application ไม่สามารถ write data ได้
3	Stop/kill patroni process	– Standby Server จะ promote ตัวเองเป็น primary จากกระบวนการ acquired DCS Lock – Primary server เดิมยังคงทำงาน รวมถึงการเขียนก็ยังเขียนที่ Primary server เดิม ซึ่งทำให้เกิด 2 Primary server ทำงานในเวลาเดียวกัน – หลังจากที่ Patroni กลับมาทำงานอีกครั้งที่ Primary Server เดิม จะมีกระบวนการ pg_rewind เพื่อ update ข้อมูลล่าสุด และเปลี่ยนสถานะตัวเองเป็น Follower * ในระหว่างที่เกิดเหตุการณ์ 2 primary server ทำงานพร้อมกันอาจจะมี data loss ได้ จะต้อง set replication parameter ให้เหมาะสม

Network Isolation Tests

No.	Test Scenario	Observation
1	Primary server ไม่สามารถ connect กับ Standby server ได้	– Postgresql หลุดจากการเป็น Primary server – เกิดกระบวนการเลือก Primary server ใหม่จาก member ที่เหลืออยู่ – ส่งผลให้ application ไม่สามารถเขียนข้อมูลได้
2	Standby server ไม่สามารถ connect กับ Server ใน member ได้	– Postgresql ยังคงทำงาน แต่จะไม่ถูกเลือกในกระบวนการเลือก Primary server ใหม่ – ไม่ส่งผลต่อการเขียนข้อมูลของ Applicaiton

Reference

Patroni for PostgreSQL

Patroni: A Template for PostgreSQL HA with ZooKeeper, etcd or Consul

Postgresql with Patroni installation recipe

เนื้อหาในบทความนี้เกี่ยวกับการ Install database postgresql ในแบบ HA mode ที่ใช้ Patroni ซึ่งเป็น package ที่ช่วยในการ replicate data และ fail over ระหว่าง node รวมถึงการจัดการ configuration ของ Postgresql หลาย node ไว้ที่เดียว โดยใช้ etcd เพื่อเป็น distributed configuration store (ใช้ sotfware ตัวอื่นได้ เช่น consul หรือ zookeeper) และมีการใช้ pgbackrest สำหรับ solution การ back up และ restore database

Requirements:

จำนวน Host ทั้งหมด 4 Host โดยจะมีการ set Postgresql node จำนวน 3 host และ 1 Host สำหรับ haproxy
S3 storage สำหรับเก็บ backup ของ Postgresql
Network connectivity ระหว่าง Host
Root access
Internet access สำหรับ system update และ download 3rd party packages

Installation Diagram

ทำการ tuning OS เพื่อให้มี config ที่เหมาะสำหรับ run database ตามคำแนะนำ Configuring your systems สรุปเป็น script ได้ดังนี้

# check selinux status
sestatus

#disable selinux
/etc/selinux/config
SELINUX=disabled

/etc/sssd/sssd.conf
selinux_provider=none

# check firewall and stop firewall
/sbin/chkconfig --list iptables
/sbin/chkconfig iptables off
# systemctl stop firewalld.service
# systemctl disable firewalld.service

#tuning OS parameter
/etc/sysctl.conf
kernel.shmall = echo $(expr $(getconf _PHYS_PAGES) / 2)
kernel.shmmax = echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95
net.ipv4.ip_local_port_range = 10000 65535 # See Port Settings
kernel.sem = 500 2048000 200 4096
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296

vm.dirty_background_ratio = 3
vm.dirty_ratio = 10

$ awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}'
               /proc/meminfo >> /etc/sysctl.conf

#tuning process and file open
# /etc/security/limits.conf
* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072

# check /etc/security/limits.d/20-nproc.conf
# same as above or nothing

#create disk for postgres seperate from OS, this is for performance improvement
# XFS Mount Options
rw,nodev,noatime,nobarrier,inode64

#CHECK
# /sbin/blockdev --getra devname
# chmod +x /etc/rc.d/rc.local
# /sbin/blockdev --setra 16384 /dev/sdb

# CHECK
# cat /sys/block/devname/queue/scheduler
# check for deadline

# CHECK
$ cat /sys/kernel/mm/*transparent_hugepage/enabled
always [never]

#CHECK
/etc/systemd/logind.conf
RemoveIPC=no

ถ้าไม่ disable firewall จะต้อง manual open port ของแต่ละ components เช่น
$ sudo ufw allow 2379 && sudo ufw allow 2380 && sudo ufw allow 5432 && sudo ufw allow 8008

setup host file เพื่อให้ host สามารถเรียกผ่านชื่อได้ หรืออาจจะใช้ DNS แต่กรณีนี้เพื่อให้ง่ายจึงเพิ่ม host file ในทุกเครื่อง หลักการตั้งชื่อจะใช้ชื่อ host ตามด้วยตัวเลขลำดับของ host และ “-” แล้วตามด้วยลำดับ postgres instance กรณีถ้าต้องการมี postgresql มากกว่าหนึ่ง instance ใน เครื่องเดียวกัน

เพิ่ม ip และ host name ใน file /etc/hosts โดยทำทุกเครื่อง pg1-1, pg2-1 , pg3-1

[ec2-user@ip-172-31-47-229 ~]$ sudo vim /etc/hosts 172.31.47.229 pg1-1 pg1-1.example.com 172.31.34.169 pg2-1 pg2-1.example.com 172.31.39.118 pg3-1 pg3-1.example.com

Download VMware Postgres RPM package จาก VMware Tanzu Network ซึ่ง version ที่ใช้จะเป็น version 16.2.0 ข้อดีของการใช้ package จาก VMware เพราะ tools ที่ต้องใช้ในการ setup ได้ถูก pack มาครบแล้วไม่ต้อง download module เพิ่มเติม รวมถึงเป็น package ที่ได้มีการ verify จาก vendor และแหล่งที่มาที่ปลอดภัยในการนำไปใช้จริงบน Production

file format ที่ download ได้จะอยู่ในรูปแบบ vmware-postgres<version>-<vmware-version>.<platform>.<architecture>.zip ตัวอย่างที่ download ได้คือ vmware-postgres-16.2.0.el9.x86_64.zip โดยที่ RPM จะอยู่ใน zip file ให้ unzip แล้ว install postgres ด้วย yum command

yum install ./vmware-postgres16-16.2-1.el9.x86_64.rpm

command yum install จะสร้าง user postgres และ set environment path ไว้ที่ user path ~postgres/.bash_profile พร้อมทั้งสร้าง dir /var/lib/pgsql/data สำหรับ user postgres ให้ด้วย ขั้นตอนต่อไป ทำการ login ไปที่ user postgres และเรียกใช้ path

su --login postgres
source ~/.bash_profile

ติดตั้ง patroni เพื่อให้ patroni สร้าง และจัดการ postgres instance

yum install ./vmware-postgres16-patroni-3.2.2-1.el9.x86_64.rpm

เป็นการเสร็จขั้นตอนการ install postgresql และ patroni ก่อนที่จะเริ่มใช้งาน postgresql จะต้องติดตั้ง etcd ก่อน เพื่อเป็นที่เก็บ state และ config ที่ patroni ต้องใช้ และ dir /var/lib/pgsql/data ต้องถูกสร้างและไม่มีข้อมูลใดๆ เนื่องจากเมื่อ patroni start ขึ้นมาแล้วไม่เจอข้อมูลใน dir /var/lib/pgsql/data จะทำการเรียก initdb เพื่อสร้าง database และ start database ขึ้นมาให้ แต่ถ้ามีข้อมูล patroni จะไม่สร้าง database

Download etcd version ล่าสุดจาก repository โดยใช้ script

ETCD_RELEASE=$(curl -s https://api.github.com/repos/etcd-io/etcd/releases/latest|grep tag_name | cut -d '"' -f 4)
echo $ETCD_RELEASE
wget https://github.com/etcd-io/etcd/releases/download/${ETCD_RELEASE}/etcd-${ETCD_RELEASE}-linux-amd64.tar.gz

unzip และย้าย binary file ไปยัง bin directory

tar xvf etcd-${ETCD_RELEASE}-linux-amd64.tar.gz
cd etcd-${ETCD_RELEASE}-linux-amd64
sudo mv etcd* /usr/local/bin

ตรวจสอบว่า file etcd etcdctl และ etcdutl ถูกย้ายไปที่ /usr/local/bin ตรวจเช็ค version ของ etcd binary file

[ec2-user@ip-172-31-47-229 ~]$ etcd --version
etcd Version: 3.5.12
Git SHA: e7b3bb6cc
Go Version: go1.20.13
Go OS/Arch: linux/amd64

[ec2-user@ip-172-31-47-229 ~]$ etcdctl version
etcdctl version: 3.5.12
API version: 3.5

[ec2-user@ip-172-31-47-229 ~]$ etcdutl version
etcdutl version: 3.5.12
API version: 3.5

เพื่อให้ง่ายในการจัดการ etcd service เราจะ config systemd เพื่อจัดการ etcd service ตาม script ดังนี้

#create data directory for etcd.
$ sudo mkdir -p /var/lib/etcd/
$ sudo mkdir /etc/etcd

#Create etcd system user
$ sudo groupadd --system etcd
$ sudo useradd -s /sbin/nologin --system -g etcd etcd

#Set /var/lib/etcd/ directory ownership to etcd user.
$ sudo chown -R etcd:etcd /var/lib/etcd/
$ sudo chmod 0775 /var/lib/etcd/

#Create a new config file for etcd.

[ec2-user@ip-172-31-47-229 ~]$ sudo vim /etc/etcd/etcd.conf
ETCD_NAME=pg1-1.example.com
ETCD_LISTEN_PEER_URLS=http://172.31.47.229:2380
ETCD_LISTEN_CLIENT_URLS=http://172.31.47.229:2379,http://127.0.0.1:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://172.31.47.229:2380
ETCD_ADVERTISE_CLIENT_URLS=http://172.31.47.229:2379
ETCD_INITIAL_CLUSTER=pg1-1.example.com=http://172.31.47.229:2380,pg2-1.example.com=http://172.31.34.169:2380,pg3-1.example.com=http://172.31.3
9.118:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster

#Create a new systemd service file for etcd
[ec2-user@ip-172-31-47-229 ~]$ sudo vim /etc/systemd/system/etcd.service
[Unit]
Description=etcd key-value store
Documentation=https://github.com/etcd-io/etcd
After=network.target

[Service]
User=etcd
Type=notify
Environment=ETCD_DATA_DIR=/var/lib/etcd
EnvironmentFile=/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd --enable-v2=true
Restart=always
RestartSec=10s
LimitNOFILE=40000

[Install]
WantedBy=multi-user.target

ทำการ reload config และ start etcd service ทุกครั้งที่ start host

#Reload systemd service and start etcd
$ sudo systemctl daemon-reload
$ sudo systemctl start etcd.service

#Enable service to start when system is rebooted
$ sudo systemctl enable etcd.service
Created symlink /etc/systemd/system/multi-user.target.wants/etcd.service → /etc/systemd/system/etcd.servic

#Check service status to confirm it is running.
[ec2-user@ip-172-31-47-229 ~]$ systemctl status etcd
● etcd.service - etcd key-value store
     Loaded: loaded (/etc/systemd/system/etcd.service; enabled; preset: disabled)
     Active: active (running) since Tue 2024-03-26 04:52:56 UTC; 1h 38min ago
       Docs: https://github.com/etcd-io/etcd
   Main PID: 673 (etcd)
      Tasks: 7 (limit: 4329)
     Memory: 182.2M
        CPU: 38.828s
     CGroup: /system.slice/etcd.service
             └─673 /usr/local/bin/etcd --enable-v2=true

#The service will start on localhost address port 2379
[ec2-user@ip-172-31-47-229 ~]$ ss -tunelp | grep 2379
tcp   LISTEN 0      4096       127.0.0.1:2379      0.0.0.0:*    uid:990 ino:20857 sk:3 cgroup:/system.slice/etcd.service <->
tcp   LISTEN 0      4096   172.31.47.229:2379      0.0.0.0:*    uid:990 ino:20858 sk:7 cgroup:/system.slice/etcd.service <->

[ec2-user@ip-172-31-47-229 ~]$ etcdctl -w table member list
+------------------+---------+-------------------+---------------------------+---------------------------+------------+
|        ID        | STATUS  |       NAME        |        PEER ADDRS         |       CLIENT ADDRS        | IS LEARNER |
+------------------+---------+-------------------+---------------------------+---------------------------+------------+
| 380e452f5fdbe4c2 | started | pg2-1.example.com | http://172.31.34.169:2380 | http://172.31.34.169:2379 |      false |
| 7fe19ee2a7299906 | started | pg3-1.example.com | http://172.31.39.118:2380 | http://172.31.39.118:2379 |      false |
| 8787d70676a8fc71 | started | pg1-1.example.com | http://172.31.47.229:2380 | http://172.31.47.229:2379 |      false |
+------------------+---------+-------------------+---------------------------+---------------------------+------------+

#Checking health status of the etcd node:
[ec2-user@ip-172-31-47-229 ~]$ etcdctl -w table endpoint health
+----------------+--------+------------+-------+
|    ENDPOINT    | HEALTH |    TOOK    | ERROR |
+----------------+--------+------------+-------+
| 127.0.0.1:2379 |   true | 4.518605ms |       |
+----------------+--------+------------+-------+

[ec2-user@ip-172-31-47-229 ~]$ etcdctl -w table endpoint status
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|    ENDPOINT    |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 127.0.0.1:2379 | 8787d70676a8fc71 |  3.5.12 |   33 kB |     false |      false |        29 |     434175 |             434175 |        |
+----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

[ec2-user@ip-172-31-47-229 ~]$ ETCDCTL_API=3 etcdctl endpoint status --cluster -w table
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|         ENDPOINT          |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| http://172.31.34.169:2379 | 380e452f5fdbe4c2 |  3.5.12 |   37 kB |     false |      false |        29 |     434341 |             434341 |        |
| http://172.31.39.118:2379 | 7fe19ee2a7299906 |  3.5.12 |   25 kB |      true |      false |        29 |     434341 |             434341 |        |
| http://172.31.47.229:2379 | 8787d70676a8fc71 |  3.5.12 |   33 kB |     false |      false |        29 |     434341 |             434341 |        |
+---------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

#check log
journalctl -xeu etcd.service -l --no-pager|less

ทดสอบใช้งาน etcd service

$ etcdctl put welcome "Hello World"
OK
Read the value of message back:

$ etcdctl get welcome
welcome
Hello World
To delete the key run:

$ etcdctl del welcome
1

$ etcdctl get welcome
Empty-output

เนื่องจากต้องทำที่ทุกเครื่อง เพื่อลดขึ้นตอนการพิมพ์ สามารถทำเป็น script โดยแก้ variable instance_hostname, instance_address ก่อนหน้าบรรทัด cat – <<HEREDOC > etcd.conf ให้ตรงตามเครื่องที่จะ generate อ่านข้อมูลเพิ่มเติมได้จาก Setup and Run etcd

instance_hostname=pg3-1.example.com
instance_address=172.31.39.118
etcd_hostname_a=pg2-1.example.com
etcd_address_a=172.31.34.169
etcd_hostname_b=pg1-1.example.com
etcd_address_b=172.31.47.229
cat - <<HEREDOC > etcd.conf
ETCD_DATA_DIR=/var/lib/etcd
ETCD_NAME=$instance_hostname
ETCD_LISTEN_PEER_URLS=http://$instance_address:2380
ETCD_LISTEN_CLIENT_URLS=http://$instance_address:2379,http://127.0.0.1:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS=http://$instance_address:2380
ETCD_ADVERTISE_CLIENT_URLS=http://$instance_address:2379
ETCD_INITIAL_CLUSTER=$instance_hostname=http://$instance_address:2380,$etcd_hostname_a=http://$etcd_address_a:2380,$etcd_hostname_b=http://$etcd_address_b:2380
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
HEREDOC
sudo mv etcd.conf /etc/etcd/etcd.conf

หลังจากที่ etcd ทำงานได้แล้ว ต่อไปคือการ setup Patroni และ pgbackrest สำหรับ backup solution ทำการติดตั้งตาม script นี้ของทุกเครื่อง

$ yum install -y python3 python3-devel gcc
$ su - postgres
$ cat - <<HEREDOC > requirements.txt
PyYAML
click>=4.1
prettytable>=0.7
psutil>=2.0.0
python-dateutil
python-etcd>=0.4.3,<0.5
requests
six >= 1.7
urllib3>=1.19.1,!=1.21
ydiff>=1.2.0
HEREDOC

#install an extra package required to connect etcd
$ pip3 install --user -r requirements.txt

สร้าง patroni.yml file โดยแก้ ip ให้ตรงตามเครื่องที่ติดตั้ง ทั้งสามเครื่อง

[postgres@ip-172-31-47-229 ~]$ vim patroni.yml
scope: pg_cluster
name: pg1-1
restapi:
  listen: '172.31.47.229:8008'
  connect_address: '172.31.47.229:8008'
etcd:
  hosts: 'pg1-1.example.com:2379,pg2-1.example.com:2379,pg3-1.example.com:2379'
bootstrap:
  dcs:
    ttl: 30
    loop_wait: 10
    retry_timeout: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      use_pg_rewind: true
      use_slots: true
      parameters:
        hot_standby: 'on'
        wal_keep_segments: 20
        max_wal_senders: 8
        max_replication_slots: 8
        archive_command: pgbackrest --stanza=pg_cluster archive-push "%p"
        archive_mode: 'on'
      recovery_conf:
        recovery_target_timeline: latest
        restore_command: pgbackrest --stanza=pg_cluster archive-get %f "%p"
    slots:
      patroni_standby_leader:
        type: physical
  initdb:
    - encoding: UTF8
    - locale: en_US.UTF-8
    - data-checksums
  pg_hba:
    - local all all trust
    - host all postgres 0.0.0.0/0 trust
    - host replication replicator 172.31.47.229/32 md5
    - host replication replicator 172.31.34.169/32 md5
    - host replication replicator 172.31.39.118/32 md5
    - host all all 0.0.0.0/0 md5
  users:
    admin:
      password: admin
      options:
        - createrole
        - createdb
postgresql:
  listen: '172.31.47.229:5432'
  connect_address: '172.31.47.229:5432'
  data_dir: /var/lib/pgsql/data
  pgpass: /tmp/pgpass0
  authentication:
    replication:
      username: replicator
      password: rep-pass
    superuser:
      username: postgres
      password: postgres
    rewind:
      username: rewind_user
      password: rewind_password
  create_replica_methods:
    - pgbackrest
    - basebackup
  pgbackrest:
    command: pgbackrest --stanza=pg_cluster --delta restore
    keep_data: true
    no_params: true
  basebackup:
    checkpoint: 'fast'
tags:
  nofailover: false
  noloadbalance: false
  clonefrom: false
  nosync: false

สร้าง config file สำหรับ pgbackrest ตาม script

$ sudo mkdir -p /var/log/pgbackrest
$ sudo mkdir -p /etc/pgbackrest
$ sudo mkdir -p /var/lib/pgbackrest

$ sudo chmod 0750 /var/lib/pgbackrest
$ sudo chmod 770 /var/log/pgbackrest

$ sudo touch /etc/pgbackrest/pgbackrest.conf
$ sudo chmod 640 /etc/pgbackrest/pgbackrest.conf

$ sudo chown -R postgres:postgres /var/lib/pgbackrest
$ sudo chown -R postgres:postgres /var/log/pgbackrest
$ sudo chown postgres:postgres /etc/pgbackrest/pgbackrest.conf

ตัวอย่าง configuration สำหรับ pgbackrest โดยตัวอย่างนี้จะใช้ s3 สำหรับเก็บ backup file ของ postgres ถ้าไม่มี s3 สามารถใช้ minio ทดแทนได้

[postgres@ip-172-31-47-229 ~]$ more /etc/pgbackrest/pgbackrest.conf
[pg_cluster]
pg1-path=/var/lib/pgsql/data
pg1-port=5432
pg1-user=postgres

[global]
start-fast=y
process-max=4
archive-timeout=300

repo1-type=s3
repo1-s3-uri-style=path
repo1-path=/var/lib/pgbackrest
repo1-retention-full=4
repo1-s3-bucket=pgbackrest1
repo1-s3-endpoint=s3.ap-southeast-1.amazonaws.com
repo1-s3-key=xxx
repo1-s3-key-secret=xxxx
repo1-s3-region=ap-southeast-1
log-level-console=info
log-level-file=debug
log-path=/var/log/pgbackrest

[global:archive-push]
compress-level=3

สร้างไฟล์ systemd service สำหรับจัดการ patroni service

[postgres@ip-172-31-47-229 ~]$ vim /etc/systemd/system/patroni.service
[Unit]
Description=Runners to orchestrate a high-availability PostgreSQL
After=syslog.target network.target

[Service]
Type=simple
User=postgres
Group=postgres
Environment=PATH=/opt/vmware/postgres/16/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin
Environment=PGDATA=/var/lib/pgsql/data
ExecStart=/opt/vmware/postgres/16/bin/patroni /var/lib/pgsql/patroni.yml
KillMode=process
TimeoutSec=30
Restart=no

[Install]
WantedBy=multi-user.target

Start Patroni service และกำหนดให้เริ่มทำงานเมื่อ start host

#Reload systemd service and start patroni
sudo systemctl daemon-reload
sudo systemctl start patroni.service

#Enable service to start when system is rebooted
$ sudo systemctl enable patroni.service
Created symlink /etc/systemd/system/multi-user.target.wants/etcd.service → /etc/systemd/system/etcd.service.

#check patroni log
sudo journalctl -xeu patroni -f

check patroni service status

[ec2-user@ip-172-31-47-229 etcd]$ sudo systemctl status patroni
● patroni.service - Runners to orchestrate a high-availability PostgreSQL
     Loaded: loaded (/etc/systemd/system/patroni.service; enabled; preset: disabled)
     Active: active (running) since Wed 2024-03-27 03:30:50 UTC; 1h 22min ago
   Main PID: 678 (patroni)
      Tasks: 12 (limit: 4329)
     Memory: 158.2M
        CPU: 4.086s
     CGroup: /system.slice/patroni.service
             ├─ 678 /opt/vmware/postgres/16/python/bin/python3 /opt/vmware/postgres/16/bin/patroni /var/lib/pgsql/patroni.yml
             ├─1834 postgres -D /var/lib/pgsql/data --config-file=/var/lib/pgsql/data/postgresql.conf --listen_addresses=172.31.47.229 --port=5432 --cluster_n>
             ├─1835 "postgres: pg_cluster: checkpointer "
             ├─1836 "postgres: pg_cluster: background writer "
             ├─1837 "postgres: pg_cluster: startup recovering 000000020000000000000007"
             ├─1853 "postgres: pg_cluster: postgres postgres 172.31.47.229(46466) idle"
             └─1858 "postgres: pg_cluster: walreceiver streaming 0/7009CD0"

[postgres@ip-172-31-47-229 ~]$ patronictl -c patroni.yml list
+ Cluster: pg_cluster (7350532227659130439) ---+----+-----------+
| Member | Host          | Role    | State     | TL | Lag in MB |
+--------+---------------+---------+-----------+----+-----------+
| pg1-1  | 172.31.47.229 | Replica | streaming |  2 |         0 |
| pg2-1  | 172.31.34.169 | Leader  | running   |  2 |           |
| pg3-1  | 172.31.39.118 | Replica | streaming |  2 |         0 |
+--------+---------------+---------+-----------+----+-----------+

ทำการทดสอบ WAL replication ด้วยการสร้าง database และ insert test data ที่ primary node และเรียกดู query ที่ replica node

[postgres@ip-172-31-47-229 ~]$ psql -c 'create database test;'
CREATE DATABASE

postgres=# \c test
You are now connected to database "test" as user "postgres".
test=# create table test(name varchar(50));
CREATE TABLE
test=# insert into test (name) values ('tao');
INSERT 0 1
test=# select * from test;
 name
------
 tao
(1 row)

#for each ost
$ psql -U postgres -d test -c "select * from test;"

ทดสอบ switch over ด้วยการย้าย primary node ไปยัง replica node

patronictl -c patroni.yml switchover

เนื่องจาก patroni จะเป็นตัวจัดการ postgres ดังนั้นข้อมูล config ต่างๆ จะถูกเก็บใน etcd แล้ว apply ไปยัง postgres ที่อยู่ใน member ดังนั้นการ manual แก้ไข postgres.conf จึงไม่ส่งผลต่อระบบ ถ้าต้องการแก้ไข config ต้องแก้ผ่าน patronictl

#change postgres config parameter
$ patronictl -c patroni.yml edit-config

#reload the change
patronictl -c patroni.yml reload pg_cluster

ตัวอย่าง edit config เพื่อให้การ replicate ข้อมูลระหว่าง primary กับ replica เป็นแบบ synchronize กรณีถ้ามีการแยกการอ่านข้อมูลที่ replica แต่เขียนที่ primary แล้ว mode การ replicate เป็น asynchronize จะทำให้ข้อมูลที่ได้จาก replica ไม่เป็นข้อมูล update ล่าสุดได้

patronictl -c patroni.yml edit-config --apply - --force <<'JSON'
{
 synchronous_mode: "on",
 synchronous_mode_strict: "on",
 "postgresql": 
   {
   "parameters":{
     "synchronous_commit": "on",
     "synchronous_standby_names": "*"
   }
  }
}
JSON

วิธีการ Manual backup database โดยใช้ pgbackrest

[postgres@ip-172-31-47-229 ~]$ pgbackrest --stanza=pg_cluster stanza-create
2024-03-26 05:03:30.425 P00   INFO: stanza-create command begin 2.50: --exec-id=1747-6d3c27b2 --log-level-console=info --log-level-file=debug --log-path=/var/log/pgbackrest --pg1-path=/var/lib/pgsql/data --pg1-port=5432 --pg1-user=postgres --repo1-path=/var/lib/pgbackrest --repo1-s3-bucket=pgbackrest1 --repo1-s3-endpoint=s3.ap-southeast-1.amazonaws.com --repo1-s3-key=<redacted> --repo1-s3-key-secret=<redacted> --repo1-s3-region=ap-southeast-1 --repo1-s3-uri-style=path --repo1-type=s3 --stanza=pg_cluster
2024-03-26 05:03:31.029 P00   INFO: stanza-create for stanza 'pg_cluster' on repo1
2024-03-26 05:03:31.297 P00   INFO: stanza-create command end: completed successfully (875ms)

[postgres@ip-172-31-47-229 ~]$ pgbackrest --stanza=pg_cluster backup
2024-03-26 05:09:06.384 P00   INFO: backup command begin 2.50: --archive-timeout=300 --exec-id=1779-d07de746 --log-level-console=info --log-level-file=debug --log-path=/var/log/pgbackrest --pg1-path=/var/lib/pgsql/data --pg1-port=5432 --pg1-user=postgres --process-max=4 --repo1-path=/var/lib/pgbackrest --repo1-retention-full=4 --repo1-s3-bucket=pgbackrest1 --repo1-s3-endpoint=s3.ap-southeast-1.amazonaws.com --repo1-s3-key=<redacted> --repo1-s3-key-secret=<redacted> --repo1-s3-region=ap-southeast-1 --repo1-s3-uri-style=path --repo1-type=s3 --stanza=pg_cluster --start-fast
WARN: no prior backup exists, incr backup has been changed to full
2024-03-26 05:09:07.209 P00   INFO: execute non-exclusive backup start: backup begins after the requested immediate checkpoint completes
2024-03-26 05:09:07.911 P00   INFO: backup start archive = 000000010000000000000005, lsn = 0/5000028
2024-03-26 05:09:07.911 P00   INFO: check archive for prior segment 000000010000000000000004
2024-03-26 05:09:14.194 P00   INFO: execute non-exclusive backup stop and wait for all WAL segments to archive
2024-03-26 05:09:14.399 P00   INFO: backup stop archive = 000000010000000000000005, lsn = 0/5000138
2024-03-26 05:09:14.544 P00   INFO: check archive for segment(s) 000000010000000000000005:000000010000000000000005
2024-03-26 05:09:14.734 P00   INFO: new backup label = 20240326-050907F
2024-03-26 05:09:15.072 P00   INFO: full backup size = 22.3MB, file total = 972
2024-03-26 05:09:15.073 P00   INFO: backup command end: completed successfully (8692ms)
2024-03-26 05:09:15.073 P00   INFO: expire command begin 2.50: --exec-id=1779-d07de746 --log-level-console=info --log-level-file=debug --log-path=/var/log/pgbackrest --repo1-path=/var/lib/pgbackrest --repo1-retention-full=4 --repo1-s3-bucket=pgbackrest1 --repo1-s3-endpoint=s3.ap-southeast-1.amazonaws.com --repo1-s3-key=<redacted> --repo1-s3-key-secret=<redacted> --repo1-s3-region=ap-southeast-1 --repo1-s3-uri-style=path --repo1-type=s3 --stanza=pg_cluster
2024-03-26 05:09:15.171 P00   INFO: expire command end: completed successfully (98ms)


[postgres@ip-172-31-47-229 ~]$ pgbackrest info
stanza: pg_cluster
    status: ok
    cipher: none

    db (current)
        wal archive min/max (16): 000000010000000000000001/000000010000000000000005

        full backup: 20240326-050907F
            timestamp start/stop: 2024-03-26 05:09:07+00 / 2024-03-26 05:09:14+00
            wal start/stop: 000000010000000000000005 / 000000010000000000000005
            database size: 22.3MB, database backup size: 22.3MB
            repo1: backup set size: 3.0MB, backup size: 3.0MB

ถ้าต้องการตั้ง schedule backup สามารถใช้ crontab ได้ตามตัวอย่าง

00 01 * * 0 sudo -u postgres --type=full --stanza=pg_cluster backup &> /dev/null
00 01 * * 1-6 sudo -u postgres --type=diff --stanza=pg_cluster backup &> /dev/null
0 */4 * * * sudo -u postgres --type=incr --stanza=pg_cluster backup &> /dev/null

กรณีที่ต้องการ reinit postgres node ใหม่ ด้วย pgbackrest ทำได้จาก command reinit หลังจากนั้นระบบจะให้เลือก node ที่ต้องการ reinit

$ patronictl -c patroni.yml reinit pg_cluster

กรณีที่ต้องการสร้าง replica node ใหม่ด้วย pgbackrest จะต้องเพิ่ม bootstrap method เข้าไปใน patroni.yml เพื่อให้ patroni เลือกใช้ pgbackrest แทน pgbasebackup ซึ่งจะทำให้การสร้าง replica ใช้เวลาน้อยกว่า

bootstrap:
  method: pgbackrest
  pgbackrest:
      command: /var/lib/pgsql/custom_bootstrap.sh
      keep_existing_recovery_conf: True
      no_params: False
      recovery_conf:
          recovery_target_timeline: latest
          recovery_target_action: promote
          restore_command: pgbackrest --stanza=pg_cluster --pg1-path=/var/lib/pgbackrest archive-get %f %p

#Create /var/lib/pgsql/custom_bootstrap.sh:

#!/bin/sh
mkdir -p /var/lib/pgbackrest
pgbackrest --stanza=pg_cluster --pg1-path=/var/lib/pgbackrest --log-level-console=info --delta restore 
Note: You only have to add --pg1-path if your new PGDATA does not match the path from the old path

#Add execution permission to the script:
chmod +x /var/lib/pgsql/custom_bootstrap.sh

HAProxy

Install haproxy สำหรับเป็น loadbalancer ของ postgresql โดยที่สามารถแยก request ให้การเขียนทำงานอยู่บน primary node ส่วนการอ่านทำงานที่ replica node ทั้งนี้เพื่อให้ทุก node สามารถช่วยกันทำงานได้ การเลือกว่าเป็น primary หรือ replica haproxy จะ call Patroni REST API ของทุก node เพื่ออ่าน status

อีกทั้งการมี HAProxy จะทำให้ง่ายในการทำ migrate (switch ระหว่าง postgres version เดิม กับ version ใหม่) หรือ mantenance database ด้วยการให้ proxy เป็นตัว forward request ไปยัง Postgresql โดยการแยก port ตามตัวอย่างนี้ User สามารถเรียกเข้ามา ที่ port 5000 สำหรับการเขียน และ 5001 สำหรับการอ่าน การติดตั้งสามารถติดตั้ง haproxy ได้มากกว่าหนึ่ง host ทั้งแบบ active/standby หรือ active/active

ทำการติดตั้งที่เครื่อง haproxy

#install haproxy package
$ yum install -y epel-release
$ yum install -y haproxy

#config haproxy
[ec2-user@ip-172-31-21-208 ~]$ sudo vim /etc/haproxy/haproxy.cfg
global
    maxconn 100
defaults
    log global
    mode tcp
    retries 2
    timeout client 30m
    timeout connect 4s
    timeout server 30m
    timeout check 5s
listen stats
    mode http
    bind *:7000
    stats enable
    stats uri /
listen postgres
    bind *:5000
    option httpchk OPTIONS/master
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server pg1-1 172.31.47.229:5432 maxconn 100 maxqueue 100 check port 8008
    server pg2-1 172.31.34.169:5432 maxconn 100 maxqueue 100 check port 8008
    server pg3-1 172.31.39.118:5432 maxconn 100 maxqueue 100 check port 8008
listen postgres_read
    bind *:5001
    balance leastconn
    option httpchk OPTIONS/replica  #/read-only
    http-check expect status 200
    default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
    server pg1-1 172.31.47.229:5432 maxconn 100 maxqueue 100 check port 8008
    server pg2-1 172.31.34.169:5432 maxconn 100 maxqueue 100 check port 8008
    server pg3-1 172.31.39.118:5432 maxconn 100 maxqueue 100 check port 8008

start systemd haproxy service และตรวจสอบ status

systemctl start haproxy

เรียกไปยัง url ของเครื่อง haproxy, http://haproxyip:7000 เพื่อดู status ของ enpoint สำหรับเขียนและอ่านข้อมูล postgresql

ข้อมูลอ้างอิง

Installing VMware Postgres

VMware Postgres High Availability with Patroni

How to configure a Patroni Cluster to use pgbackrest

How to use pgbackrest to bootstrap or add replica to HA Patroni

Create a High-Availability PostgreSQL Cluster with Patroni, Vultr Load Balancer, and Object Storage

Business Outcomes Lean Canvas

Leave a reply

Lean Canvas เป็นเครื่องมือที่จะทำให้ Team เข้าใจปัญหาและเลือกแนวทางที่เหมาะสมสำหรับการเปลี่ยนแปลง ด้วยการวิเคราะห์อย่างปัญหารอบด้าน ตัวอย่างนี้จะเป็นการนำ Lean Canvas มาใช้สำหรับการ Change ในกระบวนการพัฒนา Software โดยพิจารณาถึงสิ่งที่องค์กรจะได้รับจากการเปลี่ยนแปลง (Business Outcomes)

เริ่มต้นด้วยการกำหนดหัวข้อในรูปแบบตาราง เพื่อให้ผู้ที่เกี่ยวข้องได้เสนอความคิดในรูปแบบ sticky board และหารือกันในแต่ละประเด็นได้ง่าย เพื่อให้ตกผลึก จนเป็นข้อมูลสุดท้ายที่ทุกคนเห็นพ้องกัน

Problem

Risk

Current State

Operating Model

People

Process

Technology

Metrics

Future State

Operating Model

People

Process

Technology

Business Outcomes

Enablers/Partners

Costs (Operation efficiency)

Revenue (Innovation/Productivity)

รายละเอียดในแต่ละแผ่น

PROBLEM

ทำไมเราต้องทำสิ่งนี้ สิ่งที่เป็นเบื้องหลัง หรือแรงผลักทำให้ต้องทำ
สิ่งที่เป็นปัจจัยทั้งภายในและภายนอกองค์กร ที่ส่งผลกระทบกับ Business ทำให้ต้องทำสิ่งนี้ หรือต้องพัฒนาให้ดีขึ้น
คู่แข่งหลักๆ ที่มองว่าจะกระทบกับผลดำเนินการ และเหตุผลว่าทำไมต้องเปลี่ยน
อะไรที่เป็นความยากหรือความท้าทายที่จะทำให้ผลการดำเนินการไม่ตรงตามเป้า
อะไรที่เป็นอุปสรรคต่อการเริ่มต้นส่ิงใหม่ๆ (initiatives) วิสัยทัศน์ (vision) ความคืบหน้า (progress) และพัฒนาการ (evolution)

Risk

ทีมงาน และองค์กรจะมีผลกระทบอะไรบ้าง ถ้าเราไม่ทำสิ่งนี้
อะไรที่เป็นความเสี่ยงในกลุ่มองค์กรและหน่วยงาน IT ถ้าต้องทำสิ่งนี้
กลยุทธ์ในการรักษาความปลอดภัยในอันดับต้นๆ คืออะไร
ในการเปลี่ยนแปลงครั้งนี้ต้องมีการปรับเปลี่ยนข้อกำหนด หรือมาตรฐานขององค์กรหรือไม่
อะไรที่เป็นความเสี่ยง และผลกระทบกับองค์กร จากความเสี่ยงเรื่องการละเมิดความปลอดภัย (Security Breach)

Current State

Operating Model อธิบายสภาพที่เป็นอยู่ในปัจจุบัน

People หน่วยงานและองค์กร (ผังองค์กร ตำแหน่ง และความรับผิดชอบ)

Process ความซับซ้อนของกระบวนการ และสายงาน ระยะเวลาในการส่งมอบบริการ ความจำเป็นต้องเปลี่ยนให้ดีขึ้น รวมถึงการเสียโอกาสทางธุรกิจ (concept to cash)

Technology อธิบายโครงสร้างพื้นฐานของ IT และ Tools ที่ใช้ในปัจจุบัน ธุรกิจสามารถใช้ความสามารถของระบบได้อย่างเต็มประสิทธิภาพหรือไม อะไรที่เป็นปัญหาหรืออุปสรรคในการให้บริการของระบบ IT

Metrics

เป้าหมายและวิธีการวัดผล
การประเมินผลสำหรับโดยรวมในทางธุรกิจ
หน่วยวัดที่สำคัญสำหรับประเมินทีม IT
หน่วยวัดเพื่อประเมินผลลัพธ์ร่วมกันระหว่างหน่วยงาน IT และธุรกิจ
หน่วยวัดเพื่อประเมินประสิทธิภาพ สำหรับเป็นส่วนหนึ่งในการประเมินตาม KPI ส่วนบุคคล

Future State

Operating Model วิสัยทัศน์ และแนวทางดำเนินงาน

People เป้าหมายที่ต้องการบรรลุ ความสนใจและประโยชน์ที่ได้รับจากโมเดลในการจัดการในแบบ Agile/Lean ทีม Development พัฒนา Software ด้วยตัวเองหรือต้องพึ่งพา Outsorce

Process อธิบายผลลัพธ์ที่ต้องการเกี่ยวกับระบบ Automation การเพิ่มผลลัพธ์ (Productivity) ประสิทธิภาพ (efficiency) การลดความเสี่ยง (risk) และการกำกับดูแลที่ง่าย (simplified governance)

Technology อธิบายกลยุทธ์สำหรับการนำบริการ cloud ไปใช้ การย้าย workload ไปยัง cloud ส่งผลต่อความสามารถในการสร้างนวัตกรรมและการเติบโตทางธุรกิจอย่างไร

Business Outcomes

อะไรคือผลลัพธ์ทางธุรกิจที่ต้องการบรรลุ (business outcomes)
ความคิดริเริ่มที่จะทำให้สามารถบรรลุผลลัพธ์ทางธุรกิจได้
ความสามารถในการเติมโตและแซงหน้าคู่แข่ง

Enablers/Partners

สิ่งที่เราต้องการจากพันธมิตรทางธุรกิจ (ที่ปรึกษา หรือ ซัพพลายเออร์)
ที่ผ่านมา อะไรคือความสำเร็จ และล้มเหลวจากพันธมิตรทางธุรกิจ
คำแนะนำในประเด็นสำคัญที่กำลังมามองหา

Costs (Operational efficiency)

การประหยัดต้นทุนจากประสิทธิภาพที่เพิ่มขึ้น
ขนาดของทีม และค่าใช้จ่ายเฉลี่ยต่อพนักงาน
ค่าใช้จ่ายรวมในการดำเนินงานตลอดทั้งปี
การลดต้นทุนจาก license และการใช้ resource ให้มีประสิทธิภาพมากขึ้น รวมถึงการยกเลิกบาง resource ที่เกินความจำเป็น

Revenue (Innovation/Productivity)

ช่องทางที่ทำรายได้ให้กับธุรกิจมากที่สุดในปัจจุบัน
สรรหาช่องทางสร้างรายได้อื่นๆ
แนวความคิดใหม่ๆ เพื่อเพิ่มโอกาสในการสร้างรายได้มากยิ่งขึ้น

…end…

Run Docker without Docker Desktop on macOS

Leave a reply

จากที่ Docker Inc ได้ประกาศเรื่อง Docker Desktop ที่จะไม่ free สำหรับหน่วยงานที่มีรายได้มากกว่า 10 ล้านดอลลาร์ แต่ก็คงใช้ได้ฟรีสำหรับหน่วยงานที่มีการใช้งานน้อยกว่า 250 คน และมีรายได้น้อยกว่า 10 ล้านดอลลาร์ ใช้งานส่วนตัว การศึกษา หรืองานที่ไม่แสวงหาผลประโยชน์ เช่น open-source project เป็นต้น โดยจะเริ่มบังคับตั้งแต่ 31 มกราคม 2565 เป็นต้นไป ซึ้งบริษัทใหญ่ๆ ที่เข้าเกณท์ก็ต้องจ่ายต่อการใช้งานที่ 5$ ต่อคน ต่อเดือน และก็มี plan อื่นๆ เช่นจ่ายต่อ Team หรือ ต่อหน่วยงาน บริษัทที่มีรายได้ขนาดนี้ก็ไม่น่ามีปัญหาอะไรในการจ่าย เพราะ Docker Inc เองก็ลงทุนเรื่อง product development ไปพอสมควร รายได้ตรงนี้จะทำให้ product พัฒนาได้อีกมาก

พอเราพูดถึง docker ก็จะมีหลายคำที่เราต้องทำความเข้าใจ ก่อนจะเข้าเนื้อหาการติดตั้ง docker บน macOS

Container เป็นรูปแบบของ software ที่สามารถทำงานได้บนเครื่องที่มีสภาพแวดล้อมแตกต่างกันได้ จากการที่เราได้ pack application และสิ่งที่ application ต้องการใช้ให้อยู่ด้วยกัน (package) เช่น application code, libraries, configuration, runtime/server, operating system เป็นต้น

Docker Inc เป็นบริษัทที่พัฒนา docker software ที่ช่วยให้เรา develop, test และ run application ในแบบ container ได้ง่าย ทั้งที่เป็น open source และไม่ใช่ open source

Docker Engine เป็น open-source software ทำงานบน Linux OS ช่วยทำให้ container ทำงานบน Linux kernel ได้ หน้าที่ของ software ตัวนี้คือจัดการ container life cycle เช่น start, stop, delete และจัดการ resource ของเครื่อง (compute, memory, storage) สำหรับให้ container ทำงานได้แบบแยกจากกัน ป้องกันไม่ให้ process อื่น สามารถเข้าถึง หรือ share resource กันได้ เพื่อจุดประสงค์ด้านความปลอดภัย ทั้งนี้ Docker Engine ทำงานได้เฉพาะใน Linux OS

Docker CLI เป็น command line สำหรับเพื่อติดต่อกับ Docker Engine เพื่อใช้งาน container ประกอบด้วย docker และ docker-compose ซึ้งเป็นเป็น open-source software

Docker Desktop เนื่องจาก Docker Engine ทำงานอยู่บน Linux เท่านั้น ถ้าเครื่องเราเป็น Windows หรือ macOS ก็จะไม่สามารถใช้งาน container ได้ ทั้งนี้เราต้องสร้าง Virtual Machine (VM) ที่เป็น Linux OS สำหรับ docker engine ในเครื่องของเราก่อน ถึงจะสามารถใช้ container ผ่าน VM ที่สร้างขึ้นได้ ตรงนี้จึงเป็นต้นกำเนิดของ Docker Desktop ที่ทำให้เราสามารถใช้ container บน Windows/macOS ได้ โดยจะช่วยเรื่องการจัดการ VM และอื่นๆ ที่เกี่ยวข้องกับการใช้งาน เช่น Networking, Virtualization, สั่งงาน Linux OS เป็นต้น ทั้งนี้ Docker Desktop ถูกออกแบบมาให้ใช้สำหรับเครื่อง Desktop เพื่อ Develop software ไม่ใช่สำหรับ run container บน Production หรือใช้งานจริง

โดยสรุปแล้ว Docker Desktop ไม่ใช่ core technology สำหรับ run container จุดประสงค์คือเป็น Software ที่อำนวยความสะดวกให้เรา Develop software ในแบบ container บน OS ที่เป็น Windows หรือ macOS ได้ง่ายขึ้น ไม่ต้องจัดการอะไรให้ยุ่งยาก ดังนั้น Docker Inc เองก็เลยพยายามที่จะทำรายได้จาก product ตัวนี้กับบริษัทใหญ่ๆ เพราะตัวเองก็ Develop อย่างยากลำบาก รวมถึงค่าใช้จ่ายที่ได้ลงทุนไปนั่นเอง

จากที่อ่านถึงต้องนี้คงตอบคำถามได้แล้วว่า ถ้าเราไม่ใช้ Docker Desktop ได้มั้ย และต้องทำยังไง (อย่าลืมว่า docker, docker-compose, docker engine เป็น open-source) เราต้องมีเครื่องที่เป็น Linux สำหรับ docker engine โดยต้องมี software เพื่อทำ virtualization สำหรับ macOS สามารถใช้ hyperkit หรือ virtualbox ก็ได้ ตัวอย่างนี้ผมใช้ virtualbox โดยทำได้ตามขั้นตอนต่อไปนี้

ทำการ uninstall docker desktop ออกก่อน ด้วยการลบ /Applications/Docker.app สำหรับ macOS แต่จะมี file ค้างอยู่ แนะนำให้ทำตามขึ้นตอนนี้ เพื่อ clean ข้อมูลที่ค้างอยู่ออก หรือ search google ด้วย “uninstall docker desktop on macos”
Install docker, docker-machine และ docker-compose (ถ้าต้องการใช้)

❯ brew install docker docker-compose docker-machine

3. Install VirtualBox

❯ brew install --cask virtualbox
❯ brew install --cask virtualbox-extension-pack

4. ถ้าเราสั่ง create docker-machine เลยหลังจากนี้จะติดปัญหาเรื่องข้อจำกัดของ IP สำหรับ Host-Only network ใน VirtualBox version ใหม่ โดยจะได้ error “VBoxManage: error: Code E_ACCESSDENIED (0x80070005) – Access denied (extended info not available)” อ่านเพิ่มเติมได้จาก redit สามารถแก้ปัญหาด้วยขั้นตอนดังนี้ อ้างอิง

สร้าง folder vbox ใน directory /etc
สร้าง file networks.conf ภายใน folder vbox และเขียนข้อมูลในไฟล์ตามด้านล่าง

* 0.0.0.0/0 ::/0

5. create docker-machine

❯ docker-machine create -d virtualbox default
Running pre-create checks...
Creating machine...
(default) Copying /Users/pongsakk/.docker/machine/cache/boot2docker.iso to /Users/pongsakk/.docker/machine/machines/default/boot2docker.iso...
(default) Creating VirtualBox VM...
(default) Creating SSH key...
(default) Starting the VM...
(default) Check network to re-create if needed...
(default) Found a new host-only adapter: "vboxnet0"
(default) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env default

ตรวจสอบหลังจากสร้างเสร็จ

❯ docker-machine ls
NAME      ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER      ERRORS
default   *        virtualbox   Running   tcp://192.168.99.100:2376           v19.03.12

ลองเล่น command อื่นๆ > docker-machine –help

6. setup docker environment เพื่อให้ docker CLI เชื่อมต่อไปยัง docker engine ที่สร้างใหม่

❯ eval $(docker-machine env default)

7. ทดสอบใช้ command docker cli

❯ docker version
Client: Docker Engine - Community
 Version:           20.10.12
 API version:       1.40
 Go version:        go1.17.5
 Git commit:        e91ed5707e
 Built:             Sun Dec 12 06:28:24 2021
 OS/Arch:           darwin/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          19.03.12
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       48a66213fe
  Built:            Mon Jun 22 15:49:35 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          v1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

เท่านี้เราก็สามารถใช้ docker CLI ได้ปกติโดยไม่ต้องมี Docker Desktop

Credit: ข้อมูลบางส่วนได้สรุปเนื้อหาจาก https://dhwaneetbhatt.com/blog/run-docker-without-docker-desktop-on-macos

KIND – Kubernetes in Docker

Leave a reply

kind เป็น tool ตัวหนึ่งที่น่าสนใจสำหรับ run Kubernetes อยู่ที่เครื่องเราเองโดยใช้ Docker container เป็น nodes ของ Kubernetes ข้อดีคือจัดการง่ายและทำงานได้เร็ว ถ้าต้องใช้ script เพื่อสร้าง cluster เพื่อทำ integration test แล้ว tool ตัวนี้ตอบโจทย์ได้ดีทีเดียว ติดตั้งเพื่อใช้งานทำได้ตาม quick start

ตัวอย่าง command ต่าง ๆ สำหรับใช้งาน kind

สร้าง cluster ที่มี Master และ Worker จำนวน 1 node

kind create cluster --name test1

สร้าง cluster ที่มี Master และ Work อย่างละ Node จาก configuration file

kind create cluster --name=test2 --config kind-2node.yaml

# ข้อมูลใน File kind-2node.yaml
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker

สร้าง cluster ที่มี Master จำนวน 3 nodes และ Worker จำนวน 2 nodes

kind create cluster --name=test3 --config kind-5node.yaml

# ข้อมูลของ file kind-5node.yaml
kind: Cluster 
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker

ในกรณีที่ใช้ docker-machine เพื่อสร้าง docker engine จำเป็นต้องระบุ ip ของ docker engine ใน file config ด้วย กรณีต้องการ config ค่าอื่นๆ ดูได้จาก kind document

# หา ip ของ docker engine ด้วย
❯ docker-machine ip
192.168.99.102

# จากนั้นสร้าง config file ดังนี้
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
networking:
  apiServerAddress: "192.168.99.102"
  apiServerPort: 6443
nodes:
- role: control-plane
- role: worker

การ access เข้าไปที่ node

docker ps

# test1-control-plane เป็นชื่อ Master node จากผลลัพธ์ docker ps
docker exec -it test1-control-plane bash

การ access container ที่ run ที่เครื่อง test1-control-plan ซึ้ง container runtime ใน node จะเป็น containerd นะครับ

root@test-control-plane:/# crictl ps
CONTAINER           IMAGE               CREATED             STATE               NAME                      ATTEMPT             POD ID
fd7b628af2359       296a6d5035e2d       20 minutes ago      Running             coredns                   0                   f068e76bb133e
67db85b7b5293       296a6d5035e2d       20 minutes ago      Running             coredns                   0                   1d773ed216c4e
1e94aff11ee8e       e422121c9c5f9       20 minutes ago      Running             local-path-provisioner    0                   6c9467c5cdb84
9599b04fdad77       6de166512aa22       20 minutes ago      Running             kindnet-cni               0                   1a9a398e62418
f55a566a3f556       0e124fb3c695b       20 minutes ago      Running             kube-proxy                0                   c075e7b86e928
6927ff1f38141       0369cf4303ffd       20 minutes ago      Running             etcd                      0                   92b0750af4e4c
6eaab86c31d42       94ffe308aeff9       20 minutes ago      Running             kube-apiserver            0                   34bbfc2183651
5a861837a686e       96a295389d472       20 minutes ago      Running             kube-controller-manager   0                   571c1b03041e7
7f5d355589866       1248d2d503d37       20 minutes ago      Running             kube-scheduler            0                   9d92fc037f002

สร้าง Kubernetes pod

kubectl run nginx --image=nginx

สร้าง Kubernetes service

❯ kubectl expose pod nginx --port=80 --target-port=80 --name=nginx-http --type=NodePort


# ทดสอบเรียก service
root@test-control-plane:~# kubectl get nodes -o wide
NAME                 STATUS   ROLES                  AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE       KERNEL-VERSION         CONTAINER-RUNTIME
test-control-plane   Ready    control-plane,master   18m   v1.21.1   172.18.0.3    <none>        Ubuntu 21.04   4.19.130-boot2docker   containerd://1.5.2
test-worker          Ready    <none>                 18m   v1.21.1   172.18.0.2    <none>        Ubuntu 21.04   4.19.130-boot2docker   containerd://1.5.2

root@test-control-plane:~# curl http://172.18.0.2:30108
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
html { color-scheme: light dark; }
body { width: 35em; margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif; }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@test-control-plane:~# exit

สร้าง kube config ที่ master

cp /etc/kubernetes/admin.conf ~/.kube/config

แสดง log ของ cluster จาก kind cli

kind export logs --name=test2

ลบ cluster

kind delete cluster --name test1

Restart docker, ในกรณีที่ restart เครื่อง ต้อง start docker image ของ Master และ Worker nodes

docker start <name>-control-plane
docker start <name>-worker
export KUBECONFIG="$(kind get kubeconfig-path --name="<name>")"

Cloud Native, WHY WHAT HOW

Leave a reply

เพื่อให้เข้าใจเรื่อง cloud native เราจะต้องทำความเข้าใจที่มาที่ไปว่าทำไมถึงมี cloud native แล้ว cloud native คืออะไร รวมถึงเราจะสร้าง cloud native ต้องทำอย่างไรบ้าง

WHY?

เริ่มต้นจากที่เรามี pain เกี่ยวกับ infrastructure ที่ต้องการ scale เพื่อให้รองรับ workload ที่มากขึ้นจาก model ธุรกิจที่ต้องแข่งขัน และการ transform business ให้เป็น digital business ทำให้ infrastructure ไม่สามารถรองรับการเติมโตได้ทัน จึงเป็นโอกาสให้ service provider ต่างๆ ออกมาให้บริการ infrastruture ในแบบ on demand ที่เราคุ้นเคยก็จะเป็นบริการ cloud service ของเจ้าตลาดอย่าง aws , google หรือ azure รวมถึง local provider อย่าง AIS, Cloud HM เป็นต้น ซึ้ง model service พวกนี้ก็คือการให้บริการ cloud computing โดยมีคุณลักษณะเบื่องต้นคือ

Resource pooling ให้บริการผ่าน network ในแบบ on-demand และ self-service
Resource ถูกสร้างขึ้น ยกเลิกตามความต้องการ (ephemeral) รวมถึงสามารถปรับแต่งและ scale ได้ (elasticity)
Resource ไม่เกี่ยวพันโดยตรงกับ physical hardware ระบบสามารถนำ server ที่เหมาะสมมาให้บริการได้
เข้าถึงได้ง่าย
มีระบบบันทึกและแจ้งปริมาณการใช้

จากความสามารถที่ได้กล่าวข้างต้นจะเห็นว่า infrastructure มีรูปแบบที่ง่ายในการใช้งาน และทำให้ application เองสามารถใช้ความสามารถใหม่ๆ ของบริการในแบบ cloud ได้มากยิ่งขึ้น ทำให้เกิด demand ในฝั่ง Application โดยเกิดคำนิยามใหม่คือ Cloud Native Application คือ application ที่ถูกสร้างขึ้นจากการใช้ความสามารถของ cloud computing ทำให้ได้ความสามารถใหม่ๆ เช่น

Build for change ทำให้ Apps สามารถ change ได้ง่าย และไม่สงผลกระทบกับการใช้งาน
Build for scale ทำให้ Apps สามารถรองรับการใช้งานที่มากขึ้นจากการเพิ่มจำนวน process
Fault tolerance ทำให้ Apps ยังคงให้บริการได้แม้จะเกิดปัญหาของระบบ โดยที่ cloud platform จะช่วยให้ Apps สามารถทำงานได้บน Resource ที่ Available อยู่ได้
Continuous Delivery ทำให้ Apps สามารถ deploy ได้อย่างต่อเนื่อง
Agile ทำให้ Development Team สามารถแบ่งส่วนของ Apps และส่งมอบในแต่ feature ย่อยๆ ได้อย่างต่อเนื่อง รวมถึงปรับเปลี่ยนตามความต้องการใหม่ๆ ได้เร็ว

WHAT?

ความคาดหวังจากผู้ใช้ Application และการใช้งานที่มากขึ้นทำให้ Application เองก็ต้องพัฒนาให้ตอบสนองคนใช้งานที่เรียกว่า Reactive Design โดยมีปัจจัยต่างๆ ที่ควรพิจารณาคือ

React to Events ออกแบบให้ Application ทำงานโดยตอบสนองกับเหตุการณ์ต่างๆ ได้เองโดยไม่ต้องรอให้มีการเรียกใช้ ในรูปแบบ Event Driven Architecture
React to Load ออกแบบให้ Application สามารถ scale ด้วยการเพิ่มจำนวน process ได้ (scaleable)
React to Failure ส่วนของ platform ที่ทำให้ Application สามารถทำงานได้ในขณะที่อาจจะมีบาง process หยุดทำงานไป (Resilient)
React to Users ออกแบบให้การตอบสนองการใช้งาน Application ได้ราบรื่น ไม่กระตุก หรือรอนาน (Responsive) ถึงแม้จะมีจำนวนผู้ใช้งานที่มากขึ้นก็ตาม

ยังมีปัจจัยอื่นๆ นอกเหนือจากเรื่องของการตอบสนองต่อการใช้งาน (Reactive) ที่ต้องพิจารณาอย่างเช่น Speed & Safety โดยคำนึงถึงความรวดเร็วในการ Recovery ในกรณีที่ระบบมีปัญหา

HOW?

ในการจะสร้าง Cloud Native Application จำเป็นต้องอาศัย Design Patterns และ Framework เพื่อให้สามารถเริ่มต้นการสร้างระบบได้เร็วและการจัดการที่มีประสิทธิภาพ โดยประกอบได้ key หลักๆ คือ

Application Framework ส่วนนี้จะต้องพิจารณาใช้ Framework เพื่อช่วยพัฒนา application เช่นการใช้หลักการ API First มีการใช้ API Gateway และ Integration Pattern เข้ามาช่วยแก้ปัญหาความเข้ากันได้ของระบบ และ Transaction ระหว่างระบบ การใช้ Reactive Design การ Develop Application ตามรูปแบบ 12Factor และ Microservice เป็นต้น
Runtime Platform ใช้ Platform จัดการ runtime application เช่น Kubernetes Platform หรือ Serverless
Infrastructure Automation มีระบบช่วยในเรื่องการทำ self-service และ automate provision resource การทำ resource planing, monitor และระบบช่วยบริหารจัดการ infrastructure สำหรับ day 2 operation
Infrastructure ที่พร้อมต่อการใช้งาน

ในการออกแบบ Application ในแบบที่เป็น Distributed System ยังต้องมีเรื่องที่ต้องพิจารณาเพิ่มเติมเช่น

Configuration Management การจัดการ Application Configuration แบบรวมศูนย์ เพื่อความง่ายในการเปลี่ยนแปลง Configuration ของ Application
Registration and Discovery การ Register Application บนระบบ และการค้นหาบริการของ Application
Routing and Load-Balancing การเรียกใช้และการเข้าถึง Application
Fault Tolerance and Isolation การที่ application จะต้องทำงานได้ในภาวะที่ระบบมีปัญหา และการใช้งาน Resource ที่แยกจากกันอย่างจัดเจน
Aggregation and Transaction การรวมผลลัพธ์ที่ได้จากการทำงานและการควบคุมกระบวนการทำงานที่แยกกันหลาย Process
Monitoring and Distributed Tracing การ monitor และเข้าถึง log ของ Application ที่กระจายออกไปหลายที่
Process Management การเข้าถึงและควบคุม process การทำงานของ application

Cloud Native Application Portfolio Assessment Guide

Leave a reply

กระบวนการในการพิจารณา เพื่อประเมินความเหมาะสมสำหรับการย้าย Application ไปยัง Cloud Native Platform นั้นมี criteria หลายอย่างด้วยกัน ซึ้งบทความนี้จะให้คำแนะนำเพื่อเป็นแนวทางให้สามารถตรวจสอบ Application จัดลำดับความสำคัญ สิ่งที่ต้องปรับเปลี่ยน (refactore) เพื่อทำเป็น backlog ในขั้นตอนของ Implement ต่อไป

สำหรับการประเมินนั้น หลักๆ จะใช้หลักการของ 12 Factors ที่เป็นหลักการในการออกแบบ Cloud Native Application ซึ้งสามารถอ่านข้อมูลเพิ่มเติมได้จาก http://12factor.net

โดยแต่ละรายการที่จะกล่าวต่อไปในหัวข้อ 12 Factors ที่ส่งผลกระทบกับกระบวนการออกแบบ Application ที่ถ้า Application สามารถ comply ได้ ก็จะจัดได้ว่าเหมาะสมในการย้าย platform (replatform) แต่ถ้ากรณีที่ Factor ที่สำคัญๆ Application ไม่สามารถ comply ได้ก็จะพิจารณาว่าไม่เหมาะสมในการย้าย Application ไปยัง Cloud Native Platform

1. Codebase

ในหนึ่ง Codebase หรือ project ต้องมีระบบ Source Code Management (SCM) เพื่อติดตามและบันทึกประวัติการเปลี่ยนแปลงของ source code รวมถึงการทำ versioning ของ source code ทั้งนี้สามารถ deploy application จาก codebase ไปยัง environment ที่แตกต่างกันได้

โดยทั่วไป Platform จะช่วยให้ทีม Agile Development สามารถทำงานได้ง่าย แต่ความสามารถของ Platform ไม่ได้รวมถึงการ migrate หรือ deploy application ผู้ใช้ยังคงใช้วิธี manual proces หรือ automate ผ่านกระบวนการ CI/CD pipeline สำหรับ application ที่มีกระบวนการ build และ packaging (ควบคุมด้วย CI) ผ่านทางระบบ build system สามารถพิจารณาได้ว่ามีความเหมาะสมในการย้ายไปยัง Cloud แต่ในกรณีที่ไม่มีระบบ build system ก็จะมีความยากและปัญหาจากปัจจัยอื่นๆ ตามมา บางกรณี application อาจจะมี script เพื่อดึง dependencies และจัดการการเข้ากันได้กับระบบข้างเคียง ก็อาจจะพิจารณาได้ว่ามีความเหมาะสมระดับปานกลาง

Application ที่ไม่เข้าข่ายข้างต้นจะถูกพิจารณาว่าไม่มีความเหมาะสมในการย้ายไปยัง Cloud Native Platform อย่างไรก็ตามควรคำนึงถึงความยากในการปรับเปลี่ยน และประโยชน์ที่ได้เมื่อ Applicaiton ทำงานบน Cloud Native Platform ประกอบการพิจารณาด้วย

ตัวอย่างปัญหาที่พบ

Application Code ไม่ได้ถูกจัดการด้วย Source Control
Application Code ถูกจัดการด้วย Source Control แต่ Code กระจายไปยังหลายๆ Repositories หรือ SCM
ไม่มีระบบ CI/CD pipeline ตัวอย่างเช่นไม่มี Build service ที่ถูกจัดการด้วย automate tool เช่น TeamCity, Bamboo, Jenkins หรือ build automation tool อื่นๆ

2. Dependencies

ระบุการใช้ dependencies ใน code ให้ชัดเจนและไม่มีการใช้ code ร่วมกันจากหลายๆ project (isolate)

เป็น Factor ที่จำเป็น ถ้า Application ไม่ได้ระบุ Dependencies ที่จัดเจน หรือมีการใช้ code ร่วมกัน จะทำให้การทำ automate ด้วย CI/CD เฉพาะ Application ทำไม่ได้ เพราะต้องรวม Application อื่นเข้ามาใน Process ด้วย ซึ้งจำเป็นต้องปรับเปลี่ยนค่อนค่างมาก

ตัวอย่าง Application ที่มีการระบุ dependencies ที่ดี ส่วนใหญ่จะจัดการผ่าน tool เช่น Gradle, Maven, Ant, NuGet สำหรับ .NET ก็จะไม่ค่อยมีปัญหาเพราะ tool จะสามารถระบุ dependencies ต่างๆ ที่จำเป็นสำหรับ Application ให้ได้

ตัวอย่างปัญหาที่พบ

Application ไม่สามารถแยก deploy ได้ จำเป็นต้อง deploy application อื่นร่วมด้วยถึงจะสามารถทำงานได้
ไม่ได้ใช้ build หรือ dependencies management system เช่น maven, ant, gradle
Application release เดียวกันทำงานได้เฉพาะบาง environment เนื่องจากไม่พบ dependencies ที่ต้องการใน environment ที่มีปัญหา
.NET Application พึ่ง dependencies จาก Global Assembly Cache (GAC) ส่งผลกับการให้ deploy application ไปยัง environment อื่นมีปัญหาจากการหา dependencies ที่ต้องการไม่พบ

3. Configuration

ให้ Environment จัดการ configuration ของ application

Application จะต้องใช้ configuration จาก environment ตัวอย่างเช่น java application จะเขียน config ใว้ใน properties file โดยเก็บไว้ใน disk ของ environment ซึ้งก็จะไม่เข้ากับคำแนะนำนี้ เช่นเดียวกับ .NET ที่เขียน application config ไว้ใน web.config file

แม้จะเป็นไปได้ว่า configuration สามารถ deploy ร่วมกับ application ซึ้งมองว่าเป็น dependencies เฉพาะของ application ก็จะทำให้ยากมากขึ้นในการจัดการ ทั้งนี้ Application ที่ไม่ได้ดึง configuration มาจาก environment อาจจะสามารถ replatform ได้ขึ้นอยู่กับ configuration ที่เก็บใน properties file หรือสถานที่เก็บอื่นนอก environment

สำหรับ application ที่ไม่ตรงตามข้อแนะนำข้างต้น จำเป็นต้องประมาณการความยากในการ migrate เพื่อมาใช้ environment configuration อาจจะพิจารณาร่วมกับการใช้ Service เพื่อติดต่อกับ backend service (เพิ่มเติมในหัวข้อ Backing service Factor) ทั้งนี้การให้น้ำหนักเรื่องการปรับเปลี่ยน application ให้ตรงตามข้อกำหนด เป็นส่วนหนึ่งที่นำมาใช้ในการลำดับ application ที่เหมาะสมกับการ replaform

ตัวอย่างปัญหาที่พบ

การใช้งาน *.properties file ใน code java
มีการใช้ properties class ใน code java
อ่านข้อมูลจำเพาะต่างๆ ของ environment จาก resource และ files
Configuration ทุกอย่างถูกเก็บใน web.config และไฟล์เกี่ยวข้องอื่นๆ (.NET) เช่นเก็บใน configuration manager และ class object

4. Backing Services

ออกแบบให้ services ที่ application ต้องการเรียกใช้ (backing service) เสมือน resource หนึ่งของ application ที่สามารถใช้งานด้วยการนำเข้าและออกได้ง่าย

ตัวอย่างเช่น URL หรือ Locator/Credentials ที่ application ใช้ในการเรียก backing service ถูกเก็บไว้ใน configuration เพื่อให้ถูกต้องตามข้อกำหนดนี้ Application จะต้องทำตามข้อกำหนด Configuration Factor

เพื่อให้เป็นไปตามข้อกำหนด Application ต้องสามารถปรับเปลี่ยนเพื่อใช้ Service (นำเข้า/นำออก) เช่น data store, messing/queueing system, email, caching โดย service พวกนี้จะถูกจัดการผ่านระบบภายนอก (Third party) โดยที่ไม่ส่งผลกระทบกับ application code ทั้งนี้รวมถึง resource ที่สามารถปรับเปลี่ยนได้ในขั้นตอน deployment

ทั้งนี้ Application จะคงสามารถทำงานได้ปกติถึงแม้จะไม่ตรงตามข้อกำหนดโดยขึ้นอยู่กับ Application และ Service ที่เรียกใช้ เช่นกำหนด connection ใน configuration ที่จัดเก็บไว้ใน environment ก็ยังคงถือว่าใช้ได้เหมือนกัน

แนะนำให้แก้ไข code ของ application ที่ไม่ตรงตามข้อกำหนดนี้โดยให้มีการใช้ environment variable เพื่อเรียกใช้ backing service โดยปรับ code ให้ง่ายในการปรับเปลี่ยนมากขึ้น (abstract) เช่นการใช้วิธีการ Dependencies Injection ช่วย ทั้งนี้รวมถึง credentials และ connection parameter สำหรับ legacy application หรือ service ที่ยังไม่ได้อยู่บน cloud environment

ตัวอย่างปัญหาที่พบ

มีการ hard code ค่า configuration (หรือไม่ใช้ environment configuration) เช่น database connection string
กำหนดข้อมูลตายตัวสำหรับค่า URL และ Port สำหรับเชื่อมต่อ backing service
Application เรียกใช้งานได้กับบาง backing service เช่น ทำงานได้เฉพาะ database driver เพราะไม่ได้ใช้ adapters/JDBC/abstractions ช่วย
เก็บ credentials สำหรับ backing service ใน configuration setting ที่อาจมองว่า application ขึ้นอยู่กับ environment มากเกินไป (tight environmental coupling)
Backing services ใน Docker-to-Docker โดยไม่ได้ใช้ Docker image linking จำเป็นต้องพึ่ง /etc/hosts file ที่จะส่งผลเสียกับ platform ตามมาภายหลัง

5. Build, Release, Run

แยกขั้นตอน build และ Run ออกจากกันอย่างชัดเจน

Code ถูกนำไปใช้ในการ deploy ผ่านขั้นตอนหลักๆ อยู่ 3 ขึ้นตอนคือ build , release และก็ run โดยที่ build จะเปลี่ยน code ให้กลายเป็น application ที่สามารถทำงานบน computer ได้ release จะรวมผลลัพธ์จากการ build รวมกับ deployment configuration ที่ application ต้องใช้ในการ deploy และ Run ก็จะ run application บน environment โดยที่ application จะสามารถเข้าถึง configuration ที่ต้องการจาก environment และทำงานเป็น process ในระบบ

ทั้งนี้กระบวนการ CI/CD pipeline จะรวมขั้นตอนข้างต้น เพื่อให้ทำงานสอดคล้องกันในแต่ละขึ้นตอนจากเริ่มจน application สามารถทำงานได้

สิ่งสำคัญที่ต้องพิจารณาคือ ทุกๆ releae จะต้องมีการระบุอย่างชัดเจน เช่นทำ version (uniquely identifiable) สามารถสร้างขึ้นใหม่ได้ (immutable artifact) ส่วนขั้นต้อนการ run จะต้องเล็ก (small) ง่าย (small simple thing) และมีส่วนประกอบที่น้อยที่สุด (few moving part)

ตัวอย่างปัญหาที่พบ

การ deploy จะสำเร็จได้จะต้องอาศัยการตั้งค่าเพื่อติดตั้งบน environment
Application ไม่เป็นไปตามกฏ configuration factor
ไม่สามารถแยกขั้นตอน build, release, run ออกจากกันได้

6. Processes

Application ทำงานโดยไม่เก็บ state (Stateless processes) และไม่ share state ให้ process อื่น ในกรณีที่ต้องการเก็บ data จะต้องใช้ backing service ช่วย

File system ที่ให้บริการสำหรับ application ที่ทำงานบน platform จะไม่คงอยู่เสมอไป เป็นไปได้ที่ application ไม่สามารถทำงานได้สำหรับข้อกำหนดนี้คือ file system ถูกลบทิ้งหลังจาก process ของ application จบการทำงาน และไม่สามารถทำงานใหม่ด้วย file system เดิมได้ ดังนั้นการเก็บข้อมูลใดๆ ของ application จะต้องทำผ่าน backingn service (เพิ่มเติมจาก Backing Service Factor)

Application ที่ทำงานได้อย่างมีประสิทธิภาพบน platform จะต้องเป็นในรูปแบบ stateless หมายถึง application จะต้องไม่เก็บ state ใดๆ ไว้ใน memory, local file system หรือ http session state และจะต้องหาแนวทางที่เหมาะสมถ้าในกรณีที่ developer ต้องการเก็บ state ของ application

ในการตรวจสอบ application ว่าเป็นไปตามข้อกำหนดนี้ สามารถทำได้ด้วยวิธีการ “Smoke test” โดย start application ขึ้นมา 2 instance และทำการ rundom request ไปยัง 2 instance เพื่อทดสอบว่า application สามารถทำงานได้ตามปกติ

กรณีที่ใช้ sticky sessions โดยให้ transaction ทำงานที่ process เดียว ก็ถือว่าไม่ตรงตามข้อกำหนดนี้ และไม่ควรเลือกแนวทางนี้ถ้าไม่จำเป็น ในกรณีที่ application ไม่ตรงตามข้อกำหนดนี้ แต่ว่ามี code base ที่สามารถเพิ่มเติมส่วนการจัดการ session ได้ ก็สามารถใช้ session management ในแบบ backing sevice โดยการใช้ Redis ช่วย

การพิจารณาตามข้อกำหนดนี้จะเป็นในมุมของ application process ว่าทำงานอย่างไร รูปแบบการหยุด มีการใช้ state และ process จัดการ state ในระหว่างการทำงานหรือไม่

ตัวอย่างปัญหาที่พบ

Application มีการจัดการ state (รวมถึง session) ใน memory ส่งผลให้ state หาย ทำให้ application ไม่สามารถทำงานใหม่ได้หลังจากถูก terminate
Application แชร์ data ให้กับ application instance อื่น
Application สร้าง data บน file system เพื่อที่จะใช้ต่อเมื่อ process หยุดการทำงาน
Application ทำงานได้จาการใช้ docker data volumes เพื่อที่จะให้ data สามารถเก็บไว้ใช้ได้นอก container process
Application ไม่สามารถทำงานได้เมื่อ start พร้อมกันหลาย instance หรือต้องอาศัย sticky session ช่วย

7. Port Binding

ให้บริการ service ด้วย port ที่ไม่ถูกกำหนดตอนเริ่ม process (runtime injection) ทั้งนี้ web server ต้องให้บริการ service กับผู้ใช้ โดยส่งต่อไปยัง application port ที่ถูกต้อง

บริการที่ไม่ใช่ HTTP Platform จะต้องมีการจัดการ TCP routing เพื่อที่จะสามารถ replatform ได้

ตัวอย่างปัญหาที่พบ

Application ให้บริการผ่าน protocal ที่ไม่ใช่ http วิธีลดปัญหานี้ควรระวังเรื่องการใช้ protocal jmx, multicast, udp, custom tcp protocal, rmi-iiop
มีการระบุ port เป็น requirement เฉพาะ เช่น application จะไม่สามารถ start ได้ถ้าไม่อนุญาติให้ทำงานที่ port 8081
Application ต้องการ process หรือต้องมี port พร้อมใช้งานก่อนถึงทำงานได้

8. Concurrency

การ scale application เพื่อรองรับการทำงานที่มากขึ้น ด้วยวิธีการเพิ่มจำนวน process

Platform สามารถที่จะ scale application ด้วยการเพิ่มจำนวน instance (scale out) ในขณะที่การเพิ่ม resource เช่น RAM และ CPU (vertical scaling) ก็เป็นอีกทางเลือกแต่จะมีข้อจำกัดเรื่อง hardware

ในการย้าย application ที่เป็นชนิด monolith ขึ้นไปยัง cloud native platform ด้วยวิธีการแก้ปัญหาต่างๆ เพื่อให้สามารถทำงานบน platform ใหม่ได้ เราจะพบว่า monolith application ส่วนใหญ่มีความต้องการ memory มากกว่าทั่วไป เนื่องจากอาจจะต้องจัดการ state ใน application เอง แต่ถ้าสามารถย้ายการจัดการ state ไปยัง backing service ก็จะช่วยเรื่องความต้องการ resource บน platform ที่น้อยลง

ในกรณีที่ application ถูกสร้างตามหลักการ concurrency คือไม่แชร์ resource (share-nothing) สามารถทำงานพร้อมกันหลายน instance ได้ (horizontally partitionable application) การย้ายไปยัง cloud native platform ก็จะสามารถทำได้ง่าย

ตัวอย่างปัญหาที่พบ

Application ไม่สารถทำงานพร้อมๆ กันหลาย instance ได้
การ scale application ให้ทำงานพร้อมกันต้องอาศัย sticky session ในการจัดการ roting ของ request
Application ต้องการ memory จำนวนมากเพื่อที่จะ scale
ใช้หรือพึ่งพาการการกระจาย transaction เพื่อทำงานร่วมกันจากหลาย process (distributed transaction) อาจจะเป็นปัญหาในการ scale (horizontal) บน platform
Application ไม่ตรงตามกฏ configuration factor และ processes factors
Application ถูก deploy ด้วย package ที่เป็น EAR/WAR อาจจะเป็นปัญหากับการ packaging ในรูปแบบ container และการ scale ในแบบเพิ่มจำนวน process

9. Disposability

การเพิ่มความเสถียรด้วยออกแบบให้ appliation เริ่มทำงาน (startup time) และหยุดทำงานอย่างสมบูรณ์ได้เร็ว (graceful shutdown) ทั้งนี้การ start หรือ stop สามารถเกิดขึ้นได้ตลอดเวลาโดยไม่ได้กำหนดล่วงหน้า

ข้อแนะนำนี้ application จะต้องอาศัย ความเร็ว (speed) และความสมบูรณ์ (grace) ทั้งนี้ application จะต้อง shutdown ได้อย่างสมบูรณ์เมื่อได้รับ SIGTERM signal จาก process manager

สำหรับความเร็วนั้น Platform เองสามารถกำหนดได้อย่างเช่น application จะต้องให้บริการทาง port ที่เปิดไว้ภายใน 5 นาที ไม่เช่นนั้น platform จะถือว่า application มีปัญหา หรืออาจจะหยุดการทำงานอย่างสมบูรณ์ไม่ได้ภายในเวลาที่กำหนด ในกรณีนี้ application จะไม่สามารถทำงานบน platform ได้

ตัวอย่างปัญหาที่พบ

Application ใช้เวลามากว่า 5 นาทีเพื่อพร้อมในการทำงาน
Application หยุดการทำงานหลังจากได้ SIGTERM signal ทำให้ข้อมูลหาย หรือทำงานได้ไม่สมบูรณ์
หยุดการทำงาน webserver IIS หรือ terminate process หรือ pool (.NET) ทำให้ข้อมูลหาย หรือทำงานไม่สมบูรณ์
การหยุด application ใช้เวลานาน หรือต้องการขั้นตอนที่ซับซ้อนในการยกเลิกการจอง resource

10. Dev/Prod Parity

ให้การ deploy application และ environment setup ในขั้นตอน development, staging และ production ในรูปแบบเดียวกันให้มากที่สุด เนื่องจาก 12 Factor ถูกออกแบบเพื่อให้ ส่งมอบ application ได้อย่างต่อเนื่อง (continuous deployment) ด้วยการทำให้ความแตกต่างของการ deploy application ในขั้นตอน development มีน้อยที่สุด

ถ้า application ต้องการที่จะต้องปรับแต่งเพื่อให้สามารถทำงานได้บน environment ที่แตกต่างกัน หรือ codebase ต้องมีการแยกเพื่อให้เหมาะสมกับการ deploy ในแต่ละ environment เช่นมี branch QA หรือ PROD อาจจะมองได้ว่าเป็นการ design ที่ไม่ดี และไม่ตรงตามกฏข้างต้น

ในกรณีของ Docker image อาจจะเป็นปัญหากับข้อกำหนดนี้ เพราะอาจจะมีการ pack บาง config เพื่อให้เหมาะสมกับ environment จำเป็นต้องตรวจสอบและป้องกันไม่ให้เกิดปัญหา conflict กับ platform

กรณีที่ application ไม่ตรงตามข้อกำหนดนี้ ก็อาจจะไม่ตรงตาม Backing Service Factor ได้เหมือนกัน เนื่องจากบางกรณี developer จำเป็นต้องใช้ backing service ที่แตกต่างกันระหว่าง development และ production

ตัวอย่างปัญหาที่พบ

มีการใช้ Docker image ที่มีการ pack configuration สำหรับเฉพาะ environment เช่น มีการแยก QA Docker หรือ Prod Docker
การทำงานของ Application แตกต่างกัน หรือ feature ไม่เหมือนกันระหว่าง Production และ Development
เวลาในการ deploy application ใน Dev/QA และ Produciton ค่อนข้างเยอะ จากการที่ต้องปรับแต่ง เพื่อให้ application สามารถทำงานได้ใน environment ที่แตกต่างกัน

11. Logs

จัดการ log ของ application ในแบบเหตุการณ์ที่เกิดต่อเนื่อง (event stream) โดยไม่ควบคุมหรือจัดการไดๆ

Platform มีหน้าที่ต้องจัดการ log ของ application และ system เพื่อแสดงบน console ให้ Developer สามารถเข้าถึงได้ ทั้งนี้ Application เองจะต้องเขียน log ไปยัง standard output และ standard error เพื่อที่ Platform จะสามารถตรวจจับ รวบรวมและแสดงผลได้ในที่เดียว

ดังนั้น application ต้องไม่เขียน log บน file และเก็บไว้ใน file system เอง ในกรณีที่ appliction มีการใช้งาน SLF4j, Log4j, logback หรือ tool อื่นๆ เช่น ELMAH สำหรับ ASP.NET application ที่ ELMAH เองจะต้องการ backing service เช่น SQL server ทั้งนี้การเปลี่ยนให้ application มาใช้ stdout/stderr ก็จะส่งผลกระทบน้อย

Application อาจจะพิจารณาใช้ syslog หรือ log service อื่นๆ เพื่อจัดการเรื่องความปลอดภัยของข้อมูลได้ ในกรณีที่จัดการ stdout/stderr ผ่าน platform มีความยากในการจัดการ

ตัวอย่างปัญหาที่พบ

Application ไม่ได้ใช้ tool ช่วยในการจัดการ log (log4j, logback, slf4j, log4net) และไม่ได้ใช้ stdout
Application มี code เพื่อทำการรวม log และเขียน log โดยไม่อาศัย tool จากภายนอก

12. Admin Processes

สร้าง process ให้ทำงานแยกส่วนเฉพาะงาน เพื่อจัดการ process ของ application ที่ทำงานบน environment ตัวอย่างเช่น มี shell script เพื่อควบคุม proces หรือ database migration เป็นต้น

Factor Assessment Chart

ตัวอย่างตารางสำหรับทำ Application Assessment ว่า application มีความเหมาะสมที่จะ replatform ไปยัง cloud native platform หรือไม่ โดยถ้า Application ตรงตามข้อกำหนดก็จะได้ 1 แต้ม

Factor	Compliant?	Effort to Comply
Codebase
Dependencies
Configuration
Backing Services
Build, Release, Run
Processes
Port Binding
Concurrency
Disposability
Dev/Prod Parity
Logs
Admin Processes

หลังจากได้ทำ assessment กับทุก application แล้ว ก็สามารถลำดับ application ที่เหมาะสำหรับ replatform จากแต้มเยอะสุด ในกรณีได้แต้มที่เท่ากัน ให้ดูพิจารณาจากความยากในการปรับเปลี่ยน

จากข้อมูลนี้ก็จะทราบว่า application ไหนเหมาะที่จะย้ายไปยัง cloud native platform ทั้งนี้ยังต้องพิจารณา Factor อย่างอื่นรวมด้วยเช่น Application ที่ย้ายไปยัง cloud native platform ช่วย Business ให้บริการลูกค้าหรือผลตอบประกอบการที่มากขึ้น รวมถึงความพร้อมของทีม คนที่เกี่ยวข้องและเวลา เพื่อใช้ในการตัดสินใจ replatform application ให้ทำงานบน Cloud

Tech Notes

Technology is changing over time like the human age, documenting the finding helps me remember my footstep

Database performance tuning for App Dev

Template for PostgreSQL HA with Patroni and HA test

Postgresql with Patroni installation recipe

Business Outcomes Lean Canvas

PROBLEM

Risk

Current State

Metrics

Future State

Business Outcomes

Enablers/Partners

Costs (Operational efficiency)

Revenue (Innovation/Productivity)

Run Docker without Docker Desktop on macOS

KIND – Kubernetes in Docker

Cloud Native Application Portfolio Assessment Guide

1. Codebase

2. Dependencies

3. Configuration

4. Backing Services

5. Build, Release, Run

6. Processes

7. Port Binding

8. Concurrency

9. Disposability

10. Dev/Prod Parity

11. Logs

12. Admin Processes

Factor Assessment Chart