# cloud
# tools
# Docker Engine
- workload orchestrator
Kubenetes
YARN
kubevirt libvirt
swarm
redis cluster
- redis pipeline
anna
openresty/nginx
gvisor
- user-space kernel for containers
- runsc
- runtime-spec
- image-spec
Gofer
- file proxy
# moby project
- containerd
- runc
- infrakit
- buildkit
runtime: cri-o rktnetes containerd docker-shim
Rkt
etcd
capos
titus
BoltDB Badger LevelDB etcd
kubelet
fluentd logstash
Container Networking Interface (CNI)
# workflow
https://airflow.apache.org/start.html
# orgs
moby project
- sdn toolkit
# dataset
# algo
CAP v2: In a distributed system (a collection of interconnected nodes that share data.), you can only have two out of the following three guarantees across a write/read pair: Consistency, Availability, and Partition Tolerance - one of them must be sacrificed.
Consistency: A read is guaranteed to return the most recent write for a given client. Availability: A non-failing node will return a reasonable response within a reasonable amount of time (no error or timeout). Partition Tolerance: The system will continue to function when network partitions occur.
raft, etcd
- write ahead log
- lead
pasos
b+ tree
lsm tree
# toolchains
# docker
https://github.com/Yelp/dumb-init
# kafka
one-liner: more-than-once commit, written in scala, batching messages with logical offset, log segment file, 1 partition to 1 consumer, system page cache(no in memory cache), broker, zookeeper instead of master node, rebalance process, client handle duplicate, CRC message, monitoring events, avro protocol
activeMQ, rabbitMQ, zeroMQ, JMS spec
# Spark SQL
one-liner: R dataframe like api, Catalyst as query optimizer, nested data model based on Hive, analyze logical plan eagerly, evaluate RDD lazily. Internally, it create a logical data scan operator points to RDD. columnar compression: dict encoding, run-length encoding.
logical optimizer: constant folding, predicate pushdown, projection pruning, null propagation, boolean expr simplification.
physical planning: pipeline projection
codegen: scala quasiquote, AST to code
user-define-types for ML
# dataflow stream model
Millwheel watermark, lower bound(heuristically) on event times processed by the pipeline
# Kubenetes
Kubenetes in action
https://google.qwiklabs.com/focuses/878?locale=en&parent=catalog&qlcampaign=77-18-gcpd-236&utm_source=gcp&utm_campaign=kubernetes&utm_medium=documentation