docs: add technical diagrams explaining core components interactions
This commit is contained in:
parent
d296ff8205
commit
758f4271c6
@ -9,6 +9,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
|
||||
### Added
|
||||
|
||||
- [#621](https://github.com/spegel-org/spegel/pull/621) Added Mermaid diagrams documentation to help explain Spegel's inner workings.
|
||||
|
||||
### Changed
|
||||
|
||||
- [#608](https://github.com/spegel-org/spegel/pull/608) Use custom proxy transport and increase idle connections per host.
|
||||
|
323
docs/FLOW-DIAGRAMS.md
Normal file
323
docs/FLOW-DIAGRAMS.md
Normal file
@ -0,0 +1,323 @@
|
||||
# Spegel: Visual Architecture Guide
|
||||
|
||||
This document provides a comprehensive set of diagrams explaining Spegel's architecture, flows, and operations.
|
||||
|
||||
## 1. High-Level Cluster Architecture
|
||||
|
||||
Shows how Spegel pods form a P2P network within the cluster, with fallback to external registry. Each node runs a Spegel pod that interacts with the local containerd instance.
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "External"
|
||||
ER["External Registry"]
|
||||
end
|
||||
|
||||
subgraph "Kubernetes Cluster"
|
||||
subgraph "Node 1"
|
||||
SP1["Spegel Pod"]
|
||||
CD1["Containerd"]
|
||||
SP1 <-->|interacts| CD1
|
||||
end
|
||||
|
||||
subgraph "Node 2"
|
||||
SP2["Spegel Pod"]
|
||||
CD2["Containerd"]
|
||||
SP2 <-->|interacts| CD2
|
||||
end
|
||||
|
||||
subgraph "Node 3"
|
||||
SP3["Spegel Pod"]
|
||||
CD3["Containerd"]
|
||||
SP3 <-->|interacts| CD3
|
||||
end
|
||||
|
||||
SP1 <-->|P2P Network| SP2
|
||||
SP2 <-->|P2P Network| SP3
|
||||
SP3 <-->|P2P Network| SP1
|
||||
end
|
||||
|
||||
SP1 -->|fallback| ER
|
||||
SP2 -->|fallback| ER
|
||||
SP3 -->|fallback| ER
|
||||
```
|
||||
|
||||
## 2. Pod Component Architecture
|
||||
|
||||
Details the internal components of a Spegel pod and their relationships, showing how the registry service, P2P components, and state management interact with each other and with containerd.
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "Spegel Pod"
|
||||
subgraph "Registry Service"
|
||||
RS[HTTP Server /v2/]
|
||||
RH[Request Handler]
|
||||
RS --> RH
|
||||
end
|
||||
|
||||
subgraph "P2P Components"
|
||||
P2P[P2P Router]
|
||||
DHT[DHT Provider]
|
||||
BS[Bootstrapper]
|
||||
P2P --> DHT
|
||||
BS --> P2P
|
||||
end
|
||||
|
||||
subgraph "State Management"
|
||||
ST[State Tracker]
|
||||
MT[Metrics]
|
||||
ST --> MT
|
||||
end
|
||||
|
||||
CD[Containerd Client]
|
||||
|
||||
RH --> P2P
|
||||
ST --> P2P
|
||||
CD --> ST
|
||||
end
|
||||
|
||||
subgraph "Node Components"
|
||||
CDD[Containerd Daemon]
|
||||
CS[Content Store]
|
||||
CDD --> CS
|
||||
end
|
||||
|
||||
CD --> CDD
|
||||
```
|
||||
|
||||
## 3. Image Pull Flow
|
||||
|
||||
Shows the sequence of operations during an image pull request, demonstrating both successful peer pulls and fallback to external registry.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant CD as Containerd
|
||||
participant SR as Spegel Registry
|
||||
participant P2P as P2P Router
|
||||
participant PR as Peer Registry
|
||||
participant ER as External Registry
|
||||
|
||||
Note over SR,P2P: 20ms default resolve timeout
|
||||
Note over SR,P2P: 3 default resolve retries
|
||||
|
||||
CD->>SR: GET /v2/{name}/manifests/{ref}
|
||||
SR->>P2P: Resolve(key, allowSelf, retries)
|
||||
|
||||
alt Peer Found
|
||||
P2P-->>SR: Return Peer Address
|
||||
SR->>PR: Request Content
|
||||
PR-->>SR: Stream Content
|
||||
SR-->>CD: Return Content
|
||||
CD->>CS: Store Content
|
||||
else No Peers Available (within 20ms)
|
||||
SR-->>CD: 404 Not Found
|
||||
CD->>ER: Request from External
|
||||
ER-->>CD: Return Content
|
||||
CD->>CS: Store Content
|
||||
end
|
||||
```
|
||||
|
||||
## 4. P2P Network Formation
|
||||
|
||||
Shows how nodes discover each other and form the P2P network through leader election and peer sharing.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant N1 as Node 1
|
||||
participant N2 as Node 2
|
||||
participant N3 as Node 3
|
||||
participant LE as Leader Election
|
||||
|
||||
Note over N1,LE: 10s lease duration
|
||||
Note over N1,LE: 5s renew deadline
|
||||
Note over N1,LE: 2s retry period
|
||||
|
||||
N1->>LE: Participate in Election
|
||||
N2->>LE: Participate in Election
|
||||
N3->>LE: Participate in Election
|
||||
LE->>N1: Elected Leader
|
||||
N2->>N1: Discover Leader
|
||||
N3->>N1: Discover Leader
|
||||
N1->>N2: Share Peer List
|
||||
N1->>N3: Share Peer List
|
||||
N2->>N3: Establish P2P Connection
|
||||
|
||||
Note over N1,N3: P2P Network Formed
|
||||
```
|
||||
|
||||
## 5. State Management and Content Advertisement
|
||||
|
||||
Shows how content availability is maintained and advertised in the P2P network, including periodic refresh cycles and event-driven updates.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant ST as State Tracker
|
||||
participant CD as Containerd
|
||||
participant P2P as P2P Router
|
||||
participant DHT as DHT Network
|
||||
participant MT as Metrics
|
||||
|
||||
Note over ST,DHT: Content TTL: 10 minutes
|
||||
Note over ST,DHT: Refresh: Every 9 minutes
|
||||
|
||||
loop Every 9 minutes
|
||||
ST->>CD: List Images
|
||||
CD-->>ST: Image List
|
||||
|
||||
loop For each image
|
||||
ST->>P2P: Advertise(image_keys)
|
||||
P2P->>DHT: Provide(keys)
|
||||
end
|
||||
|
||||
ST->>MT: Update Metrics
|
||||
end
|
||||
|
||||
CD-->>ST: Image Event (Create/Update/Delete)
|
||||
ST->>P2P: Update Advertisement
|
||||
ST->>MT: Update Metrics
|
||||
```
|
||||
|
||||
## 6. Content Resolution Process
|
||||
|
||||
Shows how content is located and retrieved from peers in the network, including peer selection and retry mechanisms.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant SR as Spegel Registry
|
||||
participant P2P as P2P Router
|
||||
participant DHT as DHT Network
|
||||
participant PR1 as Peer 1
|
||||
participant PR2 as Peer 2
|
||||
|
||||
SR->>P2P: Resolve(content_key)
|
||||
P2P->>DHT: FindProviders(key)
|
||||
|
||||
par Parallel Resolution
|
||||
DHT-->>P2P: Found Peer 1
|
||||
DHT-->>P2P: Found Peer 2
|
||||
end
|
||||
|
||||
P2P->>SR: Return First Available Peer
|
||||
|
||||
Note over SR,PR2: Default 20ms timeout
|
||||
Note over SR,PR2: 3 retry attempts
|
||||
|
||||
alt Try Peer 1
|
||||
SR->>PR1: Request Content
|
||||
PR1-->>SR: Stream Content
|
||||
else Peer 1 Fails
|
||||
SR->>PR2: Request Content
|
||||
PR2-->>SR: Stream Content
|
||||
end
|
||||
```
|
||||
|
||||
## 7. Data Flow Paths
|
||||
|
||||
Shows the content paths and system control flows, including peer transfers and fallback mechanisms.
|
||||
|
||||
```mermaid
|
||||
graph LR
|
||||
subgraph "Content Paths"
|
||||
CD[Containerd]
|
||||
SP[Spegel]
|
||||
P[Peers]
|
||||
ER[External Registry]
|
||||
CS[Content Store]
|
||||
|
||||
CD -->|Request| SP
|
||||
SP -->|Check| P
|
||||
P -->|Content| SP
|
||||
SP -->|Return| CD
|
||||
CD -->|Store| CS
|
||||
|
||||
SP -->|404| CD
|
||||
CD -->|Fallback| ER
|
||||
end
|
||||
|
||||
subgraph "P2P Operations"
|
||||
P2P[P2P Network]
|
||||
DHT[DHT]
|
||||
ST[State Tracker]
|
||||
|
||||
P2P -->|Advertise| DHT
|
||||
DHT -->|Discover| P2P
|
||||
ST -->|Update| P2P
|
||||
end
|
||||
```
|
||||
|
||||
## 8. Failure Handling
|
||||
|
||||
Shows how different types of failures are handled in the system.
|
||||
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant CD as Containerd
|
||||
participant SR as Spegel Registry
|
||||
participant P2P as P2P Router
|
||||
participant PR as Peer
|
||||
participant ER as External Registry
|
||||
|
||||
Note over SR,ER: Failure Scenarios
|
||||
|
||||
alt Peer Not Found
|
||||
CD->>SR: Request Content
|
||||
SR->>P2P: Resolve(key)
|
||||
P2P--xSR: No Peers Available
|
||||
SR-->>CD: 404 Not Found
|
||||
CD->>ER: Fallback Request
|
||||
end
|
||||
|
||||
alt Peer Connection Failed
|
||||
SR->>PR: Request Content
|
||||
PR--xSR: Connection Failed
|
||||
SR->>P2P: Resolve(key) Retry
|
||||
P2P-->>SR: Alternative Peer
|
||||
end
|
||||
|
||||
alt Content Corrupted
|
||||
SR->>PR: Request Content
|
||||
PR-->>SR: Stream Content
|
||||
SR--xCD: Verification Failed
|
||||
CD->>ER: Fallback Request
|
||||
end
|
||||
```
|
||||
|
||||
## 9. Metrics Collection
|
||||
|
||||
Shows how metrics are collected and organized across the system components.
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
subgraph "Metrics Sources"
|
||||
RQ[Registry Requests]
|
||||
P2P[P2P Operations]
|
||||
ST[State Changes]
|
||||
end
|
||||
|
||||
subgraph "Metric Types"
|
||||
CT[Counters]
|
||||
HT[Histograms]
|
||||
GT[Gauges]
|
||||
end
|
||||
|
||||
subgraph "Prometheus Metrics"
|
||||
MR[mirror_requests_total]
|
||||
RD[resolve_duration_seconds]
|
||||
AI[advertised_images]
|
||||
AK[advertised_keys]
|
||||
RL[request_latency]
|
||||
IF[requests_inflight]
|
||||
end
|
||||
|
||||
RQ --> CT
|
||||
RQ --> HT
|
||||
P2P --> HT
|
||||
P2P --> GT
|
||||
ST --> GT
|
||||
|
||||
CT --> MR
|
||||
HT --> RD
|
||||
HT --> RL
|
||||
GT --> AI
|
||||
GT --> AK
|
||||
GT --> IF
|
||||
```
|
Loading…
x
Reference in New Issue
Block a user