Gateway
This page covers diagnosing common issues with the Hedgehog Gateway, including connectivity problems and NAT issues.
Health Checks
Start by verifying the gateway has picked up its current configuration:
$ kubectl get gatewayagents
NAME APPLIED APPLIEDG CURRENTG VERSION PROTOCOLIP VTEPIP AGE
gateway-1 10 minutes ago 3 3 v1.2.0 ... ... 2d
AppliedG should equal CurrentG. If they differ, the gateway has not yet
applied the latest configuration — check the dataplane pod logs.
If the gateway is not reporting in at all, check that both pods are running:
$ kubectl get pods -n fab -l app.kubernetes.io/component=gateway
NAME READY STATUS RESTARTS AGE
gw--gateway-1--dataplane-7v9ss 1/1 Running 0 12h
gw--gateway-1--frr-c9kwc 2/2 Running 0 12h
If either pod is not Running, inspect its logs:
$ kubectl logs -n fab gw--gateway-1--dataplane-7v9ss
$ kubectl logs -n fab gw--gateway-1--frr-c9kwc -c frr
$ kubectl logs -n fab gw--gateway-1--frr-c9kwc -c frr-agent
Common Issues
Traffic not flowing through gateway
-
Check peering is configured: Verify the GatewayPeering object exists and is not rejected:
-
Check routes on the leaf: Verify gateway routes are installed on the leaf switches:
Look for routes pointing to the gateway's VTEP IP. -
Check FRR is advertising routes: Use the FRR pod to verify BGP is advertising the peering prefixes (see FRR and BGP State).
-
Check flow filter: Use the dataplane CLI
show flow-filter tableto verify the peering policy is loaded. If the flow filter is empty, the dataplane configuration may not have been applied yet; check the FRR agent logs.
NAT not working as expected
-
Check flow table: Use
show flow-table entriesin the dataplane CLI to see if flows are being created. If the flow table is empty while traffic is flowing, the packets may be dropped by the flow filter before reaching the NAT stage. -
Check NAT state: Use
show masquerading state,show static-nat rules, orshow port-forwarding rulesto verify the NAT configuration is loaded. -
Idle timeout: If connections work briefly then stop, the flow may be expiring. Check the
idleTimeoutsetting in the GatewayPeering spec. Use TCP or application-layer keepalives for long-lived connections.
Gateway failover
-
Check both gateways are running: Verify both gateway pods are healthy.
-
Check gateway group membership:
Verify both gateways are members of the expected group with correct priorities. -
Check BGP on leaves: After a failover, the leaf switches should withdraw routes from the failed gateway and install routes from the backup. Use
kubectl fabric inspect bgpto check.
Diagnostics
Dataplane CLI
The dataplane includes an interactive CLI for inspecting internal state. Access it by exec'ing into the dataplane pod:
Key commands:
| Command | Description |
|---|---|
show flow-filter table |
Peering policy loaded on the dataplane |
show flow-table entries |
Active stateful NAT sessions |
show masquerading state |
Masquerade NAT configuration and pool state |
show static-nat rules |
Static NAT mappings |
show port-forwarding rules |
Port-forwarding rules |
show ip fib |
IPv4 forwarding table |
show config summary |
Configuration generation and apply status |
show tech |
Full diagnostic dump (for support) |
Use help in the CLI to see all available commands.
FRR and BGP State
FRR runs in a separate pod. Use vtysh to inspect BGP state:
Check BGP neighbors:
All neighbors should be in Established state. If a neighbor is in Active
or Idle, the BGP session is not established; check physical connectivity
and IP configuration.
Check routes advertised by the gateway:
VPC peering prefixes should appear as BGP routes pointing to the gateway's VTEP IP.
Check VRF routing tables:
Metrics
The dataplane exposes Prometheus metrics scraped by the Alloy agent on the gateway node and forwarded to the Fabric Proxy.
Each metric is emitted with three label variants:
{total="<vpc>"}: all traffic in or out of the VPC{drops="<vpc>"}: traffic dropped for the VPC{from="<src>",to="<dst>"}: directional traffic between two VPCs
Available metrics:
| Metric | Type | Description |
|---|---|---|
vpc_packet_count |
Gauge | Packet count |
vpc_packet_rate |
Gauge | Packet rate |
vpc_byte_count |
Gauge | Byte count |
vpc_byte_rate |
Gauge | Byte rate |
To inspect metrics directly, run on the gateway node itself (the dataplane uses host networking, so the endpoint is accessible on the node at port 9442):