twenty-four: service consolidation

2025-10-31 2030 words 10 minutes

The gym service used to be Python. Then it was Python + Go. Now it’s just Go, merged into the workouts service.

Fewer pods, less complexity, same functionality.

The Before State

October 11, 2025:

calendar - Go service (ICU ↔ Google Calendar sync, gym reservation sync)
gym - Python service (Selenium web scraper, Chrome sidecar)
strava - Python service (activity processing)
workouts - Go service (workout plan generator)

Four services. Three different languages/runtimes. Separate deployments, separate LoadBalancers, separate health checks.

The gym service was particularly heavy: Python + Selenium + headless Chrome in a sidecar container. It worked, but it was resource-intensive and had occasional stability issues.

The Porting Work

Phase 1: Python → Go (gym service)

Rewrote the gym scraper from Selenium to chromedp:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


// Old: Selenium with Python
driver = webdriver.Chrome()
driver.get("https://gym-website.example.com")
elem = driver.find_element(By.ID, "username")
elem.send_keys(username)

// New: chromedp with Go
ctx := chromedp.NewContext(context.Background())
chromedp.Run(ctx,
    chromedp.Navigate("https://gym-website.example.com"),
    chromedp.SendKeys("#username", username),
    chromedp.Click("#login-button"),
)

Benefits:

No Python runtime needed
No Selenium dependencies
No Chrome sidecar (uses remote Chrome DevTools in browser namespace)
Static binary, smaller image
Better error handling (Go’s type system caught edge cases)

The rewrite took a day. chromedp’s API is cleaner than Selenium’s, and the explicit context management made timeout handling straightforward.

Phase 2: Consolidation (gym → workouts)

Once the gym service was in Go, merging it into the workouts service was obvious. Both services:

Process fitness data
Talk to Intervals.icu
Run periodic sync jobs
Have similar resource needs

The merge:

Moved gym scraping code into workouts/gym_ssp.go and workouts/gym_models.go
Added /gym/* endpoints to workouts service
Updated routine service to fetch gym data from workouts.twenty-four.home/gym/
Removed standalone gym service deployment

1
2
3
4
5
6


// workouts service now handles:
http.HandleFunc("/gym/latest", handleGymLatest)
http.HandleFunc("/gym/all", handleGymAll)
http.HandleFunc("/gym/reserve", handleGymReserve)
http.HandleFunc("/gym/cancel", handleGymCancel)
http.HandleFunc("/gym/status", handleGymStatus)

Commit message excerpt:

Service consolidation:

Integrated gym scraping functionality into workouts service at /gym/* endpoints

Migrated Python Selenium scraper to Go chromedp implementation

Unified gym, Strava, workout plans, and stretching under single workouts deployment

Removed standalone gym-python service (Python + Selenium + Chrome sidecar)

Phase 3: Cleanup

Deleted the old gym service:

Removed gym/ directory (Python code, requirements.txt, Dockerfile)
Removed gym k8s resources (Deployment, Service, CronJobs)
Updated documentation to reflect new architecture

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12


# Old architecture
$ kubectl get svc -n twenty-four
calendar    LoadBalancer   10.233.66.6
gym         LoadBalancer   10.233.66.8
strava      LoadBalancer   10.233.66.7
workouts    LoadBalancer   10.233.66.4

# New architecture
$ kubectl get svc -n twenty-four
calendar    LoadBalancer   10.233.66.6
strava      LoadBalancer   10.233.66.7
workouts    LoadBalancer   10.233.66.4

One less service to manage.

What Changed in the Code

Scraping logic:

Python’s Selenium waits were implicit:

1
2


driver.implicitly_wait(10)
elem = driver.find_element(By.CLASS_NAME, "fc-event")

chromedp makes timeouts explicit:

1
2
3
4
5
6
7


ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()

err := chromedp.Run(ctx,
    chromedp.WaitVisible(".fc-event"),
    chromedp.Click(".fc-event"),
)

This caught bugs where the Python version would silently timeout and return stale data.

Data structures:

Python used dictionaries:

1
2
3
4
5


class_data = {
    "Type": "HIIT",
    "Date": "2025-10-15",
    "Reserved": True,
}

Go uses typed structs:

1
2
3
4
5
6


type GymClass struct {
    Type     string    `json:"Type"`
    Date     string    `json:"Date"`
    Reserved bool      `json:"Reserved"`
    Position *string   `json:"Position,omitempty"`
}

The Position pointer caught a bug where Python was returning None but the calendar service expected an empty string.

Error handling:

Python’s exception handling was broad:

1
2
3
4
5
6


try:
    elem = driver.find_element(By.ID, "username")
    elem.send_keys(username)
except Exception as e:
    log.error(f"Login failed: {e}")
    return None

Go forces granularity:

1
2
3
4
5
6
7


if err := chromedp.SendKeys("#username", username, chromedp.ByQuery); err != nil {
    return fmt.Errorf("failed to enter username: %w", err)
}

if err := chromedp.Click("#login-button", chromedp.ByQuery); err != nil {
    return fmt.Errorf("failed to click login button: %w", err)
}

This made debugging much easier when the gym website changed their DOM structure.

Remote Chrome DevTools

The original gym service ran Chrome as a sidecar container:

1
2
3
4
5
6
7
8
9


# Old deployment
containers:
- name: gym
  image: gym-python:latest
- name: chrome
  image: selenium/standalone-chrome:latest
  resources:
    limits:
      memory: 2Gi

The new version uses remote Chrome DevTools running in the browser namespace:

1
2
3
4
5
6
7


# New deployment
containers:
- name: workouts
  image: workouts:latest
  env:
  - name: GYM_REMOTE_URL
    value: "http://10.233.66.3:3000"

Chrome runs in a separate deployment with 2 replicas for redundancy. All services that need browser automation share it. No more per-service Chrome sidecars.

Resource Impact

Before consolidation:

gym: 1 pod (Python + Chrome sidecar) - 256Mi + 2Gi = 2.25Gi memory
workouts: 1 pod - 256Mi memory
Total: 2.5Gi memory, 2 pods

After consolidation:

workouts: 1 pod - 512Mi memory
Total: 512Mi memory, 1 pod

80% reduction in memory usage. Freed up resources for other services.

What Stayed the Same

API compatibility:

The calendar service still calls:

1

resp, err := http.Get("http://workouts.twenty-four.home/gym/latest")

Just the hostname changed. The JSON response format is identical.

Scraping logic:

The DOM selectors, click sequences, and extraction logic stayed the same. chromedp translated directly from Selenium’s concepts:

find_element(By.CLASS_NAME, "foo") → chromedp.WaitVisible(".foo")
elem.click() → chromedp.Click(".foo")
elem.get_attribute("class") → chromedp.AttributeValue(".foo", "class", &val)

Scheduling:

The hourly CronJob moved from the gym namespace to the workouts service:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17


# Now in workouts.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: hourly-gym-update
  namespace: twenty-four
spec:
  schedule: "0 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: curl
            image: curlimages/curl:latest
            args:
            - "curl -X POST http://workouts/gym/update"

Same schedule, same effect.

Why This Mattered

Simpler deployments:

One less service to monitor
Fewer LoadBalancer IPs to track
Consolidated logs

Better resource utilization:

No duplicate Chrome instances
Shared Go runtime
More efficient memory usage

Easier development:

Single codebase for fitness data processing
Shared types and utilities
Consistent error handling

No functional loss:

All endpoints still work
Same data format
Same reliability

The Calendar Service Changes

The calendar service had its own gym sync logic (gym_sync.go, ~850 lines). After consolidation, I didn’t move it into the workouts service - it stayed in the calendar service because:

Tight coupling with Google Calendar: The gym sync logic creates/updates calendar events immediately after processing gym data
Notification handling: Waitlist notifications are sent during sync, not during scraping
Auto-reservation logic: The calendar service decides when to reserve classes based on ICU workouts

The workouts service provides the data (/gym/latest), the calendar service orchestrates the sync.

This separation of concerns works well:

Workouts service: “Here’s what’s reserved/waitlisted”
Calendar service: “Based on ICU workouts, reserve X, notify about Y, update Z”

Lessons from Porting

chromedp is solid:

The Go ecosystem for browser automation is mature. chromedp’s API is clean, the docs are good, and it handles edge cases (stale elements, timeouts) better than Selenium.

Static typing caught bugs:

Moving from Python dicts to Go structs exposed assumptions. The Position field being *string instead of string caught a nil dereference that would’ve been a runtime error in Python.

Context management is powerful:

Go’s context.Context for timeouts/cancellation is more explicit than Python’s implicit waits. This made the scraper more resilient to slow page loads.

Multi-arch was easier in Go:

Building multi-arch images with Python requires careful dependency management (numpy, lxml, etc.). Go’s static binaries made arm64 + amd64 builds trivial:

1
2
3


ARG TARGETARCH
RUN CGO_ENABLED=0 GOOS=linux GOARCH=${TARGETARCH} \
    go build -o workouts .

No platform-specific dependencies, just compile for the target architecture.

Strava Consolidation

The Strava service was next. It was Python, ~990 lines handling activity processing:

Emoji injection (🏃 🚴 💪)
Commute detection (home ↔ work routes)
Gear assignment (roadie, gravel, trainer)
Walk muting (< 2km activities)
OAuth token management

The port:

Converted Python API calls to Go using the Strava client library:

1
2
3
4
5
6
7
8
9


# Old: Python stravalib
from stravalib.client import Client

client = Client(access_token=token)
activities = client.get_activities(after=cutoff)

for activity in activities:
    if activity.type == "Run" and not activity.name.startswith("🏃"):
        client.update_activity(activity.id, name=f"🏃 {activity.name}")

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


// New: Go strava client
import "github.com/strava/go.strava"

client := strava.NewClient(token)
activities, _ := service.GetLoggedInAthleteActivities().After(cutoff).Do()

for _, activity := range activities {
    if activity.Type == strava.ActivityTypeRun && !strings.HasPrefix(activity.Name, "🏃") {
        updater := service.UpdateActivity(activity.Id)
        updater.Name(fmt.Sprintf("🏃 %s", activity.Name))
        updater.Do()
    }
}

Where it went:

Into the workouts service. Makes sense - Strava activities sync to Intervals.icu, which the workouts service already talks to.

Added /strava/* endpoints:

/strava/sync - Process recent activities
/strava/activities - List activities
/strava/gear - Get gear list
/strava/status - Processing status
/strava/process - Process specific activity
/strava/trainer - Detect trainer rides
/strava/force - Force OAuth refresh

The 15-minute CronJob moved from standalone strava service to workouts service.

OAuth token management:

The Python service stored tokens in S3 and refreshed them when expired. Go version does the same:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


func RefreshTokenIfNeeded() error {
    token, _ := loadTokenFromS3()

    if time.Now().After(token.Expiry) {
        newToken, err := oauthConfig.TokenSource(context.Background(), token).Token()
        if err != nil {
            return fmt.Errorf("token refresh failed: %w", err)
        }

        saveTokenToS3(newToken)
    }

    return nil
}

Transparent. The service just works.

Calendar Consolidation

The calendar service was the orchestrator - 586 lines of sync logic handling:

ICU ↔ Google Calendar bi-directional sync
Gym reservation → ICU workout creation
Transit time buffers (leave 15 min early for gym)
Waitlist notifications
Auto-reservation when classes open up
Deletion sync (delete from calendar → delete from ICU)

Where it went:

Into the routine service. This one surprised me, but it made sense:

Routine service already displayed calendar data (cached Google Calendar events)
Routine service showed gym stats and workout summaries
Having both read + write operations in one service simplified the architecture

The merge added ~2,760 lines to routine:

calendar_sync.go - Google Calendar sync
gym_sync.go - Gym reservation sync (~850 lines of edge case handling)
intervals.go - Intervals.icu API client
google_write.go - Google Calendar write operations
pushover.go - Notifications
s3_storage.go - Cache persistence
constants.go - Transit buffers, config
notifications.go - Notification deduplication
weights.go - Weight data sync

The routine service went from “display cached data” to “display + orchestrate sync operations.”

API compatibility:

Added /api endpoint for backward compatibility. The old calendar service served sync data at the root (/), routine service uses /api:

1
2
3
4
5
6


// Old: calendar.twenty-four.home/
// New: routine.twenty-four.home/api

http.HandleFunc("/api", handleCalendarAPI)
http.HandleFunc("/api/sync", handleSync)
http.HandleFunc("/api/status", handleStatus)

Dashboard and other services updated to call routine instead of calendar.

Deployment changes:

The routine service deployment gained environment variables:

Google Calendar credentials
Intervals.icu API keys
Gym login credentials
Pushover notification tokens
S3 bucket names

Resource limits increased from 256Mi to 512Mi to handle the additional sync logic.

The Final Architecture

Before (October 11, 2025):

1
2
3
4
5
6
7
8


$ kubectl get svc -n twenty-four
calendar    LoadBalancer   10.233.66.6
gym         LoadBalancer   10.233.66.8
strava      LoadBalancer   10.233.66.7
workouts    LoadBalancer   10.233.66.4
routine     LoadBalancer   10.233.66.5
dashboard   LoadBalancer   10.233.66.2
dining      LoadBalancer   10.233.66.9

Seven services (including routine, dashboard, dining).

After (October 12, 2025):

1
2
3
4
5


$ kubectl get svc -n twenty-four
routine     LoadBalancer   10.233.66.5  # now includes calendar sync
workouts    LoadBalancer   10.233.66.4  # now includes gym + strava
dashboard   LoadBalancer   10.233.66.2
dining      LoadBalancer   10.233.66.9

Four services. All Go. All multi-arch. All running on ARM nodes.

What each service does now:

routine - Dashboard UI + calendar sync + gym sync + Google Calendar + ICU sync + notifications
workouts - Workout plan generation + gym scraping + Strava processing + ICU integration
dining - Meal tracking + AI recommendations
dashboard - Status monitoring across all services

Resource impact:

Before:

calendar: 256Mi
gym: 256Mi + 2Gi (Chrome sidecar) = 2.25Gi
strava: 256Mi
workouts: 256Mi
Total for these 4: 3Gi

After:

routine: 512Mi (was 256Mi, now includes calendar)
workouts: 512Mi (was 256Mi, now includes gym + strava)
Total: 1Gi

67% reduction in memory usage.

Three fewer LoadBalancers to manage. Three fewer deployments to monitor. Three fewer sets of logs to check.