Add initial MQTT scrubber service scaffold

This commit is contained in:
2026-03-12 18:12:16 +01:00
parent 957b2c41b3
commit 464f4c3ec4
22 changed files with 4150 additions and 1 deletions
+6
View File
@@ -0,0 +1,6 @@
.git
bin
dist
config.json
mqqt-scrubber
docs/All connections.json
+4
View File
@@ -0,0 +1,4 @@
config.json
mqqt-scrubber
bin/
dist/
+28
View File
@@ -0,0 +1,28 @@
FROM golang:1.24-alpine AS build
WORKDIR /src
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -o /out/mqqt-scrubber ./cmd/mqqt-scrubber
FROM alpine:3.20
RUN addgroup -S app && adduser -S -G app app \
&& apk add --no-cache ca-certificates wget
WORKDIR /app
COPY --from=build /out/mqqt-scrubber /usr/local/bin/mqqt-scrubber
COPY config.example.json /app/config.json
USER app
EXPOSE 8080
HEALTHCHECK --interval=30s --timeout=5s --start-period=15s --retries=3 CMD wget -q -O- http://127.0.0.1:8080/healthz >/dev/null || exit 1
ENTRYPOINT ["/usr/local/bin/mqqt-scrubber"]
CMD ["-config", "/app/config.json"]
+115 -1
View File
@@ -1,3 +1,117 @@
# mqqt-scrubber
Mqtt scrubber in golang.
Small Go daemon for collecting MQTT topics, normalizing mostly Tasmota payloads, and writing them into InfluxDB v3.
## Current status
This repository now contains:
- a milestone work plan in `docs/WORKPLAN.md`
- a technical design in `docs/ARCHITECTURE.md`
- an initial runnable scaffold for the background service
## Scope for v1
- subscribe to MQTT topics, primarily `tele/+/SENSOR` and `tele/+/STATE`
- parse Tasmota JSON payloads into normalized records
- batch writes to InfluxDB v3 using the HTTP `write_lp` endpoint
- run as a long-lived background process with reconnect and graceful shutdown
## Project layout
```text
cmd/mqqt-scrubber/ application entrypoint
internal/config/ config loading and validation
internal/influx/ InfluxDB v3 HTTP writer
internal/model/ internal message and record types
internal/mqtt/ MQTT subscriber wrapper
internal/parser/ Tasmota payload parsing and flattening
internal/pipeline/ batching, flushing, and service orchestration
docs/ work plan and architecture notes
```
## Configuration
The service loads configuration from a JSON file passed with `-config` and then applies environment variable overrides.
Start with:
```bash
cp config.example.json config.json
```
Supported environment variables:
- `MQTT_SCRUBBER_MQTT_BROKER`
- `MQTT_SCRUBBER_MQTT_USERNAME`
- `MQTT_SCRUBBER_MQTT_PASSWORD`
- `MQTT_SCRUBBER_MQTT_CLIENT_ID`
- `MQTT_SCRUBBER_MQTT_TOPICS`
- `MQTT_SCRUBBER_MQTT_QOS`
- `MQTT_SCRUBBER_INFLUX_URL`
- `MQTT_SCRUBBER_INFLUX_DATABASE`
- `MQTT_SCRUBBER_INFLUX_TOKEN`
- `MQTT_SCRUBBER_INFLUX_PRECISION`
- `MQTT_SCRUBBER_APP_BATCH_SIZE`
- `MQTT_SCRUBBER_APP_BUFFER_SIZE`
- `MQTT_SCRUBBER_APP_FLUSH_INTERVAL`
- `MQTT_SCRUBBER_APP_FLUSH_TIMEOUT`
- `MQTT_SCRUBBER_APP_LOG_LEVEL`
- `MQTT_SCRUBBER_APP_HEALTH_ADDRESS`
`MQTT_SCRUBBER_MQTT_TOPICS` expects a comma-separated list.
## Run
```bash
go run ./cmd/mqqt-scrubber -config config.json
```
The process also exposes simple health endpoints when `health_address` is set:
- `/healthz`
- `/readyz`
- `/metrics`
## Build
```bash
go build ./cmd/mqqt-scrubber
```
## Docker
Build the runtime image:
```bash
docker build -t mqqt-scrubber .
```
Run it with your config mounted in:
```bash
docker run --rm \
-p 8080:8080 \
-v "$PWD/config.json:/app/config.json:ro" \
mqqt-scrubber
```
The container health check uses `http://127.0.0.1:8080/healthz`.
## Docker Compose
Bring the service up with the included compose file:
```bash
docker-compose up --build -d
```
It expects a local `config.json` next to the compose file and exposes port `8080` for health and metrics.
## Notes
- The repo name is kept as `mqqt-scrubber` to match the existing folder.
- The current parser is intentionally narrow and optimized for Tasmota telemetry first.
- Writes use line protocol over HTTP against InfluxDB v3 `/api/v3/write_lp`.
- The Docker image includes a non-root runtime user and a simple HTTP health endpoint.
- Runtime counters are exposed in Prometheus-style text format at `/metrics`.
+27
View File
@@ -0,0 +1,27 @@
{
"mqtt": {
"broker": "tcp://127.0.0.1:1883",
"username": "",
"password": "",
"client_id": "mqqt-scrubber",
"topics": [
"tele/+/SENSOR",
"tele/+/STATE"
],
"qos": 0
},
"influx": {
"url": "http://127.0.0.1:8181",
"database": "home",
"token": "",
"precision": "ns"
},
"app": {
"batch_size": 200,
"buffer_size": 1000,
"flush_interval": "10s",
"flush_timeout": "10s",
"log_level": "info",
"health_address": ":8080"
}
}
+20
View File
@@ -0,0 +1,20 @@
version: "3.8"
services:
mqqt-scrubber:
build:
context: .
container_name: mqqt-scrubber
restart: unless-stopped
ports:
- "8080:8080"
volumes:
- ./config.json:/app/config.json:ro
environment:
MQTT_SCRUBBER_APP_HEALTH_ADDRESS: ":8080"
healthcheck:
test: ["CMD", "wget", "-q", "-O-", "http://127.0.0.1:8080/healthz"]
interval: 30s
timeout: 5s
retries: 3
start_period: 15s
+175
View File
@@ -0,0 +1,175 @@
# Architecture
## Problem statement
The service needs to sit in the background between an MQTT broker and InfluxDB v3. Its main job is to reliably consume selected MQTT topics, normalize mainly Tasmota payloads, and write time-series data into InfluxDB using a schema that is stable and query-friendly.
## Design goals
- keep the binary small and operationally simple
- treat MQTT consumption and Influx writes as independent failure domains
- avoid blocking in MQTT callbacks
- keep memory bounded with explicit buffering and batching
- optimize for Tasmota first, not generic ETL
## Non-goals for v1
- message replay from broker history
- general-purpose transformation DSL
- UI, dashboards, or query APIs
- support for every possible MQTT device type
## High-level flow
1. Load config from file and environment.
2. Start the batch pipeline.
3. Connect to MQTT and subscribe to configured topics.
4. Receive raw MQTT messages and enqueue them.
5. Parse supported Tasmota payloads into internal records.
6. Batch records and flush them to InfluxDB v3 using line protocol.
7. On shutdown, stop intake and flush remaining buffered records.
## Components
### `cmd/mqqt-scrubber`
Application entrypoint. Handles flags, startup logging, signal handling, and top-level wiring.
### `internal/config`
Loads config from JSON, applies environment overrides, and validates required fields.
### `internal/mqtt`
Owns broker connection management and topic subscriptions. The MQTT callback path should never perform parsing or network writes directly. It only enqueues raw messages into the internal pipeline.
### `internal/parser`
Transforms supported Tasmota JSON payloads into normalized records. The parser is strict about JSON validity but tolerant of unknown keys.
### `internal/pipeline`
Receives raw messages, invokes the parser, batches records, and triggers flushes by size or time.
### `internal/influx`
Serializes records into line protocol and writes them to InfluxDB v3 through `/api/v3/write_lp`.
## Internal record model
Each parsed message becomes one or more records with:
- measurement
- tags
- fields
- timestamp
Current scaffold emits one record per Tasmota message type with flattened field names.
Example:
```text
measurement: tasmota_sensor
tags:
device: kitchen-plug
message_type: sensor
source: tasmota
fields:
energy_power: 42
energy_voltage: 230
si7021_temperature: 21.4
si7021_humidity: 44
timestamp: payload Time or message receive time
```
## Schema guidance
- tags should stay low-cardinality
- device identity belongs in tags
- sensor readings belong in fields
- avoid putting dynamic keys or values in tags
- flatten nested JSON keys into stable underscore-separated field names
## Finalized v1 schema
### Topic families
- `tele/<device>/LWT`
- `tele/<device>/STATE`
- `tele/<device>/SENSOR`
### Measurements
- `tasmota_lwt`
- `tasmota_state`
- `tasmota_sensor`
### Tags
- `device`: sanitized device identifier with hyphens normalized to underscores
- `message_type`: `lwt`, `state`, or `sensor`
- `source`: always `tasmota`
### Field naming
- nested JSON objects are flattened with underscore separators
- camelCase Tasmota keys are normalized to snake_case
- examples:
- `UptimeSec` -> `uptime_sec`
- `TempUnit` -> `temp_unit`
- `TotalStartTime` -> `total_start_time`
- `Berry.HeapUsed` -> `berry_heap_used`
### Current field families
- `LWT`:
- `state` as string
- `online` as boolean
- `STATE`:
- base fields like `uptime`, `uptime_sec`, `heap`, `sleep`, `sleep_mode`, `load_avg`, `mqtt_count`
- relay state fields like `power`, `power1` through `power4`
- Wi-Fi fields like `wifi_rssi`, `wifi_signal`, `wifi_channel`, `wifi_link_count`, `wifi_mode`
- Berry fields like `berry_heap_used`, `berry_objects`
- `SENSOR`:
- energy fields like `energy_total`, `energy_today`, `energy_power`, `energy_voltage`, `energy_total_start_time`
- analog fields like `analog_temperature` and `analog_a0`
- optional `temp_unit`
### Timestamp handling
- prefer payload `Time` when present
- accept RFC3339 and timezone-less Tasmota timestamps in the form `2006-01-02T15:04:05`
- current implementation interprets timezone-less timestamps as UTC
## Failure handling
### MQTT unavailable
- rely on automatic reconnect
- resubscribe in the connect handler
- keep logs explicit about reconnect state
### Influx unavailable
- keep records in the in-memory batch until flush attempt returns
- retry on the next scheduled flush
- bound total in-memory intake with a channel capacity
### Invalid payload
- log parse failure with topic context
- skip the payload
- continue processing subsequent messages
## Initial deployment shape
- one process per broker or environment
- systemd service or Docker container
- config file mounted locally with secrets overridden by environment variables where practical
## Immediate next improvements
- add parser fixtures based on real Tasmota payloads
- add counters and metrics
- add integration testing against a local broker and InfluxDB
- evaluate whether a persistent spool is needed for outage tolerance
File diff suppressed because it is too large Load Diff
+86
View File
@@ -0,0 +1,86 @@
# Work Plan
## Goal
Build a simple Go daemon that runs continuously, subscribes to MQTT topics from home automation devices with Tasmota as the primary source, normalizes payloads, and stores the resulting time-series data in InfluxDB v3.
## Milestones
### Milestone 1: Ingestion skeleton
Estimate: 1 to 2 days
- create runnable Go service with config loading and structured logging
- connect to MQTT with reconnect support
- subscribe to `tele/+/SENSOR` and `tele/+/STATE`
- enqueue incoming messages without blocking the MQTT callback path
- batch and write records to InfluxDB v3
- implement graceful shutdown with final flush
Definition of done:
- service can run unattended for several hours
- on restart it reconnects and resumes subscriptions
- valid payloads reach InfluxDB
### Milestone 2: Tasmota normalization
Estimate: 1 to 2 days
- collect sample payloads from real devices
- normalize nested Tasmota JSON into stable field keys
- derive a schema for measurement, tags, and fields
- support `SENSOR` and `STATE` payload families cleanly
- add parser unit tests for the most common device payloads
Definition of done:
- representative payloads from real devices parse consistently
- field names stay stable across restarts and devices
- parse failures are visible in logs and do not crash the daemon
### Milestone 3: Operational hardening
Estimate: 1 to 2 days
- expose counters for received, parsed, written, dropped, and failed messages
- improve retry behavior and failure logging for Influx writes
- ensure bounded buffering and predictable memory use
- add sample deployment instructions for systemd or Docker
- document configuration and failure modes
Definition of done:
- service behavior under broker or Influx outages is understood and documented
- logs are enough to diagnose the common failure modes
- memory and buffering remain bounded under load
### Milestone 4: Broader topic support
Estimate: 2 to 4 days
- add more Tasmota topics if needed, for example `stat/+/STATUS10`
- add device-specific enrichments only when justified by real payloads
- optionally separate measurements by domain such as energy, climate, and state
- add integration tests with a local MQTT broker and InfluxDB instance
Definition of done:
- new topic families are added without destabilizing the core pipeline
- schema remains low-cardinality and query-friendly
## Backlog
- persistent local spool for outage tolerance
- dead-letter handling for invalid payloads
- health endpoint or Prometheus metrics
- config reload without restart
- per-topic parser routing for non-Tasmota devices
- retention and schema review once real data volume is known
## Delivery order
1. Make the daemon run reliably with a narrow scope.
2. Lock in the Tasmota schema from real samples.
3. Add visibility and failure handling.
4. Broaden topic coverage only after the core path is stable.
+11
View File
@@ -0,0 +1,11 @@
module mqqt-scrubber
go 1.24.0
require github.com/eclipse/paho.mqtt.golang v1.5.1
require (
github.com/gorilla/websocket v1.5.3 // indirect
golang.org/x/net v0.44.0 // indirect
golang.org/x/sync v0.17.0 // indirect
)
+8
View File
@@ -0,0 +1,8 @@
github.com/eclipse/paho.mqtt.golang v1.5.1 h1:/VSOv3oDLlpqR2Epjn1Q7b2bSTplJIeV2ISgCl2W7nE=
github.com/eclipse/paho.mqtt.golang v1.5.1/go.mod h1:1/yJCneuyOoCOzKSsOTUc0AJfpsItBGWvYpBLimhArU=
github.com/gorilla/websocket v1.5.3 h1:saDtZ6Pbx/0u+bgYQ3q96pZgCzfhKXGPqt7kZ72aNNg=
github.com/gorilla/websocket v1.5.3/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
golang.org/x/net v0.44.0 h1:evd8IRDyfNBMBTTY5XRF1vaZlD+EmWx6x8PkhR04H/I=
golang.org/x/net v0.44.0/go.mod h1:ECOoLqd5U3Lhyeyo/QDCEVQ4sNgYsqvCZ722XogGieY=
golang.org/x/sync v0.17.0 h1:l60nONMj9l5drqw6jlhIELNv9I0A4OFgRsG9k2oT9Ug=
golang.org/x/sync v0.17.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
+229
View File
@@ -0,0 +1,229 @@
package config
import (
"encoding/json"
"errors"
"fmt"
"os"
"strconv"
"strings"
"time"
)
const envPrefix = "MQTT_SCRUBBER_"
type Config struct {
MQTT MQTTConfig `json:"mqtt"`
Influx InfluxConfig `json:"influx"`
App AppConfig `json:"app"`
}
type MQTTConfig struct {
Broker string `json:"broker"`
Username string `json:"username"`
Password string `json:"password"`
ClientID string `json:"client_id"`
Topics []string `json:"topics"`
QoS byte `json:"qos"`
}
type InfluxConfig struct {
URL string `json:"url"`
Database string `json:"database"`
Token string `json:"token"`
Precision string `json:"precision"`
}
type AppConfig struct {
BatchSize int `json:"batch_size"`
BufferSize int `json:"buffer_size"`
FlushInterval DurationValue `json:"flush_interval"`
FlushTimeout DurationValue `json:"flush_timeout"`
LogLevel string `json:"log_level"`
HealthAddress string `json:"health_address"`
}
type DurationValue struct {
time.Duration
}
func (value *DurationValue) UnmarshalJSON(data []byte) error {
var raw string
if err := json.Unmarshal(data, &raw); err != nil {
return fmt.Errorf("duration must be a string: %w", err)
}
parsed, err := time.ParseDuration(raw)
if err != nil {
return fmt.Errorf("invalid duration %q: %w", raw, err)
}
value.Duration = parsed
return nil
}
func Load(path string) (Config, error) {
cfg := defaultConfig()
if path != "" {
contents, err := os.ReadFile(path)
if err != nil {
return Config{}, fmt.Errorf("read config file: %w", err)
}
if err := json.Unmarshal(contents, &cfg); err != nil {
return Config{}, fmt.Errorf("parse config file: %w", err)
}
}
if err := applyEnvOverrides(&cfg); err != nil {
return Config{}, err
}
if err := cfg.Validate(); err != nil {
return Config{}, err
}
return cfg, nil
}
func (cfg Config) Validate() error {
if cfg.MQTT.Broker == "" {
return errors.New("mqtt broker is required")
}
if cfg.MQTT.ClientID == "" {
return errors.New("mqtt client_id is required")
}
if len(cfg.MQTT.Topics) == 0 {
return errors.New("at least one mqtt topic is required")
}
if cfg.Influx.URL == "" {
return errors.New("influx url is required")
}
if cfg.Influx.Database == "" {
return errors.New("influx database is required")
}
if cfg.Influx.Precision == "" {
return errors.New("influx precision is required")
}
if cfg.App.BatchSize <= 0 {
return errors.New("app batch_size must be greater than zero")
}
if cfg.App.BufferSize <= 0 {
return errors.New("app buffer_size must be greater than zero")
}
if cfg.App.FlushInterval.Duration <= 0 {
return errors.New("app flush_interval must be greater than zero")
}
if cfg.App.FlushTimeout.Duration <= 0 {
return errors.New("app flush_timeout must be greater than zero")
}
return nil
}
func defaultConfig() Config {
return Config{
MQTT: MQTTConfig{
Broker: "tcp://127.0.0.1:1883",
ClientID: "mqqt-scrubber",
Topics: []string{
"tele/+/SENSOR",
"tele/+/STATE",
},
QoS: 0,
},
Influx: InfluxConfig{
URL: "http://127.0.0.1:8181",
Database: "home",
Precision: "ns",
},
App: AppConfig{
BatchSize: 200,
BufferSize: 1000,
FlushInterval: DurationValue{Duration: 10 * time.Second},
FlushTimeout: DurationValue{Duration: 10 * time.Second},
LogLevel: "info",
HealthAddress: ":8080",
},
}
}
func applyEnvOverrides(cfg *Config) error {
setString(&cfg.MQTT.Broker, envPrefix+"MQTT_BROKER")
setString(&cfg.MQTT.Username, envPrefix+"MQTT_USERNAME")
setString(&cfg.MQTT.Password, envPrefix+"MQTT_PASSWORD")
setString(&cfg.MQTT.ClientID, envPrefix+"MQTT_CLIENT_ID")
setString(&cfg.Influx.URL, envPrefix+"INFLUX_URL")
setString(&cfg.Influx.Database, envPrefix+"INFLUX_DATABASE")
setString(&cfg.Influx.Token, envPrefix+"INFLUX_TOKEN")
setString(&cfg.Influx.Precision, envPrefix+"INFLUX_PRECISION")
setString(&cfg.App.LogLevel, envPrefix+"APP_LOG_LEVEL")
setString(&cfg.App.HealthAddress, envPrefix+"APP_HEALTH_ADDRESS")
if raw, ok := os.LookupEnv(envPrefix + "MQTT_TOPICS"); ok {
cfg.MQTT.Topics = splitAndTrim(raw)
}
if raw, ok := os.LookupEnv(envPrefix + "MQTT_QOS"); ok {
parsed, err := strconv.Atoi(raw)
if err != nil {
return fmt.Errorf("parse %sMQTT_QOS: %w", envPrefix, err)
}
cfg.MQTT.QoS = byte(parsed)
}
if raw, ok := os.LookupEnv(envPrefix + "APP_BATCH_SIZE"); ok {
parsed, err := strconv.Atoi(raw)
if err != nil {
return fmt.Errorf("parse %sAPP_BATCH_SIZE: %w", envPrefix, err)
}
cfg.App.BatchSize = parsed
}
if raw, ok := os.LookupEnv(envPrefix + "APP_BUFFER_SIZE"); ok {
parsed, err := strconv.Atoi(raw)
if err != nil {
return fmt.Errorf("parse %sAPP_BUFFER_SIZE: %w", envPrefix, err)
}
cfg.App.BufferSize = parsed
}
if raw, ok := os.LookupEnv(envPrefix + "APP_FLUSH_INTERVAL"); ok {
parsed, err := time.ParseDuration(raw)
if err != nil {
return fmt.Errorf("parse %sAPP_FLUSH_INTERVAL: %w", envPrefix, err)
}
cfg.App.FlushInterval = DurationValue{Duration: parsed}
}
if raw, ok := os.LookupEnv(envPrefix + "APP_FLUSH_TIMEOUT"); ok {
parsed, err := time.ParseDuration(raw)
if err != nil {
return fmt.Errorf("parse %sAPP_FLUSH_TIMEOUT: %w", envPrefix, err)
}
cfg.App.FlushTimeout = DurationValue{Duration: parsed}
}
return nil
}
func setString(target *string, key string) {
if value, ok := os.LookupEnv(key); ok {
*target = value
}
}
func splitAndTrim(value string) []string {
parts := strings.Split(value, ",")
result := make([]string, 0, len(parts))
for _, part := range parts {
trimmed := strings.TrimSpace(part)
if trimmed != "" {
result = append(result, trimmed)
}
}
return result
}
+94
View File
@@ -0,0 +1,94 @@
package health
import (
"context"
"encoding/json"
"fmt"
"log/slog"
"net/http"
"strings"
"time"
)
type Server struct {
address string
server *http.Server
metrics func() any
}
func NewServer(address string, metrics func() any) *Server {
if address == "" {
return nil
}
mux := http.NewServeMux()
mux.HandleFunc("/healthz", func(w http.ResponseWriter, _ *http.Request) {
writeJSON(w, http.StatusOK, map[string]string{"status": "ok"})
})
mux.HandleFunc("/readyz", func(w http.ResponseWriter, _ *http.Request) {
writeJSON(w, http.StatusOK, map[string]string{"status": "ready"})
})
mux.HandleFunc("/metrics", func(w http.ResponseWriter, _ *http.Request) {
if metrics == nil {
writeJSON(w, http.StatusOK, map[string]string{"status": "metrics_unavailable"})
return
}
w.Header().Set("Content-Type", "text/plain; version=0.0.4")
_, _ = w.Write([]byte(formatMetrics(metrics())))
})
return &Server{
address: address,
metrics: metrics,
server: &http.Server{
Addr: address,
Handler: mux,
ReadHeaderTimeout: 5 * time.Second,
},
}
}
func (server *Server) Start() {
if server == nil {
return
}
go func() {
slog.Info("health server started", "address", server.address)
if err := server.server.ListenAndServe(); err != nil && err != http.ErrServerClosed {
slog.Error("health server stopped with error", "error", err)
}
}()
}
func (server *Server) Shutdown(ctx context.Context) error {
if server == nil {
return nil
}
return server.server.Shutdown(ctx)
}
func writeJSON(w http.ResponseWriter, status int, value any) {
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(status)
_ = json.NewEncoder(w).Encode(value)
}
func formatMetrics(snapshot any) string {
encoded, err := json.Marshal(snapshot)
if err != nil {
return "mqqt_scrubber_metrics_encode_error 1\n"
}
var fields map[string]uint64
if err := json.Unmarshal(encoded, &fields); err != nil {
return "mqqt_scrubber_metrics_decode_error 1\n"
}
var builder strings.Builder
for key, value := range fields {
builder.WriteString(fmt.Sprintf("mqqt_scrubber_%s %d\n", key, value))
}
return builder.String()
}
+184
View File
@@ -0,0 +1,184 @@
package influx
import (
"bytes"
"context"
"fmt"
"io"
"net/http"
"net/url"
"sort"
"strconv"
"strings"
"time"
"mqqt-scrubber/internal/config"
"mqqt-scrubber/internal/model"
)
type Client struct {
baseURL string
database string
token string
precision string
httpClient *http.Client
}
func NewClient(cfg config.InfluxConfig) *Client {
return &Client{
baseURL: strings.TrimRight(cfg.URL, "/"),
database: cfg.Database,
token: cfg.Token,
precision: cfg.Precision,
httpClient: &http.Client{
Timeout: 30 * time.Second,
},
}
}
func (client *Client) Write(ctx context.Context, records []model.Record) error {
if len(records) == 0 {
return nil
}
lines := make([]string, 0, len(records))
for _, record := range records {
line, err := toLineProtocol(record)
if err != nil {
return err
}
lines = append(lines, line)
}
writeURL, err := url.Parse(client.baseURL)
if err != nil {
return fmt.Errorf("parse influx url: %w", err)
}
writeURL.Path = strings.TrimRight(writeURL.Path, "/") + "/api/v3/write_lp"
query := writeURL.Query()
query.Set("db", client.database)
query.Set("precision", client.precision)
writeURL.RawQuery = query.Encode()
body := strings.Join(lines, "\n")
req, err := http.NewRequestWithContext(ctx, http.MethodPost, writeURL.String(), bytes.NewBufferString(body))
if err != nil {
return fmt.Errorf("create write request: %w", err)
}
req.Header.Set("Content-Type", "text/plain; charset=utf-8")
if client.token != "" {
req.Header.Set("Authorization", "Bearer "+client.token)
}
resp, err := client.httpClient.Do(req)
if err != nil {
return fmt.Errorf("execute write request: %w", err)
}
defer resp.Body.Close()
if resp.StatusCode >= http.StatusBadRequest {
responseBody, _ := io.ReadAll(io.LimitReader(resp.Body, 4096))
return fmt.Errorf("influx write failed with status %d: %s", resp.StatusCode, strings.TrimSpace(string(responseBody)))
}
return nil
}
func toLineProtocol(record model.Record) (string, error) {
if record.Measurement == "" {
return "", fmt.Errorf("record measurement is required")
}
if len(record.Fields) == 0 {
return "", fmt.Errorf("record fields are required")
}
tagKeys := sortedKeys(record.Tags)
fieldKeys := sortedKeysAny(record.Fields)
var builder strings.Builder
builder.WriteString(escapeMeasurement(record.Measurement))
for _, key := range tagKeys {
builder.WriteByte(',')
builder.WriteString(escapeTagOrKey(key))
builder.WriteByte('=')
builder.WriteString(escapeTagOrKey(record.Tags[key]))
}
builder.WriteByte(' ')
for index, key := range fieldKeys {
if index > 0 {
builder.WriteByte(',')
}
builder.WriteString(escapeTagOrKey(key))
builder.WriteByte('=')
formatted, err := formatFieldValue(record.Fields[key])
if err != nil {
return "", fmt.Errorf("format field %q: %w", key, err)
}
builder.WriteString(formatted)
}
builder.WriteByte(' ')
builder.WriteString(strconv.FormatInt(record.Timestamp.UnixNano(), 10))
return builder.String(), nil
}
func sortedKeys(values map[string]string) []string {
keys := make([]string, 0, len(values))
for key := range values {
keys = append(keys, key)
}
sort.Strings(keys)
return keys
}
func sortedKeysAny(values map[string]any) []string {
keys := make([]string, 0, len(values))
for key := range values {
keys = append(keys, key)
}
sort.Strings(keys)
return keys
}
func formatFieldValue(value any) (string, error) {
switch typed := value.(type) {
case string:
escaped := strings.ReplaceAll(typed, `\`, `\\`)
escaped = strings.ReplaceAll(escaped, `"`, `\"`)
return `"` + escaped + `"`, nil
case bool:
return strconv.FormatBool(typed), nil
case int:
return strconv.Itoa(typed) + "i", nil
case int64:
return strconv.FormatInt(typed, 10) + "i", nil
case float64:
return strconv.FormatFloat(typed, 'f', -1, 64), nil
case float32:
return strconv.FormatFloat(float64(typed), 'f', -1, 32), nil
default:
return "", fmt.Errorf("unsupported field type %T", value)
}
}
func escapeMeasurement(value string) string {
return strings.NewReplacer(
",", `\,`,
" ", `\ `,
).Replace(value)
}
func escapeTagOrKey(value string) string {
return strings.NewReplacer(
",", `\,`,
"=", `\=`,
" ", `\ `,
).Replace(value)
}
+60
View File
@@ -0,0 +1,60 @@
package influx
import (
"fmt"
"testing"
"time"
"mqqt-scrubber/internal/model"
"mqqt-scrubber/internal/parser"
)
func TestToLineProtocolFromParsedLWT(t *testing.T) {
receivedAt := time.Date(2026, time.March, 12, 15, 21, 39, 0, time.UTC)
records, err := parser.ParseTasmota(model.RawMessage{
Topic: "tele/tasmota_896001/LWT",
Payload: []byte("Online"),
ReceivedAt: receivedAt,
})
if err != nil {
t.Fatalf("ParseTasmota returned error: %v", err)
}
line, err := toLineProtocol(records[0])
if err != nil {
t.Fatalf("toLineProtocol returned error: %v", err)
}
expected := fmt.Sprintf(
"tasmota_lwt,device=tasmota_896001,message_type=lwt,source=tasmota online=true,state=\"Online\" %d",
receivedAt.UnixNano(),
)
if line != expected {
t.Fatalf("unexpected line protocol:\n got: %s\nwant: %s", line, expected)
}
}
func TestToLineProtocolFromParsedSensor(t *testing.T) {
receivedAt := time.Date(2026, time.March, 12, 16, 23, 13, 0, time.UTC)
records, err := parser.ParseTasmota(model.RawMessage{
Topic: "tele/tasmota_C88994/SENSOR",
Payload: []byte(`{"Time":"2026-03-12T16:23:13","ENERGY":{"TotalStartTime":"2026-02-04T19:13:40","Total":41.385,"Yesterday":1.124,"Today":0.799,"Period":0,"Power":1,"ApparentPower":4,"ReactivePower":4,"Factor":0.22,"Voltage":231,"Current":0.016}}`),
ReceivedAt: receivedAt,
})
if err != nil {
t.Fatalf("ParseTasmota returned error: %v", err)
}
line, err := toLineProtocol(records[0])
if err != nil {
t.Fatalf("toLineProtocol returned error: %v", err)
}
expected := fmt.Sprintf(
"tasmota_sensor,device=tasmota_c88994,message_type=sensor,source=tasmota energy_apparent_power=4,energy_current=0.016,energy_factor=0.22,energy_period=0,energy_power=1,energy_reactive_power=4,energy_today=0.799,energy_total=41.385,energy_total_start_time=\"2026-02-04T19:13:40\",energy_voltage=231,energy_yesterday=1.124 %d",
receivedAt.UnixNano(),
)
if line != expected {
t.Fatalf("unexpected line protocol:\n got: %s\nwant: %s", line, expected)
}
}
+16
View File
@@ -0,0 +1,16 @@
package model
import "time"
type RawMessage struct {
Topic string
Payload []byte
ReceivedAt time.Time
}
type Record struct {
Measurement string
Tags map[string]string
Fields map[string]any
Timestamp time.Time
}
+95
View File
@@ -0,0 +1,95 @@
package mqtt
import (
"log/slog"
"time"
paho "github.com/eclipse/paho.mqtt.golang"
"mqqt-scrubber/internal/config"
"mqqt-scrubber/internal/model"
)
type Subscriber struct {
config config.MQTTConfig
handle func(model.RawMessage)
client paho.Client
started bool
}
func NewSubscriber(cfg config.MQTTConfig, handle func(model.RawMessage)) *Subscriber {
return &Subscriber{
config: cfg,
handle: handle,
}
}
func (subscriber *Subscriber) Start() error {
options := paho.NewClientOptions()
options.AddBroker(subscriber.config.Broker)
options.SetClientID(subscriber.config.ClientID)
options.SetAutoReconnect(true)
options.SetConnectRetry(true)
options.SetConnectRetryInterval(5 * time.Second)
options.SetKeepAlive(30 * time.Second)
options.SetPingTimeout(10 * time.Second)
options.SetOrderMatters(false)
if subscriber.config.Username != "" {
options.SetUsername(subscriber.config.Username)
}
if subscriber.config.Password != "" {
options.SetPassword(subscriber.config.Password)
}
callback := func(_ paho.Client, message paho.Message) {
subscriber.handle(model.RawMessage{
Topic: message.Topic(),
Payload: append([]byte(nil), message.Payload()...),
ReceivedAt: time.Now().UTC(),
})
}
options.SetOnConnectHandler(func(client paho.Client) {
filters := make(map[string]byte, len(subscriber.config.Topics))
for _, topic := range subscriber.config.Topics {
filters[topic] = subscriber.config.QoS
}
token := client.SubscribeMultiple(filters, callback)
token.Wait()
if err := token.Error(); err != nil {
slog.Error("failed to subscribe after connect", "error", err)
return
}
slog.Info("subscribed to mqtt topics", "topics", subscriber.config.Topics)
})
options.SetConnectionLostHandler(func(_ paho.Client, err error) {
slog.Warn("mqtt connection lost", "error", err)
})
options.SetReconnectingHandler(func(_ paho.Client, _ *paho.ClientOptions) {
slog.Info("mqtt reconnecting")
})
subscriber.client = paho.NewClient(options)
token := subscriber.client.Connect()
token.Wait()
if err := token.Error(); err != nil {
return err
}
subscriber.started = true
return nil
}
func (subscriber *Subscriber) Stop() {
if !subscriber.started || subscriber.client == nil {
return
}
subscriber.client.Disconnect(250)
subscriber.started = false
}
+158
View File
@@ -0,0 +1,158 @@
package parser
import (
"encoding/json"
"fmt"
"regexp"
"sort"
"strings"
"time"
"mqqt-scrubber/internal/model"
)
var invalidNameCharacters = regexp.MustCompile(`[^a-z0-9_]+`)
var acronymBoundary = regexp.MustCompile(`([A-Z]+)([A-Z][a-z])`)
var camelBoundary = regexp.MustCompile(`([a-z0-9])([A-Z])`)
var tasmotaTimeLayouts = []string{
time.RFC3339Nano,
time.RFC3339,
"2006-01-02T15:04:05",
}
func ParseTasmota(message model.RawMessage) ([]model.Record, error) {
parts := strings.Split(message.Topic, "/")
if len(parts) != 3 {
return nil, fmt.Errorf("unsupported topic shape: %s", message.Topic)
}
if parts[0] != "tele" {
return nil, fmt.Errorf("unsupported topic root: %s", message.Topic)
}
measurement := "tasmota_" + sanitizeName(parts[2])
tags := map[string]string{
"device": sanitizeDeviceName(parts[1]),
"message_type": sanitizeName(parts[2]),
"source": "tasmota",
}
if strings.EqualFold(parts[2], "LWT") {
return []model.Record{parseLWT(message, measurement, tags)}, nil
}
var payload map[string]any
if err := json.Unmarshal(message.Payload, &payload); err != nil {
return nil, fmt.Errorf("invalid json payload: %w", err)
}
fields := flattenPayload(payload, nil)
delete(fields, "time")
if len(fields) == 0 {
return nil, fmt.Errorf("no usable fields in payload for topic %s", message.Topic)
}
timestamp := parsePayloadTimestamp(payload, message.ReceivedAt)
record := model.Record{
Measurement: measurement,
Tags: tags,
Fields: fields,
Timestamp: timestamp,
}
return []model.Record{record}, nil
}
func parseLWT(message model.RawMessage, measurement string, tags map[string]string) model.Record {
state := strings.TrimSpace(string(message.Payload))
return model.Record{
Measurement: measurement,
Tags: tags,
Fields: map[string]any{
"state": state,
"online": strings.EqualFold(state, "Online"),
},
Timestamp: message.ReceivedAt,
}
}
func parsePayloadTimestamp(payload map[string]any, fallback time.Time) time.Time {
rawTime, ok := payload["Time"].(string)
if !ok || strings.TrimSpace(rawTime) == "" {
return fallback
}
for _, layout := range tasmotaTimeLayouts {
var (
parsed time.Time
err error
)
if layout == "2006-01-02T15:04:05" {
parsed, err = time.ParseInLocation(layout, rawTime, time.UTC)
} else {
parsed, err = time.Parse(layout, rawTime)
}
if err == nil {
return parsed
}
}
return fallback
}
func flattenPayload(payload map[string]any, prefix []string) map[string]any {
result := make(map[string]any)
keys := make([]string, 0, len(payload))
for key := range payload {
keys = append(keys, key)
}
sort.Strings(keys)
for _, key := range keys {
value := payload[key]
nameParts := append(prefix, sanitizeName(key))
switch typed := value.(type) {
case map[string]any:
nested := flattenPayload(typed, nameParts)
for nestedKey, nestedValue := range nested {
result[nestedKey] = nestedValue
}
case float64, bool, string:
result[strings.Join(nameParts, "_")] = typed
}
}
return result
}
func sanitizeName(value string) string {
normalized := strings.TrimSpace(value)
normalized = strings.ReplaceAll(normalized, "-", "_")
normalized = strings.ReplaceAll(normalized, " ", "_")
normalized = acronymBoundary.ReplaceAllString(normalized, `${1}_${2}`)
normalized = camelBoundary.ReplaceAllString(normalized, `${1}_${2}`)
normalized = strings.ToLower(normalized)
normalized = invalidNameCharacters.ReplaceAllString(normalized, "_")
normalized = strings.Trim(normalized, "_")
if normalized == "" {
return "unknown"
}
return normalized
}
func sanitizeDeviceName(value string) string {
normalized := strings.ToLower(strings.TrimSpace(value))
normalized = strings.ReplaceAll(normalized, "-", "_")
normalized = strings.ReplaceAll(normalized, " ", "_")
normalized = invalidNameCharacters.ReplaceAllString(normalized, "_")
normalized = strings.Trim(normalized, "_")
if normalized == "" {
return "unknown"
}
return normalized
}
+102
View File
@@ -0,0 +1,102 @@
package parser
import (
"encoding/json"
"os"
"testing"
"time"
"mqqt-scrubber/internal/model"
)
type fixtureCase struct {
Name string `json:"name"`
Topic string `json:"topic"`
Payload string `json:"payload"`
ReceivedAt string `json:"received_at"`
ExpectedMeasurement string `json:"expected_measurement"`
ExpectedTimestamp string `json:"expected_timestamp"`
ExpectedTags map[string]string `json:"expected_tags"`
ExpectedFields map[string]any `json:"expected_fields"`
}
func TestParseTasmotaFixtures(t *testing.T) {
contents, err := os.ReadFile("testdata/tasmota_samples.json")
if err != nil {
t.Fatalf("read fixture file: %v", err)
}
var fixtures []fixtureCase
if err := json.Unmarshal(contents, &fixtures); err != nil {
t.Fatalf("parse fixture file: %v", err)
}
for _, fixture := range fixtures {
t.Run(fixture.Name, func(t *testing.T) {
receivedAt, err := time.Parse(time.RFC3339, fixture.ReceivedAt)
if err != nil {
t.Fatalf("parse receivedAt: %v", err)
}
expectedTimestamp, err := time.Parse(time.RFC3339, fixture.ExpectedTimestamp)
if err != nil {
t.Fatalf("parse expected timestamp: %v", err)
}
records, err := ParseTasmota(model.RawMessage{
Topic: fixture.Topic,
Payload: []byte(fixture.Payload),
ReceivedAt: receivedAt,
})
if err != nil {
t.Fatalf("ParseTasmota returned error: %v", err)
}
if len(records) != 1 {
t.Fatalf("expected 1 record, got %d", len(records))
}
record := records[0]
if record.Measurement != fixture.ExpectedMeasurement {
t.Fatalf("unexpected measurement: got %s want %s", record.Measurement, fixture.ExpectedMeasurement)
}
if !record.Timestamp.Equal(expectedTimestamp) {
t.Fatalf("unexpected timestamp: got %s want %s", record.Timestamp.Format(time.RFC3339), expectedTimestamp.Format(time.RFC3339))
}
for key, value := range fixture.ExpectedTags {
if record.Tags[key] != value {
t.Fatalf("unexpected tag %s: got %q want %q", key, record.Tags[key], value)
}
}
for key, value := range fixture.ExpectedFields {
fieldValue, ok := record.Fields[key]
if !ok {
t.Fatalf("expected field %s to be present", key)
}
if !fieldEquals(fieldValue, value) {
t.Fatalf("unexpected field %s: got %#v want %#v", key, fieldValue, value)
}
}
})
}
}
func fieldEquals(got any, want any) bool {
switch typedWant := want.(type) {
case float64:
typedGot, ok := got.(float64)
return ok && typedGot == typedWant
case string:
typedGot, ok := got.(string)
return ok && typedGot == typedWant
case bool:
typedGot, ok := got.(bool)
return ok && typedGot == typedWant
default:
return false
}
}
+148
View File
@@ -0,0 +1,148 @@
[
{
"name": "lwt-online",
"topic": "tele/tasmota_896001/LWT",
"payload": "Online",
"received_at": "2026-03-12T15:21:39Z",
"expected_measurement": "tasmota_lwt",
"expected_timestamp": "2026-03-12T15:21:39Z",
"expected_tags": {
"device": "tasmota_896001",
"message_type": "lwt",
"source": "tasmota"
},
"expected_fields": {
"state": "Online",
"online": true
}
},
{
"name": "lwt-offline-hyphenated-device",
"topic": "tele/tasmota-prusa-mini/LWT",
"payload": "Offline",
"received_at": "2026-03-12T15:21:39Z",
"expected_measurement": "tasmota_lwt",
"expected_timestamp": "2026-03-12T15:21:39Z",
"expected_tags": {
"device": "tasmota_prusa_mini",
"message_type": "lwt",
"source": "tasmota"
},
"expected_fields": {
"state": "Offline",
"online": false
}
},
{
"name": "state-single-relay",
"topic": "tele/tasmota_67850B/STATE",
"payload": "{\"Time\":\"2025-10-28T11:56:55\",\"Uptime\":\"0T15:35:12\",\"UptimeSec\":56112,\"Heap\":27,\"SleepMode\":\"Dynamic\",\"Sleep\":50,\"LoadAvg\":19,\"MqttCount\":1,\"POWER\":\"ON\",\"Wifi\":{\"AP\":1,\"SSId\":\"Home_MiNi_smart\",\"BSSId\":\"02:E2:C6:A9:5F:E9\",\"Channel\":6,\"RSSI\":78,\"Signal\":-61,\"LinkCount\":1,\"Downtime\":\"0T00:00:05\"}}",
"received_at": "2025-10-28T11:56:54Z",
"expected_measurement": "tasmota_state",
"expected_timestamp": "2025-10-28T11:56:55Z",
"expected_tags": {
"device": "tasmota_67850b",
"message_type": "state",
"source": "tasmota"
},
"expected_fields": {
"power": "ON",
"uptime_sec": 56112,
"wifi_rssi": 78,
"wifi_signal": -61
}
},
{
"name": "state-multi-relay",
"topic": "tele/tasmota_725D2D/STATE",
"payload": "{\"Time\":\"2026-03-12T16:23:14\",\"Uptime\":\"4T07:10:12\",\"UptimeSec\":371412,\"Heap\":24,\"SleepMode\":\"Dynamic\",\"Sleep\":50,\"LoadAvg\":19,\"MqttCount\":1,\"POWER1\":\"OFF\",\"POWER2\":\"OFF\",\"POWER3\":\"OFF\",\"POWER4\":\"OFF\",\"Wifi\":{\"AP\":1,\"SSId\":\"Home_MiNi_smart\",\"BSSId\":\"02:E2:C6:A9:5F:E9\",\"Channel\":6,\"Mode\":\"11n\",\"RSSI\":58,\"Signal\":-71,\"LinkCount\":1,\"Downtime\":\"0T00:00:05\"}}",
"received_at": "2026-03-12T16:23:14Z",
"expected_measurement": "tasmota_state",
"expected_timestamp": "2026-03-12T16:23:14Z",
"expected_tags": {
"device": "tasmota_725d2d",
"message_type": "state",
"source": "tasmota"
},
"expected_fields": {
"power1": "OFF",
"power4": "OFF",
"wifi_mode": "11n",
"wifi_rssi": 58
}
},
{
"name": "state-with-berry",
"topic": "tele/tasmota_C8BD20/STATE",
"payload": "{\"Time\":\"2026-03-12T16:23:11\",\"Uptime\":\"4T07:10:08\",\"UptimeSec\":371408,\"Heap\":142,\"SleepMode\":\"Dynamic\",\"Sleep\":50,\"LoadAvg\":19,\"MqttCount\":2,\"Berry\":{\"HeapUsed\":4,\"Objects\":46},\"POWER\":\"ON\",\"Wifi\":{\"AP\":1,\"SSId\":\"Home_MiNi_smart\",\"BSSId\":\"32:5A:4C:53:3F:56\",\"Channel\":1,\"Mode\":\"HE20\",\"RSSI\":96,\"Signal\":-52,\"LinkCount\":1,\"Downtime\":\"0T00:00:03\"}}",
"received_at": "2026-03-12T16:23:11Z",
"expected_measurement": "tasmota_state",
"expected_timestamp": "2026-03-12T16:23:11Z",
"expected_tags": {
"device": "tasmota_c8bd20",
"message_type": "state",
"source": "tasmota"
},
"expected_fields": {
"berry_heap_used": 4,
"berry_objects": 46,
"power": "ON",
"wifi_mode": "HE20"
}
},
{
"name": "sensor-energy-only",
"topic": "tele/tasmota_C88994/SENSOR",
"payload": "{\"Time\":\"2026-03-12T16:23:13\",\"ENERGY\":{\"TotalStartTime\":\"2026-02-04T19:13:40\",\"Total\":41.385,\"Yesterday\":1.124,\"Today\":0.799,\"Period\":0,\"Power\":1,\"ApparentPower\":4,\"ReactivePower\":4,\"Factor\":0.22,\"Voltage\":231,\"Current\":0.016}}",
"received_at": "2026-03-12T16:23:13Z",
"expected_measurement": "tasmota_sensor",
"expected_timestamp": "2026-03-12T16:23:13Z",
"expected_tags": {
"device": "tasmota_c88994",
"message_type": "sensor",
"source": "tasmota"
},
"expected_fields": {
"energy_power": 1,
"energy_voltage": 231,
"energy_total": 41.385,
"energy_current": 0.016
}
},
{
"name": "sensor-analog-temperature",
"topic": "tele/tasmota_896001/SENSOR",
"payload": "{\"Time\":\"2026-03-12T16:25:38\",\"ANALOG\":{\"Temperature\":33.8},\"ENERGY\":{\"TotalStartTime\":\"2022-12-30T00:18:41\",\"Total\":1.413,\"Yesterday\":0.000,\"Today\":0.000,\"Period\":0,\"Power\":0,\"ApparentPower\":0,\"ReactivePower\":0,\"Factor\":0.00,\"Voltage\":0,\"Current\":0.000},\"TempUnit\":\"C\"}",
"received_at": "2026-03-12T16:25:38Z",
"expected_measurement": "tasmota_sensor",
"expected_timestamp": "2026-03-12T16:25:38Z",
"expected_tags": {
"device": "tasmota_896001",
"message_type": "sensor",
"source": "tasmota"
},
"expected_fields": {
"analog_temperature": 33.8,
"temp_unit": "C",
"energy_total": 1.413
}
},
{
"name": "sensor-analog-a0",
"topic": "tele/tasmota_725D2D/SENSOR",
"payload": "{\"Time\":\"2026-03-12T16:23:14\",\"ANALOG\":{\"A0\":1024},\"ENERGY\":{\"TotalStartTime\":\"2025-05-23T14:48:03\",\"Total\":15.782,\"Yesterday\":0.000,\"Today\":0.000,\"Period\":0,\"Power\":0,\"ApparentPower\":0,\"ReactivePower\":0,\"Factor\":0.00,\"Voltage\":0,\"Current\":0.000}}",
"received_at": "2026-03-12T16:23:14Z",
"expected_measurement": "tasmota_sensor",
"expected_timestamp": "2026-03-12T16:23:14Z",
"expected_tags": {
"device": "tasmota_725d2d",
"message_type": "sensor",
"source": "tasmota"
},
"expected_fields": {
"analog_a0": 1024,
"energy_total": 15.782,
"energy_power": 0
}
}
]
+152
View File
@@ -0,0 +1,152 @@
package pipeline
import (
"context"
"log/slog"
"sync/atomic"
"time"
"mqqt-scrubber/internal/config"
"mqqt-scrubber/internal/model"
"mqqt-scrubber/internal/parser"
)
type writer interface {
Write(ctx context.Context, records []model.Record) error
}
type Service struct {
config config.Config
influxClient writer
input chan model.RawMessage
received atomic.Uint64
parsed atomic.Uint64
written atomic.Uint64
dropped atomic.Uint64
failed atomic.Uint64
}
type Snapshot struct {
Received uint64 `json:"received"`
Parsed uint64 `json:"parsed"`
Written uint64 `json:"written"`
Dropped uint64 `json:"dropped"`
Failed uint64 `json:"failed"`
}
func NewService(cfg config.Config, influxClient writer) *Service {
return &Service{
config: cfg,
influxClient: influxClient,
input: make(chan model.RawMessage, cfg.App.BufferSize),
}
}
func (service *Service) Enqueue(message model.RawMessage) {
service.received.Add(1)
select {
case service.input <- message:
default:
service.dropped.Add(1)
slog.Warn("dropping message because buffer is full", "topic", message.Topic)
}
}
func (service *Service) Run(ctx context.Context) error {
ticker := time.NewTicker(service.config.App.FlushInterval.Duration)
defer ticker.Stop()
batch := make([]model.Record, 0, service.config.App.BatchSize)
var input <-chan model.RawMessage = service.input
for {
if len(batch) >= service.config.App.BatchSize {
input = nil
} else {
input = service.input
}
select {
case <-ctx.Done():
flushCtx, cancel := context.WithTimeout(context.Background(), service.config.App.FlushTimeout.Duration)
err := service.flush(flushCtx, batch)
cancel()
if err != nil {
return err
}
service.logCounters()
return nil
case message := <-input:
records, err := parser.ParseTasmota(message)
if err != nil {
service.failed.Add(1)
slog.Warn("failed to parse message", "topic", message.Topic, "error", err)
continue
}
service.parsed.Add(uint64(len(records)))
batch = append(batch, records...)
if len(batch) >= service.config.App.BatchSize {
flushCtx, cancel := context.WithTimeout(ctx, service.config.App.FlushTimeout.Duration)
err := service.flush(flushCtx, batch)
cancel()
if err != nil {
slog.Error("failed to flush full batch to influx; keeping batch in memory", "count", len(batch), "error", err)
continue
}
batch = batch[:0]
}
case <-ticker.C:
if len(batch) == 0 {
continue
}
flushCtx, cancel := context.WithTimeout(ctx, service.config.App.FlushTimeout.Duration)
err := service.flush(flushCtx, batch)
cancel()
if err != nil {
slog.Error("failed to flush batch to influx; will retry on next interval", "count", len(batch), "error", err)
continue
}
batch = batch[:0]
}
}
}
func (service *Service) flush(ctx context.Context, batch []model.Record) error {
if len(batch) == 0 {
return nil
}
if err := service.influxClient.Write(ctx, batch); err != nil {
service.failed.Add(uint64(len(batch)))
return err
}
service.written.Add(uint64(len(batch)))
slog.Info("flushed records to influx", "count", len(batch))
return nil
}
func (service *Service) logCounters() {
slog.Info("service counters",
"received", service.received.Load(),
"parsed", service.parsed.Load(),
"written", service.written.Load(),
"dropped", service.dropped.Load(),
"failed", service.failed.Load(),
)
}
func (service *Service) Snapshot() Snapshot {
return Snapshot{
Received: service.received.Load(),
Parsed: service.parsed.Load(),
Written: service.written.Load(),
Dropped: service.dropped.Load(),
Failed: service.failed.Load(),
}
}
+110
View File
@@ -0,0 +1,110 @@
package pipeline
import (
"context"
"sync"
"testing"
"time"
"mqqt-scrubber/internal/config"
"mqqt-scrubber/internal/model"
)
type fakeWriter struct {
mu sync.Mutex
batches [][]model.Record
flushed chan struct{}
}
func newFakeWriter() *fakeWriter {
return &fakeWriter{flushed: make(chan struct{}, 1)}
}
func (writer *fakeWriter) Write(_ context.Context, records []model.Record) error {
writer.mu.Lock()
copyBatch := append([]model.Record(nil), records...)
writer.batches = append(writer.batches, copyBatch)
writer.mu.Unlock()
select {
case writer.flushed <- struct{}{}:
default:
}
return nil
}
func (writer *fakeWriter) firstBatch() []model.Record {
writer.mu.Lock()
defer writer.mu.Unlock()
if len(writer.batches) == 0 {
return nil
}
return writer.batches[0]
}
func TestServiceFlushesParsedRecords(t *testing.T) {
fake := newFakeWriter()
service := NewService(config.Config{
App: config.AppConfig{
BatchSize: 2,
BufferSize: 8,
FlushInterval: config.DurationValue{Duration: time.Hour},
FlushTimeout: config.DurationValue{Duration: time.Second},
},
}, fake)
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
errCh := make(chan error, 1)
go func() {
errCh <- service.Run(ctx)
}()
service.Enqueue(model.RawMessage{
Topic: "tele/tasmota_896001/LWT",
Payload: []byte("Online"),
ReceivedAt: time.Date(2026, time.March, 12, 15, 21, 39, 0, time.UTC),
})
service.Enqueue(model.RawMessage{
Topic: "tele/tasmota_C88994/SENSOR",
Payload: []byte(`{"Time":"2026-03-12T16:23:13","ENERGY":{"TotalStartTime":"2026-02-04T19:13:40","Total":41.385,"Yesterday":1.124,"Today":0.799,"Period":0,"Power":1,"ApparentPower":4,"ReactivePower":4,"Factor":0.22,"Voltage":231,"Current":0.016}}`),
ReceivedAt: time.Date(2026, time.March, 12, 16, 23, 13, 0, time.UTC),
})
select {
case <-fake.flushed:
case <-time.After(2 * time.Second):
t.Fatal("timed out waiting for pipeline flush")
}
batch := fake.firstBatch()
if len(batch) != 2 {
t.Fatalf("expected 2 records in flushed batch, got %d", len(batch))
}
if batch[0].Measurement != "tasmota_lwt" {
t.Fatalf("unexpected first measurement: %s", batch[0].Measurement)
}
if batch[0].Fields["online"] != true {
t.Fatalf("unexpected lwt online field: %#v", batch[0].Fields["online"])
}
if batch[1].Measurement != "tasmota_sensor" {
t.Fatalf("unexpected second measurement: %s", batch[1].Measurement)
}
if batch[1].Fields["energy_total"] != float64(41.385) {
t.Fatalf("unexpected sensor energy_total field: %#v", batch[1].Fields["energy_total"])
}
cancel()
select {
case err := <-errCh:
if err != nil {
t.Fatalf("service returned error: %v", err)
}
case <-time.After(2 * time.Second):
t.Fatal("timed out waiting for service shutdown")
}
}