Spring Boot 서버 모니터링(Actuator + Prometheus + Grafana)

서버의 성능과 장애 모니터링을 위해 모니터링 시스템을 붙이기로 함

Actuator

일단 Actuator를 설정 해야 모니터링에서 사용할 정보들을 얻을 수 있다.

circuit breaker를 적용하면서 의존성을 추가했기 때문에 yml에 설정하고 접속이 되는지 확인 필요.

application.yml

management:
  endpoints:
    jmx:
      exposure:
        exclude: '*'
    web:
      exposure:
        include: health,circuitbreakers,prometheus,metrics,loggers
      base-path: /secret
  endpoint:
    health:
      show-details: always
  health:
    circuitBreakers:
      enabled: true
    mail:
      enabled: false

jmx는 사용하지 않으면 모두 노출을 막는 것이 좋다.
endpoint는 필요한 것만 include하여 화이트리스트로 운영한다. → 보안에 유리
base-path는 기본 url인 /actuator는 알려져있어 해킹 봇들이 접속을 할 수 있으니 바꿔주는 것이 좋다
- security로 보안 추가 가능
/actuator/health가 뜨는데 너무 느려서 확인했더니 mail smtp 연결 체크가 너무 오래 걸리고 있었다
- /actuator/health/db 등 따로 체크 했을 때, 느린 부분을 찾음
- mail은 현재 운영에 큰 영향이 없어 사용안함으로 설정

"components": {
    "circuitBreakers": {
        "status": "UNKNOWN"
    }
    ...
}

Resilience4j는 Lazy Creation - 첫 호출 시 생성
API 호출이 일어나면 생성이 되고 상태 확인 가능

Prometheus + Grafana

Spring 서버 모니터링에서 많이 사용하는 조합

Actuator와 연동
오픈 소스로 별도 비용이 필요 없음
시계열 데이터를 저장하여 Before/After 비교, 시간대별 추이 분석 가능
서버 설정 관리가 단순

장점

실시간 메트릭 확인
커뮤니티 대시보드 활용(Spring boot용 대시보드를 바로 import)
지속적인 모니터링

단점

초기 설정 필요
Docker 환경 필요
대시보드 커스터마이징 러닝커브

Prometheus

서버는 endpoint만 제공하고 prometheus에서 http로 주기적으로 조회(가벼움)
서버는 요청이 오면 응답만 해주면 되는 pull 방식(메트릭 텍스트로 가벼운 요청)
pull 방식이기 때문에 Prometheus가 죽어도 문제 발생하지 않음
- 메트릭 수집 실패가 서버 성능에 영향을 주지 않음
메트릭은 메모리에 최신 상태만 유지
누적 값은 문제 없지만 실시간 데이터 손실로 실시간성이 약간 떨어짐(interval)
- 이중화, interval 간격 짧게

push 방식

서버가 전송하여 네트워크 I/O 문제 발생 가능성
실패 시 버퍼링(메모리 증가)
재시도 로직 필요
데이터 손실 최소

metrics: 애플리케이션의 상태를 숫자로 표현한 것 - counter, gauge, timer/histogram(시간/분포)
buffer: push 방식에서 전송 실패 시 메모리에 임시 저장 → 메모리 증가

테스트를 위해 로컬에서 테스트 서버의 endpoint와 연결하여 환경을 구축

docker-compose로 Prometheus와 Grafana를 띄우고 연결

아키텍처

[로컬 PC]                              [운영 서버]
┌──────────────┐                   ┌─────────────────┐
│  Prometheus  │ ───── HTTP ────>  │  Spring Boot    │
│  (9090)      │                   │  /actuator/     │
└──────────────┘                   │  prometheus     │
         │                         └─────────────────┘       
         │ 
┌──────────────┐
│   Grafana    │
│   (3000)     │
└──────────────┘

운영 서버는 actuator endpoint만 노출(리소스 부담 적음)
로컬 PC에서 Prometheus가 주기적으로 메트릭 수집
Grafana로 시각화

설치 및 설정

prometheus 의존성 추가

runtimeOnly 'io.micrometer:micrometer-registry-prometheus'

actuator endpoint prometheus 추가

prometheus.yml

global:
  scrape_interval: 15s
  evaluation_interval: 15s
  scrape_timeout: 10s

scrape_configs:
  - job_name: 'test'
    metrics_path: '/actuator/prometheus'
    scrape_interval: 10s # pull 간격
    static_configs:
      - targets: ['ip:port']
        labels:
          application: 'test'
          environment: 'dev'

docker-compose.yml

version: '3.8'

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus-data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
    restart: unless-stopped
    networks:
      - monitoring

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    ports:
      - "3000:3000"
    volumes:
      - grafana-data:/var/lib/grafana
    environment:
      - GF_SECURITY_ADMIN_USER=admin
      - GF_SECURITY_ADMIN_PASSWORD=admin123
    restart: unless-stopped
    networks:
      - monitoring
    depends_on:
      - prometheus

volumes:
  prometheus-data:
  grafana-data:

networks:
  monitoring:
    driver: bridge

Grafana

데이터 시각화, 모니터링 및 분석 도구

일정 기간 동안 특정 메트릭을 나타내는 패널이 포함된 대시보드 생성

Prometheus 외 DB 등 다른 데이터 소스도 지원

Prometheus 데이터소스 추가

1. Connections → Data sources
2. Add new data source → Prometheus
3. URL: http://prometheus:9090
4. Save & test

Spring Boot 대시보드 Import

1. Dashboards → new -> Import
2. ID 입력: 11378 (Spring Boot 2.1+ Statistics) -> Load
3. Prometheus 데이터소스 선택
4. Import

11378 - Spring Boot 전체 모니터링
4701 - JVM 메트릭
19004 - Spring Boot Observability

추가하면 유용한 endpoints

개발/모니터링용

include: health,circuitbreakers,prometheus,metrics,info,loggers

metrics - 개별 메트릭 조회 (Grafana 안 쓸 때 직접 확인)
info - 앱 정보, 버전, git commit hash 등
loggers - 로그 레벨 동적 변경 (재배포 없이)

예: DEBUG 로그 켜기

curl -X POST http://.../actuator/loggers/com.my.app \
-H "Content-Type: application/json" \
-d '{"configuredLevel":"DEBUG"}'

트러블슈팅용(선택)

threaddump - 스레드 덤프
heapdump - 힙 덤프 (주의: 파일 크고 메모리 많이 먹음)

대시보드를 보긴 했는데 아직 무엇이 무엇인지 알 수가 없어서 테스트를 하면서 파악이 필요하다.

성능 개선을 하면서 정량적 지표를 측정할 수 있고, 문제가 되는 부분을 수치와 함께 바로 확인할 수 있어서 좋다.

https://techblog.woowahan.com/9232/
https://toss.tech/article/how-to-work-health-check-in-spring-boot-actuator

'개발 > Spring' 카테고리의 다른 글

Spring 프로젝트 아키텍처 정하기 (2)	2025.04.11
리팩토링 - 패키지를 넘나드는 캐시 (0)	2025.02.12
[Spring] Caffeine Cache 적용 (1)	2025.02.11
스프링 내장 캐시 사용하기 (2)	2025.02.07
Spring 서버 로깅 1 - 로그 레벨 (0)	2024.11.04

그래도 해야지

Spring Boot 서버 모니터링(Actuator + Prometheus + Grafana)

Actuator

Prometheus + Grafana

Prometheus

아키텍처

설치 및 설정

Grafana

추가하면 유용한 endpoints

'개발 > Spring' 카테고리의 다른 글

티스토리툴바

Spring Boot 서버 모니터링(Actuator + Prometheus + Grafana)

Actuator

Prometheus + Grafana

Prometheus

아키텍처

설치 및 설정

Grafana

추가하면 유용한 endpoints

'개발 > Spring' 카테고리의 다른 글

'개발/Spring' Related Articles

티스토리툴바