SAGA Pattern: Managing Distributed Transactions in Spring Boot Microservices

🎯 Introduction

In the era of microservices architecture, managing transactions across multiple services presents significant challenges. Traditional distributed transaction mechanisms like Two-Phase Commit (2PC) often lead to tight coupling, reduced availability, and poor performance. The SAGA Pattern emerges as a powerful alternative, providing a way to manage distributed transactions through a sequence of local transactions, each with compensating actions for rollback scenarios.

📚 What is the SAGA Pattern?

🔍 Core Concepts

The SAGA pattern is a design pattern for managing long-running distributed transactions across multiple microservices. Instead of using distributed locks or two-phase commits, SAGA breaks down a complex business transaction into a series of smaller, local transactions that can be independently committed or rolled back.

graph TD
    A[Start SAGA Transaction] --> B[Local Transaction 1]
    B --> C{Success?}
    C -->|Yes| D[Local Transaction 2]
    C -->|No| E[Compensate T1]
    D --> F{Success?}
    F -->|Yes| G[Local Transaction 3]
    F -->|No| H[Compensate T2]
    H --> I[Compensate T1]
    G --> J{Success?}
    J -->|Yes| K[SAGA Complete]
    J -->|No| L[Compensate T3]
    L --> M[Compensate T2]
    M --> N[Compensate T1]

    style A fill:#ff6b6b
    style K fill:#4ecdc4
    style E fill:#feca57
    style I fill:#feca57
    style N fill:#feca57

🏗️ SAGA vs Traditional Approaches

Key Differences:

AspectTwo-Phase CommitSAGA Pattern
ConsistencyStrong (ACID)Eventually Consistent
AvailabilityReduced during locksHigh availability
PerformanceBlocking operationsNon-blocking
ComplexityProtocol complexityBusiness logic complexity
Fault ToleranceCoordinator failure risksResilient to service failures
ScalabilityLimited by lock durationHighly scalable

📋 SAGA Properties

SAGA transactions must satisfy:

  1. Atomicity: Either all transactions complete, or all compensations are executed
  2. Consistency: System maintains consistent state through compensations
  3. Isolation: Individual transactions may see intermediate states
  4. Durability: Each local transaction and compensation is durable

🎭 SAGA Implementation Patterns

🎼 1. Orchestration Pattern

In the orchestration approach, a central SAGA Orchestrator coordinates the entire transaction flow, explicitly calling each service and managing the sequence of operations.

📋 Orchestration Architecture

graph TD
    A[SAGA Orchestrator] --> B[Order Service]
    A --> C[Payment Service]
    A --> D[Inventory Service]
    A --> E[Shipping Service]

    B --> F[Create Order]
    C --> G[Process Payment]
    D --> H[Reserve Inventory]
    E --> I[Schedule Shipment]

    F --> A
    G --> A
    H --> A
    I --> A

    A --> J[Success Path]
    A --> K[Compensation Path]

    K --> L[Cancel Shipment]
    K --> M[Release Inventory]
    K --> N[Refund Payment]
    K --> O[Cancel Order]

    style A fill:#ff6b6b
    style J fill:#4ecdc4
    style K fill:#feca57

🛠️ Java Spring Boot Implementation

SAGA Orchestrator Service:

  1@Service
  2@Transactional
  3public class OrderSagaOrchestrator {
  4
  5    private final OrderService orderService;
  6    private final PaymentService paymentService;
  7    private final InventoryService inventoryService;
  8    private final ShippingService shippingService;
  9    private final SagaTransactionRepository sagaRepository;
 10    private final ApplicationEventPublisher eventPublisher;
 11
 12    public SagaExecutionResult executeOrderSaga(OrderSagaRequest request) {
 13        String sagaId = UUID.randomUUID().toString();
 14        SagaTransaction saga = createSagaTransaction(sagaId, request);
 15
 16        try {
 17            return executeOrderFlow(saga);
 18        } catch (Exception e) {
 19            return executeCompensationFlow(saga, e);
 20        }
 21    }
 22
 23    private SagaExecutionResult executeOrderFlow(SagaTransaction saga) {
 24        // Step 1: Create Order
 25        SagaStep orderStep = executeStep(saga, "CREATE_ORDER", () -> {
 26            OrderResponse order = orderService.createOrder(saga.getRequest().toOrderRequest());
 27            saga.setOrderId(order.getOrderId());
 28            return order;
 29        });
 30
 31        if (!orderStep.isSuccess()) {
 32            return compensateFromStep(saga, orderStep);
 33        }
 34
 35        // Step 2: Reserve Inventory
 36        SagaStep inventoryStep = executeStep(saga, "RESERVE_INVENTORY", () -> {
 37            return inventoryService.reserveItems(
 38                saga.getRequest().getItems(),
 39                saga.getOrderId()
 40            );
 41        });
 42
 43        if (!inventoryStep.isSuccess()) {
 44            return compensateFromStep(saga, inventoryStep);
 45        }
 46
 47        // Step 3: Process Payment
 48        SagaStep paymentStep = executeStep(saga, "PROCESS_PAYMENT", () -> {
 49            return paymentService.processPayment(
 50                saga.getRequest().getPaymentDetails(),
 51                saga.getOrderId(),
 52                saga.getRequest().getTotalAmount()
 53            );
 54        });
 55
 56        if (!paymentStep.isSuccess()) {
 57            return compensateFromStep(saga, paymentStep);
 58        }
 59
 60        // Step 4: Schedule Shipping
 61        SagaStep shippingStep = executeStep(saga, "SCHEDULE_SHIPPING", () -> {
 62            return shippingService.scheduleShipping(
 63                saga.getOrderId(),
 64                saga.getRequest().getShippingAddress()
 65            );
 66        });
 67
 68        if (!shippingStep.isSuccess()) {
 69            return compensateFromStep(saga, shippingStep);
 70        }
 71
 72        // All steps successful
 73        saga.setStatus(SagaStatus.COMPLETED);
 74        sagaRepository.save(saga);
 75
 76        eventPublisher.publishEvent(new OrderSagaCompletedEvent(saga.getSagaId()));
 77
 78        return SagaExecutionResult.success(saga);
 79    }
 80
 81    private SagaStep executeStep(SagaTransaction saga, String stepName, Supplier<Object> operation) {
 82        SagaStep step = SagaStep.builder()
 83            .stepName(stepName)
 84            .sagaId(saga.getSagaId())
 85            .status(SagaStepStatus.IN_PROGRESS)
 86            .startTime(LocalDateTime.now())
 87            .build();
 88
 89        try {
 90            Object result = operation.get();
 91            step.setStatus(SagaStepStatus.COMPLETED);
 92            step.setResult(result);
 93            step.setEndTime(LocalDateTime.now());
 94
 95            saga.addStep(step);
 96            sagaRepository.save(saga);
 97
 98            return step;
 99
100        } catch (Exception e) {
101            step.setStatus(SagaStepStatus.FAILED);
102            step.setErrorMessage(e.getMessage());
103            step.setEndTime(LocalDateTime.now());
104
105            saga.addStep(step);
106            sagaRepository.save(saga);
107
108            throw new SagaStepExecutionException(stepName, e);
109        }
110    }
111
112    private SagaExecutionResult compensateFromStep(SagaTransaction saga, SagaStep failedStep) {
113        saga.setStatus(SagaStatus.COMPENSATING);
114        sagaRepository.save(saga);
115
116        // Execute compensations in reverse order
117        List<SagaStep> completedSteps = saga.getSteps().stream()
118            .filter(step -> step.getStatus() == SagaStepStatus.COMPLETED)
119            .sorted(Comparator.comparing(SagaStep::getStartTime).reversed())
120            .toList();
121
122        for (SagaStep step : completedSteps) {
123            try {
124                executeCompensation(saga, step);
125            } catch (Exception e) {
126                // Log compensation failure but continue with other compensations
127                log.error("Compensation failed for step: {}, saga: {}",
128                    step.getStepName(), saga.getSagaId(), e);
129            }
130        }
131
132        saga.setStatus(SagaStatus.COMPENSATED);
133        sagaRepository.save(saga);
134
135        eventPublisher.publishEvent(new OrderSagaCompensatedEvent(saga.getSagaId(), failedStep.getStepName()));
136
137        return SagaExecutionResult.compensated(saga, failedStep.getErrorMessage());
138    }
139
140    private void executeCompensation(SagaTransaction saga, SagaStep step) {
141        switch (step.getStepName()) {
142            case "CREATE_ORDER":
143                orderService.cancelOrder(saga.getOrderId());
144                break;
145            case "RESERVE_INVENTORY":
146                inventoryService.releaseReservation(saga.getOrderId());
147                break;
148            case "PROCESS_PAYMENT":
149                paymentService.refundPayment(saga.getOrderId());
150                break;
151            case "SCHEDULE_SHIPPING":
152                shippingService.cancelShipment(saga.getOrderId());
153                break;
154            default:
155                log.warn("No compensation defined for step: {}", step.getStepName());
156        }
157    }
158}

SAGA Data Models:

 1@Entity
 2@Table(name = "saga_transactions")
 3public class SagaTransaction {
 4
 5    @Id
 6    private String sagaId;
 7
 8    @Enumerated(EnumType.STRING)
 9    private SagaStatus status;
10
11    @Column(name = "order_id")
12    private String orderId;
13
14    @Column(columnDefinition = "JSON")
15    @Convert(converter = OrderSagaRequestConverter.class)
16    private OrderSagaRequest request;
17
18    @OneToMany(mappedBy = "sagaTransaction", cascade = CascadeType.ALL, fetch = FetchType.LAZY)
19    private List<SagaStep> steps = new ArrayList<>();
20
21    @Column(name = "created_at")
22    private LocalDateTime createdAt;
23
24    @Column(name = "updated_at")
25    private LocalDateTime updatedAt;
26
27    public void addStep(SagaStep step) {
28        step.setSagaTransaction(this);
29        this.steps.add(step);
30        this.updatedAt = LocalDateTime.now();
31    }
32
33    // Constructors, getters, setters
34}
35
36@Entity
37@Table(name = "saga_steps")
38public class SagaStep {
39
40    @Id
41    @GeneratedValue(strategy = GenerationType.IDENTITY)
42    private Long id;
43
44    @Column(name = "step_name")
45    private String stepName;
46
47    @Enumerated(EnumType.STRING)
48    private SagaStepStatus status;
49
50    @Column(name = "start_time")
51    private LocalDateTime startTime;
52
53    @Column(name = "end_time")
54    private LocalDateTime endTime;
55
56    @Column(columnDefinition = "TEXT")
57    private String errorMessage;
58
59    @Column(columnDefinition = "JSON")
60    private String result;
61
62    @ManyToOne(fetch = FetchType.LAZY)
63    @JoinColumn(name = "saga_id", referencedColumnName = "sagaId")
64    private SagaTransaction sagaTransaction;
65
66    // Constructors, getters, setters
67}
68
69public enum SagaStatus {
70    STARTED, IN_PROGRESS, COMPLETED, COMPENSATING, COMPENSATED, FAILED
71}
72
73public enum SagaStepStatus {
74    PENDING, IN_PROGRESS, COMPLETED, FAILED, COMPENSATED
75}

REST Controller:

 1@RestController
 2@RequestMapping("/api/orders")
 3public class OrderSagaController {
 4
 5    private final OrderSagaOrchestrator orchestrator;
 6
 7    @PostMapping("/saga")
 8    public ResponseEntity<SagaResponse> createOrderWithSaga(@RequestBody OrderSagaRequest request) {
 9        try {
10            SagaExecutionResult result = orchestrator.executeOrderSaga(request);
11
12            if (result.isSuccess()) {
13                return ResponseEntity.ok(SagaResponse.success(result.getSagaId()));
14            } else {
15                return ResponseEntity.status(HttpStatus.BAD_REQUEST)
16                    .body(SagaResponse.failed(result.getSagaId(), result.getErrorMessage()));
17            }
18
19        } catch (Exception e) {
20            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
21                .body(SagaResponse.error("SAGA execution failed: " + e.getMessage()));
22        }
23    }
24
25    @GetMapping("/saga/{sagaId}")
26    public ResponseEntity<SagaStatusResponse> getSagaStatus(@PathVariable String sagaId) {
27        Optional<SagaTransaction> saga = orchestrator.getSagaStatus(sagaId);
28
29        if (saga.isPresent()) {
30            return ResponseEntity.ok(SagaStatusResponse.from(saga.get()));
31        } else {
32            return ResponseEntity.notFound().build();
33        }
34    }
35}

💃 2. Choreography Pattern

In the choreography approach, there’s no central coordinator. Each service publishes and listens to events, and the SAGA progresses through a series of event-driven interactions between services.

📋 Choreography Architecture

graph TD
    A[Order Created Event] --> B[Order Service]
    A --> C[Inventory Service]

    C --> D[Inventory Reserved Event]
    D --> E[Payment Service]

    E --> F[Payment Processed Event]
    F --> G[Shipping Service]

    G --> H[Shipment Scheduled Event]
    H --> I[Order Complete]

    J[Inventory Failed Event] --> K[Order Service]
    K --> L[Order Cancelled Event]

    M[Payment Failed Event] --> N[Inventory Service]
    N --> O[Inventory Released Event]
    O --> K

    P[Shipping Failed Event] --> Q[Payment Service]
    Q --> R[Payment Refunded Event]
    R --> N

    style I fill:#4ecdc4
    style L fill:#feca57
    style O fill:#feca57
    style R fill:#feca57

🛠️ Java Spring Boot Implementation

Event-Driven Order Service:

  1@Service
  2@Transactional
  3public class ChoreographyOrderService {
  4
  5    private final OrderRepository orderRepository;
  6    private final ApplicationEventPublisher eventPublisher;
  7    private final SagaEventRepository sagaEventRepository;
  8
  9    @EventListener
 10    public void handleOrderCreationRequest(OrderCreationRequestEvent event) {
 11        try {
 12            Order order = createOrder(event.getOrderRequest());
 13
 14            // Publish order created event
 15            OrderCreatedEvent orderCreatedEvent = OrderCreatedEvent.builder()
 16                .sagaId(event.getSagaId())
 17                .orderId(order.getId())
 18                .customerId(order.getCustomerId())
 19                .items(order.getItems())
 20                .totalAmount(order.getTotalAmount())
 21                .timestamp(LocalDateTime.now())
 22                .build();
 23
 24            recordSagaEvent(orderCreatedEvent);
 25            eventPublisher.publishEvent(orderCreatedEvent);
 26
 27        } catch (Exception e) {
 28            // Publish order creation failed event
 29            OrderCreationFailedEvent failedEvent = OrderCreationFailedEvent.builder()
 30                .sagaId(event.getSagaId())
 31                .reason(e.getMessage())
 32                .timestamp(LocalDateTime.now())
 33                .build();
 34
 35            recordSagaEvent(failedEvent);
 36            eventPublisher.publishEvent(failedEvent);
 37        }
 38    }
 39
 40    @EventListener
 41    public void handleInventoryFailure(InventoryReservationFailedEvent event) {
 42        try {
 43            // Cancel the order
 44            Order order = orderRepository.findBySagaId(event.getSagaId())
 45                .orElseThrow(() -> new OrderNotFoundException(event.getSagaId()));
 46
 47            order.setStatus(OrderStatus.CANCELLED);
 48            order.setCancellationReason("Inventory reservation failed: " + event.getReason());
 49            orderRepository.save(order);
 50
 51            // Publish order cancelled event
 52            OrderCancelledEvent cancelledEvent = OrderCancelledEvent.builder()
 53                .sagaId(event.getSagaId())
 54                .orderId(order.getId())
 55                .reason("Inventory unavailable")
 56                .timestamp(LocalDateTime.now())
 57                .build();
 58
 59            recordSagaEvent(cancelledEvent);
 60            eventPublisher.publishEvent(cancelledEvent);
 61
 62        } catch (Exception e) {
 63            log.error("Failed to handle inventory failure for saga: {}", event.getSagaId(), e);
 64        }
 65    }
 66
 67    @EventListener
 68    public void handlePaymentFailure(PaymentFailedEvent event) {
 69        // Similar compensation logic for payment failures
 70        handleCompensationScenario(event.getSagaId(), "Payment failed: " + event.getReason());
 71    }
 72
 73    @EventListener
 74    public void handleShippingFailure(ShippingFailedEvent event) {
 75        // Similar compensation logic for shipping failures
 76        handleCompensationScenario(event.getSagaId(), "Shipping failed: " + event.getReason());
 77    }
 78
 79    @EventListener
 80    public void handleSagaCompletion(ShipmentScheduledEvent event) {
 81        try {
 82            Order order = orderRepository.findBySagaId(event.getSagaId())
 83                .orElseThrow(() -> new OrderNotFoundException(event.getSagaId()));
 84
 85            order.setStatus(OrderStatus.CONFIRMED);
 86            order.setShipmentId(event.getShipmentId());
 87            orderRepository.save(order);
 88
 89            // Publish final completion event
 90            OrderSagaCompletedEvent completedEvent = OrderSagaCompletedEvent.builder()
 91                .sagaId(event.getSagaId())
 92                .orderId(order.getId())
 93                .timestamp(LocalDateTime.now())
 94                .build();
 95
 96            recordSagaEvent(completedEvent);
 97            eventPublisher.publishEvent(completedEvent);
 98
 99        } catch (Exception e) {
100            log.error("Failed to complete saga: {}", event.getSagaId(), e);
101        }
102    }
103
104    private void recordSagaEvent(SagaEvent event) {
105        SagaEventRecord record = SagaEventRecord.builder()
106            .sagaId(event.getSagaId())
107            .eventType(event.getClass().getSimpleName())
108            .eventData(JsonUtils.toJson(event))
109            .timestamp(event.getTimestamp())
110            .build();
111
112        sagaEventRepository.save(record);
113    }
114}

Inventory Service with Event Handling:

  1@Service
  2@Transactional
  3public class ChoreographyInventoryService {
  4
  5    private final InventoryRepository inventoryRepository;
  6    private final ReservationRepository reservationRepository;
  7    private final ApplicationEventPublisher eventPublisher;
  8
  9    @EventListener
 10    @Async
 11    public void handleOrderCreated(OrderCreatedEvent event) {
 12        try {
 13            // Check inventory availability
 14            boolean allItemsAvailable = event.getItems().stream()
 15                .allMatch(item -> checkInventoryAvailability(item.getProductId(), item.getQuantity()));
 16
 17            if (!allItemsAvailable) {
 18                publishInventoryFailure(event.getSagaId(), "Insufficient inventory");
 19                return;
 20            }
 21
 22            // Reserve inventory
 23            List<InventoryReservation> reservations = event.getItems().stream()
 24                .map(item -> reserveInventory(item, event.getSagaId(), event.getOrderId()))
 25                .toList();
 26
 27            // Publish success event
 28            InventoryReservedEvent reservedEvent = InventoryReservedEvent.builder()
 29                .sagaId(event.getSagaId())
 30                .orderId(event.getOrderId())
 31                .reservations(reservations.stream()
 32                    .map(this::toReservationInfo)
 33                    .toList())
 34                .timestamp(LocalDateTime.now())
 35                .build();
 36
 37            eventPublisher.publishEvent(reservedEvent);
 38
 39        } catch (Exception e) {
 40            publishInventoryFailure(event.getSagaId(), e.getMessage());
 41        }
 42    }
 43
 44    @EventListener
 45    public void handlePaymentFailure(PaymentFailedEvent event) {
 46        try {
 47            // Release inventory reservations
 48            List<InventoryReservation> reservations = reservationRepository
 49                .findBySagaId(event.getSagaId());
 50
 51            for (InventoryReservation reservation : reservations) {
 52                releaseReservation(reservation);
 53            }
 54
 55            // Publish inventory released event
 56            InventoryReleasedEvent releasedEvent = InventoryReleasedEvent.builder()
 57                .sagaId(event.getSagaId())
 58                .reason("Payment failed")
 59                .timestamp(LocalDateTime.now())
 60                .build();
 61
 62            eventPublisher.publishEvent(releasedEvent);
 63
 64        } catch (Exception e) {
 65            log.error("Failed to release inventory for saga: {}", event.getSagaId(), e);
 66        }
 67    }
 68
 69    private void publishInventoryFailure(String sagaId, String reason) {
 70        InventoryReservationFailedEvent failedEvent = InventoryReservationFailedEvent.builder()
 71            .sagaId(sagaId)
 72            .reason(reason)
 73            .timestamp(LocalDateTime.now())
 74            .build();
 75
 76        eventPublisher.publishEvent(failedEvent);
 77    }
 78
 79    private InventoryReservation reserveInventory(OrderItem item, String sagaId, String orderId) {
 80        // Update inventory count
 81        Inventory inventory = inventoryRepository.findByProductId(item.getProductId())
 82            .orElseThrow(() -> new ProductNotFoundException(item.getProductId()));
 83
 84        if (inventory.getAvailableQuantity() < item.getQuantity()) {
 85            throw new InsufficientInventoryException(item.getProductId(), item.getQuantity());
 86        }
 87
 88        inventory.setAvailableQuantity(inventory.getAvailableQuantity() - item.getQuantity());
 89        inventory.setReservedQuantity(inventory.getReservedQuantity() + item.getQuantity());
 90        inventoryRepository.save(inventory);
 91
 92        // Create reservation record
 93        InventoryReservation reservation = InventoryReservation.builder()
 94            .sagaId(sagaId)
 95            .orderId(orderId)
 96            .productId(item.getProductId())
 97            .quantity(item.getQuantity())
 98            .status(ReservationStatus.RESERVED)
 99            .createdAt(LocalDateTime.now())
100            .build();
101
102        return reservationRepository.save(reservation);
103    }
104}

Event Configuration with Spring Boot:

 1@Configuration
 2@EnableAsync
 3public class SagaEventConfiguration {
 4
 5    @Bean
 6    public ApplicationEventMulticaster applicationEventMulticaster() {
 7        SimpleApplicationEventMulticaster eventMulticaster = new SimpleApplicationEventMulticaster();
 8        eventMulticaster.setTaskExecutor(sagaEventExecutor());
 9        eventMulticaster.setErrorHandler(new SagaEventErrorHandler());
10        return eventMulticaster;
11    }
12
13    @Bean
14    public TaskExecutor sagaEventExecutor() {
15        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
16        executor.setCorePoolSize(10);
17        executor.setMaxPoolSize(50);
18        executor.setQueueCapacity(200);
19        executor.setThreadNamePrefix("saga-event-");
20        executor.setWaitForTasksToCompleteOnShutdown(true);
21        executor.setAwaitTerminationSeconds(60);
22        executor.initialize();
23        return executor;
24    }
25
26    @Component
27    public static class SagaEventErrorHandler implements ErrorHandler {
28
29        private final Logger log = LoggerFactory.getLogger(SagaEventErrorHandler.class);
30
31        @Override
32        public void handleError(Throwable throwable) {
33            log.error("Error processing SAGA event", throwable);
34            // Implement dead letter queue or retry mechanism
35        }
36    }
37}

🔄 Advanced SAGA Patterns

📊 SAGA State Management

For complex scenarios, implementing explicit state management helps track SAGA progress and handle failures more effectively.

 1@Entity
 2@Table(name = "saga_state")
 3public class SagaState {
 4
 5    @Id
 6    private String sagaId;
 7
 8    @Enumerated(EnumType.STRING)
 9    private SagaPhase currentPhase;
10
11    @Column(columnDefinition = "JSON")
12    @Convert(converter = SagaContextConverter.class)
13    private SagaContext context;
14
15    @ElementCollection
16    @Enumerated(EnumType.STRING)
17    private Set<SagaStepType> completedSteps = new HashSet<>();
18
19    @ElementCollection
20    @Enumerated(EnumType.STRING)
21    private Set<SagaStepType> compensatedSteps = new HashSet<>();
22
23    @Column(name = "created_at")
24    private LocalDateTime createdAt;
25
26    @Column(name = "updated_at")
27    private LocalDateTime updatedAt;
28
29    public boolean isStepCompleted(SagaStepType stepType) {
30        return completedSteps.contains(stepType);
31    }
32
33    public boolean isStepCompensated(SagaStepType stepType) {
34        return compensatedSteps.contains(stepType);
35    }
36
37    public void markStepCompleted(SagaStepType stepType) {
38        completedSteps.add(stepType);
39        updatedAt = LocalDateTime.now();
40    }
41
42    public void markStepCompensated(SagaStepType stepType) {
43        compensatedSteps.add(stepType);
44        completedSteps.remove(stepType);
45        updatedAt = LocalDateTime.now();
46    }
47}
48
49public enum SagaPhase {
50    STARTED,
51    ORDER_CREATION,
52    INVENTORY_RESERVATION,
53    PAYMENT_PROCESSING,
54    SHIPPING_SCHEDULING,
55    COMPLETED,
56    COMPENSATING,
57    COMPENSATED,
58    FAILED
59}
60
61public enum SagaStepType {
62    CREATE_ORDER,
63    RESERVE_INVENTORY,
64    PROCESS_PAYMENT,
65    SCHEDULE_SHIPPING
66}

🔄 SAGA Recovery Mechanism

Implementing recovery mechanisms for handling partial failures and system restarts:

 1@Component
 2@Scheduled(fixedRate = 30000) // Run every 30 seconds
 3public class SagaRecoveryService {
 4
 5    private final SagaStateRepository sagaStateRepository;
 6    private final OrderSagaOrchestrator orchestrator;
 7    private final SagaEventPublisher eventPublisher;
 8
 9    @Scheduled(fixedRate = 30000)
10    public void recoverStuckSagas() {
11        LocalDateTime cutoffTime = LocalDateTime.now().minusMinutes(10);
12
13        List<SagaState> stuckSagas = sagaStateRepository
14            .findByUpdatedAtBeforeAndCurrentPhaseIn(
15                cutoffTime,
16                Arrays.asList(SagaPhase.ORDER_CREATION, SagaPhase.INVENTORY_RESERVATION,
17                             SagaPhase.PAYMENT_PROCESSING, SagaPhase.SHIPPING_SCHEDULING)
18            );
19
20        for (SagaState saga : stuckSagas) {
21            try {
22                recoverSaga(saga);
23            } catch (Exception e) {
24                log.error("Failed to recover saga: {}", saga.getSagaId(), e);
25            }
26        }
27    }
28
29    private void recoverSaga(SagaState saga) {
30        log.info("Recovering stuck saga: {}, phase: {}", saga.getSagaId(), saga.getCurrentPhase());
31
32        switch (saga.getCurrentPhase()) {
33            case ORDER_CREATION:
34                if (!saga.isStepCompleted(SagaStepType.CREATE_ORDER)) {
35                    retryOrderCreation(saga);
36                } else {
37                    moveToNextPhase(saga, SagaPhase.INVENTORY_RESERVATION);
38                }
39                break;
40
41            case INVENTORY_RESERVATION:
42                if (!saga.isStepCompleted(SagaStepType.RESERVE_INVENTORY)) {
43                    retryInventoryReservation(saga);
44                } else {
45                    moveToNextPhase(saga, SagaPhase.PAYMENT_PROCESSING);
46                }
47                break;
48
49            case PAYMENT_PROCESSING:
50                if (!saga.isStepCompleted(SagaStepType.PROCESS_PAYMENT)) {
51                    retryPaymentProcessing(saga);
52                } else {
53                    moveToNextPhase(saga, SagaPhase.SHIPPING_SCHEDULING);
54                }
55                break;
56
57            case SHIPPING_SCHEDULING:
58                if (!saga.isStepCompleted(SagaStepType.SCHEDULE_SHIPPING)) {
59                    retryShippingScheduling(saga);
60                } else {
61                    moveToNextPhase(saga, SagaPhase.COMPLETED);
62                }
63                break;
64
65            default:
66                log.warn("Unknown phase for recovery: {}", saga.getCurrentPhase());
67        }
68    }
69
70    private void retryOrderCreation(SagaState saga) {
71        try {
72            eventPublisher.publishOrderCreationRetry(saga.getSagaId(), saga.getContext());
73        } catch (Exception e) {
74            initiateSagaCompensation(saga, "Order creation retry failed");
75        }
76    }
77
78    private void initiateSagaCompensation(SagaState saga, String reason) {
79        saga.setCurrentPhase(SagaPhase.COMPENSATING);
80        sagaStateRepository.save(saga);
81
82        eventPublisher.publishSagaCompensationInitiated(saga.getSagaId(), reason);
83    }
84}

📈 Performance Optimization & Monitoring

🔍 SAGA Metrics and Monitoring

 1@Component
 2public class SagaMetrics {
 3
 4    private final MeterRegistry meterRegistry;
 5    private final Timer sagaExecutionTimer;
 6    private final Counter sagaSuccessCounter;
 7    private final Counter sagaFailureCounter;
 8    private final Gauge activeSagasGauge;
 9
10    public SagaMetrics(MeterRegistry meterRegistry, SagaStateRepository sagaStateRepository) {
11        this.meterRegistry = meterRegistry;
12        this.sagaExecutionTimer = Timer.builder("saga.execution.time")
13            .description("Time taken to execute SAGA transactions")
14            .register(meterRegistry);
15
16        this.sagaSuccessCounter = Counter.builder("saga.executions")
17            .tag("result", "success")
18            .description("Successful SAGA executions")
19            .register(meterRegistry);
20
21        this.sagaFailureCounter = Counter.builder("saga.executions")
22            .tag("result", "failure")
23            .description("Failed SAGA executions")
24            .register(meterRegistry);
25
26        this.activeSagasGauge = Gauge.builder("saga.active.count")
27            .description("Number of active SAGA transactions")
28            .register(meterRegistry, sagaStateRepository, repo ->
29                repo.countByCurrentPhaseIn(Arrays.asList(
30                    SagaPhase.STARTED, SagaPhase.ORDER_CREATION,
31                    SagaPhase.INVENTORY_RESERVATION, SagaPhase.PAYMENT_PROCESSING,
32                    SagaPhase.SHIPPING_SCHEDULING)));
33    }
34
35    public Timer.Sample startSagaTimer() {
36        return Timer.start(meterRegistry);
37    }
38
39    public void recordSagaSuccess(Timer.Sample sample, String sagaType) {
40        sample.stop(Timer.builder("saga.execution.time")
41            .tag("type", sagaType)
42            .tag("result", "success")
43            .register(meterRegistry));
44        sagaSuccessCounter.increment();
45    }
46
47    public void recordSagaFailure(Timer.Sample sample, String sagaType, String errorType) {
48        sample.stop(Timer.builder("saga.execution.time")
49            .tag("type", sagaType)
50            .tag("result", "failure")
51            .tag("error", errorType)
52            .register(meterRegistry));
53        sagaFailureCounter.increment();
54    }
55
56    public void recordCompensationEvent(String stepType, String reason) {
57        Counter.builder("saga.compensations")
58            .tag("step", stepType)
59            .tag("reason", reason)
60            .register(meterRegistry)
61            .increment();
62    }
63}

🔧 Configuration Optimization

 1# application.yml
 2spring:
 3  application:
 4    name: saga-orchestrator
 5
 6  datasource:
 7    hikari:
 8      maximum-pool-size: 50
 9      minimum-idle: 10
10      connection-timeout: 30000
11      idle-timeout: 600000
12      max-lifetime: 1800000
13
14  kafka:
15    producer:
16      bootstrap-servers: ${KAFKA_BOOTSTRAP_SERVERS:localhost:9092}
17      key-serializer: org.apache.kafka.common.serialization.StringSerializer
18      value-serializer: org.springframework.kafka.support.serializer.JsonSerializer
19      acks: all
20      retries: 3
21      batch-size: 16384
22      linger-ms: 5
23      buffer-memory: 33554432
24
25    consumer:
26      bootstrap-servers: ${KAFKA_BOOTSTRAP_SERVERS:localhost:9092}
27      key-deserializer: org.apache.kafka.common.serialization.StringDeserializer
28      value-deserializer: org.springframework.kafka.support.serializer.JsonDeserializer
29      properties:
30        spring.json.trusted.packages: "com.example.saga.events"
31      group-id: saga-choreography-group
32      auto-offset-reset: earliest
33      enable-auto-commit: false
34      max-poll-records: 10
35
36saga:
37  orchestration:
38    timeout: 300000 # 5 minutes
39    retry:
40      max-attempts: 3
41      backoff-delay: 1000
42      multiplier: 2
43
44  choreography:
45    event-store:
46      retention-days: 30
47    recovery:
48      check-interval: 30000
49      stuck-timeout: 600000 # 10 minutes
50
51  compensation:
52    timeout: 60000 # 1 minute per step
53    max-retries: 3
54
55management:
56  endpoints:
57    web:
58      exposure:
59        include: health, metrics, prometheus, saga-status
60  endpoint:
61    health:
62      show-details: always
63  metrics:
64    export:
65      prometheus:
66        enabled: true

⚖️ SAGA Pattern Trade-offs

📊 Orchestration vs Choreography Comparison

graph TD
    A[SAGA Implementation Choice] --> B{System Characteristics}

    B -->|Simple Flow, Few Services| C[Orchestration]
    B -->|Complex Flow, Many Services| D[Choreography]
    B -->|Mixed Requirements| E[Hybrid Approach]

    C --> F[Centralized Control]
    C --> G[Easy Debugging]
    C --> H[Single Point of Failure]

    D --> I[Distributed Control]
    D --> J[High Resilience]
    D --> K[Complex Event Chains]

    E --> L[Service-Specific Choice]
    E --> M[Gradual Migration]

    style C fill:#4ecdc4
    style D fill:#45b7d1
    style E fill:#feca57
    style H fill:#ff6b35
    style K fill:#ff9ff3

📈 Performance Comparison

AspectOrchestrationChoreographyTwo-Phase Commit
LatencyMediumLowHigh
ThroughputMediumHighLow
ComplexityMediumHighLow
DebuggingEasyDifficultEasy
Fault ToleranceMediumHighLow
ConsistencyEventualEventualStrong

✅ SAGA Pattern Pros & Cons

Advantages:

  • High Availability: No blocking distributed locks
  • Scalability: Services can scale independently
  • Fault Tolerance: Resilient to individual service failures
  • Performance: Better throughput than 2PC
  • Flexibility: Business logic can be distributed

Disadvantages:

  • Complexity: More complex error handling and compensation logic
  • Eventual Consistency: Temporary inconsistent states during execution
  • Debugging: Harder to trace and debug distributed flows
  • Data Isolation: Lack of isolation between concurrent SAGAs
  • Compensation Logic: Requires careful design of compensating actions

🎯 Best Practices and Recommendations

🔧 Implementation Guidelines

1. Idempotency Design:

 1@Service
 2public class IdempotentPaymentService {
 3
 4    private final PaymentRepository paymentRepository;
 5    private final IdempotencyKeyRepository idempotencyKeyRepository;
 6
 7    @Transactional
 8    public PaymentResult processPayment(String idempotencyKey, PaymentRequest request) {
 9        // Check for existing payment with same idempotency key
10        Optional<Payment> existingPayment = paymentRepository
11            .findByIdempotencyKey(idempotencyKey);
12
13        if (existingPayment.isPresent()) {
14            return PaymentResult.fromExisting(existingPayment.get());
15        }
16
17        // Record idempotency key to prevent concurrent processing
18        IdempotencyKey key = new IdempotencyKey(idempotencyKey, PaymentRequest.class.getSimpleName());
19        idempotencyKeyRepository.saveWithLock(key);
20
21        try {
22            Payment payment = executePayment(request);
23            payment.setIdempotencyKey(idempotencyKey);
24            paymentRepository.save(payment);
25
26            return PaymentResult.success(payment);
27
28        } catch (Exception e) {
29            idempotencyKeyRepository.delete(key);
30            throw new PaymentProcessingException("Payment failed", e);
31        }
32    }
33}

2. Compensation Design:

 1public interface CompensatingAction {
 2
 3    CompensationResult execute(CompensationContext context);
 4
 5    boolean isCompensatable(CompensationContext context);
 6
 7    default int getRetryAttempts() {
 8        return 3;
 9    }
10
11    default Duration getRetryDelay() {
12        return Duration.ofSeconds(5);
13    }
14}
15
16@Component
17public class OrderCancellationCompensation implements CompensatingAction {
18
19    @Override
20    public CompensationResult execute(CompensationContext context) {
21        try {
22            String orderId = context.getOrderId();
23            Order order = orderRepository.findById(orderId)
24                .orElse(null);
25
26            if (order == null) {
27                return CompensationResult.success("Order already cancelled or not found");
28            }
29
30            if (order.getStatus() == OrderStatus.SHIPPED) {
31                // Cannot cancel shipped orders - need manual intervention
32                return CompensationResult.failed("Cannot cancel shipped order - manual intervention required");
33            }
34
35            order.cancel("SAGA compensation");
36            orderRepository.save(order);
37
38            return CompensationResult.success("Order cancelled successfully");
39
40        } catch (Exception e) {
41            return CompensationResult.failed("Failed to cancel order: " + e.getMessage());
42        }
43    }
44
45    @Override
46    public boolean isCompensatable(CompensationContext context) {
47        // Check if compensation is still possible
48        String orderId = context.getOrderId();
49        return orderRepository.findById(orderId)
50            .map(order -> order.getStatus() != OrderStatus.SHIPPED)
51            .orElse(false);
52    }
53}

3. Testing Strategy:

 1@SpringBootTest
 2@TestPropertySource(properties = {
 3    "saga.orchestration.timeout=5000",
 4    "saga.compensation.timeout=3000"
 5})
 6class SagaIntegrationTest {
 7
 8    @Autowired
 9    private OrderSagaOrchestrator orchestrator;
10
11    @MockBean
12    private PaymentService paymentService;
13
14    @Test
15    void testSuccessfulSagaExecution() {
16        // Arrange
17        OrderSagaRequest request = createValidOrderRequest();
18
19        // Act
20        SagaExecutionResult result = orchestrator.executeOrderSaga(request);
21
22        // Assert
23        assertThat(result.isSuccess()).isTrue();
24        assertThat(result.getSaga().getStatus()).isEqualTo(SagaStatus.COMPLETED);
25    }
26
27    @Test
28    void testSagaCompensationOnPaymentFailure() {
29        // Arrange
30        OrderSagaRequest request = createValidOrderRequest();
31        when(paymentService.processPayment(any(), any(), any()))
32            .thenThrow(new PaymentProcessingException("Insufficient funds"));
33
34        // Act
35        SagaExecutionResult result = orchestrator.executeOrderSaga(request);
36
37        // Assert
38        assertThat(result.isSuccess()).isFalse();
39        assertThat(result.getSaga().getStatus()).isEqualTo(SagaStatus.COMPENSATED);
40
41        // Verify compensations were called
42        verify(inventoryService).releaseReservation(any());
43        verify(orderService).cancelOrder(any());
44    }
45
46    @Test
47    void testConcurrentSagaExecution() throws InterruptedException {
48        // Test multiple concurrent SAGAs to ensure isolation
49        int threadCount = 10;
50        CountDownLatch latch = new CountDownLatch(threadCount);
51        AtomicInteger successCount = new AtomicInteger(0);
52
53        ExecutorService executor = Executors.newFixedThreadPool(threadCount);
54
55        for (int i = 0; i < threadCount; i++) {
56            executor.submit(() -> {
57                try {
58                    OrderSagaRequest request = createValidOrderRequest();
59                    SagaExecutionResult result = orchestrator.executeOrderSaga(request);
60
61                    if (result.isSuccess()) {
62                        successCount.incrementAndGet();
63                    }
64                } finally {
65                    latch.countDown();
66                }
67            });
68        }
69
70        latch.await(30, TimeUnit.SECONDS);
71        assertThat(successCount.get()).isGreaterThan(0);
72    }
73}

🎯 Conclusion

The SAGA pattern provides a robust alternative to traditional distributed transaction mechanisms, offering better availability and scalability at the cost of increased complexity. When implementing SAGA patterns in Spring Boot applications:

🏆 Key Takeaways

  1. Choose the Right Pattern: Orchestration for simpler, centralized control; Choreography for distributed, resilient systems
  2. Design for Idempotency: Ensure all operations can be safely retried
  3. Implement Comprehensive Compensation: Plan for all possible failure scenarios
  4. Monitor and Observe: Use metrics and logging to track SAGA execution
  5. Test Thoroughly: Include failure scenarios and compensation paths in testing

🚀 When to Use SAGA Pattern

Ideal Scenarios:

  • Microservices architecture with multiple data stores
  • Long-running business processes
  • Systems requiring high availability
  • Applications with complex business workflows

Avoid SAGA When:

  • Strong consistency is mandatory
  • Simple, single-service transactions
  • Systems with simple failure scenarios
  • Applications requiring strict ACID properties

By implementing SAGA patterns correctly in your Spring Boot applications, you can achieve distributed transaction management that scales with your microservices architecture while maintaining business consistency requirements.