Unverified Commit 3e923330 authored by Sammy Khamis's avatar Sammy Khamis Committed by GitHub
Browse files

Refactor FxA state machine to simplify transitions (#7359)

parent a7555607
Loading
Loading
Loading
Loading
+39 −66
Original line number Diff line number Diff line
# The Public FxA State Machine
# The FxA State Machine

The public FxA state machine tracks a user's authentication state as they perform operations on their account.
The state machine, its states, and its events are visible to the consumer applications.
Applications generally track the state and update the UI based on it, for example providing a login button for the `Disconnected` state and link to the FxA account management page for the `Connected` state.
The FxA state machine tracks a user's authentication state as they perform operations on their account.
The state machine, its states, and its events are visible to consumer applications (Firefox iOS, Firefox Android).
Apps generally watch the state and update the UI based on it - e.g. showing a login button for `Disconnected`, or a link to the FxA account management page for `Connected`.

The public state machine events correspond to user actions, for example clicking the login button or completing the OAuth flow.
The public state machine is non-deterministic -- from a given state and event, there are multiple possibilities for the next state.
Usually there are two possible transitions: one for a successful operation and one for a failed one.
For example, when completing an oauth flow, if the operation is successful the state machine transitions to the `Connected` state, while if it fails it stays in the `Authenticating` state.
Events correspond to user actions or runtime triggers (clicking the login button, completing OAuth, recovering from an auth error). From a given state and event, the FSM may produce multiple possible next states depending on the result of underlying network calls, usually one for success and one for failure.

Here is an overview containing some of the states and transitions:
For example, when completing an OAuth flow: a successful `CompleteOAuthFlow` transitions from `Authenticating` to `Connected`; a failed one transitions back to the state we were authenticating from.

```mermaid
graph LR;
    Disconnected --> |"BeginOAuthFlow(Success)"| Authenticating
    Disconnected --> |"BeginOAuthFlow(Failure)"| Disconnected
    Disconnected --> |"BeginPairingFlow(Success)"| Authenticating
    Disconnected --> |"BeginPairingFlow(Failure)"| Disconnected
    Authenticating --> |"CompleteOAuthFlow(Success)"| Connected
    Authenticating --> |"CompleteOAuthFlow(Failure)"| Authenticating
    Authenticating --> |"CancelOAuthFlow"| Disconnected
    Connected --> |"Disconnect"| Disconnected

    classDef default fill:#0af, color:black, stroke:black
```
## High-level

# The Internal State Machines
There are two layers:

For each public state, we also define an internal state machine that represents the process of transitioning out of that state.
Internal state machine states correspond to `FirefoxAccount` method calls and events correspond to call results.
Unlike the public state machine, the internal state machines are deterministic meaning that each `(state, event)` pair always results in the same next state.
1. **`transitions.rs`** — Each `match` arm reads as: do the work (calling methods on the `RetryingAccount` wrapper), attach the target state for the error path with `.to_state_machine_err(|| target)?`, return the success state. Returns `Result<FxaState, StateMachineErr>`, the `Err` variant carries both the error cause (for logging) and the target state to land in.
2. **`helpers.rs`** — the supporting types:
   - [`RetryingAccount`] wraps a `&mut FirefoxAccount` and exposes only the methods the FSM uses, with retry policy applied automatically. Holding a `&mut RetryingAccount` instead of a `&mut FirefoxAccount` makes it hard to call a network method without retry.
   - [`StateMachineErr`] + [`ResultExt::to_state_machine_err()`] extension trait give the `?` ergonomics for "on error, transition to this state".
   - [`RetryPolicy`] holds the network-retry count and auth-recovery flag.

There are two terminal states for the internal state machines:
  - `Complete(new_state)`: Complete the process and transition the public state machine to a new state
  - `Cancel`: Cancel the process and don't change the current public state.
The driver in `mod.rs` validates the `Initialize` invariant, builds a `RetryingAccount`, calls `transition()` once, routes the error (if any) through `convert_log_report_error` for logging/Sentry, commits the new state, and fires `on_auth_issues()` if applicable.

Here are some example internal state machines:
Adding a new event is straightforward: add a `match` arm in `transition()`. If the event needs a new account method, add a one-line wrapper to `RetryingAccount` — that's the moment to think about retry semantics for the new operation.

## Disconnected
## State diagram

```mermaid
graph TD;
    Authenticating["Complete(Authenticating)"]:::terminal
    BeginOAuthFlow --> |BeginOAuthFlowSuccess| Authenticating
    BeginPairingFlow --> |BeginPairingFlowSuccess| Authenticating
    BeginOAuthFlow --> |Error| Cancel:::terminal
    BeginPairingFlow --> |Error| Cancel:::terminal
graph LR;
    Uninitialized -->|"Initialize"| Disconnected
    Uninitialized -->|"Initialize"| Connected
    Uninitialized -->|"Initialize"| AuthIssues
    Disconnected -->|"BeginOAuthFlow / BeginPairingFlow (Ok)"| Authenticating
    Disconnected -->|"BeginOAuthFlow / BeginPairingFlow (Err)"| Disconnected
    Authenticating -->|"CompleteOAuthFlow (Ok)"| Connected
    Authenticating -->|"CompleteOAuthFlow / Begin*Flow (Err) → initial_state"| InitialState[Disconnected / Connected / AuthIssues]
    Authenticating -->|"CancelOAuthFlow → initial_state"| InitialState
    Authenticating -->|"InitializeDevice (Err)"| Disconnected
    Authenticating -->|"Disconnect"| Disconnected
    Connected -->|"Disconnect"| Disconnected
    Connected -->|"BeginOAuthFlow (Ok) — new OAuth flow"| Authenticating
    Connected -->|"CheckAuthorizationStatus (inactive / Err)"| AuthIssues
    Connected -->|"CallGetProfile (Err)"| AuthIssues
    AuthIssues -->|"BeginOAuthFlow (Ok)"| Authenticating
    AuthIssues -->|"Disconnect"| Disconnected

    classDef default fill:#0af, color:black, stroke:black
    classDef terminal fill:#FC766A, stroke: black;
```

## Authenticating
`Authenticating { initial_state }` tracks where the user came from. Error and cancel paths from `Authenticating` return to `initial_state.into()` (not always `Disconnected`) — so a re-auth attempt from `AuthIssues` that the user cancels lands back at `AuthIssues`, and an OAuth flow started from `Connected` that errors out keeps the user at `Connected`. The exception is `InitializeDevice` errors, which always land at `Disconnected`. A `CompleteOAuthFlow` success from `Authenticating { initial_state: Connected }` skips `InitializeDevice` because the device is already initialized.

```mermaid
graph TD;
    Connected["Complete(Connected)"]:::terminal
    CompleteOAuthFlow --> |CompleteOAuthFlowSuccess| InitializeDevice
    CompleteOAuthFlow --> |Error| Cancel:::terminal
    InitializeDevice --> |InitializeDeviceSuccess| Connected
    InitializeDevice --> |Error| Cancel:::terminal

    classDef default fill:#0af, color:black, stroke:black
    classDef terminal fill:#FC766A, stroke: black;
```
## Retry behavior

## Uninitialized
`RetryingAccount` applies this policy:

This is the initial state for the public state machine (not shown in the diagram above).
- **Network errors** retry up to 3 times.
- **Auth errors** trigger a single recovery attempt: clear the access token cache, call `check_authorization_status`, and (if still active) retry the operation once.

```mermaid
graph TD;
    Disconnected["Complete(Disconnected)"]:::terminal
    Connected["Complete(Connected)"]:::terminal
    AuthIssues["Complete(AuthIssues)"]:::terminal
    GetAuthState --> |"GetAuthStateSuccess(Disconnected)"| Disconnected:::terminal
    GetAuthState --> |"GetAuthStateSuccess(AuthIssues)"| AuthIssues:::terminal
    GetAuthState --> |"GetAuthStateSuccess(Connected)"| EnsureCapabilities
    EnsureCapabilities --> |EnsureCapabilitiesSuccess| Connected:::terminal
    EnsureCapabilities --> |Error| AuthIssues:::terminal

    classDef default fill:#0af, color:black, stroke:black
    classDef terminal fill:#FC766A, stroke: black;
```
Methods that auto-recover from auth errors: `complete_oauth_flow`, `begin_oauth_flow`, `begin_pairing_flow`, `get_profile`. Methods that don't (auth errors are FSM-recoverable, not operation-recoverable): `initialize_device`, `ensure_capabilities`, `check_authorization_status`. The `EnsureDeviceCapabilities` auth-error case is handled at the FSM level — the transition arm matches on the error and dispatches to `CheckAuthorizationStatus`.
+1 −38
Original line number Diff line number Diff line
@@ -10,7 +10,7 @@
//! Also, they must not use the string "auth" since Sentry will filter that out.
//! Use "ath" instead.

use super::{internal_machines, FxaEvent, FxaState};
use super::{FxaEvent, FxaState};
use std::fmt;

impl fmt::Display for FxaState {
@@ -41,40 +41,3 @@ impl fmt::Display for FxaEvent {
        write!(f, "{name}")
    }
}

impl fmt::Display for internal_machines::State {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match self {
            Self::GetAuthState => write!(f, "GetAthState"),
            Self::BeginOAuthFlow { .. } => write!(f, "BeginOAthFlow"),
            Self::BeginPairingFlow { .. } => write!(f, "BeginPairingFlow"),
            Self::CompleteOAuthFlow { .. } => write!(f, "CompleteOAthFlow"),
            Self::InitializeDevice => write!(f, "InitializeDevice"),
            Self::EnsureDeviceCapabilities => write!(f, "EnsureDeviceCapabilities"),
            Self::CheckAuthorizationStatus => write!(f, "CheckAuthorizationStatus"),
            Self::Disconnect => write!(f, "Disconnect"),
            Self::GetProfile => write!(f, "GetProfile"),
            Self::Complete(state) => write!(f, "Complete({state})"),
            Self::Cancel => write!(f, "Cancel"),
        }
    }
}

impl fmt::Display for internal_machines::Event {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        let name = match self {
            Self::GetAuthStateSuccess { .. } => "GetAthStateSuccess",
            Self::BeginOAuthFlowSuccess { .. } => "BeginOAthFlowSuccess",
            Self::BeginPairingFlowSuccess { .. } => "BeginPairingFlowSuccess",
            Self::CompleteOAuthFlowSuccess => "CompleteOAthFlowSuccess",
            Self::InitializeDeviceSuccess => "InitializeDeviceSuccess",
            Self::EnsureDeviceCapabilitiesSuccess => "EnsureDeviceCapabilitiesSuccess",
            Self::CheckAuthorizationStatusSuccess { .. } => "CheckAuthorizationStatusSuccess",
            Self::DisconnectSuccess => "DisconnectSuccess",
            Self::GetProfileSuccess => "GetProfileSuccess",
            Self::CallError => "CallError",
            Self::EnsureCapabilitiesAuthError => "EnsureCapabilitiesAthError",
        };
        write!(f, "{name}")
    }
}
+456 −0

File added.

Preview size limit exceeded, changes collapsed.

+0 −104
Original line number Diff line number Diff line
/* This Source Code Form is subject to the terms of the Mozilla Public
 * License, v. 2.0. If a copy of the MPL was not distributed with this
 * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

use super::{invalid_transition, Event, InternalStateMachine, State};
use crate::{Error, FxaEvent, FxaRustAuthState, FxaState, Result};
use error_support::report_error;

pub struct AuthIssuesStateMachine;

// Save some typing
use Event::*;
use State::*;

impl InternalStateMachine for AuthIssuesStateMachine {
    fn initial_state(&self, event: FxaEvent) -> Result<State> {
        match event {
            FxaEvent::BeginOAuthFlow {
                service,
                scopes,
                entrypoint,
            } => Ok(BeginOAuthFlow {
                service: service.clone(),
                scopes: scopes.clone(),
                entrypoint: entrypoint.clone(),
                initial_state: FxaRustAuthState::AuthIssues,
            }),
            FxaEvent::Disconnect => Ok(Disconnect),
            e => Err(Error::InvalidStateTransition(format!("AuthIssues -> {e}"))),
        }
    }

    fn next_state(&self, state: State, event: Event) -> Result<State> {
        Ok(match (state, event) {
            (Disconnect, DisconnectSuccess) => Complete(FxaState::Disconnected),
            (Disconnect, CallError) => {
                // disconnect() is currently infallible, but let's handle errors anyway in case we
                // refactor it in the future.
                report_error!("fxa-state-machine-error", "saw CallError after Disconnect");
                Complete(FxaState::Disconnected)
            }
            (BeginOAuthFlow { .. }, BeginOAuthFlowSuccess { oauth_url }) => {
                Complete(FxaState::Authenticating {
                    oauth_url,
                    initial_state: FxaRustAuthState::AuthIssues,
                })
            }
            (BeginOAuthFlow { .. }, CallError) => Cancel,
            (state, event) => return invalid_transition(state, event),
        })
    }
}

#[cfg(test)]
mod test {
    use super::super::StateMachineTester;
    use super::*;

    #[test]
    fn test_reauthenticate() {
        let tester = StateMachineTester::new(
            AuthIssuesStateMachine,
            FxaEvent::BeginOAuthFlow {
                service: "service".to_owned(),
                scopes: vec!["profile".to_owned()],
                entrypoint: "test-entrypoint".to_owned(),
            },
        );

        assert_eq!(
            tester.state,
            BeginOAuthFlow {
                service: "service".to_owned(),
                scopes: vec!["profile".to_owned()],
                entrypoint: "test-entrypoint".to_owned(),
                initial_state: FxaRustAuthState::AuthIssues,
            }
        );
        assert_eq!(tester.peek_next_state(CallError), Cancel);
        assert_eq!(
            tester.peek_next_state(BeginOAuthFlowSuccess {
                oauth_url: "http://example.com/oauth-start".to_owned(),
            }),
            Complete(FxaState::Authenticating {
                oauth_url: "http://example.com/oauth-start".to_owned(),
                initial_state: FxaRustAuthState::AuthIssues,
            })
        );
    }

    #[test]
    fn test_disconnect() {
        let tester = StateMachineTester::new(AuthIssuesStateMachine, FxaEvent::Disconnect);
        assert_eq!(tester.state, Disconnect);
        assert_eq!(
            tester.peek_next_state(CallError),
            Complete(FxaState::Disconnected)
        );
        assert_eq!(
            tester.peek_next_state(DisconnectSuccess),
            Complete(FxaState::Disconnected)
        );
    }
}
+0 −290

File deleted.

Preview size limit exceeded, changes collapsed.

Loading