syndicate-rs/syndicate-server/src/services/debt_reporter.rs

66 lines
2.8 KiB
Rust
Raw Normal View History

2022-01-19 13:40:50 +00:00
use preserves_schema::Codec;
use std::sync::Arc;
MAJOR REFACTORING OF CORE ASSERTION-TRACKING STRUCTURES. Little impact on API. Read on for details. 2022-02-01 15:22:30 Two problems. - If a stop action panics (in `_terminate_facet`), the Facet is dropped before its outbound handles are removed. With the code as it stands, this leaks assertions (!!). - The logic for removing an outbound handle seems to be running in the wrong facet context??? (See `f.outbound_handles.remove(&handle)` in the cleanup actions - I think I need to remove the for_myself mechanism - and add some callbacks to run only on successful commit 2022-02-02 12:12:33 This is hard. Here's the current implementation: - assert - inserts into outbound_handles of active facet - adds cleanup action describing how to do the retraction - enqueues the assert action, which - calls e.assert() - retract - looks up & removes the cleanup action, which - enqueues the retract action, which - removes from outbound_handles of the WRONG facet in the WRONG actor - calls e.retract() - _terminate_facet - uses outbound_handles to retract the facet's assertions - doesn't directly touch cleanup actions, relying on retract to do that - if one of a facet's stop actions panics, will drop the facet, leaking its assertions - actually, even if a stop action yields `Err`, it will drop the facet and leak assertions - yikes - facet drop - panics if outbound_handles is nonempty - actor cleanup - relies on facet tree to find assertions to retract Revised plan: - ✓ revise Activation/PendingEvents structures - rename `cleanup_actions` to `outbound_assertions` - remove `for_myself` queues and `final_actions` - add `pre_commit_actions`, `rollback_actions` and `commit_actions` - ✓ assert - as before - but on rollback, removes from `outbound_handles` (if the facet still exists) and `outbound_assertions` (always) - marks the new assertion as "established" on commit - ✓ retract - lookup in `outbound_assertions` by handle, using presence as indication it hasn't been scheduled in this turn - on rollback, put it back in `outbound_assertions` ONLY IF IT IS MARKED ESTABLISHED - otherwise it is a retraction of an `assert` that has *also* been rolled back in this turn - on commit, remove it from `outbound_handles` - enqueue the retract action, which just calls e.retract() - ✓ _terminate_facet - revised quite a bit now we rely on `RunningActor::cleanup` to use `outbound_assertions` rather than the facet tree. - still drops Facets on panic, but this is now mostly harmless (reorders retractions a bit) - handles `Err` from a stop action more gracefully - slightly cleverer tracking of what needs doing based on a `TerminationDirection` - now ONLY applies to ORDERLY cleanup of the facet tree. Disorderly cleanup ignores the facet tree and just retracts the assertions willy-nilly. - ✓ facet drop - warn if outbound_handles is nonempty, but don't do anything about it - ✓ actor cleanup - doesn't use the facet tree at all. - cleanly shutting down is done elsewhere - uses the remaining entries in `outbound_assertions` (previously `cleanup_actions`) to deal with retractions for dropped facets as well as any other facets that haven't been cleanly shut down - ✓ activate - now has a panic_guard::PanicGuard RAII for conveying a crash to an actor in case the activation is happening from a linked task or another thread (this wasn't the case in the examples that provoked this work, though) - simplified - explicit commit/rollback decision - ✓ Actor::run - no longer uses the same path for crash-termination and success-termination - instead, for success-termination, takes a turn that calls Activation::stop_root - this cleans up the facet tree using _terminate_facet - when the turn ends, it notices that the root facet is gone and shuts down the actor - so in principle there will be nothing for actor cleanup to do 2022-02-04 13:52:34 This took days. :-(
2022-02-04 12:59:37 +00:00
use std::sync::atomic::Ordering;
use syndicate::actor::*;
2021-09-23 19:46:10 +00:00
use syndicate::enclose;
2022-01-19 13:40:50 +00:00
use syndicate::preserves::rec;
use syndicate::preserves::value::NestedValue;
2021-08-28 16:50:55 +00:00
use crate::language::language;
use crate::lifecycle;
2021-09-20 13:10:31 +00:00
use crate::schemas::internal_services::DebtReporter;
use syndicate_macros::during;
pub fn on_demand(t: &mut Activation, ds: Arc<Cap>) {
2022-01-19 13:40:50 +00:00
t.spawn(Some(AnyValue::symbol("debt_reporter_listener")), move |t| {
Ok(during!(t, ds, language(), <run-service $spec: DebtReporter>, |t: &mut Activation| {
2022-01-19 13:40:50 +00:00
t.spawn_link(Some(rec![AnyValue::symbol("debt_reporter"), language().unparse(&spec)]),
enclose!((ds) |t| run(t, ds, spec)));
2021-09-20 13:10:31 +00:00
Ok(())
}))
2021-09-01 15:31:01 +00:00
});
}
fn run(t: &mut Activation, ds: Arc<Cap>, spec: DebtReporter) -> ActorResult {
ds.assert(t, language(), &lifecycle::started(&spec));
ds.assert(t, language(), &lifecycle::ready(&spec));
t.every(core::time::Duration::from_millis((spec.interval_seconds.0 * 1000.0) as u64), |_t| {
MAJOR REFACTORING OF CORE ASSERTION-TRACKING STRUCTURES. Little impact on API. Read on for details. 2022-02-01 15:22:30 Two problems. - If a stop action panics (in `_terminate_facet`), the Facet is dropped before its outbound handles are removed. With the code as it stands, this leaks assertions (!!). - The logic for removing an outbound handle seems to be running in the wrong facet context??? (See `f.outbound_handles.remove(&handle)` in the cleanup actions - I think I need to remove the for_myself mechanism - and add some callbacks to run only on successful commit 2022-02-02 12:12:33 This is hard. Here's the current implementation: - assert - inserts into outbound_handles of active facet - adds cleanup action describing how to do the retraction - enqueues the assert action, which - calls e.assert() - retract - looks up & removes the cleanup action, which - enqueues the retract action, which - removes from outbound_handles of the WRONG facet in the WRONG actor - calls e.retract() - _terminate_facet - uses outbound_handles to retract the facet's assertions - doesn't directly touch cleanup actions, relying on retract to do that - if one of a facet's stop actions panics, will drop the facet, leaking its assertions - actually, even if a stop action yields `Err`, it will drop the facet and leak assertions - yikes - facet drop - panics if outbound_handles is nonempty - actor cleanup - relies on facet tree to find assertions to retract Revised plan: - ✓ revise Activation/PendingEvents structures - rename `cleanup_actions` to `outbound_assertions` - remove `for_myself` queues and `final_actions` - add `pre_commit_actions`, `rollback_actions` and `commit_actions` - ✓ assert - as before - but on rollback, removes from `outbound_handles` (if the facet still exists) and `outbound_assertions` (always) - marks the new assertion as "established" on commit - ✓ retract - lookup in `outbound_assertions` by handle, using presence as indication it hasn't been scheduled in this turn - on rollback, put it back in `outbound_assertions` ONLY IF IT IS MARKED ESTABLISHED - otherwise it is a retraction of an `assert` that has *also* been rolled back in this turn - on commit, remove it from `outbound_handles` - enqueue the retract action, which just calls e.retract() - ✓ _terminate_facet - revised quite a bit now we rely on `RunningActor::cleanup` to use `outbound_assertions` rather than the facet tree. - still drops Facets on panic, but this is now mostly harmless (reorders retractions a bit) - handles `Err` from a stop action more gracefully - slightly cleverer tracking of what needs doing based on a `TerminationDirection` - now ONLY applies to ORDERLY cleanup of the facet tree. Disorderly cleanup ignores the facet tree and just retracts the assertions willy-nilly. - ✓ facet drop - warn if outbound_handles is nonempty, but don't do anything about it - ✓ actor cleanup - doesn't use the facet tree at all. - cleanly shutting down is done elsewhere - uses the remaining entries in `outbound_assertions` (previously `cleanup_actions`) to deal with retractions for dropped facets as well as any other facets that haven't been cleanly shut down - ✓ activate - now has a panic_guard::PanicGuard RAII for conveying a crash to an actor in case the activation is happening from a linked task or another thread (this wasn't the case in the examples that provoked this work, though) - simplified - explicit commit/rollback decision - ✓ Actor::run - no longer uses the same path for crash-termination and success-termination - instead, for success-termination, takes a turn that calls Activation::stop_root - this cleans up the facet tree using _terminate_facet - when the turn ends, it notices that the root facet is gone and shuts down the actor - so in principle there will be nothing for actor cleanup to do 2022-02-04 13:52:34 This took days. :-(
2022-02-04 12:59:37 +00:00
for (account_id, (name, debt)) in syndicate::actor::ACCOUNTS.read().iter() {
tracing::info!(account_id, ?name, debt = ?debt.load(Ordering::Relaxed));
}
MAJOR REFACTORING OF CORE ASSERTION-TRACKING STRUCTURES. Little impact on API. Read on for details. 2022-02-01 15:22:30 Two problems. - If a stop action panics (in `_terminate_facet`), the Facet is dropped before its outbound handles are removed. With the code as it stands, this leaks assertions (!!). - The logic for removing an outbound handle seems to be running in the wrong facet context??? (See `f.outbound_handles.remove(&handle)` in the cleanup actions - I think I need to remove the for_myself mechanism - and add some callbacks to run only on successful commit 2022-02-02 12:12:33 This is hard. Here's the current implementation: - assert - inserts into outbound_handles of active facet - adds cleanup action describing how to do the retraction - enqueues the assert action, which - calls e.assert() - retract - looks up & removes the cleanup action, which - enqueues the retract action, which - removes from outbound_handles of the WRONG facet in the WRONG actor - calls e.retract() - _terminate_facet - uses outbound_handles to retract the facet's assertions - doesn't directly touch cleanup actions, relying on retract to do that - if one of a facet's stop actions panics, will drop the facet, leaking its assertions - actually, even if a stop action yields `Err`, it will drop the facet and leak assertions - yikes - facet drop - panics if outbound_handles is nonempty - actor cleanup - relies on facet tree to find assertions to retract Revised plan: - ✓ revise Activation/PendingEvents structures - rename `cleanup_actions` to `outbound_assertions` - remove `for_myself` queues and `final_actions` - add `pre_commit_actions`, `rollback_actions` and `commit_actions` - ✓ assert - as before - but on rollback, removes from `outbound_handles` (if the facet still exists) and `outbound_assertions` (always) - marks the new assertion as "established" on commit - ✓ retract - lookup in `outbound_assertions` by handle, using presence as indication it hasn't been scheduled in this turn - on rollback, put it back in `outbound_assertions` ONLY IF IT IS MARKED ESTABLISHED - otherwise it is a retraction of an `assert` that has *also* been rolled back in this turn - on commit, remove it from `outbound_handles` - enqueue the retract action, which just calls e.retract() - ✓ _terminate_facet - revised quite a bit now we rely on `RunningActor::cleanup` to use `outbound_assertions` rather than the facet tree. - still drops Facets on panic, but this is now mostly harmless (reorders retractions a bit) - handles `Err` from a stop action more gracefully - slightly cleverer tracking of what needs doing based on a `TerminationDirection` - now ONLY applies to ORDERLY cleanup of the facet tree. Disorderly cleanup ignores the facet tree and just retracts the assertions willy-nilly. - ✓ facet drop - warn if outbound_handles is nonempty, but don't do anything about it - ✓ actor cleanup - doesn't use the facet tree at all. - cleanly shutting down is done elsewhere - uses the remaining entries in `outbound_assertions` (previously `cleanup_actions`) to deal with retractions for dropped facets as well as any other facets that haven't been cleanly shut down - ✓ activate - now has a panic_guard::PanicGuard RAII for conveying a crash to an actor in case the activation is happening from a linked task or another thread (this wasn't the case in the examples that provoked this work, though) - simplified - explicit commit/rollback decision - ✓ Actor::run - no longer uses the same path for crash-termination and success-termination - instead, for success-termination, takes a turn that calls Activation::stop_root - this cleans up the facet tree using _terminate_facet - when the turn ends, it notices that the root facet is gone and shuts down the actor - so in principle there will be nothing for actor cleanup to do 2022-02-04 13:52:34 This took days. :-(
2022-02-04 12:59:37 +00:00
// let snapshot = syndicate::actor::ACTORS.read().clone();
// for (id, (name, ac_ref)) in snapshot.iter() {
// if *id == _t.state.actor_id {
// tracing::debug!("skipping report on the reporting actor, to avoid deadlock");
// continue;
// }
// tracing::trace!(?id, "about to lock");
// tracing::info_span!("actor", id, ?name).in_scope(|| match &*ac_ref.state.lock() {
// ActorState::Terminated { exit_status } =>
// tracing::info!(?exit_status, "terminated"),
// ActorState::Running(state) => {
// tracing::info!(field_count = ?state.fields.len(),
// outbound_assertion_count = ?state.outbound_assertions.len(),
// facet_count = ?state.facet_nodes.len());
// tracing::info_span!("facets").in_scope(|| {
// for (facet_id, f) in state.facet_nodes.iter() {
// tracing::info!(
// ?facet_id,
// parent_id = ?f.parent_facet_id,
// outbound_handle_count = ?f.outbound_handles.len(),
// linked_task_count = ?f.linked_tasks.len(),
// inert_check_preventers = ?f.inert_check_preventers.load(Ordering::Relaxed));
// }
// });
// }
// });
// }
Ok(())
})
}