distcloud/distributedcloud/dccertmon/common/certificate_monitor_manager.py
Salman Rana 7d44c38c90 Introduce dccertmon service
This commit introduces dccertmon, a new managed service for DC
certificate auditing and management.

Currently, platform cert management, DC cert management, and subcloud
cert auditing are coupled into a single platform service (certmon). To
meet the requirements of DC scalability and portability, DC specific
functionality must be decoupled. These changes lay the groundwork
for the new service, by:
- Creating the necessary service files.
- Introducing configs for the service.
- Declaring high level methods (Skeleton - lifecycle and manager)

DC-specific functionality will be migrated to this dccertmon service and
optimized in subsequent changes. Non-DC cert management will continue to
be handled by certmon.

Overall, this commit introduces:
- The OCF file necessary for high availability management of the
  dccertmon service by SM.
- Package configurations to build the service (Package: distributedcloud-dccertmon).
- Lifecycle manager for a running DC cert monitor service.
- Skeleton/base service application logic - CertificateMonitorManager.
- RPC notification handlers for subcloud online/managed.
- Configuration for the log folders and log rotation. The logs
  will be available in /var/log/dccertmon/dccertmon.log.

These changes are part of a set of commits to introduce the dccertmon service:
  [1] https://review.opendev.org/c/starlingx/ha/+/941205
  [2] https://review.opendev.org/c/starlingx/stx-puppet/+/941208

Test Plan:
  - PASS: Build dccertmon package
  - PASS: Install and bootstrap system with custom ISO containing the
          newly created dccertmon package
  - PASS: Verify that the dccertmon.service is loaded
  - PASS: Verify dccertmon is being properly logged to the correct
          folder.
  - PASS: Check logged messages and verify execution of
           - Cert Watcher thread
           - Task Executor (Audit thread)
           - Periodic tasks running at expected intervals
  - PASS: Configure and provision the service using SM and verify
          it has correctly started and can be restarted with
          'sm-restart'.
  - PASS: Tox checks running on dccertmon

  Note: This commit has been tested alongside the related changes and
        their respective test plans. [1][2]

Story: 2011311
Task: 51663

Change-Id: Ic23d8d13e4b292cf0508d23eaae99b8e07f36d31
Signed-off-by: Salman Rana <salman.rana@windriver.com>
2025-03-14 15:48:19 -04:00

96 lines
2.8 KiB
Python

#
# Copyright (c) 2025 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
import time
import eventlet
import greenlet
from oslo_config import cfg
from oslo_log import log
from oslo_service import periodic_task
from dccertmon.common import watcher
LOG = log.getLogger(__name__)
CONF = cfg.CONF
class CertificateMonitorManager(periodic_task.PeriodicTasks):
def __init__(self):
super(CertificateMonitorManager, self).__init__(CONF)
self.mon_thread = None
self.worker_thread = None
def on_start(self):
LOG.info("Service Start - prepare for initial audit")
def start_task_executor(self):
self.worker_thread = eventlet.greenthread.spawn(self.worker_task_loop)
self.on_start()
def start_cert_watcher(self):
dc_monitor = None
while True:
try:
dc_monitor = watcher.DC_CertWatcher()
dc_monitor.initialize()
except Exception as e:
LOG.exception(e)
time.sleep(5)
else:
break
# spawn monitor thread
self.mon_thread = eventlet.greenthread.spawn(self.monitor_cert_loop, dc_monitor)
def stop_cert_watcher(self):
if self.mon_thread:
self.mon_thread.kill()
self.mon_thread.wait()
self.mon_thread = None
def stop_task_executor(self):
if self.worker_thread:
self.worker_thread.kill()
self.worker_thread.wait()
self.worker_thread = None
def worker_task_loop(self):
while True:
try:
self.run_periodic_tasks(context=None)
# TODO(srana): Reset sleep after proper implementation
time.sleep(60)
except greenlet.GreenletExit:
break
except Exception as e:
LOG.exception(e)
def monitor_cert_loop(self, monitor):
while True:
# never exit until exit signal received
try:
monitor.start_watch(on_success=None, on_error=None)
except greenlet.GreenletExit:
break
except Exception:
# It shouldn't fall to here, but log and restart if it did
LOG.exception("Unexpected exception from start_watch")
time.sleep(1)
@periodic_task.periodic_task(spacing=CONF.dccertmon.audit_interval)
def audit_sc_cert_start(self, context):
LOG.info("periodic_task: audit_sc_cert_start")
@periodic_task.periodic_task(spacing=5)
def audit_sc_cert_task(self, context):
LOG.info("periodic_task: audit_sc_cert_task")
@periodic_task.periodic_task(spacing=CONF.dccertmon.retry_interval)
def retry_monitor_task(self, context):
LOG.info("periodic_task: retry_monitor_task")