Middle Ware Facilities for CATS

Introduction With computing resources continuously migrating to edges, services residing distributedly turn to be delivered in a dynamic way. More fine-grained scheduling strategies awaring of service SLA requirements and current computing status are urgently required. A framework to fulfill computing status aware traffic steering and services provisioning is illustrated in related works, for instance. Since a learning procedure to collect the information of network conditions and computing status is the premise to properly steer the traffic, a concise and effective learning and processing scheme is required. Unlike the collection of network attributes, a learning procedure of computing status has its unique characteristics, features and objectives which proposes incremental requirements:

Compared to relatively stable network capabilities, network topologies for instance, the variation of the status of computing resources is quite dynamic as illustrated in . It is unwise to exert the dynamicity of the computing status or the distribution of computing resources directly on the network.
Attributes to describe network status and conditions are relatively simple and explicit while massive metadata of computing status is heterogeneous and pluralistic. Various computing related services may correlate with different attributes of computing resources. A computing information description method is studied in . Furthermore, a method to evaluate the performance of a service instance based on computing modelling is also associated with the specific service and an applied scheduling strategy, and thus is correspondingly required.
Metadata collected from the network domain and service instances located in distributed sites share both identical attributes and different dimensional properties. The values of identical attributes should be analyzed in an accumulative manner while attributes with different dimensions should be unified processed determined by specific scheduling strategies.
Overly detailed or micro metadata collected from service instances located in distributed sites lack direct interpretation semantics by a network domain. It is suggested to provide simple and specific indications for the network to follow.

Currently, the perception and detection of computing resources can be commonly achieved by several schemes partly listed as follows:

Prometheus, as an open-source system monitoring and alerting toolkit, is able to collect and store metrics as time series data. Prometheus metrics include various aspects, metrics collected from Kubernetes API Server and kubelet for instance. These metrics include typical information like node capacity, pod scheduling duration and pods in queue which can reflect the detailed conditions of CPU, memory, queue, delay, etc. However, Prometheus is designed and deployed for monitoring and visualization and can not satisfy the mentioned requirements.
A DNS and GSLB scheme or CLB may apply a "Health Check" mechanism to detect whether a server is valid. Specific methods may be implemented through TCP, UDP and HTTP. A round-robin or weighted selection strategy may be further introduced and applied to provide and provision the required service. However, the results through a detection is relatively coarse-granular which lack the ability to evaluate the performance for services.
In some impressive work and studies, it is also proposed to extend IGP or BGP to carry the information of computing resources, aiming to be compatible with the current IP routing network. To be noticed, it is worth considering that overly utilization of L3 protocols may exert extra burden on the network and may not adapt well with highly computing resource sensitive services and future circumstances.

Thus, this draft proposes a computing resources perception and processing method based on a logical Middle Ware facility to solve the mentioned problems and to satisfy the corresponding requirements.

Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 when, and only when, they appear in all capitals, as shown here.

Terminology

SLA: Service Level Agreements
DNS: Domain Name System
GSLB: Global Server Load Balance
CLB: Cloud Load Balancer
IGP: Interior Gateway Protocol
BGP: Border Gateway Protocol
SNMP: Simple Network Management Protocol
FTP: File Transfer Protocol
PCEP: Path Computation Element Communication Protocol
OAM: Operation, Administration and Maintenance
DB-Agent: Agent of a database
BE: Best Effort
TE: Traffic Engineering

Framework According to the requirements of computing status perception analyzed in the previous sections, a framework of metadata collection and processing based on Middle Ware Facilities is proposed.

Framework of Metadata Collection Based on a Middle Ware A Middle Ware proposed here is a logical facility that has the knowledge of the computing status and network conditions, and thus the ability to process them. Considering the specific physical implementation, Middle Wares can be mapped to multiple physical entities or combinations of them. The involving entities may include a network controller, a superior orchestrator, a distributed database, distributed devices, an introduced application monitoring system, constructed service agents, etc. Logical modules of a Middle Ware are organized and defined as follows:

Inner Modules in a Middle Ware The logical modules and components are designed with the following respective functions and abilities:

NSC (Network Status Collector): NSC collects network status through a Protocol Service including telemetry, SNMP, BGP, etc.
CSC (Computing Status Collector): CSC collects the status of computing resources through a Protocol Service including FTP, gRPC, RESTful, etc. An application monitoring system may be deployed and corresponding interfaces may be introduced and required to be designed.
NCC (Network Configuration and Control): NCC publishes network configuration including SRv6 policies and other information through PCEP and BGP for instance.
CCC (Computing Configuration and Control): CCC publishes computing related configuration information and participates in the process of resources deployment and scaling.
NCSDB (Network and Computing Status DataBase): NCSDB stores the collection of metadata of network and computing status informed by NSC and CSC respectively and further integrates relevant information. The meta information is arranged in a hierarchical form for further lookup.
SRM (Service Registration and Management): Computing related services required by service clients are registered at SRM with corresponding service requirements including both network and computing attributes. Evaluation methods mapped to services are configured at SRM. SRM may communicates with outer components to receive relevant information.
SSC (Scheduling Strategy and Configuration): SSC processes specified services scheduling strategies. It gets configuration from the administration plane through a NorthBound Interface. SSC also correlates with SRM which may influence the configured evaluation methods.
ORC (ORChestration): With the registered evaluation scheme and configured scheduling strategies, ORC applies corresponding functions to calculate the metadata stored in NCSDB. The performance of service instances are evaluated and appropriate entries are selected and further distributed to NCC.
There may be other possible logical modules in a Middle Ware, including OAM, AI, Portal, etc.

With the functions defined, the workflow in the control plane to fulfill computing aware traffic engineering and service routing is described as follows:

SRM fulfills service subscription. Corresponding variable and controllable service metadata modeling methods are registered and configured through the NorthBound Interface, or a local or injected configuration profile.
SSC implements scheduling strategies configuration. SRM and SSC jointly determine specific evaluation methods for registered services.
NSC and CSC collect the network and computing status with respective Protocol Service modules. NSC and CSC may communicate with network controllers and distributed or centralized service agents among multiple sites.
NCSDB organizes the metadata collected by NSC and CSC in a hierarchical manner for further process.
ORC processes the metadata stored in NCSDB with respective evaluation methods determined by SRM and SSC, and then generates corresponding entries. The results are further distributed to NCC and CCC.
NCC ultimately distributes the entries and configurations to the underlay network with its Protocol Service module.

Referring to and , incremental requirements are proposed cats framework according to this draft:

"R6 MUST realize means for rate control for distributing of metrics." Thus, specific logical modules SHOULD be introduced to preprocess running computing status before being distributed to the network.
"R4: MUST include network metrics." "R5 MUST provide mechanisms to distribute the metrics." Thus, specific logical modules SHOULD be introduced to record the information of network capabilities and computing resources.
"R8: there MUST exist flexibility in term of metrics definition and utilization for the selection of service instance." "R9: MUST set up metric information that can be understood by CATS components." Thus, specific logical modules SHOULD be introduced to organize and manage service requirements and scheduling strategies.

NSC and NCC mentioned before are relatively similar or identical to the current subfunctions of a network controller, and thus will not be further discussed in this draft while the detailed design of the functions with SRM, SSC, NCSDB and ORC are illustrated as Part 1 to 3 in the following sections.

Part 1: Service Registration and Modelling Configuration at SRM and SSC Service clients propose service requests and get responses including corresponding service identifications issued by the administration plane. For instance, a Service ID to represent a globally unique service semantic identification is defined in . With the issued Service IDs, the information of constraints and sensitive attributes should be considered to generate corresponding modelling and evaluation methods for each service represented by a Service ID. The generation patterns of the modeling methods include but are not limited to:

Perform configuration directly by administrators by Portal operations or through NBI.
Read a pre-prepared local or a distributed configuration profile through NBI.

The metadata of network and computing status can be concluded as following typical scheduling attributes:

Experience attributes, end-to-end delay, jitter and packet loss for instance, which influence the quality of experience.
Cost attributes consist of economic cost, energy consumption, etc.
Resource attributes consist of load of CPUs, load of the network, etc.

According to the mentioned scheduling attributes, typical scheduling strategies performed can be concluded as:

Experience first: optimize the quality of experience.
Cost first: optimize the cost attributes while guarantee the thresholds of experience attributes.
Resource first: optimize the resource attributes while guarantee the thresholds of experience attributes and cost attributes.

Based on specified scheduling strategies, corresponding evaluation methods are determined. With the metadata calculated through specific functions, a most appropriate instance or all satisfied instances can be identified. Then, a preferred or balanced strategy can be performed which select a single entry or a set of entries to distribute.

Service Registration and Modelling Configuration 6C | | +----------------+------------------+------------------+-----+ | Load | <80% | | | +----------------+------------------+------------------+-----+ | ...... | | | | +----------------+------------------+------------------+-----+ | | Resource first | Experience first | | | Metric= | | | | | Function() | Function1(Delay, | Function2(Delay, | | | | Loss,Load) | Jitter,CPU) | | +----------------+------------------+------------------+-----+ ]]> As shown above, a typical evaluation and modelling method is displayed and a function to calculate a metric value can be defined as follows. A to F are preliminary functions to process metadata while Function1() and Function2() are evaluation functions.

Service Registration and Modelling Configuration +-------------> +-------------> 50 Delay 0.1% Loss 40% 80% Load MAX,if max{A(Delay),B(Loss)}=MAX, Function1(Delay,Loss,Load)={ C(Load),others. D(Delay) E(Jitter) F(Cores) ^ ^ ^ | | | MAX| +---- MAX| +---- MAX+----+ | / | / | \ | / | / | \ MIN+----+ MIN+----+ MIN| +---- | | | +-------------> +-------------> +-------------> 20 100 Delay 5 15 Jitter 6 12 Cores MAX,if max{D(Delay),E(Jitter),F(Cores)}=MAX, Function2(Delay,Jitter,CPU)={ Average[D(Delay),E(Jitter),F(Cores)],others. ]]> The design of functions also correlate with the semantics of the calculated metric value. As indicated above, if any requirement registered with the services is not satisfied, the end-to-end delay reaches 100ms in Function2() for instance, the overall function value reaches MAX which indicates that the corresponding entry fails to satisfy the service SLA represented by Service ID2. Also, a smaller metric value represents the better performance. Therefore, according to a simple metric, the performance of instances can be easily displayed.

Part 2: Computing Status Collection and Updates at NCSDB Based on a set of overall subscribed services and the configured respective sensitive attributes of each service in the set, a set of attributes that require status updates collection is summarized. CSC then queries or subscribes to the service agents responsible for meta information collection at each cloud sites. Due to the varying sensitivity and tolerance of different services to changes in computing status, as well as the differentiated priorities among various services, their requirements for metadata collection and update frequency differ from one another. The frequency of collecting a type of meta information should be greater than the maximum among the overall requirements. With the metadata collected by CSC, the information is further organized and stored in NCSDB. A distributed database is introduced here as a sample physical entity which fulfills the functions of a corresponding logical module. A distributed database has the advantages of advanced performance, high availability and simple extensibility. It is highly partitionable and allows horizontal scaling which satisfies the practical scenarios of large scale of service instances. Also, both keys and values can be anything from simple objects to complex compound objects, and thus heterogeneous computing resources can be described and stored. As shown below, the status of computing resources is modeled as a collection of key-value pairs.

Status Table of Computing Resources With the introduction of a distributed database, the data of the computing resources can be stored in hierarchically organized directories. A typical form to obtain interested information is described as below:

/service ID/service instance
/service ID/service instance/Gateway
/service ID/service instance/CPU Load
/service ID/service instance/Memory Remains

NCSDB can also enable incremental functions. For instance, a pub-sub scheme and a 'Watch' mechanism can be introduced to fulfill service OAM and service protection.

A 'Watch' Mechanism Applied for a Distributed Database | | | | | | | | | | |<-------------| | | | | Write | | | | | (/Service | | | |<------------| Instance 1/ | | | | Notify | CPU Load 70) | | | | updates | | | | | | | | | | | | | | Notify | | | | | updates | | | | |<-----------| | | | | | | | | ]]> The procedure of learning and processing updated computing resource status is described as follows:

The CPU load of the container or VM reaches the threshold 70% and the updated status is then written into the database in a key-value scheme after being collected by CSC.
Relevant modules, NCC for instance, subscribe the information by watching the prefix of the key-value pair.
Learning the CPU load reaches 70%, the service routing entries are updated or regenerated and a recalculation is performed at the control plane.

Part 3: Metadata Processing and Calculation at ORC The Middle Ware processes the matadata collected from the network domain and multiple cloud sites at ORC which follows the following procedures:

End-to-end Delay

For instances which provides certain set of services with corresponding network paths, ORC integrates the collected metadata of the same class. For instance, as shown above, the unidirectional end-to-end delay consists of segmented network latency Delay1, Delay2, Delay3 and process delay Delay4 caused by possible queue backlog and logical processing.
For a specific service, ORC identifies and filters out the sensitive attributes from the integrated attributes as the input variables for a corresponding function registed at SRM and SSC.
For a service instance and all possible network forwarding paths that reach it, ORC calculates its ability to provide a specific type of service in conjunction with a TE policy or BE path, and represents it as a single metric value.
According to the designated semantics of metrics, ORC evaluates the validity and performance of every entries, further selects appropriate entries to inform and to distribute to NCC and ultimately work in the forwarding plane of computing aware network devices.

Entries in the Control Plane and the Forwarding Plane

Conclusion With the forementioned logical functions and modules designed in a Middle Ware, incremental requirements raised by a learning process of computing status can be satisfied:

The dynamicity of running computing status can be restrained and controlled at CSC, NCSDB and ORC.
Service instances are able to be evaluated by registered and configured methods in a differentiated manner. SRM and SSC are capable of adjusting scheduling strategies and switching evaluation methods.
Identical metadata can be processed in an accumulative manner while attributes of different dimensions are integrated by the registered evaluation methods.
Metadata is not exposed directly but converted into simple metric values. With properly designed semantics of a metric value, appropriate entries can be simply determined.

Security Considerations TBA.

Acknowledgements TBA.

IANA Considerations TBA.