Lexicon
Actuator
A component that moves or controls a mechanism in response to input from a controller.
AdaptiveAI™
An AI system that creates a Configuration Space within Inference Engines so that Inference and Training outputs can be manipulated in real-time to meet performance and energy goals. Functionally, AdaptiveAI first makes inference engines configurable and then makes them computationally Self Aware, using Goals as First-class objects under software control.
AI Inference Optimization
The process of improving the speed, accuracy, or resource efficiency of machine learning inference tasks.
Algorithmic Complexity
A measure of the computational resources required for an algorithm to solve a problem.
Approximate Computing
A computing technique that trades accuracy for reduced resource consumption, improving energy efficiency and performance.
Artificial Intelligence (AI)
The simulation of human intelligence in machines, allowing them to perform tasks such as learning, reasoning, and problem-solving.
Asynchronous Processing
Execution of tasks independently and without waiting for others, improving performance in distributed systems.
Backpropagation
An algorithm in training neural networks that adjusts the weights by propagating errors backward through the network.
Bandwidth
The amount of data that can be transferred over a network or system connection in a given time.
Bayesian Model
A statistical model based on Bayes’ theorem, updating probability estimates as new evidence is available.
Big.LITTLE Architecture
A processor architecture combining high-performance and energy-efficient cores to balance power and performance.
Cache Coherency
Ensures consistency across shared resources like CPU caches, keeping data updated across processors.
Cache Miss
Occurs when the data requested by the processor is not found in the cache, causing delays.
CALOREE
A Seqitur™ system optimizing resource usage in heterogeneous mobile architectures by dynamically adjusting power and performance.
Cloud Computing
The delivery of computing services (e.g., storage, databases) over the internet, allowing on-demand resource access.
Concurrency
The simultaneous execution of multiple tasks or processes, improving system performance.
Concurrency Control
Ensures multiple tasks or transactions can occur simultaneously without causing conflicts or errors.
Congestion Control
Managing network traffic to prevent overloading and ensure efficient data transmission.
Configuration Space
The set of all possible system configurations that can be adjusted to optimize performance or energy use.
Constraint
A limit or requirement placed on a system’s behavior, such as energy consumption or accuracy.
Control Theory
A branch of engineering that deals with the behavior of dynamic systems, using feedback loops to maintain desired system states.
Core
The processing unit of a CPU that performs computational tasks; multicore processors have multiple cores for parallel processing.
Core Affinity
Binding a specific process to a particular core in a multicore processor, optimizing task performance.
Cost-Performance Trade-off
The balance between minimizing costs and maximizing system performance through resource allocation.
Cross-layer Optimization
An optimization approach considering multiple system stack levels (hardware, software, application) to achieve the best results.
Data Center
A facility housing computing systems and associated components for large-scale data processing.
Data Parallelism
A parallel computing technique where the same operation is applied to multiple data elements simultaneously.
Decision Tree
A machine learning model that predicts or classifies data using a tree-like structure of decisions.
Deep Learning
A subset of machine learning using multi-layered neural networks to model complex patterns in large datasets.
Distributed Learning
A method where machine learning tasks are spread across multiple machines to scale up training or inference.
Distributed System
A system in which multiple computers work together, sharing resources and tasks across a network.
Dynamic Configuration
Adjusting system settings (e.g., power, performance) during operation to optimize resource usage and performance.
Dynamic Optimization
The continuous adjustment of system parameters to maintain optimal performance during changing conditions.
Dynamic Power Management
Adjusting the power usage of a system in real-time to balance energy efficiency and performance.
Dynamic Task Allocation
The ability to distribute and reassign tasks among system resources in real-time based on workload changes.
Dynamic Voltage and Frequency Scaling (DVFS)
A power management technique that adjusts voltage and frequency to reduce energy consumption.
Edge Computing
A computing model where data processing occurs near the source of data, reducing latency by processing closer to users.
Elastic Computing
A system’s ability to automatically scale resources up or down based on workload demands.
Embedded System
A computing system dedicated to specific tasks within larger systems, often with real-time constraints.
Energy Budget
The total amount of energy allocated for use by a system, typically a key factor in mobile or low-power devices.
Energy Efficiency
Achieving performance objectives while minimizing energy consumption, essential in power-constrained systems.
Energy Harvesting
Capturing and storing energy from external sources like solar or kinetic to power devices.
Energy-Proportional Computing
A design principle where energy consumption scales in direct proportion to system utilization.
Energy Scaling
Adjusting a system’s energy consumption based on workload or performance requirements to improve efficiency.
Energy-Quality Trade-off
The balance between reducing energy consumption and maintaining the quality of computational outputs.
Exascale Computing
High-performance computing systems capable of performing at least one exaFLOP (billion billion calculations per second).
Execution Time
The amount of time taken by a computer to execute a task or run a program from start to finish.
Execution Pipeline
The series of stages through which instructions pass in a processor, from fetching to execution, affecting overall performance.
Exponential Backoff
A strategy where the time between retries is increased exponentially to prevent network congestion.
Failure Recovery
The system’s ability to restore normal operation after a failure without losing data or causing significant downtime.
Fault Detection
Methods for identifying and diagnosing errors or failures within a system to prevent larger issues.
Fault Management
Detecting, diagnosing, and responding to hardware or software faults to maintain system reliability.
Fault Tolerance
The system’s ability to continue operating correctly even if some components fail.
Feature Extraction
Identifying and selecting relevant data attributes to improve machine learning model performance.
Federated Learning
Distributed machine learning where data is kept locally, and models are trained collectively without sharing sensitive data.
Feedback Loop
A system mechanism that uses the system’s current output to adjust future performance. System status is responsive to changes in the workload but not to be considered as dynamic configuration to goals (not self-aware).
Floating Point Operations Per Second (FLOPS)
A measure of a computer’s performance in tasks involving real-number calculations.
Goal-Oriented Computing
A computational approach where systems are designed to meet specific, often changing, objectives like performance, accuracy, or energy use.
Goal Specification
The process of defining what a system should achieve, often in terms of constraints like energy, latency, or performance. May be diverse, simple or complex. May be applied globally or distributed. May be as diverse as the initial system design specification or of any other construction.
GPU (Graphics Processing Unit)
A specialized processor designed for rendering graphics, increasingly used for general-purpose computing in AI.
Gradient Descent
An optimization algorithm that adjusts model parameters to minimize errors by moving toward the steepest descent.
Graph Neural Network (GNN)
A neural network model designed to operate on graph-structured data.
Green Computing
Designing, using, and disposing of computers in ways that minimize environmental impact.
Hardware Acceleration
Using specialized hardware to perform specific computing tasks faster or more efficiently than a general-purpose CPU.
Heterogeneous Computing
Using different types of processors (e.g., CPU, GPU) to handle different parts of a workload for better performance.
Heterogeneous Memory Systems
Systems that combine different types of memory to optimize performance and storage capacity.
Heterogeneous Systems
Computing systems composed of different types of processors (e.g., CPU, GPU), each optimized for specific tasks to improve performance and energy efficiency.
High Availability (HA)
Systems designed for operational continuity with minimal downtime through redundancy and failover.
High-Performance Computing (HPC)
The use of supercomputers or parallel processing to perform large-scale computations quickly.
Hyperparameter Tuning
Optimizing the parameters that govern a machine learning model’s training process to improve performance.
Idle Power Consumption
The power consumed by a system when it is not actively processing tasks but still powered on.
Inference
The process of using a trained machine learning model to make predictions or decisions based on new data.
Inference Latency
The delay between submitting input to an AI model and receiving output, critical in time-sensitive applications.
Inference Time
The time taken by a machine learning model to make predictions or decisions based on new data inputs after training is complete.
Infrastructure as a Service (IaaS)
A cloud computing model that provides virtualized computing resources over the internet.
Interconnect
The communication link between processors, memory, and other components in a system, affecting data transfer rates.
I/O Bound
When a system’s performance is limited by its input/output operations rather than its computational power.
JouleGuard
A method for managing energy and performance trade-offs, using machine learning and control theory for energy-efficient computing.
Knobs (Configurable Parameters)
Adjustable settings in a system (e.g., clock speed, core usage) that control performance and resource allocation.
Latency
The time delay experienced between the initiation of a process (e.g., user input) and its completion or response.
Latency Constraint
A limit placed on how much time a system has to respond to inputs or complete a process.
Latency Overhead
Delays introduced by system overhead, increasing the time taken to complete tasks or data processing.
Latency Sensitivity
The degree to which system performance depends on minimizing delays in processing or communication.
Learning Rate
A hyperparameter that determines how quickly a machine learning model adjusts its parameters based on error measurements.
Load Balancing
The distribution of workloads across multiple computing resources to optimize performance and prevent overloading.
Load Distribution
The balancing of tasks across different system resources to prevent overloading any single component.
Load Shedding
Dropping or postponing lower-priority tasks to ensure high-priority tasks are processed efficiently during overload.
Low-Latency Optimization
Techniques to minimize delays in data processing, critical in real-time systems like video streaming.
Low-Power Mode
A system state where power consumption is reduced, typically extending battery life at the cost of reduced performance.
Machine Learning
A field of artificial intelligence where systems learn from data and improve their performance without being explicitly programmed for every task.
Memory Bandwidth
The amount of data that can be read or written to memory by a processor in a given time.
Memory Hierarchy
The organization of different types of memory (registers, cache, RAM, disk) to balance speed and cost efficiency.
Memory Management Unit (MMU)
A hardware component that manages memory access and virtual address translation in a system.
Microarchitecture
The internal structure of a processor, determining how instructions are executed and how performance is optimized.
Model Compression
Techniques to reduce the size and complexity of machine learning models, improving their efficiency.
Multicore Processor
A single computing unit with two or more independent processing cores for performing parallel processing.
Multithreading
Executing multiple threads concurrently within a single process to improve performance.
Network Latency
The delay in sending and receiving data across a network, affecting performance in time-sensitive applications.
Node
A single computing device or server within a distributed system or network.
Non-Volatile Memory (NVM)
Memory that retains stored data even when the system is powered off, such as flash storage.
Normalization
A data preprocessing technique that adjusts data values to a common scale, improving machine learning model training.
Offloading
Transferring computational tasks from one system or component to another, typically for optimization or load balancing.
On-Demand Computing
A computing model where resources are dynamically provided based on the current system or user needs.
Optimization
The process of making a system or design as effective or functional as possible under given constraints, such as performance or energy.
Optimization Algorithm
A mathematical method for adjusting system parameters to achieve the best possible performance.
Overfitting
When a machine learning model performs well on training data but poorly on unseen data, often due to excessive complexity.
Overhead
The extra processing or resource costs associated with managing or maintaining a system’s operation.
Packet Loss
Data packets lost during network transmission, leading to degraded performance and the need for retransmission.
Parallel Processing
The simultaneous execution of multiple tasks by dividing them among multiple processing units to improve speed.
Parallelism Granularity
The size of tasks executed in parallel, with fine-grained parallelism involving small tasks and coarse-grained parallelism involving larger ones.
Pipelining
A technique where multiple instruction stages are overlapped to improve instruction throughput in processor design.
Performance Bottleneck
A limitation in a system’s design that restricts its overall performance, often due to resource contention.
Performance Hash Table (PHT)
A data structure used to store and predict the performance and energy characteristics of different system configurations.
Performance Monitoring
Tracking system performance metrics (e.g., CPU usage) to ensure optimal operation.
Performance Prediction
Using models or algorithms to estimate system performance under different configurations or workloads.
Power Budget
The total amount of power allocated to a system or process, critical in optimizing energy efficiency.
Power Capping
Limiting a system’s power consumption to stay within an energy budget or prevent overheating.
Power Efficiency
Maximizing system performance while consuming the least possible energy.
Predictive Analytics
Using data and machine learning techniques to predict future events or outcomes based on historical data.
Predictive Control
Using models to predict future system behavior and adjust actions accordingly to meet performance goals.
Predictive Modeling
The use of statistical techniques and machine learning to forecast future system states or behavior based on historical data.
Proactive Scaling
Adjusting system resources in anticipation of future workload increases or performance needs.
Quantum Computing
A computing paradigm using quantum bits (qubits) and quantum mechanics principles for complex calculations faster than classical computers.
QoR (Quality of Result)
A measure of how well a system meets its performance or accuracy targets, especially in approximate computing.
Quality of Service (QoS)
A measure of the overall system performance in terms of reliability, responsiveness, and availability.
Real-Time Adjustment
The ability of a system to modify its behavior based on current data or conditions.
Redundancy
Extra components or systems that ensure continued operation in case of failure, improving reliability and availability.
Reinforcement Learning
A machine learning approach where an agent learns to make decisions by receiving rewards or penalties for its actions.
Reinforcement Signal
Feedback used by a reinforcement learning agent to guide its decision-making toward desired outcomes.
Resource Allocation
Distributing available resources (e.g., CPU, memory) to different tasks or processes based on demand.
Resource Contention
A situation where multiple tasks compete for the same computing resources, leading to reduced performance.
Resource Contention Management
Techniques to manage competition for resources between multiple tasks or processes to optimize performance.
Resource Efficiency
The effective use of system resources to achieve optimal performance with minimal waste.
Response Time
The total time taken by a system to react to input, including latency and processing time.
Runtime Optimization
The real-time adjustment of system parameters or resources to improve performance during execution.
Scheduling Algorithm
A set of rules determining the order in which tasks are executed by a processor, optimizing performance or resource use.
Scalability
The ability of a system to handle increased workloads or to be expanded by adding resources without compromising performance.
Scalable Vector Extensions (SVE)
An instruction set architecture extension improving performance in vector processing, used in scientific computing.
Scaling Factor
A ratio used to increase or decrease system performance or resource usage based on demand.
Scheduling Algorithm
A set of rules that determine the order in which tasks are executed by a processor, often used to optimize performance or resource usage.
Self-Aware™ Computing
A system’s ability to monitor and reason about its performance and adjust configurations autonomously to meet goals like energy efficiency given dynamic changes in the workload at any instant.
Self-Aware™ Capable Computing
A system that has a variable Configuration Space that has been designed to be computationally available to be manipulated by goals in the future.
Self-Optimizing System
A system that automatically adjusts its configuration and behavior to achieve optimal performance under changing conditions.
Server Cluster
A group of linked servers working together to handle workloads, improving system performance, reliability, or scalability.
Simulation
Using models to replicate system behavior under different conditions, often for performance testing or optimization.
Software as a Service (SaaS)
A cloud computing model providing software applications over the internet, allowing on-demand access without local installation.
Sparse Representation
A data representation where only a small subset of values is non-zero, reducing computation and memory requirements.
Surrogate Model
A simplified model used in place of a more complex one to approximate performance or behavior with less resource use.
System Bottleneck
A limiting factor within a system that reduces its overall efficiency or speed.
Task Parallelism
A form of parallel computing where different tasks or processes are executed simultaneously across multiple processors.
Task Scheduling
The process of assigning tasks to system resources at specific times to optimize performance.
Temporal Scalability
A system’s ability to maintain performance over time, even as workloads or conditions change.
Thermal Throttling
A mechanism reducing system performance to prevent overheating when temperatures exceed safe thresholds.
Throughput
The amount of work a system can process in a given period, often used to measure system performance.
Throughput Optimization
Maximizing the rate at which a system processes tasks or data to improve overall performance.
Trade-off Space
The set of possible compromises between conflicting objectives (e.g., performance vs. energy consumption) in system optimization.
Training Data
A dataset used to train a machine learning model, helping it learn patterns for predictions or decisions.
Underfitting
A machine learning problem where a model is too simple to capture data patterns, resulting in poor performance.
Virtual Machine (VM)
A software-emulated computer system allowing multiple operating systems to run on a single physical machine.
Virtualization
Creating a virtual version of a resource (e.g., server, storage) to allow multiple systems to share the same hardware.
Voltage Scaling
Reducing energy consumption by lowering the voltage supplied to a processor, usually at the cost of reduced performance.
Workload
The total set of tasks a system must handle, affecting performance and resource allocation.
Workload-Aware Optimization
Adjusting system configurations based on the specific characteristics and demands of the current workload.
Workload Diversity
The variety of tasks handled by a system, affecting optimization strategies for performance and energy efficiency.
Workload Sensitivity
How a system’s performance or energy use is affected by different types of workloads.
