NIST AI Glossary
NIST AI Glossary
NIST has released “The Language of Trustworthy AI: An In-Depth Glossary of Terms” (beta) to promote a shared understanding and improve communication among individuals and organizations seeking to operationalize trustworthy and responsible AI through approaches such as the NIST AI Risk Management Framework (AI RMF). The glossary is non-sector specific and use-case agnostic, designed to be flexible for all organizations and sectors of society to use. A document describing the motivation and development of the glossary is also being released. The goal of this common vocabulary is not to declare one specific meaning for identified terms, but to provide interested parties with a broader awareness of the multiple meanings of commonly used terms within the interdisciplinary field of Trustworthy and Responsible AI. The following is an abbreviated table (11/1/24)
Terms | Definition 1 | Citation 1 | Definition 2 | Citation 2 | Definition 3 | Citation 3 | Definition 4 | Citation 4 | Definition 5 | Citation 5 | Related terms and synonyms | Legal definition applicable |
---|---|---|---|---|---|---|---|---|---|---|---|---|
accountability | 1) relates to an allocated responsibility. The responsibility can be based on regulation or agreement or through assignment as part of delegation; 2) For systems, a property that ensures that actions of an entity can be traced uniquely to the entity; 3) In a governance context, the obligation of an individual or organization to account for its activities, for completion of a deliverable or task, accept the responsibility for those activities, deliverables or tasks, and to disclose the results in a transparent manner. | ISO/IEC_TS_5723:2022(en) | "accountable" (adjective vs. noun): answerable for actions, decisions, and performance | ISO/IEC_TS_5723:2022(en) | ||||||||
accuracy | Closeness of computations or estimates to the exact or true values that the statistics were intended to measure. | OECD | A qualitative assessment of correctness or freedom from error. | FDA_Glossary | The measure of an instrument's capability to approach a true or absolute value. It is a function of precision and bias. | FDA_Glossary | The accuracy of a machine learning system is measured as the percentage of correct predictions or classifications made by the model over a specific data set. It is typically estimated using a test or "hold out" sample, other than the one(s) used to construct the model. Its complement, the error rate, is the proportion of incorrect predictions on the same data. | Raynor | measure of closeness of results of observations, computations, or estimates to the true values or the values accepted as being true | ISO/IEC_TS_5723:2022(en) | ||
active learning | A proposed method for modifying machine learning algorithms by allowing them to specify test regions to improve their accuracy. At any point, the algorithm can choose a new point x, observe the output and incorporate the new (x, y) pair into its training base. It has been applied to neural networks, prediction functions, and clustering functions. | Raynor | Active learning (also called “query learning,” or sometimes “optimal experimental design” in the statistics literature) is a subfield of machine learning and, more generally, artificial intelligence. The key hypothesis is that, if the learning algorithm is allowed to choose the data from which it learns—to be “curious,” if you will—it will perform better with less training. | settles_active_2009 | the process of learning through activities and/or discussion in class, as opposed to passively listening to an expert. | Freeman_et_al_2014 | ||||||
active learning agent | [a machine learning algorithm that can] decide what actions to take [with regards to its training data, in contrast to a passive learning agent, which is limited to a fixed policy]. | Russell_and_Norvig | passive learning agent | |||||||||
activity | Work that an organization performs using business processes; can be singular or compound. | IEEE_Guide_IPA | Set of cohesive tasks of a process. | CSRC | ||||||||
adaptive dynamic programming | An adaptive dynamic programming (or ADP) agent takes advantage of the constraints among the utilities of states by learning the transition model that connects them and solving the corresponding Markov decision process using dynamic programming. | Russell_and_Norvig | A means of learning a model and a reward function from observations that then uses value or policy iteration to obtain the utilities or an optimal policy; makes optimal use of the local constraints on utilities of states imposed through the neighborhood structure of the environment. | Russell_and_Norvig | ||||||||
adaptive learning | Updating predictive models online during their operation to react to concept drifts | Gama,_Joao | ||||||||||
adversarial example | Machine learning input sample formed by applying a small but intentionally worst-case perturbation ... to a clean example, such that the perturbed input causes a learned model to output an incorrect answer. | NISTIR_8269_Draft | Samples generated from real samples with carefully designed imperceptible perturbations | Zhang,_Yonggang | adversarial perturbation | |||||||
adverse action notice | A notification of i) a refusal to grant credit in substantially the amount or on substantially the terms requested in an application unless the creditor makes a counteroffer (to grant credit in a different amount or on other terms) and the applicant uses or expressly accepts the credit offered; ii) A termination of an account or an unfavorable change in the terms of an account that does not affect all or substantially all of a class of the creditor's accounts or iii) A refusal to increase the amount of credit available to an applicant who has made an application for an increase. | ECOA | ||||||||||
adverse impact ratio | privileged and unprivileged groups receiving different outcomes irrespective of the decision maker’s intent and irrespective of the decision-making procedure. Quantified as the ratio: disparate impact ratio = 𝑃( 𝑦̂ (𝑋) = fav ∣∣ 𝑍 = unpr )/𝑃( 𝑦̂ (𝑋) = fav ∣∣ 𝑍 = priv ) where 𝑃(𝑦̂ (𝑋) = fav) is the favorable label, (𝑍 = priv) is the privileged group, and (𝑍 = unpr) is the unprivileged group. | Varshney,_Kush | disparate impact ratio, relative risk ratio | |||||||||
agile | a development approach that delivers software in increments by following the principles of the Manifesto for Agile Software Development. | Gartner | A philosophy and methodology used to describe the continuous, iterative process to develop and deliver software and other digital technologies. User requirements and feedback inform incremental development and delivery by developers. | NSCAI | ||||||||
AI principles | [An overarching concept, value, belief, or norm that guides AI development, testing, and deployment across the AI lifecycle. The OECD] identifies five complementary values-based principles for the responsible stewardship of trustworthy AI and calls on AI actors to promote and implement them: inclusive growth, sustainable development and well-being; human-centred values and fairness; transparency and explainability; robustness, security and safety; and accountability. | OECD_CAI_recommendation | Are these definitions of what an AI principle is or a list of definitions? | |||||||||
algorithm | A set of computational rules to be followed to solve a mathematical problem. More recently, the term has been adopted to refer to a process to be followed, often by a computer. | Comptroller_Office | precise rules for transforming specified inputs into specified outputs in a finite number of steps | knuth_art_1981 | algorithms are step-by-step procedures for solving problems. For concreteness, we can think of them simply as being computed programs, written in some precise computer languages | garey_computers_1979 | ||||||
algorithmic aversion | biased assessment of an algorithm which manifests in negative behaviours and attitudes towards the algorithm compared to a human agent. | Ekaterina_et_al_2020 | ||||||||||
alignment | ensur[ing] that powerful AI is properly aligned with human values. ... The challenge of alignment has two parts. The first part is technical and focuses on how to formally encode values or principles in artificial agents so that they reliably do what they ought to do. ... The second part of the value alignment question is normative. It asks what values or principles, if any, we ought to encode in artificial agents. | Gabriel_2020 | ||||||||||
amplification | [an act of amplifying, which is] to make larger or greater (as in amount, importance, or intensity). | Merriam-Webster_amplify | Let [construct space] 𝑌 ′ and [prediction space] 𝑌ˆ be categorical. Then, a model exhibits disparity amplification if 𝑑tv (𝑌ˆ |𝑍=0,𝑌ˆ |𝑍=1) > 𝑑tv (𝑌 ′ |𝑍=0,𝑌 ′ |𝑍=1). dtv is the total variation distance defined as follows. Let 𝑌0 and 𝑌1 be categorical random variables with finite supports Y0 and Y1. Then, the total variation distance between 𝑌0 and 𝑌1 is 𝑑tv (𝑌0,𝑌1) = 12 Σ︁ 𝑦∈Y0∪Y1 Pr[𝑌0=𝑦] − Pr[𝑌1=𝑦] . In the special case where 𝑌0,𝑌1 ∈ {0, 1}, the total variation distance can also be expressed as | Pr[𝑌0=1] − Pr[𝑌1=1] |. | yeom_avoiding_2021 | ||||||||
analytics | Analytics is the application of scientific & mathematical methods to the study & analysis of problems involving complex systems. There are three distinct types of analytics: * Descriptive Analytics gives insight into past events, using historical data. * Predictive Analytics provides insight on what will happen in the future. * Prescriptive Analytics helps with decision making by providing actionable advice. | informs_analytics_2022 | ||||||||||
annotation | Further documentation accompanying a requirement. | IEEE_Soft_Vocab | [the act of] mak[ing] or furnish[ing] critical or explanatory notes or comment | Merriam-Webster_annotate | ||||||||
anomaly | Anything observed in the documentation or operation of a system that deviates from expectations based on previously verified system, software, or hardware products or reference documents. | IEEE_Soft_Vocab | Condition that deviates from expectations, based on requirements specifications, design documents, user documents, or standards, or from someone's perceptions or experiences. | SP800-160 | ||||||||
anonymization | The process in which individually identifiable data is altered in such a way that it no longer can be related back to a given individual. Among many techniques, there are three primary ways that data is anonymized. Suppression is the most basic version of anonymization and it simply removes some identifying values from data to reduce its identifiability. Generalization takes specific identifying values and makes them broader, such as changing a specific age (18) to an age range (18-24). Noise addition takes identifying values from a given data set and switches them with identifying values from another individual in that data set. Note that all of these processes will not guarantee that data is no longer identifiable and have to be performed in such a way that does not harm the usability of the data. | IAPP_Privacy_Glossary | process that removes the association between the identifying dataset and the data subject | CSRC | ||||||||
anthropomorphism | the attribution of distinctively human-like feelings, mental states, and behavioral characteristics to inanimate objects, animals, and in general to natural phenomena and supernatural entities | Anthropomorphism_in_AI_2020 | a particular human-like interpretation of existing physical features and behaviors that goes beyond what is directly observable | Anthropomorphism_in_AI_2020 | ||||||||
application | A software program hosted by an information system. | SP800-37 | A hardware/software system implemented to satisfy a particular set of requirements. | CSRC | software or a program that is specific to the solution of an application problem | aime_measurement_2022 citing ISO/IEC TR 24030 | ||||||
application programming interface (API) | a software contract between the application and client, expressed as a collection of methods or functions. . . it defines the available functions you can execute; . . . the intermediary interface between the client and the application. | Hands-On_Smart_Contract_Dev | ||||||||||
artificial intelligence (AI) system | an engineered or machine-based system that can, for a given set of objectives, generate outputs such as predictions, recommendations, or decisions influencing real or virtual environments. AI systems are designed to operate with varying levels of autonomy | NIST AI RMF (Adapted from: OECD Recommendation on AI:2019; ISO/IEC 22989:2022). | ||||||||||
artificial intelligence learning | The ingestion of a corpus, application of semantic mapping, and relevant ontology of structured and/or unstructured data that yields inference and correlation leading to the creation of useful conclusive or predictive capabilities in a given knowledge domain. Strong AI learning also includes the capability of creating unique hypotheses, attributing data relevance, processing data relationships, and updating its own lines of inquiry to further the usefulness of its purpose. | IEEE_Guide_IPA | ||||||||||
artificial narrow intelligence (ANI) | [an AI system that] is designed to accomplish a specific problem-solving or reasoning task. | OECD_Artificial_Intelligence_in_Society | weak intelligence; applied intelligence | |||||||||
artificial neural networks | A computing system, made up of a number of simple, highly interconnected processing elements, which processes information by its dynamic state response to external inputs. | Reznik,_Leon | Definition 1. A directed graph is called an Artificial Neural Network (ANN) if it has x at least one start node (or Start Element; SE), x at least one end node (or End Element; EE), x at least one Processing Element (PE), x all the nodes used must be Processing Elements (PEs), except start nodes and end nodes, x a state variable ni associated with each node i, x a real valued weight wki associated with each link (ki) from node k to node i, x a real valued bias bi associated with each node i, x at least two of the multiple PEs connected in parallel, x a learning algorithm that helps to model the desired output for given input. x a flow on each link (ki) from node k to node i, that carries exactly the same flow which equals to nk caused by the output of node k , x each start node is connected to at least one end node, and each end node is connected to at least one start node, x no parallel edges (each link (ki) from node k to node i is unique). | |||||||||
assessment | Action of applying specific documented criteria to a specific software module, package or product for the purpose of determining acceptance or release of the software module, package or product. | IEEE_Soft_Vocab | the action or an instance of making a judgment about something : the act of assessing something : APPRAISAL | Merriam-Webster_assessment | ||||||||
attack | Action targeting a learning system to cause malfunction. | NISTIR_8269_Draft | Any kind of malicious activity that attempts to collect, disrupt, deny, degrade, or destroy information system resources or the information itself. | CSRC | ||||||||
attribute | Property associated with a a set of real or abstract things that is some characteristic of interest. | IEEE_Soft_Vocab | property or characteristic of an object that can be distinguished quantitatively or qualitatively by human or automated means | aime_measurement_2022, citing ISO/IEC TR 24029-1 | ||||||||
audit | Systematic, independent, documented process for obtaining records, statements of fact, or other relevant information and assessing them objectively, to determine the extent to which specified requirements are fulfilled. | IEEE_Soft_Vocab | To conduct an independent review and examination of system records and activities in order to test the adequacy and effectiveness of data security and data integrity procedures, to ensure compliance with established policy and operational procedures, and to recommend any necessary changes. | FDA_Glossary | Independent examination of a software product, software process, or set of software processes to assess compliance with specifications, standards, contractual agreements, or other criteria | NASA_Soft_Standards | Independent review conducted to compare the various aspects of the laboratory’s performance with a standard for that performance. Also defined as a systematic, independent and documented process for obtaining audit evidence and evaluating it objectively to determine the extent to which audit criteria are fulfilled. | UNODC_Glossary_QA_GLP | ||||
audit log | A chronological record of system activities, including records of system accesses and operations performed in a given period. | SP800-37 | ||||||||||
authenticity | property that an entity is what it claims to be | ISO/IEC_TS_5723:2022(en) | ||||||||||
automation | Independent machine-managed choreography of the operation of one or more digital systems. | IEEE_Guide_IPA | conversion of processes or equipment to automatic operation, or the results of the conversion | IEEE_Soft_Vocab | The system functions with no/little human operator involvement; however, the system performance is limited to the specific actions it has been designed to do. Typically these are well-defined tasks that have predetermined responses (i.e., simple rule-based responses). | DOD_TEVV | ||||||
automation bias | over-relying on the outputs of AI systems | David_Leslie_Morgan_Briggs | ||||||||||
autonomic | A monitor-analyze-plan-execute (MAPE) computer system capable of sensing environments, interpreting policy, accessing knowledge (data --- information --- knowledge), making decisions, and initiating dynamically assembled routines of choreographed activity to both complete a process and update the set of environmental variables that enables the autonomic system to self-manage its own operation and the processes it oversees. An autonomic system is identified by eight characteristics: a) Knows the resources to which it has access, what its capabilities and limitations are, and how and why it is connected to other systems. b) Is able to configure and reconfigure itself depending on the changing computing environment. c) Is able to optimize its performance to ensure the most efficient computing process. d) Is able to work around encountered problems either by repairing itself or routing functions away from the trouble. e) Is able to detect, identify, and protect itself against various types of attacks to maintain overall system security and integrity. f) Is able to adapt to its environment as it changes by interacting with neighboring systems and establishing communication protocols. g) Relies on open standards and requires access to proprietary environments to achieve full performance. h) Is able to anticipate the demand on its resources transparently to users. | IEEE_Guide_IPA | ||||||||||
autonomous vehicle | [an] automobile, bus, tractor, combine, boat, forklift, etc. . . . capable of sensing its environment and moving safely with little or no human input. | Introduction_to_Information_Systems | ||||||||||
autonomy | A system’s level of independence from human involvement and ability to operate without human intervention. [Different AI systems have different levels of autonomy.] An autonomous system has a set of learning, adaptive and analytical capabilities to respond to situations that were not pre-programmed or anticipated (i.e., decision-based responses) prior to system deployment. Autonomous or semi-autonomous AI systems can be characterised as "human-in-the-loop", "human-on-the-loop", or "human-out-of-the loop" systems depending on their level of meaningful involvement of human beings. | TTC6_Taxonomy_Terminology | ||||||||||
availability | Ensuring timely and reliable access to and use of information. | SP800-37 | The property that data or information is accessible and usable upon demand by an authorized person. | NIST_SP_800 | property of being accessible and usable on demand by an authorized entity | ISO/IEC_TS_5723:2022(en) | ||||||
back-testing | A form of outcomes analysis that involves the comparison of actual outcomes with modeled forecasts during a development sample time period (in-sample back-testing) and during a sample period not used in model development (out-of-time back-testing), and at an observation frequency that matches the forecast horizon or performance window of the model. | Comptroller_Office | ||||||||||
batched automation | Process automation execution of intentionally segregated work processes that are able to be processed irrespective of their contextual placement within a service. | IEEE_Guide_IPA | ||||||||||
benchmark | Standard against which results can be measured or assessed; Procedure, problem, or test that can be used to compare systems or components to each other or to a standard. | IEEE_Soft_Vocab | An alternative prediction or approach used to compare a model’s inputs and outputs to estimates from alternative internal or external data or models. | Comptroller_Office | The term benchmarking is used in machine learning (ML) to refer to the evaluation and comparison of ML methods regarding their ability to learn patterns in ‘benchmark’ datasets that have been applied as ‘standards’. Benchmarking could be thought of simply as a sanity check to confirm that a new method successfully runs as expected and can reliably find simple patterns that existing methods are known to identify. | olson_pmlb_2017 | ||||||
bias | A systematic error. In the context of fairness, we are concerned with unwanted bias that places privileged groups at systematic advantage and unprivileged groups at systematic disadvantage. | AI_Fairness_360 | (computational bias) An effect which deprives a statistical result of representativeness by systematically distorting it, as distinct from a random error which may distort on any one occasion but balances out on the average. | OECD | (systemic bias) systematic difference in treatment of certain objects, people or groups in comparison to others | measurement_iso22989_2022 | (mathematical) A point estimator \theta_hat is said to be an unbiased estimator fo \theta if E(\theta_hat) = \theta for every possible value of \theta. If \theta_hat is not unbiased, the difference E(\theta_hat) - \theta is called the bias of \theta | devore_probability_2004 | ||||
bias mitigation algorithm | A procedure for reducing unwanted bias in training data or models. | AI_Fairness_360 | ||||||||||
bias testing | As it relates to disparate impact, courts and regulators have utilized or considered as acceptable various statistical tests to evaluate evidence of disparate impact. Traditional methods of statistical bias testing look at differences in predictions across protected classes, such as race or sex. In particular, courts have looked to statistical significance testing to assess whether the challenged practice likely caused the disparity and was not the result of chance or a nondiscriminatory factor. | SP1270 | ||||||||||
big data | consists of extensive datasets primarily in the characteristics of volume, variety, velocity, and/or variabilitythat require a scalable architecture for efficient storage, manipulation, and analysis | NIST_1500 | ||||||||||
binning | a technique of lumping small ranges of values together into categories, or "bins," for the purpose of reducing the variability (removing some of the fine structure) in a data set. | Pyle,_Dorian_Data_Preparation_as_a_Process | ||||||||||
biometric data | personal data resulting from specific technical processing relating to the physical, physiological or behavioural characteristics of a natural person, which allow or confirm the unique identification of that natural person, such as facial images or dactyloscopic data; | GDPR | an individual’s physiological, biological, or behavioral characteristics, including information pertaining to an individual’s deoxyribonucleic acid (DNA), that is used or is intended to be used singly or in combination with each other or with other identifying data, to establish individual identity. Biometric information includes, but is not limited to, imagery of the iris, retina, fingerprint, face, hand, palm, vein patterns, and voice recordings, from which an identifier template, such as a faceprint, a minutiae template, or a voiceprint, can be extracted, and keystroke patterns or rhythms, gait patterns or rhythms, and sleep, health, or exercise data that contain identifying information. | CCPA | A measurable physical characteristic or personal behavioral trait used to recognize the identity, or verify the claimed identity, of an applicant. Facial images, fingerprints, and iris scan samples are all examples of biometrics. | SP800-12 | personal data; processing | |||||
boosting | A machine learning technique that iteratively combines a set of simple and not very accurate classifiers (referred to as "weak" classifiers) into a classifier with high accuracy (a "strong" classifier) by upweighting the examples that the model is currently misclassifying | aime_measurement_2022, citing Machine Learning Glossary by Google | ||||||||||
breach | The loss of control, compromise, unauthorized disclosure, unauthorized acquisition, or any similar occurrence where: a person other than an authorized user accesses or potentially accesses personally identifiable information; or an authorized user accesses personally identifiable information for another than authorized purpose. | CSRC | ||||||||||
broad artificial intelligence (broad AI) | Complex, computational, cognitive automation system capable of providing descriptive, predictive, prescriptive, and limited deductive analytics with relevance and accuracy exceeding human expertise in a broad, logically related set of knowledge domains. | IEEE_Guide_IPA | ||||||||||
built-in test | Equipment or software embedded in the operational components or systems, as opposed to external support units, which perform a test or sequence of tests to verify mechanical or electrical continuity of hardware, or the proper automatic sequencing, data processing, and readout of hardware or software systems. | SP1011 | ||||||||||
bug-bounty | Reward given to independent security researchers, penetrations testers, and white hat hackers for discovering exploitable software vulnerabilities and sharing this knowledge with the operator of a particular bug-bounty program (BBP). | Kuehn,_Andreas | ||||||||||
business process | A defined set of business activities that represent the steps or tasks required to achieve a business objective, including the flow and use of information, participants, and human or digital resources. | IEEE_Guide_IPA | ||||||||||
business process management | Discipline involving any combination of modeling, automation, execution, control, measurement and optimization of business activity flows, in support of enterprise goals, spanning systems, employees, customers, and partners within and beyond the enterprise boundaries. | IEEE_Guide_IPA | ||||||||||
business rule | Definition, constraint, dependency, or decision criteria that determine the method of execution of a task or tasks, or influences the order of execution of a task or tasks. Business rules assert control, or influence the behavior, of a business process within computing systems. | IEEE_Guide_IPA | ||||||||||
calibration | A comparison between a device under test and an established standard, such as UTC(NIST). When the calibration is finished, it should be possible to state the estimated time offset and/or frequency offset of the device under test with respect to the standard, as well as the measurement uncertainty. | CSRC | operation that, under specified conditions, in a first step, establishes a relation between the quantity values with measurement uncertainties provided by measurement standards and corresponding indications with associated measurement uncertainties and, in a second step, uses this information to establish a relation for obtaining a measurement result from an indication | aime_measurement_2022, citing ISO/IEC Guide 99 | Set of operations that establish, under specified conditions, the relationship between values indicated by a measuring instrument or measuring system, or values represented by a material measure, and the corresponding known values of a measurand. | UNODC_Glossary_QA_GLP | ||||||
capability | measure of capacity and the ability of an entity, person or organization to achieve its objectives | ISO/IEC_TS_5723:2022(en) | ||||||||||
case | Single entry, single exit multiple way branch that defines a control expression, specifies the processing to be performed for each value of the control expression, and returns control in all instances to the statement immediately following the overall construct. | IEEE_Soft_Vocab | ||||||||||
chatbot | Conversational agent that dialogues with its user (for example: empathic robots available to patients, or automated conversation services in customer relations). | COE_AI_Glossary | ||||||||||
choreography | An ordered sequence of system-to-system message exchanges between two or more participants. In choreography, there is no central controller, responsible entity, or observer of the process. | IEEE_Guide_IPA | ||||||||||
classification | When the output is one of a finite set of values (such as sunny, cloudy or rainy), the learning problem is called classification, and is called Boolean or binary classification if there are only two values. | AIMA | task of assigning collected data to target categories or classes. | aime_measurement_2022, citing ISO/IEC TR 24030 | ||||||||
classifier | A model that predicts categorical labels from features. | AI_Fairness_360 | ||||||||||
clustering | Detecting potentially useful clusters of input examples. | AIMA | The basic problem of clustering may be stated as follows: Given a set of data points, partition them into a set of groups which are as similar as possible. | aggarwal_clustering_2013 | the tendency for items to be consistently grouped together in the course of recall. This grouping typically occurs for related items. It is readily apparent in memory tasks in which items from the same category, such as nonhuman animals, are recalled together. | APA_clustering | ||||||
cognitive automation | The identification, assessment, and application of available machine learning algorithms for the purpose of leveraging domain knowledge and reasoning to further automate the machine learning already present in a manner that may be thought of as cognitive. With cognitive automation, the system performs corrective actions driven by knowledge of the underlying analytics tool itself, iterates its own automation approaches and algorithms for more expansive or more thorough analysis, and is thereby able to fulfill its purpose. The automation of the cognitive process refines itself and dynamically generates novel hypotheses that it can likewise assess against its existing corpus and other information resources. | IEEE_Guide_IPA | ||||||||||
cognitive computing | Complex computational systems designed to — Sense (perceive the world and collect data); — Comprehend (analyze and understand the information collected); - Act (make informed decisions and provide guidance based on this analysis in an independent way); and — Adapt (adapt capabilities based on experience) in ways comparable to the human brain. | IEEE_Guide_IPA | ||||||||||
column | In the context of relational databases, a column is a set of data values, all of a single type, in a table. | techopedia_column_2022 | ||||||||||
computer vision | The digital process of perceiving and learning visual tasks in order to interpret and understand the world through cameras and sensors. | NSCAI | An image understanding task that automatically builds a description not only of the image itself, but of the three dimensional scene that it depicts. | NBSIR_82-2582 | ||||||||
concept drift | Use of a system outside the planned domain of application, and a common cause of performance gaps between laboratory settings and the real world. | SP1270 | an online supervised learning scenario when the relation between the input data and the target variable changes over time. | Gama,_Joao | Systems that classify or predict a concept (e.g., credit ratings or computer intrusion monitors) over time can suffer performance loss when the concept they are tracking changes. This is referred to as concept drift. This can either be a natural process that occurs without a reference to the system, or an active process, where others are reacting to the system (e.g., virus detection). | Raynor | ||||||
confidentiality | Data confidentiality is a property of data, usually resulting from legislative measures, which prevents it from unauthorized disclosure. | OECD | Preserving authorized restrictions on information access and disclosure, including means for protecting personal privacy and proprietary information. | CSRC | The property that data or information is not made available or disclosed to unauthorized persons or processes. | NIST_SP_800 | A property that information is not disclosed to users, processes, or devices unless they have been authorized to access the information. | CISA | ||||
confusion matrix | A matrix showing the predicted and actual classifications. A confusion matrix is of size LxL, where L is the number of different label values | Kohavi,_Ron | ||||||||||
consent | ‘Consent’ of the data subject means any freely given, specific, informed and unambiguous indication of the data subject's wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her. | GDPR | “Consent” means any freely given, specific, informed, and unambiguous indication of the consumer’s wishes by which the consumer, or the consumer’s legal guardian, a person who has power of attorney, or a person acting as a conservator for the consumer, including by a statement or by a clear affirmative action, signifies agreement to the processing of personal information relating to the consumer for a narrowly defined particular purpose. Acceptance of a general or broad terms of use, or similar document, that contains descriptions of personal information processing along with other, unrelated information, does not constitute consent. Hovering over, muting, pausing, or closing a given piece of content does not constitute consent. Likewise, agreement obtained through use of dark patterns does not constitute consent. | CCPA | personal data | |||||||
constituent system | independent system that forms part of a system of systems (SoS) (note: Constituent systems can be part of one or more SoS. Each constituent system is a useful system by itself, having its own development, management, utilization, goals, and resources, but interacts within the SoS to provide the unique capability of the SoS). | ISO/IEC_TS_5723:2022(en) | ||||||||||
constraint | Specification of what may be contained in a data or metadata set in terms of the content or, for data only, in terms of the set of key combinations to which specific attributes (defined by the data structure) may be attached. | OECD | A limitation or implied requirement that constrains the design solution or implementation of the systems engineering process and is not changeable by the enterprise | IEEE_Soft_Vocab | ||||||||
construct validity | the degree to which the application of constructs to phenomena is warranted with respect to the research goals and questions. | Wieringa,_Roel_J. | Construct validation is involved whenever a test is to be interpreted as a measure of some attribute or quality which is not “operationally defined.” The problem faced by the investigator is, “What constructs account for variance in test performance?” | cronbach_construct_1955 | Established experimentally to demonstrate that a survey distinguishes between people who do and do not have certain characteristics. It is usually established experimentally. | fink_survey_2010 | Establishing construct validity means demonstrating, in a variety of ways, that the measurements obtained from measurement model are both meaningful and useful. | jacobs_measurement_2023 | ||||
content validity | Refers to the extent to which a measure thoroughly and appropriately assesses the skills or characteristics it is intended to measure. | fink_survey_2010 | the extent to which a test measures a representative sample of the subject matter or behavior under investigation. For example, if a test is designed to survey arithmetic skills at a third-grade level, content validity indicates how well it represents the range of arithmetic operations possible at that level. Modern approaches to determining content validity involve the use of exploratory factor analysis and other multivariate statistical procedures. | APA_content_validity | ||||||||
context | The context is the circumstances, purpose, and perspective under which an object is defined or used. | OECD | The immediate environment in which a function (or set of functions in a diagram) operates | IEEE_Soft_Vocab | the interrelated conditions in which something exists or occurs. | Merriam-Webster_context | ||||||
contextual learning | A computing system with sufficient knowledge regarding its purpose that it understands the source, relevance, and utility of data and inputs. | IEEE_Guide_IPA | ||||||||||
context-of-use | The Context of Use is the actual conditions under which a given artifact/software product is used, or will be used in a normal day to day working situation. | interaction_context_2023 | comprises a combination of users, goals, tasks, resources, and the technical, physical and social, cultural and organizational environments in which a system, product or service is used[; ...] can include the interactions and interdependencies between the object of interest and other systems, products or services. | ISO_9241-11:2018 | ||||||||
controllability | property of a system that allows a human or another external agent to intervene in the system’s functioning; such a system is heteronomous. | ISO/IEC_TS_5723:2022(en) | ||||||||||
control class | (control group) the set of observations in an experiment or prospective study that do not receive the experimental treatment(s). These observations serve (a) as a comparison point to evaluate the magnitude and significance of each experimental treatment, (b) as a reality check to compare the current observations with previous observation history, and (c) as a source of data for establishing the natural experimental error. | nist_statistics_2012 | ||||||||||
controller | ‘Controller’ means the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the processing of personal data; where the purposes and means of such processing are determined by Union or Member State law, the controller or the specific criteria for its nomination may be provided for by Union or Member State law; | GDPR | personal data; processor | |||||||||
copilot | An artificial intelligence powered software program designed to assist users with various tasks and automate features within compatible applications using advanced language models, machine-learning algorithms, and conversational interfaces to understand user requests and provide suggestions, summaries, and content generation in response. | A product or service that provides assistance using, incorporating and/or based on artificial intelligence software and artificial intelligence software services | ||||||||||
corpus (corpora) | A deliberately assembled collection of knowledge and data (structured and/or unstructured) believed to contain relevant information on a topic or topics to be used by software systems for which useful analysis, prediction, or outcome is being sought. | IEEE_Guide_IPA | ||||||||||
correlation | In its most general sense correlation denoted the interdependence between quantitative or qualitative data. In this sense it would include the association of dichotomised attributes and the contingency of multiply-classified attributes. | OECD | The correlation coefficient of two random variables y_1, and y_2, denoted \rho(y_1,y_2) is: \rho(y_1, y_2) = Cov(y_1, y_2)/\sqrt{Var(y_1)*Var(y_2)} | box_statistics_2005 | ||||||||
counterfactual explanation | Statements taking the form: Score p was returned because variables V had values (v1, v2,...) associated with them. If V instead had values (v1', v2',...) score p' would have been returned. | wachter_counterfactual_2018 | ||||||||||
counterfactual fairness | A fairness metric that checks whether a classifier produces the same result for one individual as it does for another individual who is identical to the first, except with respect to one or more sensitive attributes. Evaluating a classifier for counterfactual fairness is one method for surfacing potential sources of bias in a model | aime_measurement_2022, citing Machine Learning Glossary by Google | Given a predictive problem with fairness considerations, where A, X and Y represent the protected attributes, remaining attributes, and output of interest respectively, let us assume that we are given a causal model (U; V; F), where V = A \cup X. We postulate the following criterion for predictors of Y . Definition 5 (Counterfactual fairness). Predictor ^Y is counterfactually fair if under any context X = x and A = a, P( ^Y_{A <- a} (U) = y | X = x; A = a) = P( ^Y_{A <- a')(U) = y | X = x;A = a); (1) for all y and for any value a' attainable by A. | kusner_counterfactual_2017 | ||||||||
countermeasure | Actions, devices, procedures, techniques, or other measures that reduce the vulnerability of a system. Synonymous with security controls and safeguards. | SP800-37 | Actions, devices, procedures, or techniques that meet or oppose (i.e., counters) a threat, a vulnerability, or an attack by eliminating or preventing it, by minimizing the harm it can cause, or by discovering and reporting it so that corrective action can be taken. | GWUC | safeguard; security control | |||||||
criterion validity | compares responses to future performance or to those obtained from other, more well-established surveys. Criterion validity is made up two subcategories: predictive and concurrent. Predictive validity refers to the extent to which a survey measure forecasts future performance. A graduate school entry examination that predicts who will do well in graduate school has predictive validity. Concurrent validity is demonstrated when two assessments agree or a new measure is compared favorably with one that is already considered valid. | fink_survey_2010 | an index of how well a test correlates with an established standard of comparison (i.e., a criterion). Criterion validity is divided into three types: predictive validity, concurrent validity, and retrospective validity. For example, if a measure of criminal behavior is valid, then it should be possible to use it to predict whether an individual (a) will be arrested in the future for a criminal violation, (b) is currently breaking the law, and (c) has a previous criminal record. | APA_criterion_validity | criterion-referenced validity; criterion-related validity | |||||||
crowdsource | a type of participative online activity in which an individual, an institution, a non-profit organization, or company proposes to a group of individuals of varying knowledge, heterogeneity, and number, via a flexible open call, the voluntary undertaking of a task. | Enrique | ||||||||||
customer | The beneficiary of the execution of an automated task, process, or service. | IEEE_Guide_IPA | ||||||||||
cybersecurity | Prevention of damage to, protection of, and restoration of computers, electronic communications systems, electronic communications services, wire communication, and electronic communication, including information contained therein, to ensure its availability, integrity, authentication, confidentiality, and nonrepudiation. | SP800-37 | ||||||||||
dark pattern | “Dark pattern” means a user interface designed or manipulated with the substantial effect of subverting or impairing user autonomy, decisionmaking, or choice, as further defined by regulation. | CCPA | ||||||||||
data | Characteristics or information, usually numerical, that are collected through observation. | OECD | re-interpretable representation of information in a formalized manner suitable for communication, interpretation or processing | aime_measurement_2022, citing ISO/IEC TR 24029-1 | ||||||||
data analytics | the process of applying graphical, statistical, or quantitative techniques to a set of observations or measurements in order to summarize it or to find general patterns. | APA_data_analysis | Data analysis is the process of transforming raw data into usable information, often presented in the form of a published analytical article, in order to add value to the statistical output. | OECD | ||||||||
data cleaning | Data Cleaning is the process of identifying, correcting, or removing inaccurate or corrupt data records | Ranschaert,_Erik | ||||||||||
data control | management oversight of information policies for an organization’s information; observing and reporting on how processes are working and managing issues. | Egnyte | ||||||||||
data dredging | A statistical bias in which testing huge numbers of hypotheses of a dataset may appear to yield statistical significance even when the results are statistically nonsignificant. | SP1270 | statistical bias; p-hacking | |||||||||
data drift | The change in model input data that leads to model performance degradation. | Microsoft_Azure_documentation | ||||||||||
data-driven | Data-driven decision making (DDD) refers to the practice of basing decisions on the analysis of data rather than purely on intuition. | provost_data_2013 | ||||||||||
data fabric | A data corpus, after the application of semantic mapping, relevant ontologies, and data seeding sufficient for artificial intelligence (AI) or machine learning algorithms to provide meaningful insight, prediction, and/or prescription. | IEEE_Guide_IPA | ||||||||||
data fusion | A process in which data, generated by multiple sensory sources, is integrated and/or correlated to create information, knowledge, and/or intelligence that may be displayed for user or be actionable to accomplish the tasks. | SP1011 | The process of combining data from multiple sources to produce more accurate, consistent, and concise information than that provided by any individual data source. | Munir,_Arslan | ||||||||
data governance | A set of processes that ensures that data assets are formally managed throughout the enterprise. A data governance model establishes authority and management and decision making parameters related to the data produced or managed by the enterprise. | CSRC | refers to a system, including policies, people, practices, and technologies, necessary to ensure data management within an organization | NIST_1500 | ||||||||
data mining | computational process that extracts patterns by analysing quantitative data from different perspectives and dimensions, categorizing them, and summarizing potential relationships and impacts | aime_measurement_2022 citinig ISO/IEC 22989 | the process of data analysis and information extraction from large amounts of datasets with machine learning, statistical approaches. and many others. | Ranschaert,_Erik | ||||||||
data point | a discrete unit of information. | TechTarget_data_point | ||||||||||
data preparation | We define data preparation as the set of preprocessing operations performed in early stages of a data processing pipeline, i.e., data transformations at the structural and syntactical levels | hameed_data_2020 | ||||||||||
data proxy | Data that are closely related to and serve in place of data that are either unobservable or immeasurable. | Comptroller_Office | ||||||||||
data quality | degree to which the characteristics of data satisfy stated and implied needs when used under specified conditions | IEEE_Soft_Vocab | The dimensions of the IMF definition of "data quality" are: - integrity; - methodological soundness; - accuracy and reliability; - serviceability; - accessibility. There are a number of prerequisites for quality. These comprise: - legal and institutional environment; - resources; - quality awareness. | OECD | ||||||||
data science | Methodology for the synthesis of useful knowledge directly from data through a process of discovery or of hypothesis formulation and hypothesis testing. | NIST_1500 | Interdisciplinary science that uses statistics, algorithms, and other methods to extract meaningful and useful patterns from data sets—sometimes known as “big data.” Today, machine learning is often used in this field. Next to analysis of data, data science is also concerned with the capturing, preparation, and interpretation of data. | AI_Ethics_Mark_Coeckelbergh | artificial intelligence (AI); machine learning (ML) | |||||||
data scientist | A practitioner who has sufficient knowledge in the overlapping regimes of business needs, domain knowledge, analytical skills, and software and systems engineering to manage the end-to-end data processes in the analytics life cycle. | NIST_1500 | ||||||||||
data seeding | The intentional introduction of initial state conditions, influencing factors, and outcomes (both successful and unsuccessful) in a data fabric to create sufficient machine learning analysis signals to enable encouragement/discouragement to enrich deterministic relationships between data elements in a given information domain. | IEEE_Guide_IPA | ||||||||||
data wrangling | process by which the data required by an application is identified, extracted, cleaned and integrated, to yield a data set that is suitable for exploration and analysis. | Furche,_Tim | ||||||||||
decision | A conclusion reached after consideration of business rules and relevant data within a given process. | IEEE_Guide_IPA | Types of statements in which a choice between two or more possible outcomes controls which set of actions will result. | IEEE_Soft_Vocab | ||||||||
decision point | A point within a business process where the process flow can take one of several alternative paths, including recursive. | IEEE_Guide_IPA | ||||||||||
decision tree | Tree‐structure resembling a flowchart, where every node represents a test to an attribute, each branch represents the possible outcomes of that test, and the leaves represent the class labels. | Reznik,_Leon | ||||||||||
decision-making | the cognitive process resulting in the selection of a belief or a course of action among several possible alternative options. It could be either rational or irrational. The decision-making process is a reasoning process based on assumptions of values, preferences and beliefs of the decision-maker. Every decision-making process produces a final choice, which may or may not prompt action. | Wikipedia_Decision-making | the cognitive process of choosing between two or more alternatives, ranging from the relatively clear cut (e.g., ordering a meal at a restaurant) to the complex (e.g., selecting a mate). Psychologists have adopted two converging strategies to understand decision making: (a) statistical analysis of multiple decisions involving complex tasks and (b) experimental manipulation of simple decisions, looking at the elements that recur within these decisions. | APA_decision_making | ||||||||
decision support system | a computer program application used to improve a company's decision-making capabilities. It analyzes large amounts of data and presents an organization with the best possible options available[; they] bring together data and knowledge from different areas and sources to provide users with information beyond the usual reports and summaries. This is intended to help people make informed decisions. | TechTarget_decision_support_system | ||||||||||
decommission | the total or partial removal of existing components and their corresponding sub-components from Production and any relevant environment, minimizing risks and impacts, ensuring policy compliance, and maximizing the financial benefits (i.e., optimizing the cost reduction). | IG1190M_AIOps_Decommission_v1.0.0 | ||||||||||
deductive analytics | Insights, reporting, and information answering the question, "What would likely happen IF…?” Deductive analytics evaluates causes and outcomes of possible future events. | IEEE_Guide_IPA | deductive reasoning | |||||||||
deep learning | Deep learning is a broad family of techniques for machine learning in which hypotheses take the form of complex algebraic circuits with tunable connection strengths. The word “deep” refers to the fact that the circuits are typically organized into many layers, which means that computation paths from inputs to outputs have many steps. Deep learning is currently the most widely used approach for applications such as visual object recognition, machine translation, speech recognition, speech synthesis, and image synthesis; it also plays a significant role in reinforcement learning applications. | Russell_and_Norvig | A form of machine learning that uses neural networks with several layers of "neurons": simple interconnected processing units that interact. | AI_Ethics_Mark_Coeckelbergh | [an approach to AI that allows] computers to learn from experience and understand the world in terms of a hierarchy of concepts, with each concept defined through its relation to simpler concepts. By gathering knowledge from experience, this approach avoids the need for human operators to formally specify all the knowledge that the computer needs. The hierarchy of concepts enables the computer to learn complicated concepts by building them out of simpler ones. If we draw a graph showing how these concepts are built on top of each other, the graph is deep, with many layers. | deeplearningbook_intro | ||||||
deepfake | AI-generated or manipulated image, audio or video content that resembles existing persons, objects, places or other entities or events and would falsely appear to a person to be authentic or truthful. | TTC6_Taxonomy_Terminology | ||||||||||
deletion | Of an , the action of destroying an instantiated . | IEEE_Soft_Vocab | ||||||||||
denial-of-service | The prevention of authorized access to resources or the delaying of time-critical operations. (Time-critical may be milliseconds or it maybe hours, depending upon the service provided). | SP800-12 | An attack that prevents or impairs the authorized use of information system resources or services. | CISA | when legitimate users are unable to access information systems, devices, or other network resources due to the actions of a malicious cyber threat actor. Services affected may include email, websites, online accounts (e.g., banking), or other services that rely on the affected computer or network. A denial-of-service condition is accomplished by flooding the targeted host or network with traffic until the target cannot respond or simply crashes, preventing access for legitimate users. DoS attacks can cost an organization both time and money while their resources and services are inaccessible. | ST04-015 | ||||||
dependability | ability to perform as and when required (note 1: includes availability, reliability, recoverability, maintainability, and maintenance support performance, and, in some cases, other characteristics such as durability, safety and security. Note 2: used as a collective term for the time-related quality characteristics of an item). | ISO/IEC_TS_5723:2022(en) | ||||||||||
deployment | Phase of a project in which a system is put into operation and cutover issues are resolved | IEEE_Soft_Vocab | ||||||||||
descriptive analytics | Insights, reporting, and information answering the question, “Why did something happen?” Descriptive analytics determines information useful to understanding the cause(s) of an event(s). | IEEE_Guide_IPA | ||||||||||
deterministic | modelling [that] produces consistent outcomes for a given set of inputs, regardless of how many times the model is recalculated. The mathematical characteristics are known in this case. None of them is random, and each problem has just one set of specified values as well as one answer or solution. The unknown components in a deterministic model are external to the model. It deals with the definitive outcomes as opposed to random results and doesn’t make allowances for error. | Sourabh_Mehta_deterministic | ||||||||||
deterministic algorithm | An algorithm that, given the same inputs, always produces the same outputs. | CSRC | ||||||||||
developer | A general term that includes developers or manufacturers of systems, system components, or system services; systems integrators; vendors; and product resellers. Development of systems, components, or services can occur internally within organizations or through external entities. | SP800-37 | Individual or organization that performs development activities (including requirements analysis, design, testing through acceptance) during the system or software life‐cycle process. | IEEE_Soft_Vocab | ||||||||
diagnostic analytics | Insights, reporting, and information answering the question, “Why did something happen?” Diagnostic analytics determines information useful to understanding the cause(s) of an event(s). | IEEE_Guide_IPA | ||||||||||
diagnostics | Pertaining to the detection and isolation of faults or failures | IEEE_Software_Vocab | ||||||||||
differential privacy | Differential privacy is a method for measuring how much information the output of a computation reveals about an individual. It is based on the randomised injection of "noise". Noise is a random alteration of data in a dataset so that values such as direct or indirect identifiers of individuals are harder to reveal. An important aspect of differential privacy is the concept of “epsilon” or ɛ, which determines the level of added noise. Epsilon is also known as the “privacy budget” or “privacy parameter”. | privacy-enhancing_technologies | For two datasets D and D' that differ in at most one element, a randomized algorithm $M$ guarantees \emph{$(\epsilon, \delta)$-differential privacy} for any subset of the output $S$ if $M$ satisfies: \begin{equation} Pr[M(D) \in S] \leq exp(\epsilon)*Pr[M(D') \in S] + \delta \end{equation} Furthermore, when $\delta = 0$ an algorithm M is said to guarantee \emph{$\epsilon$-differential privacy} | gong_differential_2020 | ||||||||
differential validity | Differential validity states that the validities in two applicant populations are unequal, that is, pi != pa. | hunter_differential_1979 | ||||||||||
digital labor | Digital automation of information technology systems and/or business processes that successfully delivers work output previously performed by human labor or new work output that would typically or alternatively have been performed by human labor. | IEEE_Guide_IPA | ||||||||||
digital workforce | The collective suite of automation technologies delivering existing or new work output as applied in a business; the manifestation of digital labor. | IEEE_Guide_IPA | ||||||||||
dimension | The dimension of an object is a topological measure of the size of its covering properties. Roughly speaking, it is the number of coordinates needed to specify a point on the object. | wolfram_math_2022 | Distinct components that a multidimensional construct encompasses | IEEE_Soft_Vocab | ||||||||
dimension reduction | Dimensionality reduction is the process of taking data in a high dimensional space and mapping it into a new space whose dimensionality is much smaller | Shalev-Shwartz,_Shai | ||||||||||
disparate impact | For Predictor Y and Sensitive Impact S. Definition 6.2 Disparate Impact (DI) = P[Yˆ = 1 | S != 1]/P[Yˆ = 1 | S = 1] | friedler_comparative_2019 | ||||||||||
disparate treatment | Intentional discrimination, including (i) decisions explicitly based on protected characteristics; and (ii) intentional discrimination via proxy variables (e.g literacy tests for voting eligibility). | Lipton,_Zachary | ||||||||||
distributional robustness | Optimizing the predictive accuracy for a whole class of distributions instead of just a single target distribution. | Meinshausen,_Nicolai | ||||||||||
diversity | the practice of including the many communities, identities, races, ethnicities, backgrounds, abilities, cultures, and beliefs of the American people, including underserved communities. | EO_DEIA_2021 | inclusion | |||||||||
documentation | Collection of documents on a given subject; written or pictorial information describing, defining, specifying, reporting, or certifying activities, requirements, procedures, or results. | IEEE_Soft_Vocab | ||||||||||
domain | Distinct scope, within which common characteristics are exhibited, common rules observed, and over which a distribution transparency is preserved. | IEEE_Soft_Vocab | A set of elements, data, resources, and functions that share a commonality in combinations of: (1) roles supported, (2) rules governing their use, and (3) protection needs. | SP800-160 | specific field of knowledge or expertise | aime_measurement_2022, citing ISO/IEC 2382 | ||||||
domain expertise | Domain expertise implies knowledge and understanding of the essential aspects of a specific field of inquiry. | McCue_Colleen | ||||||||||
domain shift | Differences between the source and target domain data | Stacke,_Karin | distributional shift | |||||||||
drinking your own champagne | The practice in which tech workers use their own product consistently to see how well it works and where improvements can be made. | kelley_dogfooding_2022 | dogfooding, eating your own dogfood | |||||||||
dynamic process | The process in which one or more paths are defined and may be utilized based on the conditions present at the time of execution. | IEEE_Guide_IPA | ||||||||||
edge case | a problem or situation, especially in computer programming, that only happens at the highest or lowest end of a range of possible values or in extreme situations: | cambridge_dictionary_2022 | ||||||||||
embedding | An embedding is a representation of a topological object, manifold, graph, field, etc. in a certain space in such a way that its connectivity or algebraic properties are preserved. For example, a field embedding preserves the algebraic structure of plus and times, an embedding of a topological space preserves open sets, and a graph embedding preserves connectivity. One space X is embedded in another space Y when the properties of Y restricted to X are the same as the properties of X. | wolfram_math_2022 | ||||||||||
emulation | The use of a data processing system to imitate another data processing system, so that the imitating system accepts the same data, executes the same programs, and achieves the same results as the imitated system. | IEEE_Soft_Vocab | ||||||||||
end event | An activity, task, or output that describes or defines the conclusion of a process. | IEEE_Guide_IPA | ||||||||||
engineer | n. 3a: a designer or builder of engines; b: a person who is trained in or follows as a profession a branch of engineering; c: a person who carries through an enterprise by skillful or artful contrivance; 4: a person who runs or supervises an engine or an apparatus. v. 1: to lay out, construct, or manage as an engineer. | Merriam-Webster_engineer | ||||||||||
ensemble | a machine learning paradigm where multiple models (often called “weak learners”) are trained to solve the same problem and combined to get better results. The main hypothesis is that when weak models are correctly combined we can obtain more accurate and/or robust models. | Joseph_Rocca_Ensemble_methods | ||||||||||
environment | Anything affecting a subject system or affected by a subject system through interactions with it, or anything sharing an interpretation of interactions with a subject system | IEEE_Soft_Vocab | ||||||||||
equality of odds | (Equalized odds). We say that a predictor bY satisfies equalized odds with respect to protected attribute A and outcome Y, if bY and A are independent conditional on Y. | hardt_equality_2016 | The probability of a person in the positive class being correctly assigned a positive outcome and the probability of a person in a negative class being incorrectly assigned a positive outcome should both be the same for the protected and unprotected group members. In other words, the protected and unprotected groups should have equal rates for true positives and false positives. | Mehrabi,_Ninareh | ||||||||
equality of opportunity | (Equal opportunity). We say that a binary predictor bY satisfies equal opportunity with respect to A and Y if Pr{bY = 1 | A = 0; Y = 1} = Pr{bY = 1 | A = 1; Y = 1}. | hardt_equality_2016 | The probability of a person in positive class being assigned to a positive outcome should be equal for both protected and unprotected group members. In other words, the protected and unprotected groups should have equal true positive rates. | Mehrabi,_Ninareh | ||||||||
error | The difference between the observed value of an index and its “true” value. Errors maybe random or systematic. Random errors are generally referred to as “errors”. Systematic errors are called “biases”. | OECD | Difference between a computed, observed, or measured value or condition and the true, specified, or theoretically correct value or condition. | IEEE_Soft_Vocab | measured quantity value minus a reference quantity value | aime_measurement_2022, citing ISO/IEC Guide 99 | ||||||
error propagation | the way in which uncertainties in the variables affect the uncertainty in the calculated results. | Dorf_2018 | propgation of uncertainty; proprgation of error | |||||||||
ethics | definition 1a: "a set of moral principles : a theory or system of moral values"; definition 1b: "the principles of conduct governing an individual or a group"; definition 1c: "a consciousness of moral importance"; definition 1d: "a guiding philosophy"; definition 2: "a set of moral issues or aspects (such as rightness)"; definition 3: "the discipline dealing with what is good and bad and with moral duty and obligation" | Merriam-Webster_ethic | n. 1. the branch of philosophy that investigates both the content of moral judgments (i.e., what is right and what is wrong) and their nature (i.e., whether such judgments should be considered objective or subjective). The study of the first type of question is sometimes termed normative ethics and that of the second metaethics. Also called moral philosophy. 2. the principles of morally right conduct accepted by a person or a group or considered appropriate to a specific field. In psychological research, for example, proper ethics requires that participants be treated fairly and without harm and that investigators report results and findings honestly. See code of ethics; professional ethics; research ethics. —ethical adj. | APA_ethics | ||||||||
ethics by design | An approach to technology ethics and a key component of responsible innovation that aims to integrate ethics in the design and development stage of the technology. Sometimes formulated as "embedding values in design." Similar terms are "value-sensitive design" and "ethically aligned design." | AI_Ethics_Mark_Coeckelbergh | ||||||||||
evaluation | (1) systematic determination of the extent to which an entity meets its specified criteria; (2) action that assesses the value of something | aime_measruement_2022, citing ISO/IEC 24765 | Test, Evaluation, Verification and Validation (TEVV) | |||||||||
example | definition 1: "one that serves as a pattern to be imitated or not to be imitated"; definition 3: "one that is representative of all of a group or type"; definition 4: "a parallel or closely similar case especially when serving as a precedent or model"; definition 5: "an instance (such as a problem to be solved) serving to illustrate a rule or precept or to act as an exercise in the application of a rule" | Merriam-Webster_example | ||||||||||
exception | An event that occurs during the performance of the process that causes a diversion from the normal flow of the process. Exceptions are generated by an unanticipated event within a process due to an undefined or unknown input, undefined or unexpected outcome, or unforeseen sequencing of a task or event. | IEEE_Guide_IPA | ||||||||||
execute | To carry out a plan, a task command, or another instruction | SP1011 | To carry out an instruction, process, or computer program; directing, managing, performing, and accomplishing the project work, providing the deliverables, and providing work performance information. | IEEE_Soft_Vocab | ||||||||
experiment | a series of observations conducted under controlled conditions to study a relationship with the purpose of drawing causal inferences about that relationship. An experiment involves the manipulation of an independent variable, the measurement of a dependent variable, and the exposure of various participants to one or more of the conditions being studied. Random selection of participants and their random assignment to conditions also are necessary in experiments. | apa_experiment_2023 | A study of a fundamental physical process by the use of one or more computer simulators. Like empirical experiments, input variables (factors) are systematically changed to assess their impact upon simulator outputs (responses). Unlike empirical experiments, the simulator responses are deterministic, and this has implications: Computer experiments can appropriately have their factors with intermediate levels and the scope, especially the number of runs, can be more ambitious. Further, modeling methods based on interpolators (especially kriging) emerge as a viable approach. Good practice is to use Latin hypercubes for computer experiments, and advanced nonparametric modeling methods such as kriging, neural networks, and multivariate adaptive regression splines (MARS) in the data analysis stage. Important applications of computer experimental methods are for determining process optima and for evaluating process tolerances. | nist_statistics_2012 | ||||||||
expert system | A form of AI that attempts to replicate a human's expertise in an area, such as medical diagnosis. It combines a knowledge base with a set of hand-coded rules for applying that knowledge. Machine-learning techniques are increasingly replacing hand coding. | Hutson,_Matthew | Intelligent computer program that uses knowledge and inference procedures to solve problems that are difficult enough to require significant human expertise for their solution. | Reznik,_Leon | An expert system is an intelligent computer program that uses knowledge and inference procedures to solve problems that are difficult enough to require significant human expertise for their solution. | OECD | Computer system that provides for expertly solving problems in a given field or application area by drawing inferences from a knowledge base developed from human expertise. | IEEE_Soft_Vocab | A computer system emulating the decision-making ability of a human expert through the use of reasoning, leveraging an encoding of domain-specific knowledge most commonly represented by sets of if-then rules rather than procedural code. The term “expert system” was used largely during the 1970s and ’80s amidst great enthusiasm about the power and promise of rule-based systems that relied on a “knowledge base” of domain-specific rules and rule-chaining procedures that map observations to conclusions or recommendations. | NSCAI | ||
expertise | The accumulation of specialized knowledge is often called expertise. Passive expertise is a type of knowledge-based specialization that arises from experiences in life and one's position in a society or culture. Formal expertise is the result of a self-selection of a domain of knowledge that is mastered deliberately and for which there are clear benchmarks of success. | Schneider_McGrew_in_Flanagan_McDonough_2018 | ||||||||||
explainability | The ability to provide a human interpretable explanation for a machine learning prediction and produce insights about the causes of decisions, potentially to line up with human reasoning. | NISTIR_8269_Draft | Within the context of AI, the extent to which AI decisioning processes and outcomes are reasonably understood. | Comptroller_Office | A characteristic of an AI system in which there is provision of accompanying evidence or reasons for system output in a manner that is meaningful or understandable to individual users (as well as to developers and auditors) and reflects the system’s process for generating the output (e.g., what alternatives were considered, but not proposed, and why not). | NSCAI | interpretability | |||||
explainer | Functionality for providing details on or causes for fairness metric results. | AI_Fairness_360 | ||||||||||
explanation | Systems deliver accompanying evidence or reason(s) for all outputs. | NISTIR_8269_Draft | The explanation principle obligates AI systems to supply evidence, support, or reasoning for each output. | NISTIR_8312 | ||||||||
exploratory | Exploratory Data Analysis (EDA) is an approach/philosophy for data analysis that employs a variety of techniques (mostly graphical) to 1. maximize insight into a data set; 2. uncover underlying structure; 3. extract important variables; 4. detect outliers and anomalies; 5. test underlying assumptions; 6. develop parsimonious models; and 7. determine optimal factor settings. | nist_statistics_2012 | ||||||||||
external validity | the extent to which the results of research or testing can be generalized beyond the sample that generated them. The more specialized the sample, the less likely will it be that the results are highly generalizable to other individuals, situations, and time periods. | APA_external_validity | ||||||||||
facial recognition (FR) | Face recognition algorithms, however, have no built-in notion of a particular person. They are not built to identify particular people; instead they include a face detector followed by a feature extraction algorithm that converts one or more images of a person into a vector of values that relate to the identity of the person. The extractor typically consists of a neural network that has been trained on ID-labeled images available to the developer. In operations, they act as generic extractors of identity-related information from photos of persons they have usually never seen before. Recognition proceeds as a differential operator: Algorithms compare two feature vectors and emit a similarity score. This is a vendor-defined numeric value expressing how similar the parent faces are. It is compared to a threshold value to decide whether two samples are from, or represent, the same person or not. Thus, recognition is mediated by persistent identity information stored in a feature vector (or “template”). | NISTIR_8280 | ||||||||||
fairness metric | A quantification of unwanted bias in training data or models. | AI_Fairness_360 | A mathematical definition of “fairness” that is measurable. Some commonly used fairness metrics include: equalized odds predictive parity counterfactual fairness demographic parity Many fairness metrics are mutually exclusive; see incompatibility of fairness metrics. | google_glossary_2023 | ||||||||
false negative | An example in which the predictive model mistakenly classifies an item as in the negative class. | NSCAI | an outcome where the model incorrectly predicts the negative class. | google_dev_classification-true-false-positive-negative | A false negative is denying an applicant who should be approved | Varshney,_Kush | 1. An instance in which a security tool intended to detect a particular threat fails to do so. 2. Incorrectly classifying malicious activity as benign. | CSRC_false_negative | Type II error (in statistics) | |||
false positive | An example in which the model mistakenly classifies an item as in the positive class | NSCAI | an outcome where the model incorrectly predicts the positive class. | google_dev_classification-true-false-positive-negative | A false positive is approving an applicant who should be denied | Varshney,_Kush | 1. An alert that incorrectly indicates that a vulnerability is present. 2. An alert that incorrectly indicates that malicious activity is occurring. 3. An instance in which a security tool incorrectly classifies benign content as malicious. 4. Incorrectly classifying benign activity as malicious. 5. An erroneous acceptance of the hypothesis that a statistically significant event has been observed. This is also referred to as a type 1 error. This is also referred to as a type 1 error. When “health-testing” the components of a device, it often refers to a declaration that a component has malfunctioned – based on some statistical test(s) – despite the fact that the component was actually working correctly. | CSRC_false_positive | Type I error (in statistics) | |||
fault tolerance | The ability of a system or component to continue normal operation despite the presence of hardware or software faults | SP1011 | ||||||||||
favorable label | A label whose value corresponds to an outcome that provides an advantage to the recipient. The opposite is an unfavorable label. | AI_Fairness_360 | ||||||||||
feature | An attribute containing information for predicting the label. | AI_Fairness_360 | ||||||||||
feature extraction | a more general method in which one tries to develop a transformation of the input space onto the lowdimensional subspace that preserves most of the relevant information | khalid_feature_2014 | ||||||||||
feature importance | how important the feature was for the classification performance of the model; a measure of the individual contribution of the corresponding feature for a particular classifier, regardless of the shape (e.g., linear or nonlinear relationship) or direction of the feature effect | saarela_feature_2021 | ||||||||||
feature shift | Unlike joint distribution shift detection, which cannot localize which features caused the shift, we define a new hypothesis test for each feature individually. Naïvely, the simplest test would be to check if the marginal distributions have changed for each feature (as explored by [25]); however, the marginal distribution would be easy for an adversary to simulate (e.g., by looping the sensor values from a previous day). Thus, marginal tests are not sufficient for our purpose. Therefore, we propose to use conditional distribution tests. More formally, our null and alternative hypothesis for the j-th feature is that its full conditional distribution (i.e., its distribution given all other features) has not shifted for all values of the other features. | kulinski_feature_2020 | ||||||||||
federated learning | An approach to machine learning which addresses problems of data governance and privacy by training algorithms collaboratively without transferring the data to a central location. Each federated device trains on data locally and shares its local model parameters instead of sharing the training data. Different federated learning systems have different topologies that involve different ways of sharing parameters. | TTC6_Taxonomy_Terminology | ||||||||||
feedback loop | describes the process of leveraging the output of an AI system and corresponding end-user actions in order to retrain and improve models over time. The AI-generated output (predictions or recommendations) are compared against the final decision (for example, to perform work or not) and provides feedback to the model, allowing it to learn from its mistakes. | C3.ai_feedback_loop | closed-loop learning | |||||||||
fitting | Fitting is the process of verifying whether the data item value is in the previously specified interval. | OECD | ||||||||||
firmware | Computer programs and data stored in hardware - typically in read-only memory (ROM) or programmable read-only memory (PROM) - such that the programs and data cannot be dynamically written or modified during execution of the programs. | SP800-37 | Combination of a hardware device and computer instructions or computer data that reside as read only software on the hardware device. | IEEE_Soft_Vocab | ||||||||
Forecasting | Estimate or prediction of conditions and events in the project's future based on information and knowledge available at the time of the forecast. The information is based on the project's past performance and expected future performance, and includes information that could impact the project in the future, such as estimate at completion and estimate to complete. | IEEE_Soft_Vocab | ||||||||||
fraud detection | Monitoring the behavior of populations of users in order to estimate, detect, or avoid undesirable behavior. | Kou,_Yufeng | detecting and recognizing fraudulent activities as they enter systems and report them to a system manager. | Behdad | ||||||||
fully autonomous | Accomplishes its assigned mission, within a defined scope, without human intervention while adapting to operational and environmental conditions | SP1011 | ||||||||||
generative adversarial network (GAN) | Generative Adversarial Networks, or GANs for short, are an approach to generative modeling using deep learning methods, such as convolutional neural networks. Generative modeling is an unsupervised learning task in machine learning that involves automatically discovering and learning the regularities or patterns in input data in such a way that the model can be used to generate or output new examples that plausibly could have been drawn from the original dataset. | Brownlee,_Jason | A pair of jointly trained neural networks that generates realistic new data and improves through competition. One net creates new examples (fake Picassos, say) as the other tries to detect the fakes. | Hutson,_Matthew | Generative adversarial networks (GANs) consist of two competing neural networks—a generator network that tries to create fake outputs (such as pictures), and a discriminator network that tries to determine whether the outputs are real or fake. A major advantage of this structure is that GANs can learn from less data than other deep learning algorithms. | CRS_AI | An approach to training AI models useful for applications like data synthesis, augmentation, and compression where two neural networks are trained in tandem: one is designed to be a generative network (the forger) and the other a discriminative network (the forgery detector). The objective is for each network to train and better itself off the other, reducing the need for big labeled training data. | NSCAI | ||||
global | A global explanation produces a model that approximates the non-interpretable model. | NISTIR_8312_Full | ||||||||||
governance | The actions to ensure stakeholder needs, conditions, and options are evaluated to determine balanced, agreed-upon enterprise objectives; setting direction through prioritization and decision-making; and monitoring performance and ompliance against agreed-upon directions and objectives. AI governance may include policies on the nature of AI applications developed and deployed versus those limited or withheld. | NSCAI | A system of laws, policies, frameworks, practices and processes at international, national and organizational levels. AI governance helps various stakeholders implement, manage, oversee and regulate the development, deployment and use of AI technology. It also helps manage associated risks to ensure AI aligns with stakeholders' objectives, is developed and used responsibly and ethically, and complies with applicable legal and regulatory requirements. | IAPP_Governance_Terms | ||||||||
graph | Diagram that represents the variation of a variable in comparison with that of one or more other variables. Diagram or other representation consisting of a finite set of nodes and internode connections called edges or arcs. | IEEE_Soft_Vocab | A graph (sometimes called an undirected graph to distinguish it from a directed graph, or a simple graph to distinguish it from a multigraph) is a pair G = (V, E), where V is a set whose elements are called vertices (singular: vertex), and E is a set of paired vertices, whose elements are called edges (sometimes links or lines). | wikipedia_graph_2023 | ||||||||
graphical processing unit (GPU) | A specialized chip capable of highly parallel processing. GPUs are well-suited for running machine learning and deep learning algorithms. GPUs were first developed for efficient parallel processing of arrays of values used in computer graphics. Modern-day GPUs are designed to be optimized for machine learning. | NSCAI | ||||||||||
ground truth | information provided by direct observation as opposed to information provided by inference | Collins_Dictionary_ground_truth | value of the target variable for a particular item of labelled input data | aime_measurement_2022, citing ISO/IEC 22989 | ||||||||
group fairness | The goal of groups defined by protected attributes receiving similar treatments or outcomes. | AI_Fairness_360 | ||||||||||
hacker | Unauthorized user who attempts to or gains access to an information system. | Reznik,_Leon | Technically sophisticated computer enthusiast who uses his or her knowledge and means to gain unauthorized access to protected resources. | IEEE_Soft_Vocab | ||||||||
hardware | Physical equipment used to process, store, or transmit computer programs or data | IEEE_Soft_Vocab | ||||||||||
harm | An undesired outcome [whose] cost exceeds some threshold[; ...] the key points in the definition of safety are that: costs have to be sufficiently high in some human sense for events to be harmful, and that safety involves reducing both the probability of expected harms and the possibility of unexpected harms. | Engineering_safety_in_machine_learning | ||||||||||
harmful bias | Harmful bias can be either conscious or unconscious. Unconscious, also known as implicit bias, involves associations outside conscious awareness that lead to a negative evaluation of a person on the basis of characteristics such as race, gender, sexual orientation, or physical ability. Discrimination is behavior; discriminatory actions perpetrated by individuals or institutions refer to inequitable treatment of members of certain social groups that results in social advantages or disadvantages | humphrey_addressing_2020 | ||||||||||
human-assisted | The type of human-robot-interaction that that refers to situations during which human interactions are needed at the level of detail of task plans, i.e., during the execution of a task | SP1011 | ||||||||||
human-computer interaction (HCI) | methods and approaches for designing and architecting user interfaces and the interactions between humans and computer (or information) technology. | Poore_Lawrence_ARLIS_2023-01 | ||||||||||
human-cognitive bias | Human-cognitive biases relate to how an individual or group perceives AI system information to make a decision or fill in missing information, or how humans think about purposes and functions of an AI system. Human biases are omnipresent in decision-making processes across the AI lifecycle and system use, including the design, implementation, operation, and maintenance of AI. | NIST_AI_RMF_1.0 | Systematic error in judgment and decision-making common to all human beings which can be due to cognitive limitations, motivational factors, and/or adaptations to natural environments. | |||||||||
human-enabled machine learning | Detection, correlation, and pattern recognition generated through machine-based observation of human operation of software systems capturing successful or unsuccessful operations to enable the creation of a useful predictive analytics capability. | IEEE_Guide_IPA | ||||||||||
human-in-the-loop | An AI system that requires human interaction. | DOD_Modeling_and_Simulation_Glossary | ||||||||||
human-machine teaming (HMT) | The ability of humans and AI systems to work together to undertake complex, evolving tasks in a variety of environments with seamless handoff both ways between human and AI team members. Areas of effort include developing effective policies for controlling human and machine initiatives, computing methods that ideally complement people, methods that optimize goals of teamwork, and designs that enhance human-AI interaction. | NSCAI | methods and approaches for coordinating the functions and actions of (semi) autonomous machine capabilities and human users, which are granted equal weighting. | Poore_Lawrence_ARLIS_2023-01 | human-AI teaming | |||||||
human-operator-intervention | The need for human interaction in a normally fully autonomous behavior due to some extenuating circumstances. | SP1011 | ||||||||||
human subjects | a living individual about whom an investigator (whether professional or student) conducting research: (i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or (ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens. | 45_CFR_46_2018_Requirements_(2018_Common_Rule) | participant | |||||||||
human system integration (HSI) | methods and approaches for testing and optimizing all human-related considerations from a “whole-system” or “system-of-systems” level. | Poore_Lawrence_ARLIS_2023-01 | ||||||||||
hyperparameters | the parameters that are used to either configure a ML model (e.g., the penalty parameter C in a support vector machine, and the learning rate to train a neural network) or to specify the algorithm used to minimize the loss function (e.g., the activation function and optimizer types in a neural network, and the kernel type in a support vector machine). | On_Hyperparameter_Optimization | ||||||||||
hypothesis testing | A term used generally to refer to testing significance when specific alternatives to the null hypothesis are considered. | OECD | ||||||||||
impact assessment | a risk management tool that seeks to ensure an organization has sufficiently considered a system's relative benefits and costs before implementation. In the context of AI, an impact assessment helps to answer a simple question: alongside this system’s intended use, for whom could it fail? | Bipartisan_Policy_Center_impact_assessments | An evaluation process designed to identify, understand, document and mitigate the potential ethical, legal, economic and societal implications of an AI system in a specific use case. | IAPP_Governance_Terms | ||||||||
impersonation | A malicious individual is able to impersonate a legitimate data subject to the data controller. The adversary forges a valid access request and goes through the identity verification enforced by the data controller. The data controller sends to the adversary the data of a legitimate data subject. Defeating impersonation is the primary objective of any authentication protocol. The result of this attack is a data breach (e.g. blaggers [sic] pretend to be someone they are not in order to wheedle out the information they are seeking obtaining information illegaly which they then sell for a specified price). | Security_Analysis_of_Subject_Access | ||||||||||
in-processing | Techniques that modify the algorithms in order to mitigate bias during model training. Model training processes could incorporate changes to the objective (cost) function or impose a new optimization constraint. | SP1270 | Techniques that try to modify and change state-of-the-art learning algorithms to remove discrimination during the model training process. | Mehrabi,_Ninareh | ||||||||
in-processing algorithm | A bias mitigation algorithm that is applied to a model during its training. | AI_Fairness_360 | ||||||||||
incident | a situation in which AI systems caused, or nearly caused, real-world harm. | AI_Incident_Database | the occurrence of a technical event that affects the integrity of a Product and/or Model. | FBPML_Wiki | an alleged harm or near harm event to people, property, or the environment where an AI system is implicated. | AI_Incident_Editors | Adverse event(s) in a computer system or networks caused by a failure of a security mechanism, or an attempted or threatened breach of these mechanisms. | Hasan,_Raza | ||||
incident response | a public official response to an incident ... from an entity (i.e. company, organization, individual) allegedly responsible for developing or deploying the AI or AI system involved in said incident. | AIID_incident_response | ||||||||||
independence | Of software quality assurance (SQA), situation in which SQA is free from technical, managerial, and financial influences, intentional or unintentional | IEEE_Soft_Vocab | Two events are independent if the occurrence of one event does not affect the chances of the occurrence of the other event. The mathematical formulation of the independence of events A and B is the probability of the occurrence of both A and B being equal to the product of the probabilities of A and B (i.e., P(A and B) = P(A)P(B)) | nist_800_2010 | In simple terms, inclusion is getting the mix to work together. | |||||||
individual fairness | The goal of similar individuals receiving similar treatments or outcomes. | AI_Fairness_360 | Give similar predictions to similar individuals | Mehrabi,_Ninareh | A fairness metric that checks whether similar individuals are classified similarly | aime_measurement_2022 citing Machine Learning Glossary by Google | ||||||
inference | The stage of ML in which a model is applied to a task. For example, a classifier model produces the classification of a test sample. | NISTIR_8269_Draft | ||||||||||
information input component | One of the three components of a model. This component delivers assumptions and data to the model. | Comptroller_Office | ||||||||||
information security | preservation of confidentiality, integrity and availability of information; in addition, other properties, such as authenticity, accountability, non-repudiation, and reliability can also be involved. | ISO/IEC_TS_5723:2022(en) | ||||||||||
input | Data received from an external source | IEEE_Soft_Vocab | ||||||||||
insider attack | Those who are within [an] organisation may have authorised access to vast amounts of sensitive company records that are essential for maintaining competitiveness and market position, and knowledge of information services and procedures that are crucial for daily operations. . . .[and] should an individual choose to act against the organisation, then with their privileged access and their extensive knowledge, they are well positioned to cause serious damage. | IEEE_Caught_in_the_Act | ||||||||||
in silico | carrying out some experiment by means of a computer simulation | World_Wide_Words_In_silico | computer simulation testing | |||||||||
instance | Discrete, bounded thing with an intrinsic, immutable, and unique identity. Individual occurrence of a type | IEEE_Soft_Vocab | A single object of the world from which a model will be learned, or on which a model will be used (e.g., for prediction). | Kohavi,_Ron | ||||||||
instance weight | A numerical value that multiplies the contribution of a data point in a model. | AI_Fairness_360 | ||||||||||
integrity | Degree to which a system, product, or component prevents unauthorized access to, or modification of, computer programs or data. | IEEE_Soft_Vocab | Guarding against improper information modification or destruction, and includes ensuring information non-repudiation and authenticity. | CSRC | The property whereby information, an information system, or a component of a system has not been modified or destroyed in an unauthorized manner. | CISA | property whereby data have not been altered in an unauthorized manner since they were created, transmitted, or stored; property of accuracy and completeness | ISO/IEC_TS_5723:2022(en) | the quality of moral consistency, honesty, and truthfulness with oneself and others. | APA_integrity | ||
intelligent process automation | A preconfigured software instance that combines business rules, experience- based context determination logic, and decision criteria to initiate and execute multiple interrelated human and automated processes in a dynamic context. The goal is to complete the execution of a combination of processes, activities, and tasks in one or more unrelated software systems that deliver a result or service with minimal or no human intervention. | IEEE_Guide_IPA | ||||||||||
interaction | Action that takes place with the participation of the environment of the object. | IEEE_Soft_Vocab | ||||||||||
internal validity | the degree to which a study or experiment is free from flaws in its internal structure and its results can therefore be taken to represent the true nature of the phenomenon. In other words, internal validity pertains to the soundness of results obtained within the controlled conditions of a particular study, specifically with respect to whether one can draw reasonable conclusions about cause-and-effect relationships among variables. | APA_internal_validity | ||||||||||
interoperability | The ability of software or hardware systems or components to operate together successfully with minimal effort by end user | SP1011 | Degree to which two or more systems, products or components can exchange information and use the information that has been exchanged. | IEEE_Soft_Vocab | The ability for tools to work together in execution, communication, and data exchange under specific conditions. | NIST_1500 | ||||||
interpretability | The ability to understand the value and accuracy of system output. Interpretability refers to the extent to which a cause and effect can be observed within a system or to which what is going to happen given a change in input or algorithmic parameters can be predicted. | NSCAI | The ability to explain or to present an ML model’s reasoning in understandable terms to a human | aime_measurement_2022, citing Machine Learning Glossary by Google | explainability | |||||||
interpretable model | An interpretable machine learning model obeys a domain-specific set of constraints to allow it (or its predictions, or the data) to be more easily understood by humans. These constraints can differ dramatically depending on the domain. | rudin_interpretable_2022 | ||||||||||
intervenability | the property that intervention is possible concerning all ongoing or planned privacy relevant data processing[; ...] the data subjects themselves should be able to intervene with regards to the processing of their own data ... [to ensure] that data subjects have the ability to control how their data is processed and by whom. | Covert_et_al | ||||||||||
knowledge | The sum of all information derived from diagnostic, descriptive, predictive, and prescriptive analytics embedded in or available to or from a cognitive computing system. | IEEE_Guide_IPA | abstracted information about objects, events, concepts or rules, their relationships and properties, organized for goal-oriented systematic use | aime_measurement_2022, citinig ISO/IEC 22989 | ||||||||
label | A value corresponding to an outcome. | AI_Fairness_360 | target variable assigned to a sample | aime_measurement_2022, citing ISO/IEC 22989 | ||||||||
label shift | Under label shift, the label distribution p(y) might change but the class-conditional distributions p(x|y) do not. ... We work with the label shift assumption, i.e., ps(x|y) = pt(x|y) | saurabh_label_2020 | ||||||||||
large language model (LLM) | a class of language models that use deep-learning algorithms and are trained on extremely large textual datasets that can be multiple terabytes in size. LLMs can be classed into two types: generative or discriminatory. Generative LLMs are models that output text, such as the answer to a question or even writing an essay on a specific topic. They are typically unsupervised or semi-supervised learning models that predict what the response is for a given task. Discriminatory LLMs are supervised learning models that usually focus on classifying text, such as determining whether a text was made by a human or AI. | AI_Assurance_2022 | language model | |||||||||
language model | A language model is an approximative description that captures patterns and regularities present in natural language and is used for making assumptions on previously unseen language fragments. | Gustavii,_Ebba | large language model (LLM) | |||||||||
learning | A procedure in artificial intelligence by which an artificial intelligence program improves its performance by gaining knowledge. | Dennis_Mercadal | the acquisition of novel information, behaviors, or abilities after practice, observation, or other experiences, as evidenced by change in behavior, knowledge, or brain function. Learning involves consciously or nonconsciously attending to relevant aspects of incoming information, mentally organizing the information into a coherent cognitive representation, and integrating it with relevant existing knowledge activated from long-term memory. | APA_learning | ||||||||
least privilege | The principle that a security architecture should be designed so that each entity is granted the minimum system resources and authorizations that the entity needs to perform its function. | CSRC | The security objective of granting users only those accesses they need to perform their official duties. | SP-800-12 | ||||||||
lemmatization | the process of grouping together the different inflected forms of a word so they can be analyzed as a single item. | Artasanchez_Joshi_AI_with_Python | in natural language processing[, ...] working with words according to their root lexical components | Techopedia_lemmatization | grouping together words with the same root or lemma but with different inflections or derivatives of meaning so they can be analyzed as one item. | Techslang_lemmatization | ||||||
linear model | [a supervised learning algorithm that uses] a simple formula to find a best-fit line through a set of data points. | dataiku_ML_and_linear_models | (linear) An operator L^~ is said to be linear if, for every pair of functions f and g and scalar t, L^~(f+g)=L^~f+L^~g and L^~(tf)=tL^~f. | wolfram_mathworld_2022 | ||||||||
local | A local explanation explains a subset of decisions or is a per-decision explanation. | NISTIR_8312_Full | ||||||||||
localization | Creation of a national or specific regional version of a product. | IEEE_Soft_Vocab | ||||||||||
logistic model | (logistic equation) The continuous version of the logistic model is described by the differential equation (dN)/(dt)=(rN(K-N))/K, (1) where r is the Malthusian parameter (rate of maximum population growth) and K is the so-called carrying capacity (i.e., the maximum sustainable population). Dividing both sides by K and defining x=N/K then gives the differential equation (dx)/(dt)=rx(1-x), (2) which is known as the logistic equation and has solution x(t)=1/(1+(1/(x_0)-1)e^(-rt)). (3) The function x(t) is sometimes known as the sigmoid function. | wolfram_mathworld_2022 | ||||||||||
machine learning | A branch of Artificial Intelligence (AI) that focuses on the development of systems capable of learning from data to perform a task without being explicitly programmed to perform that task. Learning refers to the process of optimizing model parameters through computational techniques such that the model's behaviour is optimized for the training task. | TTC6_Taxonomy_Terminology | A subcategory of artificial intelligence; a method of designing a sequence of actions to solve a problem that optimizes automatically through experience and with limited or no human intervention. | Comptroller_Office | ||||||||
machine observation | Machine detection and interpretation of relevant and meaningful events and conditions that impact operation of the computer system itself or other dependent mechanisms or processes essential to the purpose of the system. | IEEE_Guide_IPA | ||||||||||
malware | Hardware, firmware, or software that is intentionally included or inserted in a system for a harmful purpose. | Reznik,_Leon | Software that compromises the operation of a system by performing an unauthorized function or process. | CISA | trojan horse | |||||||
materiality | Refers to the significance of a matter in relation to a set of financial or performance information. If a matter is material to the set of information, then it is likely to be of significance to a user of that information | OECD | ||||||||||
McNamara fallacy | presum[ing] that (A) quantitative models of reality are always more accurate than other models; (B) the quantitative measurements that can be made most easily must be the most relevant; and (C) factors other than those currently being used in quantitative metrics must either not exist or not have a significant influence on success. Also known as the quantitative fallacy. | McNamara_Fallacy | quantitative fallacy | |||||||||
measurement | (Quantitative) (1) act or process of assigning a number or category to an entity to describe an attribute of that entity; (2) assignment of numbers to objects in a systematic way to represent properties of the object; (3) use of a metric to assign a value (e.g., a number or category) from a scale to an attribute of an entity; (4) set of operations having the object of determining a value of a measure; (5) assignment of values and labels to aspects of software engineering work products, processes, and resources plus the models that are derived from them, whether these models are developed using statistical or other techniques; (6) figure, extent, or amount obtained by measuring | aime_measurement_2022, citing ISO/IEC 24765 | (Qualitative) (1) a way of learning about social reality [...][that uses] approaches [...] to explore, describe, or explain social phenomen[a]; unpack the meaning people ascribe to activities, situations, events, or [artifacts]; build a depth of understanding about some aspect of social life; build "thick descriptions" (see Clifford Geertz, 1973) of people in naturalistic settings; explore new or underresearched areas; or make micro-macro links (illuminate connections between individuals-groups and institutional and/or cultural contexts). (2) [approaches that] can make visible and unpick the mechanisms which link particular variables, by looking at the explanations, or accounts, provided by those involved. | Leavy_OHQR_Intro | Qualitative measurement engages research methods and techniques to provide information about the nature of phenomenon. Qualitative methods are designed for systematic collection, organization, description and interpretation of non-numeric (textual, verbal or visual) data (Hammarberg et. al, 2016). Qualitative measurement generally answers questions about why, for whom, when, and how something is (or is not) observed, whereas quantitative measurement answers questions about what is observed. Elements assessed using qualitative measurement may include contextual norms or meaning, socio-cultural dynamics, individual or collective beliefs, and complex multi-component interactions or interventions (Busetto et. al, 2020). | Hammarberg_2016_Busetto_2020 | Documentation of assumptions and methods used is a foundational element of qualitative measurement, as the choice of single or combined methods is made based on the phenomenon and its context (Russell & Gregory, 2003). When appropriately paired, qualitative and quantitative measurement can provide corroboration or elaboration, demonstrate use cases, and/or identify conditions for complementarity or contradiction (Brannen, 2005). | Russell_2003_Brannen_2005 | ||||
measurement method | generic description of a logical organization of operations used in a measurement | aime_measurement_2022, citing ISO/IEC Guide 99 | logical sequence of operations, described generically, used in quantifying an attribute with respect to a specified scale | aime_measurement_2022, citing ISO/IEC 24765 | ||||||||
measurement model | The initial confirmatory factory analysis (CFA) model that underlies the structural model [that] tests the adequacy (as indexed by model fit) of the specified relations whereby indicators are linked to their underlying construct. | Little_2013 | A statistical model that links unobservable theoretical constructs, operationalized as latent variables, and observable properties—i.e., data about the world | jackman_oxford_2008 | ||||||||
measurability | ability to assess an attribute of an entity against a metric (note 1: "measurable" is the adjective form of "measurability") | ISO/IEC_TS_5723:2022(en) | ||||||||||
membership inference | given a machine learning model and a record, determining whether the record was used as part of the model's training dataset or not. | |||||||||||
metadata | Metadata is data that defines and describes other data. | OECD | Data that describe other data. | IEEE_Soft_Vocab | Data employed to annotate other data with descriptive information, possibly including their data descriptions, data about data ownership, access paths, access rights, and data volatility. | |||||||
metric | defined measurement method and measurement scale | ISO/IEC_TS_5723:2022(en) | (1) quantitative measure of the degree to which a system, component, or process possesses a given attribute; (2) defined measurement method and the measurement scale; c.f., measure in this section above | aime_measurement_2022, citing ISO/IEC 24765 | ||||||||
minimization | (Part of the ICO framework for auditing AI) AI systems generally require large amounts of data. However, organisations must comply with the minimisation principle under data protection law if using personal data. This means ensuring that any personal data is adequate, relevant and limited to what is necessary for the purposes for which it is processed. […] The default approach of data scientists in designing and building AI systems will not necessarily take into account any data minimisation constraints. Organisations must therefore have in place risk management practices to ensure that data minimisation requirements, and all relevant minimisation techniques, are fully considered from the design phase, or, if AI systems are bought or operated by third parties, as part of the procurement process due diligence | ICO_data_minimisation | a data controller should limit the collection of personal information to what is directly relevant and necessary to accomplish a specified purpose. They should also retain the data only for as long as is necessary to fulfil that purpose. In other words, data controllers should collect only the personal data they really need, and should keep it only for as long as they need it. The data minimisation principle is expressed in Article 5(1)(c) of the GDPR and Article 4(1)(c) of Regulation (EU) 2018/1725, which provide that personal data must be "adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed". | EDPS_data_minimization | ||||||||
mixed methods | In mixed methods, the researcher collects and analyzes both qualitative and quantitative data rigorously in response to research questions and hypotheses; integrates the two forms of data and their results; organizes these procedures into specific research designs that provide the logic and procedures for conducting the study; and frames these procedures within theory and philosophy. | Creswell_Clark_mixed_methods | research in which the inquirer or investigator collects and analyzes data, integrates the findings, and draws inferences using both qualitative and quantitative approaches or methods in a single study or a program of study. | Lisa_M._Given_SAGE | ||||||||
MLOPS | MLOps (machine learning operations) stands for the collection of techniques and tools for the deployment of ML models in production. | symeonidis_MLOps_2022 | ||||||||||
model | A function that takes features as input and predicts labels as output. | AI_Fairness_360 | A model is a formalised expression of a theory or the causal situation which is regarded as having generated observed data. In statistical analysis the model is generally expressed in symbols, that is to say in a mathematical form, but diagrammatic models are also found. The word has recently become very popular and possibly somewhat over-worked. | OECD | A core component of an AI system used to make inferences from inputs in order to produce outputs. A model characterizes an input-to-output transformation intended to perform a core computational task of the AI system (e.g., classifying an image, predicting the next word for a sequence, or selecting a robot's next action given its state and goals). | TTC6_Taxonomy_Terminology | A quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates. A model consists of three components: an information input component, which delivers assumptions and data to the model; a processing component, which transforms inputs into estimates; and a reporting component, which translates the estimates into useful business information. | Comptroller_Office | ||||
model assertion | Model assertions are arbitrary functions over a model’s input and output that indicate when errors may be occurring | Kang,_Daniel | ||||||||||
model card | short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. [They] also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. | Model_Cards_for_Model_Reporting | A brief document that discloses information about an AI model, like explanations about intended use, performance metrics and benchmarked evaluation in various conditions, such as across different cultures, demographics or race. | IAPP_Governance_Terms | ||||||||
model debugging | Model debugging aims to diagnose a model’s failures. | Jain_Saachi | ||||||||||
model decay | Model decay depicts that the performance of the model is degrading over time | Nayak,_Pragati | ||||||||||
model editing | An area of research that aims to enable fast, data-efficient updates to a pre-trained base model’s behavior for only a small region of the domain, without damaging model performance on other inputs of interest | Mitchell,_Eric | ||||||||||
model extraction | Adversaries maliciously exploiting the query interface to steal the model. More precisely, in a model extraction attack, a good approximation of a sensitive or proprietary model held by the server is extracted (i.e. learned) by a dishonest user who interacts with the server only via the query interface. | Chandrasekaran,_Varun | model inversion; model stealing | |||||||||
model governance | Model Governance is the name for the overall internal framework of a firm or organization that controls the processes for Model Development, Model Validation and Model Usage, assign responsibilities and roles etc. | open_risk_2022 | ||||||||||
model inventory | in the context of Risk Management, [...] a database/[management information system] developed for the purpose of aggregating quantitative model related information that is in use by a firm or organization. | ORM_model_inventory | ||||||||||
model overlay | Judgmental or qualitative adjustments to model inputs or outputs to compensate for model, data, or other known limitations. A model overlay is a type of override. | Comptroller_Office | ||||||||||
model risk management | model risk management encompasses governance and control mechanisms such as board and senior management oversight, policies and procedures, controls and compliance, and an appropriate incentive and organizational structure | Fed_Reserve | ||||||||||
model suite | A group of models that work together. | Comptroller_Office | ||||||||||
model training | the phase in the data science development lifecycle where practitioners try to fit the best combination of weights and bias to a machine learning algorithm to minimize a loss function over the prediction range | C3.ai_Model_Training | process to determine or to improve the parameters of a machine learning model, based on a machine learning algorithm, by using training data | aime_measurement_2022, citing ISO/IEC 22989 | ||||||||
model validation | the set of processes and activities intended to verify that models are performing as expected. | yields.io_model_validation | the set of principles, practices and organizational arrangements supporting a rigorous (audited) model development and validation cycle. | Open_Risk_Manual_model_validation | ||||||||
monitoring | Examination of the status of the activities of a supplier and of their results by the acquirer or a third party. | IEEE_Soft_Vocab | Continual checking, supervising, critically observing or determining the status in order to identify change from the performance level required or expected. | SP800-160 | ||||||||
moral agency | The capacity for moral action, reasoning, judgment, and decision making, as opposed to merely having moral consequences. | AI_Ethics_Mark_Coeckelbergh | ||||||||||
moral patiency | The moral standing of an entity in the sense of how that entity should be treated. | AI_Ethics_Mark_Coeckelbergh | ||||||||||
naive Bayes | The naive Bayes classifier is a Bayesian learning method that has been found to be useful in many practical applications. It is called "naive" because it incorporates the simplifying assumption that attribute values are conditionally independent, given the classification of the instance. The naive Bayes classifier applies to learning tasks where each instance x is described by a conjunction of attribute values and where the target function f(x) can take on any value from some finite set V. | Mitchell,_Tom | ||||||||||
natural language processing | The field concerned with machines capable of processing, analysing, and generating human language, either spoken, written or signed. | TTC6_Taxonomy_Terminology | ||||||||||
neural network | A model that, taking inspiration from the brain, is composed of layers (at least one of which is hidden) consisting of simple connected units or neurons followed by nonlinearities | aime_measurement_2022, citing Machine Learnign Glossary by Google | ||||||||||
nondiscrimination | the practice of treating people, companies, countries, etc. in the same way as others in order to be fair: | Cambridge Dictionary | In the context of machine learning non-discrimination can be defined as follows: (1) people that are similar in terms non-protected characteristics should receive similar predictions, and (2) differences in predictions across groups of people can only be as large as justified by non-protected characteristics. | Žliobaitė_Indrė | the practice of treating people, companies, countries, etc. in the same way as others in order to be fair | Cambridge_Dictionary_non-discrimination | ||||||
normal flow | The intended flow of a process originating from a start event, continuing through all defined activities, and concluding successfully to its defined end event. | IEEE_Guide_IPA | ||||||||||
normalization | Conceptual procedure in database design that removes redundancy in a complex database by establishing dependencies and relationships between database entities. Normalization reduces storage requirements and avoids database inconsistencies. | OECD | The process of converting an actual range of values into a standard range of values, typically −1 to +1 or 0 to 1 | aime_measurement_2022, citing Machine Learning Glossary by Google | ||||||||
objective evidence | data supporting the existence or verity of something (note: can be obtained through observation, measurement, test, or other means). | ISO/IEC_TS_5723:2022(en) | ||||||||||
observation | a piece of information received online from users, sensors, or other knowledge sources | poole_mackworth_observation | the careful, close examination of an object, process, or other phenomenon for the purpose of collecting data about it or drawing conclusions. | APA_observation | ||||||||
offline learning | implies ... a static dataset that [one] know[s] from the start and the parameters of [one's] machine learning algorithm are adjusted to the whole dataset at once often loading the whole dataset into memory or in batches. | Ben_Auffarth_2021 | ||||||||||
online learning | fitting [one's] model incrementally as the data flows in (streaming data). | Ben_Auffarth_2021 | ||||||||||
ontology | A set of concepts and categories in a subject area or knowledge domain that shows their properties and the relationships among them to enable interoperability among disparate elements and systems and specify interfaces to independent, knowledge-based services for the purpose of enabling certain kinds of automated reasoning. | IEEE_Guide_IPA | ||||||||||
opacity | The nature of some AI techniques whereby the inferential operations are complex, hidden, or otherwise opaque to their developers and end users in terms of providing an understanding of how classifications, recommendations, or actions are generated and what overall performance will be. | NSCAI | A description of some deep learning systems [that] take an input and provide an output, but the calculations that occur in between are not easy for humans to interpret. | Hutson,_Matthew | When one or more features of an AI system, such as processes, the provenance of datasets, functions, output or behaviour are unavailable or incomprehensible to all stakeholders – usually an antonym for transparency. | TTC6_Taxonomy_Terminology | black box; unexplainable | |||||
operationalization | Putting AI systems or related concepts into use so they can be measured. | |||||||||||
operator | A role assumed by the person performing remote control or teleoperation, semi-autonomous operations, or other human-in-the-loop types of operations | SP1011 | Individual or organization that performs the operations of a system. | IEEE_Soft_Vocab | Individual or organization that performs the operations of a system. | SP800-160 | ||||||
opt-in | an individual makes an active affirmative indication of choice via a user interface signaling a desire to share their information with third parties. | IAPP_Privacy_Glossary | privacy; consent; opt-out | |||||||||
opt-out | an individual makes an active affirmative indication of choice via a user interface signaling a desire not to share their information with third parties. | IAPP_Privacy_Glossary | privacy; consent; opt-in | |||||||||
outcome | something that follows as a result or consequence | merriam_webster_outcome | ||||||||||
outlier | An outlier is a data point that is far from other points. | Russell_and_Norvig | An outlier is a data value that lies in the tail of the statistical distribution of a set of data values. | OECD | Values distant from most other values. In machine learning, any of the following are outliers: • Weights with high absolute values • Predicted values relatively far away from the actual values • Input data whose values are more than roughly 3 standard deviations from the mean Outliers often cause problems in model training. Clipping is one way of managing outliers | aime_measurement_2022 citing Machine Learning Glossary by Google | ||||||
output | Data transmitted to an external destination | IEEE_Soft_Vocab | Process by which an information processing system, or any of its parts, transfers data outside of that system or part | IEEE_Soft_Vocab | ||||||||
overfitting | Given a hypothesis space H, a hypothesis h element of H is said to overfit the training data if there exists some alternative hypothesis h' element of H, such that h has smaller error than h' over the training examples, but h' has a smaller error than h over the entire distribution of instance. | Mitchell,_Tom | ||||||||||
package | a folder with all the code and metadata needed to train and serve a machine learning model. | about_ML_packages | ||||||||||
parametric | A learning model that summarizes data with a set of parameters of fixed size (independent of the number of training examples) | Russell_and_Norvig | ||||||||||
parent process | A process that may contain one or more sub-processes, activities, and tasks. | IEEE_Guide_IPA | ||||||||||
parity | Bit(s) used to determine whether a block of data has been altered. Rationale: Term has been replaced by the term “parity bit”. | NIST_CSRC_parity | the quality or state of being equal or equivalent | Merriam-Webster_parity | ||||||||
participation | engag[ing] multiple stakeholders in deliberative processes in order to achieve consensus. | Sloane_et_al_2020 | ||||||||||
participant | A computer system, data, input, business rule, human intervention, and other contributor to the flow of a process. | IEEE_Guide_IPA | a living individual about whom an investigator (whether professional or student) conducting research: (i) Obtains information or biospecimens through intervention or interaction with the individual, and uses, studies, or analyzes the information or biospecimens; or (ii) Obtains, uses, studies, analyzes, or generates identifiable private information or identifiable biospecimens. | 45_CFR_46_2018_Requirements_(2018_Common_Rule) | human subject | |||||||
passive learning agent | A passive learning agent has a fixed policy that determines its behavior. An active learning agent gets to decide what actions to take. | Russell_and_Norvig | active learning agent | |||||||||
personal data | ‘Personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person. | GDPR | (1) “Personal information” means information that identifies, relates to, describes, is reasonably capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household. Personal information includes, but is not limited to, the following if it identifies, relates to, describes, is reasonably capable of being associated with, or could be reasonably linked, directly or indirectly, with a particular consumer or household: (A) Identifiers such as a real name, alias, postal address, unique personal identifier, online identifier, Internet Protocol address, email address, account name, social security number, driver’s license number, passport number, or other similar identifiers. (B) Any personal information described in subdivision (e) of Section 1798.80. (C) Characteristics of protected classifications under California or federal law. (D) Commercial information, including records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies. (E) Biometric information. (F) Internet or other electronic network activity information, including, but not limited to, browsing history, search history, and information regarding a consumer’s interaction with an internet website application, or advertisement. (G) Geolocation data. (H) Audio, electronic, visual, thermal, olfactory, or similar information. (I) Professional or employment-related information. (J) Education information, defined as information that is not publicly available personally identifiable information as defined in the Family Educational Rights and Privacy Act (20 U.S.C. Sec. 1232g; 34 C.F.R. Part 99). (K) Inferences drawn from any of the information identified in this subdivision to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes. (L) Sensitive personal information. (2) “Personal information” does not include publicly available information or lawfully obtained, truthful information that is a matter of public concern. For purposes of this paragraph, “publicly available” means: information that is lawfully made available from federal, state, or local government records, or information that a business has a reasonable basis to believe is lawfully made available to the general public by the consumer or from widely distributed media; or information made available by a person to whom the consumer has disclosed the information if the consumer has not restricted the information to a specific audience. “Publicly available” does not mean biometric information collected by a business about a consumer without the consumer’s knowledge. (3) “Personal information” does not include consumer information that is deidentified or aggregate consumer information. | CCPA | ||||||||
policy | The general principles by which a government is guided in its management of public affairs, or the legislature in its measures. This term, as applied to a law, ordinance, or rule of law, denotes its general purpose or tendency considered as directed to the POLICY | law_policy_2023 | A policy defines the learning agent’s way of behaving at a given time | sutton_reinforcement_2018 | ||||||||
positionality | Awareness and discussion of ones’ social and institutional position with regards to research, particularly of power imbalances, and limitations the researcher may have because of differences in lived experience. | Malik_2021 | the researcher's starting points and standpoints before and during inquiry, as well as the conditions shaping the research situation, process, and product. | Charmaz_Henwood | reflexivity | |||||||
post-hoc explanation | Post-hoc explainability targets models that are not readily interpretable by design by resorting to diverse means to enhance their interpretability, such as text explanations, visual explanations, local explanations, explanations by example, explanations by simplification and feature relevance explanations techniques. Each of these techniques covers one of the most common ways humans explain systems and processes by themselves. | NISTIR_8312_Full | Post-hoc explainability targets models that are not readily inter- pretable by design by resorting to diverse means to enhance their in- terpretability, such as text explanations, visual explanations, local expla- nations, explanations by example, explanations by simplification and feature relevance explanations techniques. Each of these techniques covers one of the most common ways humans explain systems and processes by themselves. | barredo_explainable_2020 | ||||||||
post-processing | Typically performed with the help of a holdout dataset (data not used in the training of the model). Here, the learned model is treated as a black box and its predictions are altered by a function during the post-processing phase. The function is deduced from the performance of the black box model on the holdout dataset. | SP1270 | Performed after training by accessing a holdout set that was not involved during the training of the model. If the algorithm can only treat the learned model as a black box without any ability to modify the training data or learning algorithm, then only post-processing can be used in which the labels assigned by the black-box model initially get reassigned based on a function during the post-processing phase. | Mehrabi,_Ninareh | Steps performed after a machine learning model has been run to adjust its output. This can include adjusting a model's outputs or using a holdout dataset — data not used in the training of the model — to create a function run on the model's predictions to improve fairness or meet business requirements. | IAPP_Governance_Terms | ||||||
post-processing algorithm | A bias mitigation algorithm that is applied to predicted labels. | AI_Fairness_360 | ||||||||||
practical significance | a conceptual framework for evaluating discrimination cases developed primarily on statistical evidence that is the subject of increasing interest and discussion by some in the equal employment opportunity (EEO) field. | DOL_Practical_Significance | statistical significance (often paired in contrast to this); substantive significance (synonym) | |||||||||
pre-processing algorithm | A bias mitigation algorithm that is applied to training data. | AI_Fairness_360 | ||||||||||
precision | A metric for classification models. Precision identifies the frequency with which a model was correct when classifying the positive class. | NSCAI | closeness of agreement between indications or measured quantity values obtained by replicate measurements on the same or similar objects under specified conditions | aime_measurement_2022, citing ISO/IEC Guide 99 | A metric for classification models. Precision identifies the frequency with which a model was correct when predicting the positive class. That is: Precision = True Positive /(True Positive + False Positive) | aime_measurement_2022, citing Machine Learning Glossary by Google | Closeness of agreement between independent test results obtained under prescribed conditions. It is generally dependent on analyte concentration, and this dependence should be determined and documented. The measure of precision is usually expressed in terms of imprecision and computed as a standard deviation of the test results. Higher imprecision is reflected by a larger standard deviation. Independent test results means results obtained in a manner not influenced by any previous results on the same or similar material. Precision covers repeatability and reproducibility [19]. Alternatively, precision is a measure for the reproducibility of measurements within a set, that is, of the scatter or dispersion of a set about its central value. Precision depends only on the distribution of random errors and does not relate to the true value or specified value. | UNODC_Glossary_QA_GLP | ||||
prediction | Forecasting quantitative or qualitative outputs through function approximation, applied on input data or measurements. | NSCAI | primary output of an AI system when provided with input data or information | aime_measurement_2022, citing ISO/IEC 22989 | ||||||||
predictive analysis | The organization of analyses of structured and unstructured data for inference and correlation that provides a useful predictive capability to new circumstances or data. | IEEE_Guide_IPA | ||||||||||
predictive analytics | Insights, reporting, and information answering the question, "What is likely to happen?" Predictive analytics support high confidence foretelling of future event(s). | IEEE_Guide_IPA | ||||||||||
preprocessing | Transforming the data so that the underlying discrimination is mitigated. This method can be used if a modeling pipeline is allowed to modify the training data. | SP1270 | ||||||||||
prescriptive analytics | Insights, reporting, and information answering the question, “What should I do about it?" Prescriptive analytics determines information that provides high confidence actions necessary to recover from an event or fulfill a need. | IEEE_Guide_IPA | ||||||||||
privacy | freedom from intrusion into the private life or affairs of an individual | ISO/IEC_TS_5723:2022(en) | freedom from intrusion into the private life or affairs of an individual when that intrusion results from undue or illegal gathering and use of data about that individual | aime_measurement_2022, citing ISO/IEC TR 24029-1 | ||||||||
privacy-by-design | Embedding privacy measures and privacy enhancing technologies directly into the design of information technologies and systems. | ENISA | data-protection-by-design (def: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A02016R0679-20160504&qid=1532348683434) | |||||||||
privacy-enhancing technology | A coherent system of ICT (Information and Communications Technology) measures that protects privacy by eliminating or reducing personal data or by preventing unnecessary and/or undesired processing of personal data, all without losing the functionality of the information system. | PET_Handbook | ||||||||||
privileged protected attribute | A value of a protected attribute indicating a group that has historically been at systematic advantage. | AI_Fairness_360 | ||||||||||
procedure | Information item that presents an ordered series of steps to perform a process, activity, or task. | IEEE_Soft_Vocab | ||||||||||
process | A sequence or flow of activities in an organization with the objective of carrying out work, which may include a set of activities, events, tasks, and decisions in a sequenced flow that adhere to finite execution semantics. Process levels will generally follow structure at the capability maturity model integration (CMMI) level. | IEEE_Guide_IPA | Set of interrelated or interacting activities that transforms inputs into outputs | IEEE_Soft_Vocab | ||||||||
process flow | The defined representation of the overall progression of how a process is intended to be performed, including all exceptions. | IEEE_Guide_IPA | ||||||||||
processing | ‘Processing’ means any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction. | GDPR | “Processing” means any operation or set of operations that are performed on personal information or on sets of personal information, whether or not by automated means. | CCPA | personal data; processing | |||||||
processing environment | the combination of software and hardware on which the Application runs. | Law_Insider_processing_environment | ||||||||||
processor | ‘Processor’ means a natural or legal person, public authority, agency or other body which processes personal data on behalf of the controller. | GDPR | “Processing” means any operation or set of operations that are performed on personal information or on sets of personal information, whether or not by automated means. | CCPA | personal data; processing; controller | |||||||
product manager | a specialized product management professional whose job is to manage the planning, development, launch, and success of products/solutions powered by AI, machine learning, and deep learning technologies. | productmanagerHQ_Josh_Fechter | ||||||||||
product owner | [person who is] focused on providing direction and prioritization for the cross-functional AI team, ensuring everyone remains focused on the overall vision and road map. This role is responsible for unifying individuals with diverse skills and backgrounds toward a common goal. | Forbes_Tracy_Kemp | ||||||||||
productization | [turning the best performing model] into an actual "data product," ready to be used in live services. | Towards_Productizing | ||||||||||
profiling | ‘Profiling’ means any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person's performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements. | GDPR | “Profiling” means any form of automated processing of personal information, as further defined by regulations pursuant to paragraph (16) of subdivision (a) of Section 1798.185, to evaluate certain personal aspects relating to a natural person and in particular to analyze or predict aspects concerning that natural person’s performance at work, economic situation, health, personal preferences, interests, reliability, behavior, location, or movements. | CCPA | Measuring the characteristics of expected activity so that changes to it can be more easily identified. | CSRC | personal data; processing | |||||
protected attribute | An attribute that partitions a population into groups whose outcomes should have parity. Examples include race, gender, caste, and religion. Protected attributes are not universal, but are application specific. | AI_Fairness_360 | ||||||||||
protected class | [a feature] that may not be used as the basis for decisions [and] could be chosen because of legal mandates or because of organizational values. Some common protected [classes] include race, religion, national origin, gender, marital status, age, and socioeconomic status. | MIT_Protected_Attributes | A group of people with a common characteristic who are legally protected from [...] discrimination on the basis of that characteristic. Protected classes are created by both federal and state law. | Practical_Law_protected_class | ||||||||
prototype | A prototype is an original model constructed to include all the technical characteristics and performances of the new product. | OECD | ||||||||||
provisioning | The granting of access rights and executional privilege to an agent (human or machine) within an application(s) or system(s). | IEEE_Guide_IPA | ||||||||||
proxy | A variable that can stand in for another, usually not directly observable or measurable, variable. | SP1270 | ||||||||||
pseudo-anonymization (pseudonymization) | ‘Pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person; | GDPR | “Pseudonymize” or “Pseudonymization” means the processing of personal information in a manner that renders the personal information no longer attributable to a specific consumer without the use of additional information, provided that the additional information is kept separately and is subject to technical and organizational measures to ensure that the personal information is not attributed to an identified or identifiable consumer. | CCPA | A data management technique to strip identifiers linking data to an individual. | NSCAI | personal data; processing | |||||
quality | The totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs. | OECD | degree to which the characteristics of data satisfy stated and implied needs when used under specified conditions; degree to which a set of inherent characteristics of an object fulfils requirements (an object can be a product, process or service) | ISO/IEC_TS_5723:2022(en) | ||||||||
racialized | A socio-political process by which groups are ascribed a racial identity, whether or not members of the group self-identify as such | AAAS_AI_and_Bias_2022-09 | ||||||||||
ranking | a type of machine learning that sorts data in a relevant order[; often used by companies] to optimize search and recommendations. | DEV_ranking | position, order, or standing within a group : RANK | Merriam-Webster_ranking | ||||||||
recall | A metric for classification models; identifies the frequency with which a model correctly classifies the true positive items. | NSCAI | A metric for classification models that answers the following question: Out of all the possible positive labels, how many did the model correctly identify? That is: Recall = True Positive/( True Positive + false Negative) | aime_measurement_2022, citing Machine Learning Glossary by Google | ||||||||
recognition | the automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take actions such as classifying the data into different categories. | Pattern_Recognition_and_Machine_Learning | a sense of awareness and familiarity experienced when one encounters people, events, or objects that have been encountered before or when one comes upon material that has been learned in the past. | APA_recognition | to transfer prior learning or past experience to current consciousness: that is, to retrieve and reproduce information; to remember. | APA_recall | ||||||
recommendation system | A software tool and techniques that provide suggestion based on the customer's taste to discover new appropriate thing for them by filtering personalized information based on the user's preferences from a large volume of information | Das,_Debashis | A subclass of information filtering system that seek to predict ‘rating’ or ‘preference’ that a user would give to an item (such as music, books or movies) or social element (e.g. people or group) they had not yet considered, using a model built from the characteristics of an item (content based approaches) or the user’s social environment (collaborative filtering approaches) | Sharma,_Lalita | ||||||||
rectification | An individual’s right to have personal data about them corrected or amended by a business or other organization if it is inaccurate. | IAPP_Privacy_Glossary | ||||||||||
red-team | A group of people authorized and organized to emulate a potential adversary’s attack or exploitation capabilities against an enterprise’s security posture. The Red Team’s objective is to improve enterprise cybersecurity by demonstrating the impacts of successful attacks and by demonstrating what works for the defenders (i.e., the Blue Team) in an operational environment. Also known as Cyber Red Team. | CSRC | ||||||||||
reference class | A class which is intended to describe structure and behavior of object identifiers. Its instances, called references, are passed by-value and indirectly represent objects by substituting for some primitive reference. | IGI_Global_reference_class | ||||||||||
reflexivity | A form of critical thinking that prompts us to consider the ‘whys’ and ‘hows’ of research, critically questioning the utility, ethics, and value of what, whom, and how we study | Jamieson_Govaart_Pownall | in qualitative research, the self-referential quality of a study in which the researcher reflects on the assumptions behind the study and especially the influence of his or her own motives, history, and biases on its conduct. | APA_reflexivity | positionality | |||||||
regression | Regression is a process of predicting the value to a yes or no label provided it falls on a continuous spectrum of input values, subcategory of supervised learning. | Ranschaert,_Erik | the prediction of an exact value using a given set of data | Saleh_Alkhalifa_ML_in_Biotech | ||||||||
reinforcement learning | A method of training algorithms to make suitable actions by maximizing rewarded behavior over the course of its actions. This type of learning can take place in simulated environments, such as game-playing, which reduces the need for real-world data. | NSCAI | Reinforcement learning (RL) is a subset of machine learning that allows an artificial system (sometimes referred to as an agent) in a given environment to optimize its behaviour. Agents learn from feedback signals received as a result of their actions, such as rewards or punishments, with the aim of maximizing the received reward. Such signals are computed based on a given reward function, which constitutes an abstract representation of the system's goal. The goal could be, for example, to earn a high video game score or to minimize idle worker time in a factory | TTC6_Taxonomy_Terminology | ||||||||
reliability | Reliability refers to the closeness of the initial estimated value(s) to the subsequent estimated values. | OECD | ability of an item to perform as required, without failure, for a given time interval, under given conditions. Note 1 to definition: The time interval duration can be expressed in units appropriate to the item concerned (e.g. calendar time, operating cycles, distance run, etc.) and the units should always be clearly stated. Note 2 to definition: Given conditions include aspects that affect reliability, such as: mode of operation, stress levels, environmental conditions, and maintenance. | ISO/IEC_TS_5723:2022(en) | property of consistent intended behaviour and results | aime_measurement_2022, citing ISO/IEC 22989 | ||||||
remediation | The process of treating data by cleaning, organizing, and migrating it to a safe and secure environment for optimized usage is called data remediation. Generally [understood] as a process involving deleting unnecessary or unused data. However, the actual process . . . is very detailed and includes several steps, including replacing, updating, or modifying data along with cleaning it, organizing it, and getting rid of unnecessary data. | CPO_Magazine_Amar_Kanagaraj | ||||||||||
reproducibility | Closeness of the agreement between the results of measurements of the same measurand carried out under changed conditions of measurement. | IEEE_Soft_Vocab | ||||||||||
requirement | something essential to the existence or occurrence of something else : CONDITION | Merriam-Webster_requirement | ||||||||||
residual | Residuals are differences between the one-step-predicted output from the model and the measured output from the validation data set. Thus, residuals represent the portion of the validation data not explained by the model. | MathWorks_Residual | ||||||||||
resilience | The ability to prepare for and adapt to changing conditions and withstand and recover rapidly from disruptions. Resilience includes the ability to withstand and recover from deliberate attacks, accidents, or naturally occurring threats or incidents. The ability of a system to adapt to and recover from adverse conditions. | NISTIR_8269_Draft | ability to anticipate and adapt to, resist, or quickly recover from a potentially disruptive event, whether natural or man-made; capability of a system to maintain its functions and structure in the face of internal and external change, and to degrade gracefully when this is necessary | ISO/IEC_TS_5723:2022(en) | ability of a system to recover operational condition quickly following an incident | aime_measurement_2022, citing ISO/IEC 22989 | ||||||
responsible AI | An AI system that aligns development and behavior to goals and values. This includes developing and fielding AI technology in a manner that is consistent with democratic values. | NSCAI | ||||||||||
result | The consequential outcome of completing a process. | IEEE_Guide_IPA | ||||||||||
retention limit | refers to the amount of information that is stored long-term, and can be measured in volume (the size of the total collected logs in bytes) and time (the number of months or years that logs are stored for). | Industrial_Network_Security_2011 | ||||||||||
risk | The composite measure of an event’s probability of occurring and the magnitude or degree of the consequences of the corresponding event. The impacts, or consequences, of AI systems can be positive, negative, or both and can result in opportunities or threats (Adapted from: iso 31000:2018 ) | NIST_AI_RMF_1.0 | A measure of the extent to which an entity is threatened by a potential circumstance or event, and typically a function of: (i) the adverse impacts that would arise if the circumstance or event occurs; and (ii) the likelihood of occurrence. | SP800-12 | An uncertain event or condition that, if it occurs, has a positive or negative effect on a project's objectives | IEEE_Soft_Vocab | effect of uncertainty on objectives | ISO_IEC_38507 | ||||
risk control | mechanisms at the design, implementation, and evaluation stages [that can be taken] into consideration when developing responsible AI for organizations that includes security risks (cyber intrusion risks, privacy risks, and open source software risk), economic risks (e.g., job displacement risks), and performance risks (e.g., risk of errors and bias and risk of black box, and risk of explainability). | Toward_an_understanding_of_responsible_artificial_intelligence_practices | ||||||||||
risk tolerance | Risk tolerance refers to the organization’s or AI actor’s ... readiness to bear the risk in order to achieve its objectives. Risk tolerance can be influenced by legal or regulatory requirements. | NIST_AI_RMF_1.0 | ||||||||||
robotic desktop automation (RDA) | The computer application that makes available to a human operator a suite of predefined activity choreography to complete the execution of processes, activities, transactions, and tasks in one or more unrelated software systems to deliver a result or service in the course of human-initiated or -managed workflow. | IEEE_Guide_IPA | ||||||||||
robotic process automation (RPA) | A preconfigured software instance that uses business rules and predefined activity choreography to complete the autonomous execution of a combination of processes, activities, transactions, and tasks in one or more unrelated software systems to deliver a result or service with human exception management. | IEEE_Guide_IPA | Software to help in the automation of tasks, especially those that are tedious and repetitive. | NSCAI | ||||||||
robust AI | An AI system that is resilient in real-world settings, such as an object-recognition application that is robust to significant changes in lighting. The phrase also refers to resilience when it comes to adversarial attacks on AI components. | NSCAI | ||||||||||
robustness | ability of a system to maintain its level of performance under a variety of circumstances | ISO/IEC_TS_5723:2022(en) | The ability of a machine learning model/algorithm to maintain correct and reliable performance under different conditions (e.g., unseen, noisy, or adversarially manipulated data). | NISTIR_8269_Draft | ||||||||
root-mean-square deviation (RMSD) | of an estimator of a parameter[; ...] the square-root of the mean squared error (MSE) of the estimator. In symbols, if X is an estimator of the parameter t, then RMSE(X) = ( E( (X−t)2 ) )½. The RMSE of an estimator is a measure of the expected error of the estimator. The units of RMSE are the same as the units of the estimator. | Glossary_of_Statistical_Terms | a frequently used measure of the differences between values (sample or population values) predicted by a model or an estimator and the values observed | Wikipedia_RMSD | root-mean-square error (RMSE) | |||||||
row | describes a single entity or observation and the columns describe properties about that entity or observation. The more rows you have, the more examples from the problem domain that you have. | Machine_Learning_Mastery_Jason_Brownlee | ||||||||||
safety | property of a system such that it does not, under defined conditions, lead to a state in which human life, health, property, or the environment is endangered; [safety involves reducing both the probability of expected harms and the possibility of unexpected harms]. | ISO/IEC_TS_5723:2022(en) | freedom from risk which is not tolerable | aime_measurement_2022, citinig ISO/IEC TR 24029-1 | ||||||||
scalability | The ability to increase or decrease the computational resources required to execute a varying volume of tasks, processes, or services. | IEEE_Guide_IPA | ||||||||||
score | A continuous value output from a classifier. Applying a threshold to a score results in a predicted label. | AI_Fairness_360 | ||||||||||
screen out | Screen-out discrimination occurs when “a disability prevents a job applicant or employee from meeting—or lowers their performance on—a selection criterion, and the applicant or employee loses a job opportunity as a result.” | EEOC_ADA_AI | ||||||||||
security | resistance to intentional, unauthorized act(s) designed to cause harm or damage to a system | ISO/IEC_TS_5723:2022(en) | degree to which a product or system (3.38) protects information (3.20) and data (3.11) so that persons or other products or systems have the degree of data access appropriate to their types and levels of authorization | aime_measurement_2022, citing ISO/IEC TR 24029-1 | ||||||||
segmentation | The process of identifying homogeneous subgroups within a data table. | Raynor | ||||||||||
self-aware system | A computing platform imbued with sufficient knowledge and analytic capability to make useful conclusions about its inputs, its own processing, and the use of its output so that it is capable of self- judgment and improvement consistent with its purpose. | IEEE_Guide_IPA | ||||||||||
self-diagnosis | Ability of a system to adequately take measurement information from sensors, validate the data, and communicate the processes and results to other devices | SP1011 | ||||||||||
self-healing system | A computing system able to perceive that it is not operating correctly and, without human intervention, make the necessary adjustments to restore itself to normalcy. | IEEE_Guide_IPA | ||||||||||
semantic mapping | A strategic schema or framework of metadata labels applied to all data, data groups, data fields, data types, or data content used to introduce new or raw data into a corpus or data fabric to give machine learning algorithms direction for investigating known or potential relationships between data. A semantic map provides a structure for the introduction of new data, information, or knowledge | IEEE_Guide_IPA | ||||||||||
sensitivity analysis | A “what-if” type of analysis to determine the sensitivity of the outcomes to changes in parameters. If a small change in a parameter results in relatively large changes in the outcomes, the outcomes are said to be sensitive to that parameter. | OECD | ||||||||||
sensory digitization | The conversion of typically analog or human sensory perception (e.g., vision, speech) to a digital format useful for machine-to-human interaction or machine processing of traditionally analog sensory information [e.g., optical character recognition (OCR)]. | IEEE_Guide_IPA | ||||||||||
service | A collection of coordinated processes that takes one or more kinds of input, performs a value-added transformation, and creates an output that fulfills the needs of a customer [or shareholder]. | IEEE_Guide_IPA | ||||||||||
signal detection theory | a framework for interpreting data from experiments in which accuracy is measured. | Signal_Detection_Theory | ||||||||||
shallow learning | Techniques that separate the process of feature extraction from learning itself. | Reznik,_Leon | ||||||||||
situational awareness | Perception of elements in the system and/or environment and a comprehension of their meaning, which could include a projection of the future status of perceived elements and the uncertainty associated with that status. | SP800-160 | ||||||||||
socio-technical system | how humans interact with technology within the broader societal context | NIST SP1270 | system that includes a combination of technical and human or natural elements | ISO/IEC_TS_5723:2022(en) | ||||||||
software testing | Activity in which a system or component is executed under specified conditions, the results are observed or recorded, and an evaluation is made of some aspect of the system or component. | IEEE_Soft_Vocab | ||||||||||
sparsity | refers to a matrix of numbers that includes many zeros or values that will not significantly impact a calculation. | Dave_Salvator_sparsity | ||||||||||
specification | A document that specifies, in a complete, precise, verifiable manner, the requirements, design, behavior, or other characteristics of a system or component and often the procedures for determining whether these provisions have been satisfied. | SP800-37 | ||||||||||
stakeholder | Individual or organization having a right, share, claim, or interest in a system or in its possession of characteristics that meet their needs and expectations. An individual, group, or organization who may affect, be affected by, or perceive itself to be affected by a decision, activity, or outcome of a project. | IEEE_Soft_Vocab | any individual, group, or organization that can affect, be affected by, or perceive itself to be affected by a decision or activity | ISO/IEC_TS_5723:2022(en) | ||||||||
standard deviation | The most widely used measure of dispersion of a frequency distribution introduced by K. Pearson (1893). It is equal to the positive square root of the variance. The standard deviation should not be confused with the root mean square deviation. | OECD | ||||||||||
start event | An activity, task, or input that describes or defines the beginning of a process. | IEEE_Guide_IPA | ||||||||||
statistical bias | A systematic tendency for estimates or measurements to be above or below their true values. Statistical biases arise from systematic as opposed to random error. Statistical bias can occur in the absence of prejudice, partiality, or discriminatory intent. | SP1270 | ||||||||||
statistical parity | The independence between the protected attribute and the outcome of the decision rule | Besse,_Philippe | ||||||||||
statistical significance | When the probability of obtaining a statistic of a given size due strictly to random sampling error, or chance, is less than the selected alpha level [or the probability of a type I error]; also represents a rejection of the null hypothesis. | Statistics_in_Plain_English | refers to whether a relationship between two or more variables exists beyond a probability expected by chance | The_SAGE_Encyclopedia_of_Communication_Research_Methods | ||||||||
statistics | Numerical data relating to an aggregate of individuals; the science of collecting, analysing and interpreting such data | OECD | ||||||||||
stereotype | a set of cognitive generalizations (e.g., beliefs, expectations) about the qualities and characteristics of the members of a group or social category. Stereotypes, like schemas, simplify and expedite perceptions and judgments, but they are often exaggerated, negative rather than positive, and resistant to revision even when perceivers encounter individuals with qualities that are not congruent with the stereotype. | APA_stereotype | Contemporary social psychology typically defines stereotypes as mental representations of a group and its members, and stereotyping as the cognitive activity of treating individual elements in terms of higher level categorial properties | Augoustinos_Walker_1998 | ||||||||
stochastic | The adjective “stochastic” implies the presence of a random variable; e.g. stochastic variation is variation in which at least one of the elements is a variate and a stochastic process is one wherein the system incorporates an element of randomness as opposed to a deterministic system. | OECD | ||||||||||
straight-through processing (STP) | The successful execution of a service, process, or transaction performed entirely through traditional application platforms with predefined interfaces (i.e., application programming interfaces [APIs]). | IEEE_Guide_IPA | ||||||||||
strawperson | a fallacious argument which irrelevantly attacks a position that appears similar to, but is actually different from, an opponent's position, and concludes that the opponent's real position has thereby been refuted. | Hughes_Lavery_Critical_Thinking | ||||||||||
stress test | Type of performance efficiency testing conducted to evaluate a test item's behavior under conditions of loading above anticipated or specified capacity requirements, or of resource availability below minimum specified requirements | IEEE_Soft_Vocab | ||||||||||
structured data | Data that has a predefined data model or is organized in a predefined way. | NIST_1500 | ||||||||||
sub-process | A subordinate process that can be included within a parent process. It can be present and/or repeated within other parent processes. | IEEE_Guide_IPA | ||||||||||
supervised learning | A type of machine learning in which the algorithm compares its outputs with the correct outputs during training. In unsupervised learning, the algorithm merely looks for patterns in a set of data. | Hutson,_Matthew | Algorithms, which develop a mathematical model from the input data and known desired outputs. | Reznik,_Leon | For a computer to process a set of data whose attributes have been divided into two groups and derive a relationship between the values of one and the values of the other. These two groups are sometimes called predictor and targets, respectively. In statistical terminology, they are called independent and dependent variables. Respectively. The learning Is "supervised because the distinction between the predictors and the target variables is chosen by the investigator or some other outside agency. | Raynor | a general subset of machine learning in which data, like its associated labels, is used to train models that can learn or generalize from the data to make predictions, preferably with a high degree of certainty. | Saleh_Alkhalifa_ML_in_Biotech | ||||
support vector machines | A supervised machine learning model for data classification and regression analysis. One of the most used classifiers in machine learning. It optimizes the width of the gap between the points of separate categories in feature space. | Ranschaert,_Erik | ||||||||||
system | combination of interacting elements organized to achieve one or more stated purposes | ISO/IEC_TS_5723:2022(en) | ||||||||||
systemic bias | Systemic biases result from procedures and practices of particular institutions that operate in ways which result in certain social groups being advantaged or favored and others being disadvantaged or devalued. This need not be the result of any conscious prejudice or discrimination but rather of the majority following existing rules or norms. | D. Chandler and R. Munday, A Dictionary of Media and Communication. Oxford University Press, Jan. 2011, publication Title: A Dictionary of Media and Communication | ||||||||||
system of systems | set of systems and system elements that interact to provide a unique capability that none of the constituent systems can accomplish on its own (note: can be necessary to facilitate interaction of the constituent systems in the system of systems) | ISO/IEC_TS_5723:2022(en) | ||||||||||
target | a method for solving a problem that an AI algorithm parses its training data to find. Once an algorithm finds its target function, that function can be used to predict results (predictive analysis). The function can then be used to find output data related to inputs for real problems where, unlike training sets, outputs are not included. | TechTarget_target_function | target variable, target value | |||||||||
task | The performance of a discrete activity with a defined start, stop, and outcome that cannot be broken down to a finer level of detail. | IEEE_Guide_IPA | Required, recommended, or permissible action, intended to contribute to the achievement of one or more outcomes of a process | IEEE_Soft_Vocab | set of activities undertaken in order to achieve a specific goal | aime_measurement_2022, citing ISO/IEC TR 24030 | ||||||
taxonomy | Taxonomy refers to classification according to presumed natural relationships among types and their subtypes. | OECD | ||||||||||
technical control | Security controls (i.e., safeguards or countermeasures) for an information system that are primarily implemented and executed by the information system through mechanisms contained in the hardware, software, or firmware components of the system. | NIST_SP_800-30_Rev_1 | ||||||||||
technochauvinism | The belief that technology is always the solution | M. Broussard, Artificial Unintelligence: How Computers Misunderstand the World. MIT Press, 2018. | techno-solutionism | |||||||||
test | Technical operation to determine one or more characteristics of or to evaluate the performance of a given product, material, equipment, organism, physical phenomenon, process or service according to a specified procedure. | UNODC_Glossary_QA_GLP | any activity aimed at evaluating an attribute or capability of a program or system and deteermining that it meets its required results. | William_Hetzel | (1) activity in which a system or component is executed under specified conditions, the results are observed or recorded, and an evaluation is made of some aspect of the system or component; (2) to conduct an activity as in (1); (3) set of one or more test cases and procedures. | aime_measurement_2022, citing ISO/IEC 24765 | the process of executing a program with the intent of finding errors. | The_Art_of_Software_Testing | Test, Evaluation, Verification and Validation (TEVV) | |||
Test and Evaluation, Verification and Validation (TEVV) | A framework for assessing, incorporating methods and metrics to determine that a technology or system satisfactorily meets its design specifications and requirements, and that it is sufficient for its intended use. | NSCAI_Report | ||||||||||
third party | an entity that is involved in some way in an interaction that is primarily between two other entities. [Please see note, especially regarding NIST CSRC terms that we might incorporate into this definition.] | TechTarget_third_party | ||||||||||
three lines of defense | Most financial institutions follow a three-lines-of-defense model, which separates front line groups, which are generally accountable for business risks (the First Line), from other risk oversight and independent challenge groups (the Second Line) and assurance (the Third Line) | AIRS_Penn | ||||||||||
traceability | Ability to trace the history, application or location of an entity by means of recorded identification. ["Chain of custody" is a related term.] Alternatively, traceability is a property of the result of a measurement or the value of a standard whereby it can be related with a stated uncertainty, to stated references, usually national or international standards, i.e. through an unbroken chain of comparisons. In this context, The standards referred to here are measurement standards rather than written standards. | UNODC_Glossary_QA_GLP | A characteristic of an AI system enabling a person to understand the technology, development processes, and operational capabilities (e.g., with transparent and auditable methodologies along with documented data sources and design procedures). | NSCAI | ||||||||
training data | A dataset from which a model is learned. | AI_Fairness_360 | samples for training used to fit a machine learning model | aime_measurement_2022, citing ISO/IEC 22989 | ||||||||
transaction | Enactment of a process represented by a set of coordinated activities carried out by multiple systems and/or participants in accordance with defined relationships. This coordination leads to an intentional, consistent, and verifiable result across all participants. | IEEE_Guide_IPA | ||||||||||
transfer learning | A technique in machine learning in which an algorithm learns to perform one task, such as recognizing cars, and builds on that knowledge when learning a different but related task, such as recognizing cats. | Hutson,_Matthew | ||||||||||
transformer | A procedure that modifies a dataset. | AI_Fairness_360 | ||||||||||
transparency | open, comprehensive, accessible, clear and understandable presentation of information; property of a system or process to imply openness and accountability | ISO/IEC_TS_5723:2022(en) | Understanding the working logic of the model. | NISTIR_8269_Draft | property of an organization that appropriate activities and decisions are communicated to relevant stakeholders (3.5.13) in a comprehensive, accessible and understandable manner Note 1 to entry: Inappropriate communication of activities and decisions can violate security, privacy or confidentiality requirements. | iso_22989_2022 | property of a system that appropriate information about the system is made available to relevant stakeholders (3.5.13) Note 1 to entry: Appropriate information for system transparency can include aspects such as features, performance, limitations, components, procedures, measures, design goals, design choices and assumptions, data sources and labelling protocols. Note 2 to entry: Inappropriate disclosure of some aspects of a system can violate security, privacy or confidentiality requirements. | iso_22989_2022 | ||||
true negative | outcome where the model correctly predicts the negative class. | google_dev_classification-true-false-positive-negative | ||||||||||
true positive | an outcome where the model correctly predicts the positive class. | google_dev_classification-true-false-positive-negative | ||||||||||
trust | the system status in the mind of human beings based on their perception of and experience with the system; concerns the attitude that a person or technology will help achieve specific goals in a situation characterized by uncertainty and vulnerability. | DOD_TEVV | degree to which a user or other stakeholder has confidence that a product or system will behave as intended | aime_measurement_2022, citing ISO/IEC TR 24029-1 | ||||||||
trustworthiness | The degree to which an information system (including the information technology components that are used to build the system) can be expected to preserve the confidentiality, integrity, and availability of the information being processed, stored, or transmitted by the system across the full range of threats and individuals’ privacy. | SP800-37 | Worthy of being trusted to fulfill whatever critical requirements may be needed for a particular component, subsystem, system, network, application, mission, enterprise, or other entity. | SP800-160 | ability to meet stakeholders' expectations in a verifiable way; an attribute that can be applied to services, products, technology, data and information as well as to organizations. | ISO/IEC_TS_5723:2022(en) | ||||||
trustworthy AI | Characteristics of trustworthy AI systems include: valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful bias managed. | NIST_AI_RMF_1.0 | Trustworthy AI has three components: (1) it should be lawful, ensuring compliance with all applicable laws and regulations (2) it should be ethical, demonstrating respect for, and ensure adherence to, ethical principles and values and (3) it should be robust, both from a technical and social perspective, since, even with good intentions, AI systems can cause unintentional harm. Trustworthy AI concerns not only the trustworthiness of the AI system itself but also comprises the trustworthiness of all processes and actors that are part of the system’s life cycle. | european_ethics_2019 | Trustworthy AI has three components: (1) it should be lawful, ensuring compliance with all applicable laws and regulations (2) it should be ethical, demonstrating respect for, and ensure adherence to, ethical principles and values and (3) it should be robust, both from a technical and social perspective, since, even with good intentions, AI systems can cause unintentional harm. Characteristics of Trustworthy AI systems include: valid and reliable, safe, secure and resilient, accountable and transparent, explainable and interpretable, privacy-enhanced, and fair with harmful bias managed. Trustworthy AI concerns not only the trustworthiness of the AI system itself but also comprises the trustworthiness of all processes and actors that are part of the AI system’s life cycle. Trustworthy AI is based on respect for human rights and democratic values. | TTC6_Taxonomy_Terminology | ||||||
type I error | The null hypothesis H0 is rejected, even though it is [true] | berthold_guide_2020 | false positive rate | james_statistical_2014 | ||||||||
type II error | The null hypothesis H0 is accepted, even though it is [false] | berthold_guide_2020 | true positive rate | james_statistical_2014 | ||||||||
uncertainty | Result of not having accurate or sufficient knowledge of a situation; state, even partial, of deficiency of information related to understanding or knowledge of an event, its consequence, or likelihood | IEEE_Soft_Vocab | ||||||||||
underfitting | Underfitting occurs when a statistical model cannot adequately capture the underlying structure of the data. | Ranschaert,_Erik | ||||||||||
underrepresentation | inadequately represented. (See note.) | Merriam-Webster_underrepresented | when members of discernible groups are not consistently present in representative bodies and among measures of well-being in numbers roughly proportionate to their numbers within the population. | Encyclopedia.com_underrepresentation | ||||||||
unexplainable | impossibility of providing an explanation for certain decisions made by an intelligent system which is both 100% accurate and comprehensible. | Roman_V._Yampolskiy_Unexplainability | black box; opacity | |||||||||
unstructured data | Data that does not have a predefined data model or is not organized in a predefined way | |||||||||||
unsupervised learning | A learning strategy that consists in observing and analyzing different entities and determining that some of their subsets can be grouped into certain classes, without any correctness test being performed on acquired knowledge through feedback from external knowledge sources. Note 1 to entry: Once a concept is formed, it is given a name that may be used in subsequent learning of other concepts. | iso_2382_1997 | ||||||||||
usability | extent to which a system product or service can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use (note 1: The “specified” users, goals and context of use refer to the particular combination of users, goals and context of use for which usability is being considered; note 2: used as a qualifier to refer to the design knowledge, competencies, activities and design attributes that contribute to usability, such as usability expertise, usability professional, usability engineering, usability method, usability evaluation, usability heuristic). [See also: ISO/IEC 9241-11 Ergonomic of Human-System Interaction — Part 11: Usability: Definitions and Concepts. ISO, Geneva, Switzerland, 2018, https://www.iso.org/standard/63500.html.] | ISO/IEC_TS_5723:2022(en) | ||||||||||
usability testing | refers to evaluating a product or service by testing it with representative users. Typically, during a test, participants will try to complete typical tasks while observers watch, listen and takes notes. The goal is to identify any usability problems, collect qualitative and quantitative data and determine the participant's satisfaction with the product. | Usabilitygov | ||||||||||
user | individual or group that interacts with a system or benefits from a system during its utilization | IEEE_Soft_Vocab | A person, organization, or other entity which requests access to and uses the resources of a computer system or network. | CSRC | ||||||||
user-centered design | the practice of the following principles, the active involvement of users for a clear understanding of user and task requirements, iterative design and evaluation, and a multi-disciplinary approach | Vredenburg,_Karel | Approach to system design and development that aims to make interactive systems more usable by focusing on the use of the system; applying human factors, ergonomics and usability knowledge and techniques. | IEEE_Soft_Vocab | ||||||||
validation | Confirmation by examination and provision of objective evidence that the particular requirements for a specific intended use are fulfilled. | UNODC_Glossary_QA_GLP | Confirmation, through the provision of objective evidence, that the requirements for a specific intended use or application have been fulfilled. | IEEE_Soft_Vocab | provides objective evidence that the capability provided by the system complies with stakeholder performance requirements, achieving its use in its intended operational environment; answers the question, "Is it the right solution to the problem?" [C]onsists of evaluating the operational effectiveness, operational suitability, sustainability, and survivability of the system or system elements under operationally realistic conditions. | DOD_TEVV | A continuous monitoring of the process of compilation and of the results of this process. | OECD | Test and Evaluation, Verification, and Validation (TEVV) | |||
value sensitive design | a theoretically grounded approach to the design of technology that accounts for human values in a principled and systematic manner throughout the design process. | Friedman_et_al_2017 | ||||||||||
variable | A variable is a characteristic of a unit being observed that may assume more than one of a set of values to which a numerical measure or a category from a classification can be assigned. | OECD | Quantity or data item whose value can change | IEEE_Soft_Vocab | ||||||||
variance | The variance is the mean square deviation of the variable around the average value. It reflects the dispersion of the empirical values around its mean. | OECD | A quantifiable deviation, departure, or divergence away from a known baseline or expected value | IEEE_Soft_Vocab | ||||||||
verifiable | can be checked for correctness by a person or tool | ISO/IEC_TS_5723:2022(en) | provides evidence that the system or system element performs its intended functions and meets all performance requirements listed in the system performance specification and functional and allocated baselines; answers the question, "Did you build the system correctly?" | DOD_TEVV | Test and Evaluation, Verification and Validation (TEVV) | |||||||
word embedding | a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. . . . A word embedding, trained on word co-occurrence in text corpora, represents each word (or common phrase) w as a d-dimensional word vector w~ 2 Rd. It serves as a dictionary of sorts for computer programs that would like to use word meaning. First, words with similar semantic meanings tend to have vectors that are close together. Second, the vector differences between words in embeddings have been shown to represent relationships between words. | Bolukbasi_et_al_Debiasing_Word_Embeddings | ||||||||||
Terms | Definition 1 | Citation 1 | Definition 2 | Citation 2 | Definition 3 | Citation 3 | Definition 4 | Citation 4 | Definition 5 | Citation 5 | Related terms and synonyms | Legal definition applicable |