Before we begin working on hands-on projects, it’s essential to cover the basics of AI and key blocks that help us in building machine learning projects. Two fundamental building blocks of machine learning are features (input we provide) and labels (output(s) we aim to predict). In this article, I’ll walk you through what labels and features are, their different types, and how they are used across various types of machine learning.
Let’s dive in!
Overview
What are Labels?
Labels, also known as targets or outcomes, are the output data we want to predict or classify.
Example:
In an image recognition problem, the label is the category the image belongs to, such as ‘cat’, ‘dog’, or ‘car’.
What are Features?
Features, also known as variables or attributes, are the input information we give to the computer to help it to learn and make decisions.
Depending on the type of data and the machine learning algorithm, features can be numeric, categorical, or text-based
Example:
In an image recognition problem, features could be details in the image such as colors, shapes, and patterns.
Types of Features
Qualitative Features
Qualitative features, also known as categorical features, represent data that can be divided into distinct categories based on qualities or attributes. They DO NOT have a numerical value or inherent order (except for ordinal features). These features are used to describe the characteristics of the data.
Nominal: No inherent order; examples include colors and car brands. One-hot encoding is commonly used.
Car ID | Color |
1 | Red |
2 | Blue |
3 | Green |
4 | Blue |
5 | Red |
Ordinal: Have a meaningful order; examples include education levels and customer satisfaction ratings. One-hot encoding can be used, but ordinal encoding may be more appropriate.
Binary: Yes/No decisions, presence/absence of a feature.
How to feed features into the system?
You can use One-Hot-Encoding for nominal data and Ordinal Encoding for ordinal data.
Quantitative Features
Quantitative features represent data that can be measured and expressed numerically. These features quantify the attributes of the data and can be either discrete or continuous.
· Discrete: Number of items sold, number of children.
· Continuous: Weight, height, temperature, price.
Feature Engineering
Feature engineering is the process of using domain knowledge to create new features from raw data that help improve the performance of machine learning models. It involves transforming, creating, and selecting features that make the learning process more efficient and effective. The goal is to enhance the predictive power of the model by providing it with the most relevant and informative features.
Common Feature Engineering Techniques
- Mathematical Transformations: Apply operations such as log, square root, or square to existing features.
- Aggregation: Summarize or aggregate data to create new features, such as monthly averages from daily data.
- Date and Time Features: Extract components from datetime data, such as year, month, day, hour, or day of the week.
- Scaling: Standardize or normalize features to ensure they are on a similar scale.
- Handling Missing Values: Impute missing data with appropriate substitutes or remove them if necessary.
Features and Labels in Different Types of Machine Learning
You might have come across different learning methods, here is a quick summary of what features and labels are for different learning methods to help you better understand and grasp the concept.
Type | Definition | Labels | Features |
Supervised Learning | Training on fully labeled data to predict labels for new inputs. | Relies on accurate, fully labeled data | Features are used to predict known labels; quality and relevance are crucial. |
Semi-Supervised Learning | Combining labeled and unlabeled data to improve learning accuracy. | Combines a small amount of labeled data with a large amount of unlabeled data. | Features from both labeled and unlabeled data improve learning; leveraging unlabeled data is essential. |
Weakly-Supervised Learning | Training on data with noisy or incomplete labels to still extract useful patterns. | Works with noisy or imprecise labels. | Features need to be robust to noisy labels; reliable feature extraction and selection are vital. |
Webly Supervised Learning | Leveraging large-scale, diverse, and noisy web-derived data for model training. | Utilizes web-derived data and labels, often dealing with noisy information. | Features are diverse and noisy; advanced extraction and normalization techniques are needed to handle web-derived data. |
Unsupervised Learning | Finding patterns and structures in data without labels. | Uses no labels, focuses on finding patterns. | Features are used to identify hidden patterns without labels; selection and dimensionality reduction are key. |