Machine Learning: Study Notes

General Science July 28, 2025 4 min read

What is Machine Learning?

Machine Learning (ML) is a branch of artificial intelligence (AI) focused on building systems that learn from data, identify patterns, and make decisions with minimal human intervention.

Definition: ML enables computers to learn from experience (data) rather than being explicitly programmed for specific tasks.
Key Concept: Algorithms improve their performance over time as they are exposed to more data.

Historical Context

Early Ideas

1950s: Alan Turing posed the question “Can machines think?” leading to the Turing Test.
1952: Arthur Samuel created a checkers program that improved by playing games.
1957: Frank Rosenblatt invented the perceptron, an early neural network.

Timeline

Year	Event
1950	Turing Test for machine intelligence
1952	Samuel’s self-learning checkers program
1957	Rosenblatt’s perceptron
1967	Nearest Neighbor algorithm for pattern recognition
1986	Backpropagation algorithm for training neural networks
1997	IBM’s Deep Blue defeats chess champion Garry Kasparov
2006	Geoffrey Hinton coins “deep learning”
2012	AlexNet wins ImageNet competition, revolutionizing image recognition
2020	GPT-3 released, showcasing advanced natural language processing

Types of Machine Learning

Supervised Learning: Algorithms learn from labeled data (e.g., spam detection).
Unsupervised Learning: Algorithms find patterns in unlabeled data (e.g., customer segmentation).
Reinforcement Learning: Algorithms learn by trial and error, receiving rewards or penalties (e.g., game playing, robotics).

How Machine Learning Works

Data Collection: Gather relevant data.
Data Preparation: Clean and format data.
Model Selection: Choose an algorithm (e.g., decision tree, neural network).
Training: Feed data into the model so it can learn patterns.
Evaluation: Test the model on new data to assess accuracy.
Deployment: Use the model for real-world predictions.

Core Concepts

Features: Individual measurable properties of data.
Labels: Desired output for supervised learning.
Loss Function: Measures how far predictions are from actual results.
Optimization: Adjusting model parameters to minimize loss.

Diagram: Machine Learning Workflow

Machine Learning Workflow

Surprising Facts

The human brain has more connections (synapses) than there are stars in the Milky Way.
- Estimated: 100 trillion synapses vs. 100–400 billion stars.
Machine learning models can sometimes outperform human experts in complex tasks, such as diagnosing certain diseases from medical images.
Adversarial examples—tiny changes to input data—can cause ML models to make major mistakes, revealing vulnerabilities not present in human cognition.

Applications of Machine Learning

Healthcare: Disease prediction, drug discovery, medical imaging.
Finance: Fraud detection, algorithmic trading.
Transportation: Self-driving cars, route optimization.
Retail: Recommendation engines, inventory management.
Natural Language Processing: Language translation, chatbots.

Ethical Issues in Machine Learning

Bias and Fairness: ML models can perpetuate or amplify biases present in training data, leading to unfair outcomes.
Privacy: Use of personal data raises concerns about consent and data protection.
Transparency: Many ML models (especially deep learning) are “black boxes,” making it difficult to understand their decisions.
Accountability: Determining responsibility for decisions made by autonomous systems is challenging.

Example

A 2021 study published in Nature Medicine (“Fairness in machine learning for healthcare”) highlights how biased training data can lead to unequal healthcare outcomes, emphasizing the need for diverse datasets and transparent algorithms.

Recent Research

Reference: Bommasani, R., et al. (2021). “On the Opportunities and Risks of Foundation Models.” arXiv:2108.07258.
- Foundation models (e.g., GPT-3) are trained on massive datasets and can adapt to a wide variety of tasks, but they raise new ethical concerns about misuse, bias, and environmental impact.

Challenges and Limitations

Data Quality: Poor or unrepresentative data leads to inaccurate models.
Generalization: Models sometimes fail to perform well on data outside their training set.
Interpretability: Complex models are difficult to explain.
Resource Intensity: Training large models requires significant computational power and energy.

Future Directions

Explainable AI: Making models more transparent and understandable.
Federated Learning: Training models across decentralized devices to enhance privacy.
Continual Learning: Enabling models to adapt to new data over time without forgetting previous knowledge.

Diagram: Types of Machine Learning

Types of Machine Learning

Key Takeaways

Machine learning is transforming industries by enabling computers to learn from data.
Historical advances have led to today’s powerful models, but ethical and practical challenges remain.
Ongoing research focuses on making ML fairer, more transparent, and more adaptable.

References

Bommasani, R., et al. (2021). “On the Opportunities and Risks of Foundation Models.” arXiv:2108.07258.
Nature Medicine (2021). “Fairness in machine learning for healthcare.”