Machine Learning: Study Notes

General Science July 28, 2025 4 min read

1. Introduction

Machine Learning (ML) is a subset of artificial intelligence (AI) that enables computers to learn from data and improve their performance over time without being explicitly programmed. ML algorithms identify patterns in data and use these patterns to make predictions or decisions.

2. Historical Context

1950s: Alan Turing introduces the concept of a “learning machine” in his paper “Computing Machinery and Intelligence.”
1957: Frank Rosenblatt develops the Perceptron, an early neural network.
1986: Backpropagation algorithm popularized, enabling multi-layer neural networks.
2012: Deep learning breakthrough with AlexNet winning the ImageNet competition.
2020s: ML is integral to fields like healthcare, finance, autonomous vehicles, and genomics.

3. How Machine Learning Works

Main Components

Data: The raw information fed into algorithms.
Features: Individual measurable properties or characteristics of data.
Model: Mathematical representation that maps input data to output predictions.
Training: The process of adjusting model parameters using data.
Evaluation: Testing the model’s performance on unseen data.

Types of Machine Learning

Type	Description	Example
Supervised	Learns from labeled data	Email spam detection
Unsupervised	Finds patterns in unlabeled data	Customer segmentation
Semi-supervised	Mix of labeled and unlabeled data	Speech recognition
Reinforcement	Learns via rewards and penalties	Game-playing AI (e.g., AlphaGo)

4. Key Algorithms

Linear Regression: Predicts continuous values.
Logistic Regression: Classifies data into categories.
Decision Trees: Splits data based on feature values.
Random Forests: Ensemble of decision trees for better accuracy.
Support Vector Machines (SVM): Finds the best boundary between classes.
Neural Networks: Mimics the human brain for complex tasks.
Clustering (K-Means): Groups similar data points.

5. Machine Learning Workflow

Data Collection
Data Preprocessing (cleaning, normalization)
Feature Engineering
Model Selection
Training
Evaluation
Deployment
Monitoring & Maintenance

6. Visual Representation

ML Workflow Diagram

7. Surprising Facts

ML Models Can Detect Diseases Before Symptoms Appear: Recent studies show ML can identify subtle patterns in medical images or genetic data, predicting diseases like cancer or Alzheimer’s before clinical symptoms manifest.
Adversarial Examples: Slight, often imperceptible changes to input data can fool even the most advanced ML models, raising security concerns.
Zero-Shot Learning: Some models can classify data into categories they were never explicitly trained on, by leveraging semantic relationships.

8. Common Misconceptions & Myth Debunked

Myth: “ML Models Understand Data Like Humans”

Reality: ML models do not “understand” data contextually. They identify statistical patterns, not meaning. For example, a model trained to recognize cats in images does not know what a cat is; it just learns pixel patterns associated with the label “cat.”

Other Misconceptions

ML is Always Accurate: ML models can be biased or make errors, especially with poor-quality data.
ML Replaces Humans: ML augments human decision-making but often requires human oversight for critical decisions.
Bigger Models Are Always Better: Larger models can overfit or require impractical amounts of data and computation.

9. Applications

Healthcare: Disease prediction, drug discovery, medical imaging.
Finance: Fraud detection, algorithmic trading, credit scoring.
Autonomous Vehicles: Object detection, path planning.
Natural Language Processing: Translation, sentiment analysis, chatbots.
Genomics: Pattern recognition in DNA sequences, CRISPR gene-editing guidance.

10. Recent Research Example

A 2022 study published in Nature demonstrated that ML models can predict the outcome of CRISPR gene-editing with high accuracy by analyzing DNA sequences and predicting off-target effects, improving the safety of gene-editing therapies (Nature, 2022).

11. Challenges and Limitations

Bias and Fairness: Models can inherit biases present in training data.
Explainability: Complex models (e.g., deep neural networks) are often “black boxes.”
Data Requirements: Large, high-quality datasets are essential.
Security: Vulnerable to adversarial attacks.

12. Future Trends

Federated Learning: Training models across decentralized devices while preserving privacy.
Explainable AI (XAI): Making decisions of ML models more interpretable.
Integration with Biotechnology: ML is accelerating discoveries in genomics and gene editing.

13. Glossary

Overfitting: Model fits training data too closely, performing poorly on new data.
Underfitting: Model is too simple, missing important patterns.
Feature Engineering: Creating new input features to improve model performance.
Hyperparameter Tuning: Adjusting algorithm settings for optimal results.

14. References

Nature. (2022). “Machine learning enables CRISPR–Cas9 off-target prediction at high accuracy.” Link
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach.

15. Diagram: Types of ML

Types of Machine Learning

End of Reference Handout