⚠️ Research Content Note

This research analyzes adult content metadata to study algorithmic bias. No explicit imagery is displayed. All analysis follows strict ethical guidelines and focuses on fairness metrics, not content itself. The goal is to expose and reduce discrimination against marginalized communities in content moderation systems.

535,236
Videos Analyzed
0.8%
Black Women Representation
92.6%
Best Model Accuracy (BERT)
64%
Bias Reduction Achieved

Why This Research Matters

Content moderation algorithms systematically discriminate against LGBT creators, sex workers, and BIPOC communities. When your ML model decides what's "appropriate," it's encoding cultural biases at industrial scale — and the people getting hurt are always the same: queer folks, Black and Brown creators, anyone outside the cishet white norm.

This thesis proves this discrimination is measurable, quantifiable, and most importantly — fixable. Through a novel Ensemble Fairness Optimization approach, I successfully reduced documented bias while preserving predictive accuracy.

Research Questions

  1. RQ1: What representational biases exist in UGC adult video platform metadata?
  2. RQ2: How do ML classifiers perform across intersectional demographic groups?
  3. RQ3: What are the causal mechanisms driving algorithmic disparities?
  4. RQ4: Can bias mitigation techniques reduce disparities while preserving accuracy?

1. The Dataset: Corpus Overview

Corpus Overview Dashboard showing the composition of 535,236 videos across demographic categories, temporal distribution, and metadata statistics
Figure 1: Corpus Overview Dashboard — Comprehensive view of the 535,236-video dataset showing demographic distributions, temporal patterns, and metadata characteristics.

Key Observations

  • Severe underrepresentation: Black women comprise only 0.81% of the corpus despite being a distinct intersectional category
  • 49 metadata columns including protected attributes (race, gender, sexuality, age) extracted via NLP tagging
  • Temporal span: Videos from 2007-2024 enable longitudinal bias analysis
  • Language distribution: Multilingual content with intersection-specific patterns

2. Exploratory Data Analysis

Deep dive into the dataset reveals systematic patterns of representation disparity across intersectional groups.

Heatmap showing language distribution across intersectional demographic groups
Language × Intersection Heatmap — Non-English content correlates strongly with specific demographic groups, suggesting potential language-based algorithmic penalties.
Bar chart showing top content categories by frequency
Top Categories Distribution — Category tags show clear skew toward stereotypical classifications for marginalized groups.
Intersectional representation decomposition showing MDI scores
Marginal Disparity Index (MDI) — Quantifying representational gaps across single and intersectional identities. Higher MDI = greater underrepresentation.
Boxplot showing view count distribution across demographic groups
View Distribution by Group — Visibility disparities: some intersectional groups receive systematically fewer views, suggesting algorithmic suppression.
Category cardinality analysis showing tag diversity
Category Cardinality — Tag diversity varies significantly by group, with marginalized identities receiving fewer, more stereotypical labels.
Monthly seasonality patterns in content uploads
Temporal Seasonality — Upload patterns show consistent trends across months, validating dataset stability for longitudinal analysis.

3. Model Performance: The Bias Problem

High overall accuracy masks severe performance disparities across demographic groups. This is the core finding: aggregate metrics lie.

Performance by Intersectional Group

Model Group Accuracy F1-Score Gap vs Best
Random Forest Asian Women 79.5% 0.119 -79.6%
Random Forest Black Women 83.3% 0.494 -15.3%
Random Forest White Women 70.7% 0.583
BERT Overall 92.6% 0.891
BERT Black Women 95.6% 0.904 +1.5%
Fairness curves showing model performance across demographic groups at different thresholds
Figure 2: Fairness Curves — Performance metrics across threshold values reveal how different groups are affected by classification decisions. The gap between curves represents algorithmic discrimination.
ROC curves comparing model discrimination ability
ROC Curves — Area Under Curve (AUC) comparison across models shows strong overall discrimination, but group-specific analysis tells a different story.
Precision-Recall curves for minority class performance
Precision-Recall Curves — Critical for imbalanced datasets. Performance on minority groups (the ones that matter most) is significantly worse.
Confusion matrix showing classification errors
Figure 3: Confusion Matrix — Error patterns reveal systematic misclassification of content from marginalized creators.

4. Bias Mitigation: Finding Solutions

I tested three categories of fairness interventions across the ML pipeline. The key insight: there's no free lunch, but smart trade-offs exist.

1

Pre-Processing

Technique: Reweighing

Adjusts sample weights to balance group representation before training.

  • ✅ Simple to implement
  • ✅ Model-agnostic
  • ⚠️ Limited bias reduction
2

In-Processing

Technique: Exponentiated Gradient + Demographic Parity

Constrains optimization to satisfy fairness during training.

  • ✅ Best fairness results
  • ✅ Principled approach
  • ⚠️ Computational cost
3

Post-Processing

Technique: Calibrated Equalized Odds

Adjusts predictions after training to equalize error rates.

  • ✅ No retraining needed
  • ✅ Works with any model
  • ⚠️ May reduce overall accuracy
In-processing mitigation margins showing fairness-accuracy trade-off
In-Processing Margins — Exponentiated Gradient achieves the best fairness-accuracy balance, reducing the gap to -4.6%.
Post-processing mitigation margins
Post-Processing Margins — Calibrated Equalized Odds provides quick wins without retraining, but with lower ceiling.
Comparison of mitigation effectiveness across all strategies
Figure 4: Mitigation Effectiveness Comparison — Head-to-head comparison of all three mitigation strategies across fairness metrics. In-processing wins.

5. The Accuracy-Fairness Trade-off

The Pareto frontier reveals which models achieve optimal trade-offs between predictive accuracy and demographic fairness. This is the key decision tool for practitioners.

Accuracy-Fairness Pareto Frontier showing the trade-off space between model accuracy and fairness metrics, with different mitigation strategies plotted
Figure 5: Accuracy-Fairness Pareto Frontier — Each point represents a model configuration. Points on the frontier represent optimal trade-offs — you can't improve fairness without sacrificing accuracy, and vice versa. The goal is to move from the bottom-left (unfair, inaccurate) to the top-right (fair, accurate).

Reading the Frontier

Baseline RF: 80.5% accuracy, -12.6% gap — Starting point
Reweighed RF: 86.2% accuracy, EOD: 0.053 — Best accuracy
EG + DP: 80.3% accuracy, -4.6% gap — Best fairness (64% improvement!)
BERT: 92.6% accuracy — Highest overall, moderate fairness

6. Research Landscape & Positioning

Research landscape showing positioning of this work relative to existing literature
Figure 6: Research Positioning — This work fills a critical gap at the intersection of algorithmic fairness, content moderation, and intersectional analysis. Most prior work ignores adult content or treats demographics as single-axis.

7. The 30-Step Reproducible Pipeline

This isn't just a paper — it's a complete, reproducible research framework. Every step is automated, documented, and version-controlled.

Steps 1-6

Data & Bias Discovery

01: Corpus statistics & validation
02: Exploratory data analysis
03: PMI & stereotypical co-occurrence
04: Harm terminology extraction
05: Label preprocessing
06: Temporal trend analysis
Steps 7-13

Modeling & Fairness

07: Baseline model training
08: Comprehensive evaluation
09: BERT fine-tuning
10: Pre-processing mitigation
11: In-processing mitigation
12: Post-processing mitigation
13: Mitigation effectiveness
Steps 15-23

Advanced Analysis

15: Causal inference setup
16-19: Temporal & gap analysis
20-21: Network strength metrics
22: Category harm analysis
23: Effect size calculation
Steps 24-30

Synthesis & Outputs

24: Ground truth validation
25: Pareto frontier generation
26: Interactive dashboards
27-28: Table generation
29-30: Final report assembly

Technical Implementation

Core ML & Fairness

Python 3.12+ scikit-learn Fairlearn PyTorch Transformers (BERT)

Data & Analysis

Pandas NumPy Statsmodels NetworkX SciPy

Visualization

Matplotlib Seaborn Plotly Panel (Dashboards)

Infrastructure

Poetry pytest pre-commit GitHub Actions

Sample: Fairness Evaluation

from fairlearn.metrics import (
    demographic_parity_difference,
    equalized_odds_difference
)

# Calculate fairness metrics by group
dpd = demographic_parity_difference(
    y_true, y_pred,
    sensitive_features=df['intersectional_group']
)

eod = equalized_odds_difference(
    y_true, y_pred,
    sensitive_features=df['intersectional_group']
)

print(f"Demographic Parity Diff: {dpd:.4f}")
print(f"Equalized Odds Diff: {eod:.4f}")

Key Takeaways

📊

Bias is Measurable

Intersectional fairness metrics reveal discrimination invisible to aggregate accuracy scores.

🔧

Bias is Fixable

In-processing mitigation reduced the accuracy gap from -12.6% to -4.6% — a 64% improvement.

⚖️

Trade-offs Exist

The Pareto frontier helps practitioners choose the right balance for their context.

🔬

Reproducibility Matters

30-step automated pipeline ensures every finding can be verified and extended.

Explore the Research

The complete codebase, data processing pipeline, and analysis notebooks are available on GitHub. The full dissertation paper will be published upon completion.