report-generator

UCLA LPPI Factsheet Generator

This repository contains tools for generating Latino Climate and Health Dashboard Factsheets. The scripts are used to create data visualizations, maps, and comprehensive factsheets on environmental health disparities affecting Latino communities in California.

The data used in these factsheets was curated by the Data and Research team at the UCLA Latino Policy and Politics Institute (LPPI) and can be found from this Google Sheet: https://docs.google.com/spreadsheets/d/e/2PACX-1vTyGBCxXjMIztPF1IL5JrP0nss-H4GwJwyecXDRy7Hv5oyC3s54ytPaNotzoqTMKzkPCxFqgEItfSLz/pub?gid=1869860862&single=true&output=csv

What This Tool Does

The UCLA LPPI Factsheet Generator creates comprehensive factsheets that include:

Workflow

  1. Prepare the input data sources
  2. Generate population figures
  3. Generate maps
  4. Generate html factsheets
  5. Generate pdf from html factsheet screenshots

Quick Start

Prerequisites

Required Software

Optional

For PDF Generation

System Requirements

Installation

Standard Installation

  1. Clone the repository:
    git clone https://github.com/ucla-lppi/report-generator.git
    cd report-generator
    
  2. Optional: Create and activate a virtual environment using Miniconda:
    conda create --name report-generator python=3.8
    conda activate report-generator
    

    On macOS/Linux:

    conda activate report-generator ```

  3. Install Python dependencies:
    pip install -r requirements.txt
    
  4. Install system dependencies (optional, for PDF generation):

    Ubuntu/Debian:

    sudo apt-get update
    sudo apt-get install wkhtmltopdf firefox-geckodriver
    

    macOS:

    brew install wkhtmltopdf geckodriver
    

    Windows:

    • Download and install wkhtmltopdf from official website
    • Download geckodriver and add to PATH
    • Restart terminal after installation

Verification

Test your installation:

python -c "from report_generator import quick_demo; quick_demo(offline_mode=True)"

Usage Guide

Command Line Interface

Generate factsheets using the original interface:

# Generate all factsheet types for all counties
python main.py build

# Generate specific factsheet type
python main.py build -t extremeheat
python main.py build -t airpollution

# Serve factsheets locally for preview
python main.py serve

New Unified Interface

Use the simplified ReportGenerator class:

from report_generator import ReportGenerator

# Quick demo (offline mode)
from report_generator import quick_demo
results = quick_demo(offline_mode=True)

# Full workflow
generator = ReportGenerator()
generator.load_data()

# Generate step by step
generator.generate_population_charts(['Los Angeles'])
generator.generate_heat_maps(['Los Angeles']) 
generator.generate_air_pollution_maps(['Los Angeles'])
generator.generate_html_factsheets('all', ['Los Angeles'])
generator.generate_pdfs('all', ['Los Angeles'])

# Or generate everything at once
results = generator.generate_full_report(
    counties=['Los Angeles', 'San Diego'],
    include_pdfs=True
)

For Developers and Researchers

Use the unified ReportGenerator class for programmatic access:

from report_generator import ReportGenerator

# Initialize the generator
generator = ReportGenerator()

# Load data
generator.load_data(offline_mode=False)  # Set to True for demo mode

# Generate complete factsheets for specific counties
results = generator.generate_full_report(
    counties=['Los Angeles', 'San Diego'],
    include_pdfs=True,
    offline_mode=False
)

print(f"Generated {len(results['files_generated'])} files")

Jupyter Notebooks

Two notebooks are provided for different skill levels:

  1. notebooks/01_Quick_Start_Guide.ipynb: For beginners and non-technical users
  2. notebooks/02_Advanced_Usage.ipynb: For researchers and power users

Launch Jupyter:

jupyter notebook

Then navigate to the notebooks/ folder and open the appropriate guide.

Project Structure

report-generator/
β”œβ”€β”€ πŸ“„ report_generator.py          # Main unified interface (NEW)
β”œβ”€β”€ πŸ“ notebooks/                   # Jupyter notebooks for easy use (NEW)
β”‚   β”œβ”€β”€ 01_Quick_Start_Guide.ipynb
β”‚   └── 02_Advanced_Usage.ipynb
β”œβ”€β”€ πŸ“„ main.py                      # Original command-line interface
β”œβ”€β”€ πŸ“„ flask_app.py                 # Main factsheet generation
β”œβ”€β”€ πŸ“„ data_utils.py                # Data loading and processing utilities
β”œβ”€β”€ πŸ“„ map_utils.py                 # Geographic map generation
β”œβ”€β”€ πŸ“„ pdf_utils.py                 # PDF export functionality
β”œβ”€β”€ πŸ“„ generate_donuts.py           # Population chart generation
β”œβ”€β”€ πŸ“ inputs/                      # Input data files
β”‚   β”œβ”€β”€ πŸ“ geojson/                # Geographic boundary files
β”‚   └── πŸ“„ *.csv                   # Statistical data files
β”œβ”€β”€ πŸ“ output/                      # Generated factsheets and files
β”œβ”€β”€ πŸ“ templates/                   # HTML templates for factsheets
β”œβ”€β”€ πŸ“ static/                      # CSS, fonts, and assets
β”œβ”€β”€ πŸ“ tests/                       # Test files
β”‚   β”œβ”€β”€ test_pdf_utils.py
β”‚   β”œβ”€β”€ test_heat_maps.py
β”‚   └── test_air_pollution_maps.py
└── πŸ“„ requirements.txt             # Python dependencies

Workflow Overview

The factsheet generation follows this process:

  1. Data Loading:
    • Population demographics from Google Sheets
    • Geographic boundaries from GeoJSON files
    • Environmental data from CSV files
  2. Chart Generation:
    • Population donut charts showing ethnic composition
    • Statistical summaries for each county
  3. Map Generation:
    • Heat exposure maps showing geographic disparities
    • Air pollution exposure maps
    • Neighborhood-level analysis comparing Latino and Non-Latino White areas
  4. Factsheet Assembly:
    • HTML factsheets using Flask/Jinja2 templates
    • Multi-page layouts with embedded charts and maps
    • Statistical narratives generated from data
  5. Export:
    • Static HTML files for web deployment
    • PDF exports for printing and sharing

πŸ§ͺ Testing

Run the test suite to verify functionality:

# Test PDF generation
python -m unittest test_pdf_utils.py

# Test heat map generation  
python -m unittest test_heat_maps.py

# Test air pollution map generation
python -m unittest test_air_pollution_maps.py

# Run all tests
python -m unittest discover -s . -p "test_*.py"

Configuration

Environment Variables

Set these for production use:

# For GitHub Pages deployment
export GITHUB_PAGES=true

# Custom data URLs
export POPULATION_DATA_URL="your-google-sheets-url"
export HEAT_DATA_URL="your-heat-data-url"

Custom Configuration

custom_config = {
    'output_dir': 'custom_output',
    'population_csv_url': 'your-custom-url',
    'geojson_path': 'path/to/your/boundaries.geojson'
}

generator = ReportGenerator(config=custom_config)

Jupyter Notebooks (Experimental)

Requirements

Jupyter Notebooks

  1. Install dependencies:
    pip install -r requirements.txt
    
  2. Launch Jupyter and open the Quick Start Guide:
    jupyter notebook notebooks/01_Quick_Start_Guide.ipynb
    
  3. Follow the step-by-step instructions in the notebook to generate your first factsheets.

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make your changes and test thoroughly
  4. Commit your changes: git commit -am 'Add feature'
  5. Push to the branch: git push origin feature-name
  6. Submit a pull request

Development Guidelines

Additional Documentation

Troubleshooting

Common Issues

Import Error: Make sure all dependencies are installed:

pip install -r requirements.txt

PDF Generation Fails: Install wkhtmltopdf and geckodriver, ensure they’re in PATH

Map Generation Errors: Verify input GeoJSON and CSV files exist and are properly formatted

Memory Issues: Process counties in smaller batches, especially for large datasets

Network Errors: Use offline_mode=True for testing without internet connectivity

Getting Help

  1. Check the Issues page for known problems
  2. Review the Jupyter notebooks for step-by-step guidance
  3. Run the diagnostic tools in the Advanced Usage notebook
  4. Contact the development team for persistent issues

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Citation

If you use this tool in research or policy work, please cite:

UCLA Latino Policy and Politics Institute. (2024). Environmental Health Factsheet Generator. 
GitHub. https://github.com/ucla-lppi/report-generator

For technical support or collaboration inquiries, please contact the UCLA LPPI research team.