Project description


This hobby project aims to create a data extraction system that uses the free Electricity API to retrieve current carbon emissions and energy consumption data. The system updates the data every hour and stores it in the Amazon Web Services (AWS) cloud service after transformation. The data processing and transformation are done in Python, while Power BI is used for data visualization.


Key components


Data Source: Electricity API
The free Electricity API provides current carbon emissions and energy consumption data.
The API updates hourly, ensuring the latest information is available.

Data Extraction and Transformation: Python
The ETL (Extract, Transform, Load) process in Python is responsible for retrieving and transforming the data.
The raw data is collected initially, then necessary transformations are performed (e.g., data cleaning, normalization, formatting).

Data Storage: AWS
Transformed data is stored in the AWS infrastructure.
AWS offers a reliable and scalable solution for secure data storage and access.

Data Visualization: Power BI
Power BI is used to create visualizations from the data, enabling the monitoring of carbon emissions and energy consumption trends.
The visualizations are interactive and customizable, allowing users to easily review and analyze the data.

Operation


Data Retrieval
A Python script retrieves the current data from the Electricity API every hour.

Data Processing and Transformation
The data is initially collected in raw form.
The script cleans and normalizes the data to ensure it is in the correct format for storage and further use.

Data Storage
The transformed data is stored in AWS. AWS services (e.g., S3, RDS) ensure reliable and scalable data storage.

Visualization
Power BI regularly queries the data from AWS and updates the visualizations.
Users can interactively explore the charts and reports generated from the current and historical data.

Technical Implementation


Programming Language: Python
API: Electricity API
Cloud Service Provider: AWS (Amazon Web Services)
Data Visualization: Power BI
ETL Tool: Custom Python script

Closing Thoughts


This project is a great example of how modern technologies and services can be combined to create an environmentally conscious and data-driven solution. Regular monitoring of current carbon emissions and energy consumption data can help raise awareness and develop more sustainable energy consumption habits.

Lambda function



    import json
    import os
    import pandas as pd
    import requests
    import boto3
    from io import StringIO
    from datetime import datetime as dt
    from dateutil.relativedelta import relativedelta
    
    def extract_live_carbon_intensity(api_key: str, header: str, zone_data: json) -> pd.DataFrame:
        live_carbon_endpoint = 'https://api.electricitymap.org/v3/carbon-intensity/latest?'
        emission_df = pd.DataFrame()
        for key, value in zone_data.items():
            query = 'zone='+key
            url = live_carbon_endpoint + query
            response = requests.get(url, header).json()
            if len(response) != 1:
                emission_df = pd.concat([emission_df, pd.DataFrame(response, index=[0])])
        
        return emission_df
    
    def extract_breakdown_data(header, zone_data):
        breakdown_edpoint = 'https://api.electricitymap.org/v3/power-breakdown/latest?'
        consumption_breakdown_df = pd.DataFrame()
        production_breakdown_df = pd.DataFrame()
        breakdown_general_df = pd.DataFrame()
    
        for key, value in zone_data.items():
            query = 'zone=' + key
            url = breakdown_edpoint + query
            response = requests.get(url, header).json()
            
            if len(response) != 1:
                temp_df = pd.DataFrame(response['powerConsumptionBreakdown'], index=[0])
                temp_df['zone'] = response['zone']
                temp_df['date'] = response['datetime']
                consumption_breakdown_df = pd.concat([consumption_breakdown_df, temp_df], ignore_index=True)
                
                temp_df = pd.DataFrame(response['powerProductionBreakdown'], index=[0])
                temp_df['zone'] = response['zone']
                temp_df['date'] = response['datetime']
                production_breakdown_df = pd.concat([production_breakdown_df, temp_df], ignore_index=True)
        
                berakdown_general = {
                    'zone': response['zone'],
                    'datetime': response['datetime'],
                    'fossilFreePercentage': response['fossilFreePercentage'],
                    'renewablePercentage': response['renewablePercentage'],
                    'powerConsumptionTotal': response['powerConsumptionTotal'],
                    'powerProductionTotal': response['powerConsumptionTotal'],
                    'powerImportTotal': response['powerImportTotal'],
                    'powerExportTotal': response['powerExportTotal'],
                    'isEstimated': response['isEstimated'],
                    'estimationMethod': response['estimationMethod']
                }
                temp_df = pd.DataFrame(berakdown_general, index=[0])
                breakdown_general_df = pd.concat([breakdown_general_df, temp_df], ignore_index=True)
        
        return [consumption_breakdown_df, production_breakdown_df, breakdown_general_df]
    
    def save_to_processed(dataFrame, bucket, target_path, filename):
        s3 = boto3.client('s3')
        date = dt.now().strftime("%Y-%m-%d-%H")
        filename = filename + f'_{date}.csv'
        buffer = StringIO()
        dataFrame.to_csv(buffer, index=False)
        df_content = buffer.getvalue()
        
        s3.put_object(
            Bucket=bucket,
            Key=target_path + filename,
            Body=df_content
            )
        
    
    def load_zone_data(api_key, header):
        zone_endpoint = 'https://api.electricitymap.org/v3/zones'
        response = requests.get(zone_endpoint, header)
        return response.json()
    
    def lambda_handler(event, context):
        api_key = os.getenv('ELECTRICITYMAPS_API')
        header = {'auth-token': api_key}
        live_carbon_processed_folder = 'transformed_data/live_carbon/'
        consumption_breakdown_processed_folder = 'transformed_data/consumption_breakdown/'
        production_breakdown_processed_folder = 'transformed_data/production_breakdown/'
        general_processed_folder = 'transformed_data/general_breakdown_data/'
        bucket = 'chris-electric-power'
        
        zone_data = load_zone_data(api_key, header)
        extracted_carbon_data = extract_live_carbon_intensity(api_key, header, zone_data)
        save_to_processed(extracted_carbon_data, bucket, live_carbon_processed_folder, 'carbon_data')
        
        breakdown_data = extract_breakdown_data(header, zone_data)
        save_to_processed(breakdown_data[0], bucket, consumption_breakdown_processed_folder, 'consumption_breakdown_data')
        save_to_processed(breakdown_data[1], bucket, production_breakdown_processed_folder, 'production_breakdown_data')
        save_to_processed(breakdown_data[2], bucket, general_processed_folder, 'general_breakdown_data')