This hobby project aims to create a data extraction system that uses the free Electricity API to retrieve current carbon emissions and energy consumption data. The system updates the data every hour and stores it in the Amazon Web Services (AWS) cloud service after transformation. The data processing and transformation are done in Python, while Power BI is used for data visualization.
Data Source: Electricity API
The free Electricity API provides current carbon emissions and energy consumption data.
The API updates hourly, ensuring the latest information is available.
Data Extraction and Transformation: Python
The ETL (Extract, Transform, Load) process in Python is responsible for retrieving and transforming the data.
The raw data is collected initially, then necessary transformations are performed (e.g., data cleaning,
normalization, formatting).
Data Storage: AWS
Transformed data is stored in the AWS infrastructure.
AWS offers a reliable and scalable solution for secure data storage and access.
Data Visualization: Power BI
Power BI is used to create visualizations from the data, enabling the monitoring of carbon emissions and energy
consumption trends.
The visualizations are interactive and customizable, allowing users to easily review and analyze the data.
Data Retrieval
A Python script retrieves the current data from the Electricity API every hour.
Data Processing and Transformation
The data is initially collected in raw form.
The script cleans and normalizes the data to ensure it is in the correct format for storage and further
use.
Data Storage
The transformed data is stored in AWS. AWS services (e.g., S3, RDS) ensure reliable and scalable data
storage.
Visualization
Power BI regularly queries the data from AWS and updates the visualizations.
Users can interactively explore the charts and reports generated from the current and historical data.
Programming Language: Python
API: Electricity API
Cloud Service Provider: AWS (Amazon Web Services)
Data Visualization: Power BI
ETL Tool: Custom Python script
This project is a great example of how modern technologies and services can be combined to create an environmentally conscious and data-driven solution. Regular monitoring of current carbon emissions and energy consumption data can help raise awareness and develop more sustainable energy consumption habits.
import json
import os
import pandas as pd
import requests
import boto3
from io import StringIO
from datetime import datetime as dt
from dateutil.relativedelta import relativedelta
def extract_live_carbon_intensity(api_key: str, header: str, zone_data: json) -> pd.DataFrame:
live_carbon_endpoint = 'https://api.electricitymap.org/v3/carbon-intensity/latest?'
emission_df = pd.DataFrame()
for key, value in zone_data.items():
query = 'zone='+key
url = live_carbon_endpoint + query
response = requests.get(url, header).json()
if len(response) != 1:
emission_df = pd.concat([emission_df, pd.DataFrame(response, index=[0])])
return emission_df
def extract_breakdown_data(header, zone_data):
breakdown_edpoint = 'https://api.electricitymap.org/v3/power-breakdown/latest?'
consumption_breakdown_df = pd.DataFrame()
production_breakdown_df = pd.DataFrame()
breakdown_general_df = pd.DataFrame()
for key, value in zone_data.items():
query = 'zone=' + key
url = breakdown_edpoint + query
response = requests.get(url, header).json()
if len(response) != 1:
temp_df = pd.DataFrame(response['powerConsumptionBreakdown'], index=[0])
temp_df['zone'] = response['zone']
temp_df['date'] = response['datetime']
consumption_breakdown_df = pd.concat([consumption_breakdown_df, temp_df], ignore_index=True)
temp_df = pd.DataFrame(response['powerProductionBreakdown'], index=[0])
temp_df['zone'] = response['zone']
temp_df['date'] = response['datetime']
production_breakdown_df = pd.concat([production_breakdown_df, temp_df], ignore_index=True)
berakdown_general = {
'zone': response['zone'],
'datetime': response['datetime'],
'fossilFreePercentage': response['fossilFreePercentage'],
'renewablePercentage': response['renewablePercentage'],
'powerConsumptionTotal': response['powerConsumptionTotal'],
'powerProductionTotal': response['powerConsumptionTotal'],
'powerImportTotal': response['powerImportTotal'],
'powerExportTotal': response['powerExportTotal'],
'isEstimated': response['isEstimated'],
'estimationMethod': response['estimationMethod']
}
temp_df = pd.DataFrame(berakdown_general, index=[0])
breakdown_general_df = pd.concat([breakdown_general_df, temp_df], ignore_index=True)
return [consumption_breakdown_df, production_breakdown_df, breakdown_general_df]
def save_to_processed(dataFrame, bucket, target_path, filename):
s3 = boto3.client('s3')
date = dt.now().strftime("%Y-%m-%d-%H")
filename = filename + f'_{date}.csv'
buffer = StringIO()
dataFrame.to_csv(buffer, index=False)
df_content = buffer.getvalue()
s3.put_object(
Bucket=bucket,
Key=target_path + filename,
Body=df_content
)
def load_zone_data(api_key, header):
zone_endpoint = 'https://api.electricitymap.org/v3/zones'
response = requests.get(zone_endpoint, header)
return response.json()
def lambda_handler(event, context):
api_key = os.getenv('ELECTRICITYMAPS_API')
header = {'auth-token': api_key}
live_carbon_processed_folder = 'transformed_data/live_carbon/'
consumption_breakdown_processed_folder = 'transformed_data/consumption_breakdown/'
production_breakdown_processed_folder = 'transformed_data/production_breakdown/'
general_processed_folder = 'transformed_data/general_breakdown_data/'
bucket = 'chris-electric-power'
zone_data = load_zone_data(api_key, header)
extracted_carbon_data = extract_live_carbon_intensity(api_key, header, zone_data)
save_to_processed(extracted_carbon_data, bucket, live_carbon_processed_folder, 'carbon_data')
breakdown_data = extract_breakdown_data(header, zone_data)
save_to_processed(breakdown_data[0], bucket, consumption_breakdown_processed_folder, 'consumption_breakdown_data')
save_to_processed(breakdown_data[1], bucket, production_breakdown_processed_folder, 'production_breakdown_data')
save_to_processed(breakdown_data[2], bucket, general_processed_folder, 'general_breakdown_data')