This tutorial explains how to transform a custom number of columns of a table into rows (Normalizing) and vice versa (Denormalizing).
An additional Video tutorial is here https://openenergy-platform.org/tutorials/6/ .
Please report bugs and improvements here: https://github.com/OpenEnergyPlatform/examples/issues
__copyright__ = "Reiner Lemoine Institut"
__license__ = "GNU Affero General Public License Version 3 (AGPL-3.0)"
__url__ = "https://github.com/openego/data_processing/blob/master/LICENSE"
__author__ = "oakca, Ludee"
import pandas as pd
# open df.xlsx
with pd.ExcelFile('df.xlsx') as xls:
# save Sheet1 in df.xlsx as df
df = xls.parse('Sheet1')
# show df
df
# Normalizing the df with melt (we keep the two columns 'Site' and 'Process'), we also add column unit, sort the values
# and make sure no old indexes are used as columns.
# As a result we have the two kept columns 'Site' and 'Process' and all other columns are transposed into rows
# labeled as variable and the respective values are found in value. The newly added column unit contains the respective units
norm = df.melt(['Site', 'Process']).assign(unit='').sort_values(['Site','Process']).reset_index(drop=True)
# assign values for unit
unit = {'inst-cap': 'MW', 'cap-lo': 'MW', 'cap-up': 'MW',
'inv-cost': '€/MW', 'fix-cost': '€/MW/a', 'var-cost': '€/MWh',
'wacc': None, 'depreciation': 'a'}
# include units in table
norm['unit'] = norm['variable'].map(unit)
# show normalized df
norm
# denormalizing the norm with pivot_table, transforms the table back into the original state (apart from column order)
denorm = norm.pivot_table(values='value', index=['Site', 'Process'], columns='variable').reset_index()
# remove the variable axis name
denorm = denorm.rename_axis(None, axis=1)
# show denormalized df
denorm
# note: the only difference between denorm and df is that the column names are in alphabetic order
# In case of columns containing NaN values, pivot_table will omit these columns in the denormalized table.
# Hence, in this case another option is to use unstack():
# denormalizing the norm with unstack
denorm = norm.set_index(['Site', 'Process', 'variable'])['value'].unstack().reset_index()
# remove the variable axis name
denorm = denorm.rename_axis(None, axis=1)
# show denormalized df
denorm
If you find bugs or if you have ideas to improve the Open Energy Platform, you are
welcome to add your comments to the existing issues on GitHub.
You can also fork the project and get involved.
Please note that the platform is still under construction and therefore the design of this page is still highly volatile!