Commit be43d12f authored by Carlos A. Iglesias's avatar Carlos A. Iglesias
Browse files

Renombrado mosaik como simulator

parent d39c75e2
This module allows users to generate synthetic data from a smart grid that suffers False Data Injection Attacks. For this purpose, a smart grid is simulated using Mosaik, which can be installed following the instructions of this link https://mosaik.readthedocs.io/en/latest/installation.html.
Basically, it requres two steps:
1) Installing mosaik:
$ pip install mosaik
2) Installing packages and requirements for the demo
$ sudo apt-get install git python3-numpy python3-scipy python3-h5py
$ cd securegrid-demo
pip install -r requirements.txt
The directory securegrid-demo contains the demo. It has been done extending mosaik-demo (git clone https://git@bitbucket.org/mosaik/mosaik-demo.git ~/Code/mosaik-demo) with the files `securegrid-demo.py`, `householdsim.py`, `attack.py` and `model_house.py`.
Mosaik only allows users to simulate a scenario that contains normal smart grid behavior, so this module has been implemented in order to simulate attacks. The attack that can be simulated modifies the power consumption values of the houses to a specific percentage of the original value.
In order to define how many houses of the simulation suffer an attack, the variable `nAttacks` of the `demo.py` file can be modified.
In addition, to define the percentage of the original value at which the power consumption values of the houses will be modified, the variable `attackPercentageValue` of the `attack.py` file can be modified.
Furthermore, to define the moment in which the attack starts, the variable `attackTime` of the `attack.py` file can be modified. This variable represents the percentage of the course of the simulation in which the attack will begin.
Finally, the following command starts the simulation.
$ python securegrid-demo.py
Once the simulation ends, the results are stored in the `demo.hdf5` file.
During the simulation, the grid can be inspected by accessing to `http://localhost:8000`.
For inspecting hdf5 file, an hdf5 viewer should be used. In linux, ViTables can be used, which is available at http://vitables.org/Download/.
It can be installed as follows.
$ apt install libhdf5-dev
$ python -m pip install pyqt5
$ python -m pip install vitables
The files that will be used for attack detection are Series/HouseholdSim-0.House_{N}/P_out which contain the power of House N. It has 44.640 values, since the
simulator generates data every minute for a month (60 minutes * 24 hours * 31 days).
syntax: glob
*.egg-info/
*.pyc
*.ropeproject
*~
.DS_Store
.coverage
MANIFEST
dist/*
docs/_build/*
htmlcov/*
venv/
/demo.hdf5
/data/profiles.data
{
"cells": [],
"metadata": {},
"nbformat": 4,
"nbformat_minor": 2
}
This diff is collapsed.
# SecureGrid
Deep Learning based Attack Detection System for Smart Grids
## Required modules
* Keras
* There is a recognized bug in numpy 1.16.3. Thus, the numpy version should be 1.16.1.
pip uninstall numpy
pip install --upgrade numpy==1.16.1
# Mosaik
For running the demo number of packages are needed:
sudo apt-get install git python3-numpy python3-scipy python3-h5py
In addition, there is a bug in arrow in mosaik demo, so you should install the version arrow 0.14:
pip install arrow==0.14
Execute
python securegrid-demo.py
After finishing, you will have the simulation data in the file demo.hdf5
You can visualize this file with any hdf5 viewer.
In our case, we are using ViTables (http://vitables.org/Download/).
If you desire to install it, follow the installation instructions.
The suggested process is:
apt install libhdf5-dev
pip install pyqt5
pip install vitables
Then execute 'vitables' in a terminal.
If you wish to visualize the scenario, you can install maverig: https://bitbucket.org/mosaik/maverig/src/master/. Basically
pip install maverig
## Usage
In order to detect attacks, the power consumption values of the houses are analyzed. For that reason, first, the needed DataFrames to feed the neural network (autoencoder) have to be created.
For this purpose, the notebook dataframe_creation is used. This notebook generates .pkl files that contain the DataFrames with the necessary data. In addition, these DataFrames contain the following features:
| Feature | Description |
| ------------- | ------------- |
| Day | Current day of the first window value |
| Hour | Current hour of the first window value |
| Minute | Current minute of the first window value |
| Pn | Power consumption window values |
| Mean | Mean of the window values |
| Mean_i - Mean_i-1 | Difference between the mean of the window values and the mean of the previous window values |
| s | Standard deviation of the window values |
| Pn - P1 | Difference between the last and first value of the window |
| Q1 | First quartile of the window values |
| Q2 | Median of the window values |
| Q3 | Third quartile of the window values |
| IQR | Interquartile range of the window values |
For executing the dataframe, it is needed to install the package h5py
pip install h5py
Once the DataFrames are created, they are used to feed the autoencoder. Therefore, the conv1d_autoencoder.py file has to be configured.
The normal_data_path variable has to contain the path to the .pkl file that contains data without attacks, that is to say, a normal behaviour of the houses. In addition, the attack_data_path variable has to contain the path to the .pkl file that contains the data that is wanted to be analyzed in order to detect attacks.
Furthermore, in order to train the autoencoder, the DO_TRAINING variable has to be set to True.
Finally, the following command executes the system:
$ python conv1d_autoencoder.py
## Results
Once the system is executed, it generates the predicted_labels.csv file that contains the labels that classify every entry of the DataFrame into attack (1) or normal behaviour (0).
import mosaik_api
import random
meta = {
'models': {
'Attack': {
'public': True,
'params': ['target_attr'],
'attrs': ['P_out_val'],
},
},
}
attackPercentageValue = 0
attackTime = 5
class Attack(mosaik_api.Simulator):
def __init__(self):
super().__init__(meta)
self.units = {}
def init(self, sid, step_size=15*60):
self.sid = sid
self.step_size = step_size
return self.meta
def create(self, num, model, **model_params):
n_units = len(self.units)
entities = []
for i in range(n_units, n_units + num):
eid = 'Attack-%d' % i
self.units[eid] = model_params
entities.append({'eid': eid, 'type': model })
return entities
def step(self, time, inputs):
commands = {}
progress = yield self.mosaik.get_progress()
if progress >attackTime:
for eid, attrs in inputs.items():
# measure = 0
for attr, vals in attrs.items():
if attr == 'P_out_val':
for src_id, val in vals.items():
target_id = src_id
values = val
if eid not in commands:
commands[eid] = {}
target_attr = self.units[eid]['target_attr']
if target_id not in commands[eid]:
commands[eid][target_id] = {}
commands[eid][target_id][target_attr] = attackPercentageValue
# print("COMMANDS", commands)
yield self.mosaik.set_data(commands)
return time + self.step_size
def main(self):
return mosaik_api.start_simulation(Attack(), 'example attack')
if __name__ == '__main__':
main()
import logging
import mosaik_api
import model_house
import random
logger = logging.getLogger('householdsim')
meta = {
'models': {
'ResidentialLoads': {
'public': True,
'params': [
'sim_start', # The start time for the simulation:
# 'YYYY-MM-DD HH:ss'
'profile_file', # Name of file with household data
'grid_name', # Name of the grid to load
],
'attrs': [],
},
'House': {
'public': False,
'params': [],
'attrs': [
'P_out', # Active power [W]
'num', # House number starting at 1
'node_id', # ID of node the house has to be connected to
'num_hh', # Number of separate households within the house
'num_res', # Number of residents per household
],
},
},
}
def eid(hid):
return 'House_%s' % hid
class HouseholdSim(mosaik_api.Simulator):
def __init__(self):
super().__init__(meta)
self.model = None
self.houses_by_eid = {}
self.pos_loads = None
self._file_cache = {}
self._offset = 0
self._cache = {}
def init(self, sid, pos_loads=True):
logger.debug('Loads will be %s numbers.' %
('positive' if pos_loads else 'negative'))
self.pos_loads = 1 if pos_loads else -1
return self.meta
def create(self, num, model, sim_start, profile_file, grid_name):
if num != 1 or self.model:
raise ValueError('Can only create one set of houses.')
logger.info('Creating houses for %s from "%s"' %
(grid_name, profile_file))
if profile_file.endswith('gz'):
import gzip
pf = gzip.open(profile_file, 'rt')
else:
pf = open(profile_file, 'rt')
try:
self.model = model_house.HouseModel(pf, grid_name)
self.houses_by_eid = {
eid(i): house for i, house in enumerate(self.model.houses)
}
except KeyError:
raise ValueError('Invalid grid name "%s".' % grid_name)
# A time offset in minutes from the simulation start to the start
# of the profiles.
self._offset = self.model.get_delta(sim_start)
return [{
'eid': 'resid_0',
'type': 'ResidentialLoads',
'rel': [],
'children': [{'eid': eid(i), 'type': 'House', 'rel': []}
for i, _ in enumerate(self.model.houses)],
}]
def step(self, time, inputs):
# "time" is in seconds. Convert to minutes and add the offset
# if sim start > start date of the profiles.
houses = []
# print("INPUTS", inputs)
for key in inputs.keys():
houses.append(key[key.find("_"):][1:])
# print("HOUSES", houses)
if len(inputs)!=0:
newValue = list(list(list(inputs.values())[0].values())[0].values())[0]
minutes = time // 60 + self._offset
cache = {}
data = self.model.get(minutes)
# print("DATA", data)
for hid, d in enumerate(data):
d *= self.pos_loads # Flip sign if necessary
cache[eid(hid)] = d
for house in houses:
# cache[eid(house)] = random.randint(50, 200)
# cache[eid(house)] = d*0.3
cache[eid(house)] = newValue*d/100
self._cache = cache
return (minutes + self.model.resolution) * 60 # seconds
def get_data(self, outputs):
data = {}
for eid, attrs in outputs.items():
data[eid] = {}
for attr in attrs:
if attr == 'P_out':
val = self._cache[eid]
else:
val = self.houses_by_eid[eid][attr]
data[eid][attr] = val
return data
def main():
return mosaik_api.start_simulation(HouseholdSim(), 'Household simulation')
"""
"""
import json
import arrow
DATE_FORMAT = 'YYYY-MM-DD HH:mm'
"""Date format used to convert strings to dates."""
class HouseModel:
"""The HouseModel processes and prepares the load profiles and their
associated meta data to allow and easier access to it.
"""
def __init__(self, data, lv_grid):
# Process meta data
assert next(data).startswith('# meta')
meta = json.loads(next(data))
self.start = arrow.get(meta['start_date'], DATE_FORMAT)
"""The start date of the profile data."""
self.resolution = meta['resolution']
"""The time resolution of the data in minutes."""
self.unit = meta['unit']
"""The unit used for the load profiles (e.g., *W*)."""
self.num_profiles = meta['num_profiles']
"""The number of load profiles in the file."""
# Obtain id lists
assert next(data).startswith('# id_list')
id_list_lines = []
for line in data:
if line.startswith('# attrs'):
break
id_list_lines.append(line)
id_lists = json.loads(''.join(id_list_lines))
self.node_ids = id_lists[lv_grid]
"""List of power grid node IDs for which to create houses."""
# Enable pre-processing of the data
self._data = self._get_line(data)
# Obtain static attributes and create list of house info dicts
attrs = {}
for attr, *vals in self._data:
if attr.startswith('# profiles'):
break
attrs[attr] = [int(val) for val in vals]
#: List of house info dicts
self.houses = [
{
'num': i + 1,
'node_id': n,
'num_hh': attrs['num_hh'][i % self.num_profiles],
'num_res': attrs['num_residents'][i % self.num_profiles],
} for i, n in enumerate(self.node_ids)
]
# Helpers for get()
self._last_date = None
self._cache = None
def get(self, minutes):
"""Get the current load for all houses for *minutes* minutes since
:attr:`start`.
If the model uses a 15min resolution and minutes not multiple of 15,
the next smaller multiple of 15 will be used. For example, if you
pass ``minutes=23``, you'll get the value for ``15``.
"""
# Trim "minutes" to multiples of "self.resolution"
# Example: res=15, minutes=40 -> minutes == 30
minutes = minutes // self.resolution * self.resolution
target_date = self.start.replace(minutes=minutes)
if target_date != self._last_date:
# If target date not already reached, search data until we find it:
for date, *values in self._data:
date = arrow.get(date, DATE_FORMAT)
if date == target_date:
# Found target date, cache results:
values = list(map(float, values))
self._cache = [values[i % self.num_profiles]
for i, _ in enumerate(self.houses)]
self._last_date = date
break
else:
# We've reached the end of our data file if the for loop
# normally finishes.
raise IndexError('Target date "%s" (%s minutes from start) '
'out of range.' % (target_date, minutes))
return self._cache
def get_delta(self, date):
"""Get the amount of minutes between *date* and :attr:`start`.
The date needs to be a strings formated like :data:`DATE_FORMAT`.
Raise a :exc:`ValueError` if *date* is smaller than :attr:`start`.
"""
date = arrow.get(date, DATE_FORMAT)
if date < self.start:
raise ValueError('date must >= "%s".' %
self.start.format(DATE_FORMAT))
dt = date - self.start
minutes = (dt.days * 1440) + (dt.seconds // 60)
return minutes
def _get_line(self, iterator):
for line in iterator:
yield [item.strip() for item in line.split(',')]
This module allows users to generate synthetic data from a smart grid that suffers False Data Injection Attacks. For this purpose, a smart grid is simulated using Mosaik, which can be installed following the instructions of this link https://mosaik.readthedocs.io/en/latest/installation.html.
Basically, it requres two steps:
1) Installing mosaik:
$ pip install mosaik
2) Installing packages and requirements for the demo
$ sudo apt-get install git python3-numpy python3-scipy python3-h5py
$ cd securegrid-demo
pip install -r requirements.txt
The directory securegrid-demo contains the demo. It has been done extending mosaik-demo (git clone https://git@bitbucket.org/mosaik/mosaik-demo.git ~/Code/mosaik-demo) with the files `securegrid-demo.py`, `householdsim.py`, `attack.py` and `model_house.py`.
Mosaik only allows users to simulate a scenario that contains normal smart grid behavior, so this module has been implemented in order to simulate attacks. The attack that can be simulated modifies the power consumption values of the houses to a specific percentage of the original value.
In order to define how many houses of the simulation suffer an attack, the variable `nAttacks` of the `demo.py` file can be modified.
In addition, to define the percentage of the original value at which the power consumption values of the houses will be modified, the variable `attackPercentageValue` of the `attack.py` file can be modified.
Furthermore, to define the moment in which the attack starts, the variable `attackTime` of the `attack.py` file can be modified. This variable represents the percentage of the course of the simulation in which the attack will begin.
Finally, the following command starts the simulation.
$ python securegrid-demo.py
Once the simulation ends, the results are stored in the `demo.hdf5` file.
During the simulation, the grid can be inspected by accessing to `http://localhost:8000`.
For inspecting hdf5 file, an hdf5 viewer should be used. In linux, ViTables can be used, which is available at http://vitables.org/Download/.
It can be installed as follows.
$ apt install libhdf5-dev
$ python -m pip install pyqt5
$ python -m pip install vitables
The files that will be used for attack detection are Series/HouseholdSim-0.House_{N}/P_out which contain the power of House N. It has 44.640 values, since the
simulator generates data every minute for a month (60 minutes * 24 hours * 31 days).
import mosaik_api
import random
meta = {
'models': {
'Attack': {
'public': True,
'params': ['target_attr'],
'attrs': ['P_out_val'],
},
},
}
attackPercentageValue = 0
attackTime = 5
class Attack(mosaik_api.Simulator):
def __init__(self):
super().__init__(meta)
self.units = {}
def init(self, sid, step_size=15*60):
self.sid = sid
self.step_size = step_size
return self.meta
def create(self, num, model, **model_params):
n_units = len(self.units)
entities = []
for i in range(n_units, n_units + num):
eid = 'Attack-%d' % i
self.units[eid] = model_params
entities.append({'eid': eid, 'type': model })
return entities
def step(self, time, inputs):
commands = {}
progress = yield self.mosaik.get_progress()
if progress >attackTime:
for eid, attrs in inputs.items():
# measure = 0
for attr, vals in attrs.items():
if attr == 'P_out_val':
for src_id, val in vals.items():
target_id = src_id
values = val
if eid not in commands:
commands[eid] = {}
target_attr = self.units[eid]['target_attr']
if target_id not in commands[eid]:
commands[eid][target_id] = {}
commands[eid][target_id][target_attr] = attackPercentageValue
# print("COMMANDS", commands)
yield self.mosaik.set_data(commands)
return time + self.step_size
def main(self):
return mosaik_api.start_simulation(Attack(), 'example attack')
if __name__ == '__main__':
main()
import random
import numpy as np
import sys
from mosaik.util import connect_randomly, connect_many_to_one
import mosaik
sim_config = {
'CSV': {
'python': 'mosaik_csv:CSV',
},
'DB': {
'cmd': 'mosaik-hdf5 %(addr)s',
},
'HouseholdSim': {
'python': 'householdsim:HouseholdSim',
# 'cmd': 'mosaik-householdsim %(addr)s',
},
'PyPower': {
'python': 'mosaik_pypower.mosaik:PyPower',
# 'cmd': 'mosaik-pypower %(addr)s',
},
'WebVis': {
'cmd': 'mosaik-web -s 0.0.0.0:8000 %(addr)s',
},
'Attack': {
'python': 'attack:Attack'
}
}
START = '2014-01-01 00:00:00'
END = 31 * 24 * 3600 # 1 day
PV_DATA = 'data/pv_10kw.csv'
PROFILE_FILE = 'data/profiles.data.gz'
GRID_NAME = 'demo_lv_grid'
GRID_FILE = 'data/%s.json' % GRID_NAME
nAttacks = 10
def main():
random.seed(23)
world = mosaik.World(sim_config)