-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial version of LavaProfiler user interface #233
base: main
Are you sure you want to change the base?
Changes from all commits
f456afb
2d0ec74
25a3a68
a2d1765
74dfdf6
1daa9af
5908401
3df2847
9581fae
6e4716a
e22def6
bd25a80
7edeb1f
23fb8d7
f6686b4
86867c2
859a195
f7007f8
0163202
bf934ef
0eef67e
f08a35c
ceddf2a
f798b0a
b6e72bb
9c57309
7425f7e
bc353c7
73288a4
3f5c2d7
e691b1d
9ab9a5d
a1f0367
50f9113
e2342f5
0b7cedb
6a30ad7
8e652bc
3b3cc6f
46b54ad
49d1f43
fa69d5f
49c8b57
5e8b577
1de39fc
beb555a
c9188b8
59fc2f6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,13 +2,133 @@ | |
# SPDX-License-Identifier: BSD-3-Clause | ||
# See: https://spdx.org/licenses/ | ||
|
||
""" | ||
This module will contain a tool to determine power and performance of workloads | ||
for Loihi 1 or Loihi 2 based on software simulations or hardware measurements. | ||
|
||
The execution time and energy of a workload will be either measured on hardware | ||
during execution or estimated in simulation. The estimation is based on | ||
elementary hardware operations which are counted during the simulation. Each | ||
elementary operation has a defined execution time and energy cost, which is | ||
used in a performance model to calculate execution time and energy. | ||
""" | ||
import typing as ty | ||
import types | ||
import numpy as np | ||
from lava.magma.core.process.process import AbstractProcess | ||
from lava.magma.core.run_conditions import AbstractRunCondition | ||
from lava.magma.core.run_configs import RunConfig | ||
from lava.magma.runtime.runtime import Runtime | ||
from lava.magma.core.process.message_interface_enum import ActorType | ||
from lava.magma.compiler.compiler import Compiler | ||
from lava.magma.core.resources import ( | ||
AbstractComputeResource, Loihi1NeuroCore, Loihi2NeuroCore) | ||
|
||
|
||
class Profiler: | ||
"""The Profiler is a tool to determine power and performance of workloads | ||
for Loihi 1 or Loihi 2 based on software simulations or hardware | ||
measurements. | ||
|
||
The execution time and energy of a workload is either measured on hardware | ||
during execution or estimated in simulation. The estimation is based on | ||
elementary hardware operations which are counted during the simulation. | ||
Each elementary operation has a defined execution time and energy cost, | ||
which is used in a performance model to calculate execution time and energy. | ||
""" | ||
|
||
def __init__(self, start: int = 0, end: int = 0, | ||
bin_size: int = 1, buffer_size: int = 1000): | ||
self.start = start | ||
self.end = end | ||
self.bin_size = bin_size | ||
self.buffer_size = buffer_size | ||
self.used_resources: ty.List[AbstractComputeResource] = [] | ||
|
||
def profile(self, proc: AbstractProcess): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We talked about two API variants. The old one in which the profiler wraps the Proc and probably this new version. Did you get feedback which one is actually better. I believe our conclusion was that the old one seemed more suitable for what we want to do so you wanted to prepare both side by side to show DR and PS the alternative. Because one class overwriting a method attribute of another class looks like borderline invasive in terms of unexpected side effects. Python allows it but I'd imagine this would draw the anger of the Python community upon us. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still think that the user API would be better if the user could simply define a profiler without having to change his line
To me, that would be way less invasive. But I know too little about what's going on under the hood to say if anything like that would work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Both versions are slide by slide in the PowerPoint ;) @phstratmann We need to do quite a few things under the hood, especially modifying the compilation process (the sketch of what we need to do is in form of comments and mock methods in this PR). |
||
proc.run = types.MethodType(self.run, proc) | ||
|
||
def get_energy(self) -> np.array: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We probably need something more elaborate instead or besides this method. Yes we want a total time series but in simulation, we also need the ability to get time series for specific Procs or cores or specific contributors to the entire energy. |
||
"""Returns the energy estimate per time step in µJ.""" | ||
... | ||
return 0 | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we give the user the option to either receive the whole time series or just the total time / energy? I could imagine that for particularly long runs, we run into memory or runtime problems if we store one value for each time step. If we just accumulate the values, it may often suffice. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, this is just a first example of getting results. We can offer more sophisticated options, including standard plots etc. |
||
def get_power(self) -> np.array: | ||
"""Returns the power estimate per time step in µW.""" | ||
... | ||
return 0 | ||
|
||
def get_execution_time(self) -> np.array: | ||
"""Returns the execution time estimate per time step in µs.""" | ||
... | ||
return 0 | ||
|
||
def run(self, proc: AbstractProcess, condition: AbstractRunCondition = None, | ||
run_cfg: RunConfig = None): | ||
"""Runs process given RunConfig and RunCondition. | ||
|
||
Functionally, this method does the same as run(..) of AbstractProcess, | ||
but modifies the chosen ProcModels and executables to be able to use the | ||
Profiler. From the user perspective, it should not be noticeable as | ||
the API does not change. This method will be used to override the method | ||
run(..) of an instance of AbstractProcess, when the Profiler is | ||
created. | ||
|
||
Parameters | ||
---------- | ||
proc : AbstractProcess | ||
Process instance which run(..) was initially called on. | ||
condition : AbstractRunCondition | ||
RunCondition instance specifies for how long to run the process. | ||
run_cfg : RunConfig | ||
RunConfig is used by compiler to select a ProcessModel for each | ||
compiled process. | ||
""" | ||
|
||
if not proc._runtime: | ||
|
||
compiler = Compiler(loglevel=proc.loglevel) | ||
# initializer = Initializer() | ||
|
||
# 1. get proc_map | ||
# proc_map = initializer._map_proc_to_model(proc, run_cfg) | ||
proc_map = compiler._map_proc_to_model( | ||
compiler._find_processes(proc), run_cfg) | ||
|
||
# 2. modify proc_map | ||
proc_map = self._modify_proc_map(proc_map) | ||
|
||
# 3. prepare ProcModels for profiling | ||
self._prepare_proc_models(proc_map) | ||
|
||
# 4. create executable | ||
executable = compiler.compile(proc, run_cfg) | ||
|
||
# 5. append profiler sync channels | ||
self._set_profiler_sync_channel_builders(executable) | ||
|
||
# 6. create Runtime | ||
proc._runtime = Runtime(executable, | ||
ActorType.MultiProcessing, | ||
loglevel=proc.loglevel) | ||
proc._runtime.initialize() | ||
|
||
proc._runtime.start(condition) | ||
|
||
def _modify_proc_map(self, proc_map): | ||
"""Check if chosen process models have a profileable version and | ||
exchange the process models accordingly. | ||
Tell the user which Processes will not be profiled, as they lack a | ||
profileable ProcModel.""" | ||
... | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we automatically switch process models? I could imagine that people may get confused in some cases. Let's assume the user first runs a process without profiler and the compiler chooses a process model that is not profilable. Then the user runs the same process with profiler. The profiler will automatically switch the process model to one that can be profiled. But these process models may differ - maybe because of a bug, maybe because of other reasons. The user will not have expected to see any different process behavior just because (s)he activated the profiler. Instead, I would expect that I receive an error message if the default process model cannot be profiled. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My approach would be that each ProcModel has a profileable counter part ProcModel, which only adds the operation counters. If there is no such ProcModel, then the user will be informed that this Process is not considered by the Profiler. If no chosen ProcModel has a profileable version and we run on simulation only, than there will be an error stating no Process is considered for the Profiler. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What does this mean in terms of code duplication? Does it mean there is a ProcModel class and then an almost identical ProcModel class that just adds counters? Before we go into a whole lot of implementation, as usual, we should first write down an end to end (mock) example of what we are trying to enable and then agree that this is the best way to go. Such decisions are best made not in the abstract, for people who have not thought about the pros and cons deeply before, but using a concrete exmple. Do we have such an example already? If not I suggest you draft one and share it. We are at an important fork in the road, so we should get some wider input. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added an example in proc/lif/models.py We can inherit the ProcModel and add the code for operation counters. |
||
return proc_map | ||
|
||
def _prepare_proc_models(self, proc_map): | ||
"""Prepare each ProcModel for profiling. | ||
Configure Monitors for ProcModels executing in simulation. | ||
Recognize if ProcModels execute on Hardware.""" | ||
for proc_model, proc in proc_map.items(): | ||
if Loihi1NeuroCore in proc.required_resources: | ||
self.used_resources.append(Loihi1NeuroCore) | ||
else: | ||
# 1. add operation counter Vars to the Process | ||
# 2. set up Monitors to operation counter Vars | ||
... | ||
|
||
def _set_profiler_sync_channel_builders(self, executable): | ||
"""Create and append sync_channel builders if Loihi compute node is | ||
going to execute a profileable ProcModel.""" | ||
if Loihi1NeuroCore in self.used_resources or \ | ||
Loihi2NeuroCore in self.used_resources: | ||
executable.sync_channel_builders.append(...) | ||
... |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Copyright (C) 2022 Intel Corporation | ||
# SPDX-License-Identifier: BSD-3-Clause | ||
# See: https://spdx.org/licenses/ |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# Copyright (C) 2022 Intel Corporation | ||
# SPDX-License-Identifier: BSD-3-Clause | ||
# See: https://spdx.org/licenses/ | ||
import unittest | ||
|
||
from lava.magma.core.decorator import implements, requires | ||
from lava.magma.core.model.py.model import PyLoihiProcessModel | ||
from lava.magma.core.process.process import AbstractProcess | ||
from lava.magma.core.resources import CPU | ||
from lava.magma.core.run_conditions import RunSteps | ||
from lava.magma.core.sync.domain import SyncDomain | ||
from lava.magma.core.sync.protocols.loihi_protocol import LoihiProtocol | ||
from lava.magma.core.run_configs import RunConfig | ||
|
||
|
||
# A minimal process | ||
from lava.utils.profiler import Profiler | ||
|
||
|
||
class P(AbstractProcess): | ||
... | ||
|
||
|
||
# A minimal PyProcModel implementing P | ||
@implements(proc=P, protocol=LoihiProtocol) | ||
@requires(CPU) | ||
class PyProcModel(PyLoihiProcessModel): | ||
|
||
def run_spk(self): | ||
print("Test") | ||
|
||
|
||
# A simple RunConfig selecting always the first found process model | ||
class MyRunCfg(RunConfig): | ||
def select(self, proc, proc_models): | ||
return proc_models[0] | ||
|
||
|
||
class TestLavaProfiler(unittest.TestCase): | ||
def test_init(self): | ||
"""TBD""" | ||
start = 1 | ||
end = 5 | ||
buffer_size = 1000 | ||
bin_size = 1 | ||
profiler = Profiler(start=start, end=end, buffer_size=buffer_size, | ||
bin_size=bin_size) | ||
|
||
self.assertTrue(isinstance(profiler, Profiler)) | ||
self.assertTrue(profiler.start == start) | ||
self.assertTrue(profiler.end == end) | ||
self.assertTrue(profiler.buffer_size == buffer_size) | ||
self.assertTrue(profiler.bin_size == bin_size) | ||
|
||
def test_get_energy(self): | ||
"""TBD""" | ||
|
||
proc = P() | ||
profiler = Profiler() | ||
|
||
# The process proc and connected processes should be profiled | ||
profiler.profile(proc) | ||
|
||
# No connections are made | ||
|
||
simple_sync_domain = SyncDomain("simple", LoihiProtocol(), | ||
[proc]) | ||
|
||
# The process should compile and run without error (not doing anything) | ||
proc.run(RunSteps(num_steps=3, blocking=True), | ||
MyRunCfg(custom_sync_domains=[simple_sync_domain])) | ||
proc.stop() | ||
|
||
energy = profiler.get_energy() | ||
|
||
self.assertTrue(energy == 0) | ||
|
||
|
||
if __name__ == '__main__': | ||
unittest.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring