menu

What Should the Next-Generation SpiNNaker Look Like?

TU Dresden and the University of Manchester are currently working on the next-generation SpiNNaker system. It will be a radical redesign using a state-of-the-art 28nm process so there is considerable room for more features. Things we are looking at include dedicated co-processors (e.g. for computation of exponentials and random numbers), and also architectural enhancements such as interconnect topology, relative size of memory vs. processing, etc.

This discussion group will give both an exciting preview of the next-generation SpiNNaker system and also a unique opportunity to participate directly in the design. We are looking especially for feedback from current and potential future users on features they'd like to see, characteristics of the current SpiNNaker system that they like or find to be an onerous limitation, and field experience on working with the quirks of the current design. Where would you like to have more performance? What components of neural function are sufficiently general that developing bespoke dedicated co-processors makes sense? How much system capacity in terms of neurons and synapses per chip do you need? How fine do you need the time resolution to be? What are the limits in terms of power consumption, heat, space, that you need in order to meet your design constraints for embedded systems (such as, particularly, robots)

We are looking for concrete and practical suggestions; this group will not be about your 'inner philosopher' but rather your 'inner engineer'. Bring us your ambitious but pragmatic vision of what the 'ideal' digital neuromorphic system would look like.

Login to become a member send

Timetable

Day Time Location
Tue, 26.04.2016 22:00 - 23:00 Sala Panorama
Fri, 29.04.2016 15:00 - 16:00 Sala Panorama
Tue, 03.05.2016 14:00 - 15:00 Outside disco

What should the next-generation SpiNNaker system look like?

Protocol of first meeting on 26.4. at 10pm

(thanks to Simon for taking notes)

The discussion went through numerous constraints of the current system that could be improved in the next generation. They are sorted here according to the main sub-parts of the SpiNNaker system:

Processing

(processor cores + extensions -> implementing models)

  • Floating-point calculations are required for certain models/applications (e.g. matrix multiplications/inversions, models that require ODE solvers)
  • General agreement that we'd be running more complex models in future, perhaps with multi-compartment models, dendritic branches, etc. That would shift the communication/compute balance significantly to the compute side.

Memory

(internal/external -> state variables, synaptic matrix and parameters)

  • Local memory could become a more critical constraint if complexity of models increases
  • External memory access seemed not to be problematic (maybe hidden by/visible through communication limitations)

Communication

(internal and IO -> spike communication, interfacing to the outside world)

  • Easier connectivity for AER sensors (to avoid needing external FPGA; could FPGA on SpiNNaker board be employed?), and more generally for AER-compatible devices
  • Direct WIFI connectivity maybe also useful for independent systems (robots, Drones, etc); would allow a bigger, stationary SpiNNaker system to control a mobile agent
  • Data load times and data read times were felt as a rather severe obstacle for using the current system
    • Especially important for models with explicit connectivity (e.g. pyNN.FromListConnector) and parameter sweeps
    • Improved bandwidth from/to host needed
  • Spike packet drops are encountered for some benchmarks
    • Better diagnostics would help giving more informative feedback (where, when, which) back to the user
  • Longer routing keys in the SpiNNaker router could be beneficial for more targeted routing. This would also be required for interfacing with new/future retinas/AER chips with number of pixels/neurons. A current ATIS interface already uses all available bits in the routing key, and would benefit from longer keys. Maybe also longer payloads could help for certain applications.

General

  • Mobile systems (smaller size than 48-node board, one chip or a few chips in a compact package) would be quite useful, to fit e.g. on small robots
  • Scalability: SpiNNaker is scalable in principle, but how to assess the constraints that apply to large-scale models? Communication bandwidth vs. router entries seems to be a likely limiting factor
    • Do not only scale the network size, but also make models more detailed (dendritic branches, etc.)
    • Investigation with two possible approaches: Either a set of example networks, i.e. networks that users work with anyway, or dedicated synthetic tests to analyze constraints in a more targeted fashion

Protocol of session on 29.04.2016 (Felix/Christian)

Network-on-chip improvements

  • Secure Packet Transmission - Mode/avoid packet loss
    • required for non-spike data (tbd, stuff which is more critical than e.g. loosing a couple of spikes)
    • self-check features of links and higher NoC layers?
  • Deadlock (fifos getting full) seems to be a large factor in current spinnaker spike packet loss, how to avoid it
    • directed communication channels (as current spinnaker uses one event queue for both in and out, bidirectional communication leads to deadlock)
    • increase bandwidth/fifo size
  • Extent routing table key length for increased system size of spinnaker 2
  • Memory Address Mapping

Processor

  • Double Precision floating point unit
  • DMA/Memory Improvements (e.g. Read-Modify-Request)
  • Memory Partition  (Shared SRAM-Access for more than 1 core)
  • Embedding FPGA-like structure as configurable hardware accelerator

Configuration/setup

  • Reduce configuration time (priority issue)
    • increase external bandwidth
    • implement on-spinnaker configuration (self-mapping, etc)
  • Online interactions/reconfiguration
    • Protocols to implement this
    • what needs to be reconfigured

General

  • Benchmark/Profiling/Debug Features
  • Better detect for Routing/Execute Errors (realtime violations)

Protocol of session on 03.05.2016 (Andreas/Sebastian)

  • State-recording might require additional memory bandwidth and capacity
  • 0.1ms timestep --> UMAN list implications of that
  • TUD evaluates HMC as memory solution, focus on power reduction at low utilization (e.g. sleep modes of SerDES transceivers)
  • Portable systems might power-down the HMC or only utilize less than 4 links for power saving reasons.
  • Memory discussion ongoing with UMAN
  • Hardware accelerators:
    • e.g. expr, log, sqrt, logistics function … (TUD provides initial list, to be discussed with UMAN, evaluate HW overhead)
    • DMA that supports memory access of arrays

Leaders

Sebastian Höppner
Christian Mayr
Johannes Partzsch
Alexander Rast

Members

Lukas Cavigelli
Simon Davidson
Gabriel Andres Fonseca Guerra
Michael Hopkins
Sebastian Höppner
Gengting Liu
Shih-Chii Liu
Christian Mayr
Manu Nair
Johannes Partzsch
Alexander Rast
Ole Richter
Alan Stokes
Evangelos Stromatias
Bernhard Vogginger
Qi Xu
Yexin Yan
André van Schaik