OpenVizsla OV3 - FPGA design
I’ve previously talked about the hardware design of the the OpenVizsla OV3 USB hardware analyzer.
This time I want to give a broad overview of how the FPGA part of the design works.
The OpenVizsla FPGA design (available in the GitHub OV repository was written using Migen, “a Python toolbox for building complex digital hardware”. Migen allows to write logic as the result of a Python script, and compile them to Verilog (and ultimately an FPGA bitstream). This has interesting implications, as it allows for the full power of the Python language when building language features.
Let me bring up an FPGA-centric block diagram of the OpenVizsla design. Don’t panic - I will explain the individual blocks and provide some examples. You can even click parts in the block diagram to see some detailed description. Now if that isn’t magic… (And please excuse my limited website usability skills.)
(Did I tell you that you can click in the block diagram to see more details of the blocks?)
The main SDRAM chip
A regular MT48LC16M16A2 or compatible. It has a 16-bit data bus, and runs (by default) at 100 MHz, which gives a theoretical bandwidth limit of 200 MByte/s. The effective data rate will be less, because there are dead cycles when switching between pages and when doing refresh.
Wikipedia has an article that describes the basics.
SDRAM Host Controller
This is the internal SDRAM logic. Internally, a HostIf
-Interface is used to interact with the SDRAM controller.
On the HostIf
, you first issue a request using the i_*
-signals. This can
be either a read or write (i_wr
=1 is a write issue, i_wr
=0 is a read
issue), to a specific start address (i_addr
). The controller
acknowledges the issue, after seeing your i_stb
=1, by asserting i_ack
=1.
After that, there is the data phase. Whenever d_stb
(which is controlled
by the SDRAM controller) is asserted, data is transferred, unless d_term
is asserted.
For writes this means that the data on d_write
is
written to the next address, and for reads, data is available on d_read
.
There is no other flow control, so the client has to take the data (or
deliver the data) as fast as the controller wants. If you can’t keep the
rate, you have to end the issue (d_term
=1), and then start a new issue.
Here is an example transfer on this bus:
- Write AD50, AD51, …, ADB3 to address 000050, 000051, …, 0000B3
- Read back from 000055, …, 000058
The d_stb
de-assertions are arbitrary, meaning that they depend on how busy
the system is. A reason for d_stb
could be that we hit a column limit (and
the next page has to be selected), or that another master took over.
HostIf
-clients always need to support d_stb
de-assertions at any time.
- ov_types.py defines the HostIF interface.
- SDRAMCTL interfaces to the physical SDRAM chip, and provides a single
HostIf
-interface on the other side. - SDRAMMux time-slices the
HostIf
-interface to multiple masters. - SDRAMBIST is one of such masters, and first writes a test-pattern, and then reads it back.
- SDRAMBISTCfg provides a CSR interface to the
SDRAMBIST
module.
Migen BankArray
BankArray
scans modules that inherit from AutoCSR
and builds a list of
Banks. Each Bank
then contains a number of a CSRs. A Bank
decodes
addresses on the CSR bus into write enables and selects the right register
to return data from.
FTDI data mux (CmdProc
)
The FTDI_sync245
modules connects to the FTDI chip on the one side, and to
the CmdProc
module on the other side, and provides bidirectional data
transfer. It handles the FTDI SyncFIFO protocol in a state machine and
provides the clock domain crossing from the internal clock to the FTDI 60MHz
clock domain.
CmdProc
connects to the FIFOs provided by FTDI_sync245
, and connects
the HOST-to-FPGA FIFO to a BusDecode
instance, and the FPGA-to-HOST FIFO
to a BusInterleaver
instance.
The BusDecode
processing data bytes sent by the host, and generates CSR
master transactions. Here is an example CSR master transaction. The data
from the host was 55 80 03 00 D8
(with the last byte being a checksum),
and this decodes to a CSR write at address 0003
with data being 00
.
It also generates the CSR response packets; for each read and write transfer on the bus, you get a copy of the transaction back via the FTDI.
In the opposite direction the BusEncode
takes multiple sources - such as
the CSR response, or the USB packet data, or the LFSR test data, and
switches them into a single stream. It relies on each source to assert the
last
signal whenever a packet is complete. Switching then happens at those
positions.
Here we see two pending packets (as indicated by stb
being asserted in
both clients), and the round-robin muxer first selects client1 (which is a
CSR response packet) and then, once client1 asserts last
, it switches to
client2.
CMD_REC
record in csr_master.pyBusDecode
andBusInterleave
in bus_interleave.pyCmdProc
in cmdproc.pyFTDI_sync245
in ftdi_bus.py
USB capture logic
This large block is in charge of:
- Configuration of the external ULPI chip to sniff mode.
- Packetizing the ULPI data, and insertion of control (
RXCMD
) data
The ULPI and data processing is decidedly out of scope of this blog entry.
But to understand the big picture: The output of the cstream
(whacker
)
is a packetized byte stream again, similar to the CSR slave results. It gets
muxed into the stream sent to the analysis host in the BusEncoder
.
The LFSR Generator
The LFSR stream generator has a Source
that outputs a packetized LFSR stream. It is configured using two CSRs:
RANDTEST_SIZE
: The number of LFSR bytes in each packet.RANDTEST_CFG
: A single ‘go’ bit. If it is set, packets will be generated, otherwise the generation will stop.
- FTDI_randtest implements it.
External IO FPGA Logic
BTN_status
connects to the external physical button, and provides the
ability to read the button status via a CSR register BUTTONS_STAT
.
LED_outputs
allows to select from a number of LED sources via a per-LED
LED_MUX_<n>
register.
Source 0 is the LEDS_OUT
register, so by default, you can display an
arbitrary LED pattern by writing it to LEDS_OUT
.
In the top-level module OV3
, a number of LED sources are selected:
# from ovhw/top.py:
# GPIOs (leds/buttons)
self.submodules.leds = LED_outputs(plat.request('leds'),
[
[self.bist.busy, self.ftdi_bus.tx_ind],
[0, self.ftdi_bus.rx_ind],
[0]
], active=0)
This means that LED0 can be switched between OUT[0]
, the BIST busy
signal, and the FTDI bus tx_ind
signal.
LED 1 can be switched between OUT[1]
, a static 0, and the FTDI bus
rx_ind
signal.
active=0
causes the LED_outputs
module to invert the leds
(leds_raw.eq(leds if active else ~leds)
) since they are active-low.
- LED mux sources in top.py
- BTN_status in buttons.py
- LED_outputs in leds.py
External LEDS
LEDs are connected with a current-limit resistor to pins P57, P58, P59 on the FPGA against VCC. To drive a LED, the pin must be driven to GND.
- LD1,LD2,LD3,LD4 in the Schematics
External Button
The external push-button connects P67 of the FPGA with GND. A 10K pull-up resistor provides a positive input when the button is not pressed.
- SW1 in the Schematics
ULPI chip
The ULPI and data processing is decidedly out of scope of this blog entry.
FTDI chip
An FT2232H is used for communication with the host.
The FTDI website has a document that describes the SyncFIFO mode that we’re using.
SyncFIFO mode allows transferring data at a very high speed using an 8-bit bidirectional BUS clocked at 60MHz, and some control lines. The actual speed is depending mostly on how fast the host can process the data, but is of course inherently limited to the USB 2.0 High-Speed efficiency of about 90%.
FTDI outputs, FPGA inputs:
RXF#
: Data is transferred from FTDI to FPGA when bothRXF#
andRD#
are low.RXF#
is driven high when there is no data to be read.TXE#
: Data is transferred from FPGA to FTDI when bothTXE#
andWR#
are low.TXE#
is driven high when there is no space for data to be stored.CLKOUT
: 60 MHz clock generated by FTDI chip. All signals are synchronous to this clock.
FPGA outputs, FTDI inputs:
RD#
: Acknowledges the current byte, and causes the FTDI to load the next byte onto the bus ifRXF#
is low during that cycle.WR#
: Strobe for the current byte ifTXE#
is low that cycle.OE#
: Controls bus direction. Must be asserted at least one cycle before drivingRD#
.SIWU
: “Send Immediate” pin. Can be used to flush the FTDI buffer to lower latency, but we’re not doing this right now.
Walkthrough
On the left we see the FT2232H interface chip to the analysis host. It
provides a bi-direction fast data link to a host PC running ovctl.py
. The
incoming command stream from the PC is parsed in CmdProc
, turned into CSR
(Configuration registers) master transactions. BankArray
is a collection
of CSR slaves. One such slave is the BTN_status
component which allows
reading the state of the external hardware button, another one is
controlling the LEDs, other allow access to the ULPI controller, others can
invoke BIST sequences in SDRAM or configure SDRAM-to-host streaming.
Migen automatically wires up modules to a CSR infrastructure, so it’s easy to add registers that can be accessed from the host.
Again, please click on the various elements in the block diagram to see a description and waveforms!