Jump to content

Returning to the fold with questions...


RodRico

Recommended Posts

My first FPGA design was in an XC-4000 back in the late 80's and my most recent experience was six years ago on a Digilent Nexys with a Spartan 6.  I got excited when I saw the Arty S7 on sale and ordered one. Unfortunately, when I looked at Divider Generator v5.1 I found it doesn't support Spartan 7. I don't understand why that would be so. Can anyone explain?

Update: I think it's just the documentation that's lacking mention of Spartan 7. The Divider Generator (v5.1) shows up and can be configured in IP Integrator with the Arty S7-25 (xc7s25csga324-1) specified as the project device, so I assume it can be compiled for the Spartan 7. Silly me for going to the IP documentation before just trying to see if I could use it! 

Link to comment
Share on other sites

@RodRico,

AXI-lite isn't all that bad.  But the full AXI?  Yeah, that gets painful.  Perhaps the most painful part is dealing with backpressure.  This is something Xilinx's demos don't handle properly.  Backpressure is what happens when BVALID && !BREADY or equivalently when RVALID && !RREADY.  In either case, your design should (ultimately) lower AWREADY, WREADY, and/or ARREADY (depending on which channel has the backpressure).  Stopping up the processing along the way, especially if you have a pipelined design, can be a challenge.  Here are some other challenges:

  • AXI requires all bus outputs be registered (essentially).  Nothing is allowed to depend combinatorially on the inputs.  This can make stall handling a challenge.  Unless you want to deal with 50% throughput, learn how to work with skidbuffers.
  • From the perspective of building a full AXI master, there are a lot of constraints you might have to deal with: keeping the AXI addresses from crossing 4kB boundaries within bursts is one of the more challenging ones, but you'll also need to keep your burst sizes properly limited by your application as well.
  • AXI allows for wrapped addressing as well as incrementing or fixed addressing, yet Xilinx's examples don't really work through how to properly calculate the next address in a burst from the last one.  In particular, they don't deal with AxSIZE being anything but the bus data width.  You might find this discussion valuable therefore.
  • There are a lot of AXI signals that I have yet to see a single example use: AxQOS, AxREGION, AxPROT, and AxCACHE are examples of these.  Indeed, the AXI spec doesn't do a really good job of explaining these signals.  Since most designs seem to ignore them ... I'll just leave it there.
  • While AXI-lite is an easy protocol to use, Xilinx tools treat it as a second class citizen.  If you want throughput, you might need to go for full AXI.
  • The AXI Block RAM controller is often touted as a good fit for lots of applications.  It doesn't really get good burst-to-burst throughput.  You can do better, if need be.  Indeed, if you build your own AXI slave interface, you may find it easier to interact with your design--since you'll know what's under your control and what isn't.

If you do need to build an AXI-lite slave peripheral, I think you'll find this example fairly easy to start from and use

Dan

Link to comment
Share on other sites

On 12/19/2020 at 11:49 AM, RodRico said:

Silly me for going to the IP documentation before just trying to see if I could use it! 

If reading the documentation before trying to use IP is silly then I propose that we should all be silly. What I'd call silly is reading Xilinx documentation and assuming that it's up to date or free from errors. Never assume because that really is silly.

Link to comment
Share on other sites

On 12/21/2020 at 8:36 AM, D@n said:

@RodRico,

AXI-lite isn't all that bad.  But the full AXI?  Yeah, that gets painful.  Perhaps the most painful part is dealing with backpressure.  This is something Xilinx's demos don't handle properly.  Backpressure is what happens when BVALID && !BREADY or equivalently when RVALID && !RREADY.  In either case, your design should (ultimately) lower AWREADY, WREADY, and/or ARREADY (depending on which channel has the backpressure).  Stopping up the processing along the way, especially if you have a pipelined design, can be a challenge.  Here are some other challenges:

  • AXI requires all bus outputs be registered (essentially).  Nothing is allowed to depend combinatorially on the inputs.  This can make stall handling a challenge.  Unless you want to deal with 50% throughput, learn how to work with skidbuffers.
  • From the perspective of building a full AXI master, there are a lot of constraints you might have to deal with: keeping the AXI addresses from crossing 4kB boundaries within bursts is one of the more challenging ones, but you'll also need to keep your burst sizes properly limited by your application as well.
  • AXI allows for wrapped addressing as well as incrementing or fixed addressing, yet Xilinx's examples don't really work through how to properly calculate the next address in a burst from the last one.  In particular, they don't deal with AxSIZE being anything but the bus data width.  You might find this discussion valuable therefore.
  • There are a lot of AXI signals that I have yet to see a single example use: AxQOS, AxREGION, AxPROT, and AxCACHE are examples of these.  Indeed, the AXI spec doesn't do a really good job of explaining these signals.  Since most designs seem to ignore them ... I'll just leave it there.
  • While AXI-lite is an easy protocol to use, Xilinx tools treat it as a second class citizen.  If you want throughput, you might need to go for full AXI.
  • The AXI Block RAM controller is often touted as a good fit for lots of applications.  It doesn't really get good burst-to-burst throughput.  You can do better, if need be.  Indeed, if you build your own AXI slave interface, you may find it easier to interact with your design--since you'll know what's under your control and what isn't.

If you do need to build an AXI-lite slave peripheral, I think you'll find this example fairly easy to start from and use

Dan

Thanks for all the great info Dan! My last hands-on experience was quite some time ago, and I now see that *everything* has changed and the learning curve is steep. I'm used to using hierarchal schematic designs, not verilog or VHDL, and am certainly not prepared for designing AXI based systems yet! Some context...
 
I've got a number of products in development that require an embedded controller, and I *really* don't like the complexity of the typical software development workflow in embedded systems, so I'm looking at building a FORTH core (note I was a FORTH consultant back in the day and have a full package of my own design I used for customer applications). The objective of this project is a very small, very fast, low interrupt latency (single cycle) kernel on the Arty S7 board augmented with VGA (display), PS/2 (keyboard), and MicroSD (storage) PMODs. I will add an AXI master at some point, but it's not needed in the initial version.

Reviewing the 7 family library guide (ug953), I see everything I need. I'm thinking of defining the kernel using those primitives and the verilog code the guide provides. Ultimately, I'll want to be able to include the kernel in other designs. Ideally it would be a block of logical with fixed place and route into a relocatable block of logic if possible. I'm still absorbing Hierarchal Design (ug905) and Designing IP Subsystems (ug994), but it looks like the answer is in there somewhere. Can you provide any guidance on the best path to a preplaced and routed IP product?

Link to comment
Share on other sites

On 12/21/2020 at 9:13 AM, zygot said:

If reading the documentation before trying to use IP is silly then I propose that we should all be silly. What I'd call silly is reading Xilinx documentation and assuming that it's up to date or free from errors. Never assume because that really is silly.

Of course there are errors in the immense library of documentation. They do, however, make things very hard for people like me just starting out. Errors are understandable. So is a bit of user frustration.

Link to comment
Share on other sites

>> so I'm looking at building a FORTH core (note I was a FORTH consultant back in the day ...)

You might have a look at James Bowman's "J1" CPU with the supporting compiler in gforth. It's very accessible.

I used in in a fractals fun project with my own minimal compiler and float implementation (not claiming this is production-ready in the hands of someone else).
This project includes a PC side simulator using Verilator. This might score high with regard to simplicity of the workflow (the most complex part is maybe the UART-based bootloader, which is optional).

https://projects.digilentinc.com/markus-nentwig/fpga-fractals-1920x1080x60-real-time-on-usb-power-c3d296

 

Link to comment
Share on other sites

BTW if you want an FPGA to replace an embedded controller chip, you might have a look at Lattice (they require less complex supporting circuitry. For example, look through Xilinx requirements on power supply sequencing). If you plan to copy a mass-produced design, check the prices of components first. It's not unheard of that EVBs are actually cheaper than the FPGA component alone without substantial bulk discount. And there can be nasty surprises with voltage regulators).

If interested in Lattice, please double-check that the CPU architecture builds as intended (I believe the BRAM on Lattice e.g. ICE40 is less capable, no combinational mode. AFAIK..)

 

Link to comment
Share on other sites

23 hours ago, xc6lx45 said:

You might have a look at James Bowman's "J1" CPU with the supporting compiler in gforth.

I already looked at J1. It's very clever and elegant, but too minimalist; I plan to implement a full development system with a MicroSD card file system, communication stacks, editor, debugger, VGA output, PS/2 keyboard, etc.
 

23 hours ago, xc6lx45 said:

BTW if you want an FPGA to replace an embedded controller chip, you might have a look at Lattice... If you plan to copy a mass-produced design, check the prices of components first.

I already own a Digilent Arty S7, VGA PMOD, and PS/2 PMOD which I will use for development. I plan to use BRAM for storage and one DSP48E slice for multiply, add, subtract, AND, OR, and XOR. I should be able to fit into a $18.76 (qty 1) XC7S15 with 18KB - 31.5 KB RAM (40% to 70% BRAM utilization) or a $31.78 (qty 1) XC7CS25 with 31.5KB to 63KB RAM (16% to 31% BRAM utilization). I haven't priced the whole BOM yet, but it certainly shouldn't retail more than Digilent's CMOD S7 ($69) plus their PS2 PMOD ($9),  VGA PMOD ($9), and MicroSD PMOD($7) combined or $94 (qty 1). I expect the price would be lower by incorporating it all on a single board like an Arduino MEGA.

Link to comment
Share on other sites

On 12/25/2020 at 5:56 PM, xc6lx45 said:

BTW if you want an FPGA to replace an embedded controller chip, you might have a look at Lattice

When other FPGA vendors are mentioned around here I'm always reminded of past tool licensing gotchas that I haven't quite gotten over.

I haven't used anything from the the Lattice ICE portfolio but it does seem to have a following among developers that share an interest similar to some Digilent users. It does seem that the devices are supported with free ICEcube tools on a node-locked installation. Free to me means that you can install it and uses it without fear of having the tools stop working after some period of time. Perhaps someone can confirm that impression. 

These are obviously devices intended for a restricted purpose but try finding a useful recent FPGA in a QFP package. Not everyone wants to do 6-8 layer boards in order to use a BGA package. As always, the advice 'buyer beware' should be taken as a given. To anyone thinking that this is no place to be discussing Xilinx competitors, I'll just say that I have a short list of interesting ideas for things that Digilent board users find useful and that I might be able to do with an ICE type device that Xilinx doesn't have any interest in helping me out with. I understand that there's no sense in keeping older devices competitive and filling up distribution channels or scheduling FAB time except to support existing customers.

Link to comment
Share on other sites

>> Free to me means that you can install it and uses it without fear of having the tools stop working after some period of time. Perhaps someone can confirm that impression.

Good point, the licensing is definitely a topic to be aware of when thinking long-term. The no-cost Lattice license needs to be renewed every year.  (I just double-clicked iCEcube and was informed "your license expires in 14 days ... a new license can be generated"). I consider it still "free to use" but that's my personal view. They also have smaller devices e.g. MACH (different toolchain) but I haven't used any of those other than putting a quick hello-world design on an EVB.

You'd think it doesn't matter for corporate users but I ended up more than once in the wrong corner of the corporate legal jungle (maybe skunkworks but this was business-as-usual) without access to production licenses and no way to sort it out quickly without stepping into CAPEX territory. Lesson here, the shiny hardware on my desk alone is useless e.g. seen a development board high in the four-digit price range reduced to paperweight duties or once a box full of Ettus RF toys, several of everything originally bought for MIMO, and of course completely useless for what could have become a "hardware implementation" on the slideset for management.

For me personally it's a very strong reason to stick with no-cost licensed tools, because when I take the time now to learn some technology, I want to be reasonably certain that I can pull it out of the hat in a few years with minimum hassle. It's the same story with large tool suites e.g. Mentor or Cadence or ADS or Matlab or whatever - there is always a risk that corporate winds change. That's why I'm so fond e.g. of Octave or (here's a real dinosaur) Maxima. No one can take that away so it's a safe investment of my time.

But I digress - both Lattice and Xilinx tick that box for me even though it's not perfect.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...