A Guide to Using DDR in the all HDL Design Flow

zygot · October 6, 2021

I've started a thread for people wanting to know how to use the DDR memory on their FPGA boards. I want this to be interactive as it's not possible to provide a single demo project that works for all boards and all versions of Vivado. To get this started, I've provided a tutorial in the file XilinxDDR_Tutorial_Part_1.txt. As, the name implies this is just the beginning of the tutorial, but if you get through it, you will have a working DDR design running on your hardware.

Not everyone ( perhaps no one? ) will be happy with having to plow through a long text file but there are reasons for why I am presenting this material in this format. Perhaps, I will try and pretty the content up at a later time, if the topic is popular enough.

[Update] I've posted part 2 of the Tutorial. You can follow steps to creating a more useful DDR design by reading the file XilinxDDR_Tutorial_Part_2.txt. This isn't the end.

[Update] I've posted part 3 of the Tutorial in which we look at performance and simulation.

imp_top.v imp_top.xdc

XilinxDDR_Turorial_Part_1.txt

NexysVideoDdrDemo.vhd NexysVideoDdrDemo.xdc UART_DEBUGGER2.vhd XilinxDDR_Turorial_Part_2.txt YASUTX.vhd

NexysVideoDdrTest.vhd XilinxDDR_Turorial_Part_3.txt

Edited December 18, 2021 by zygot

zygot · October 6, 2021

The Mig IP doesn't allow you to assign the system_clock as the reference_clock input unless the Input Clock period is set to 5000ps ( 200 MHz ). Why is that?

According to the Series7 Select IO manual the reference clock for IDELAY can be 190-210 MHz or 290-310 MHz. According to the Artix datasheet we should be able to use either a 200 MHz or 300 MHz IDELAY reference clock for the -1 speed grade. So why doesn't the IP allow for using a 300 MHz system clock as the reference clock for the Nexys Video? The answer is that I don't know.

Interestingly, for the Nexys Video board, if you go with the Vivado 2019.1 default 2:1 controller PHY clock period of 3225ps this translates to a PHY clock of 310.0078 MHz. The MMCM math starts to become a problem, especially if you want a high quality clock for your DDR; and you do.

zygot · October 6, 2021

The tutorial doesn't mention why you checked the Use Internal Vref box though it comes up as unchecked by default.

If you look at the Nexys Video schematic you will see that on IO bank 35 where the DDR3 signals are connected, the two VREF pins are used as GPIO; in this case as a DQ and DDR3 address pin. So you have to check this option. On a board like the Genesys2 the bank VREF pins are connected to the same net as the DDR3 VREF pins, so you would leave this option unchecked. The Genesys2 uses the Kintex and has the DDR3 signals connected to an HP bank with more features than HR banks so the tutorial doesn't completely cover all of the details for setting up the MiG for that board.

If you are designing a board there are a lot of ways to mess up the external memory interface. Copying a design from another vendor is not a particularly good way to cheat it on doing your homework.

zygot · October 16, 2021

I've added material for Part 2 of the tutorial. Read both .txt files to follow along. Enjoy. Do post comments whether you find the tutorial useful or not, but especially if you are struggling with porting the code to your board.

Edited October 16, 2021 by zygot

zygot · October 25, 2021

I've posted a correction to the files associated with part 3 of the tutorial.

zygot · October 27, 2021

Even if you aren't too interested using the material in this tutorial, it might still be a valuable exercise to go through. FPGA board vendors like to headline the advertisements for their products by highlighting the most optimistic performance statistics, even if they don't have anything to do with actual performance for real-world applications. it's a good idea to have some sort of realistic expectation about what kinds of applications a particular FPGA platform can support. This is especially true if your application requires external memory. Looking at the peak DDR data rate is likely to be very misleading.

As for non-ZYNQ FPGA Xilinx boards with DDR3 there is a large difference between the more capable and less capable boards. The Genesys2 and XEM7320 have 2 16-bit DDR3 devices that work at peak 1800 MT/s and 1600 MT/s respectively. ~~These boards give you the flexibility of using 1 DDR3 controller with a 32-bit DQ bus or 2 DDR3 controllers, each with a 16-bit DQ bus.~~ For FPGA boards sold by Digilent the most disappointing would be the Kintex based NetFPGA-1G-CML. It has one DDR3 device with an 8-bit DQ bus. ~~The reference manual claims a 1600 MT/s DDR3 peak data rate but in actuality, it's less.~~ There's no evidence that anyone, except me, has ever tested the external memory on this board. For a board with 4 1 GbE ports, having a DDR3 that is almost useless for Ethernet applications is very disappointing. The Intel based Cyclone V GT at least has 1 64-bit DDR3 memory connected to a hard external memory controller, as well as a smaller one that can be connected to a soft controller.

We can't stop vendors from claiming misleading or even incorrect performance numbers for their products but we certainly can do our own due diligence to work out if a particular FPGA platform will support our projects. I'd argue that this is a requirement.

Since the major FPGA vendors have removed hard external memory controllers from their devices it's not a bad idea to spend some time trying to work out how well they actually work for your projects. With all datasheets, never take the advertising on page 1 at face value. Sometimes you can figure out what you need to know deep within the datasheet. Sometimes, you have to do some experimenting.

[edit] For the Genesys 2, both DDR3 devices share address and control lines, so operating them independently isn't a possibility.

[another edit] Keep reading because I did execute an 1800 MT/s DDR3 design for the NetFPGA-1G-CML platform.

Edited November 29, 2021 by zygot

zygot · November 3, 2021

The MIG IP in Vivado has a number of bugs that vary from release to release, some of which are quite bad. For people who use the HDL design flow the tools are getting more and more unfriendly with each release. Two issues with the MIG present a hardship to the HDL project design flow. One issue is that the MIG IP can't understand ucf or xdc files that have more than just the location constraint on one line. The second issue is that it is very picky about signal names in the constraints files. Digilent simply doesn't provide the constraints for using the MIG IP for their boards in a format that the MIG IP can use in Vivado. I've asked for this to be addressed. We'll have to wait to see a reply.

There's really no excuse or reason why the MIG IP wizard has to be so contrary. The fact is that the mig.prj file contains (almost) all of the information necessary to generate the IP output products; so really, the only thing that vendors need to supply in support of their boards is a working mig.prj file.

That isn't the end of Vivado issues and it's IP bugs though. Vivado simply can't keep track of all the mig.prj files that it creates and port them to a new project. The worst issue is that I've had Vivado unceremoniously terminate while having the Hardware Manager open and connected to a board while running a simulation. You might not think that this is so bad.. until you discover that sometimes this leaves behind files that you can't remove, even a administrator. I've only seen this on Windows hosts, but it's pretty disconcerting.

Edited November 3, 2021 by zygot

zygot · November 3, 2021

I've created a variant of the NexysVideoDdrTest.vhd design associated with part three of the tutorial that submits read or write commands as fast as the MIG DDR3 controller allows. For the Nexys Video board here is a test result that is representative:

    ui_clk is 100 MHz. I'm writing 16 bytes every 10 ns or 1600 bytes/s peak rate
    000130DE =    78046 --> wtimer --> 0.00078046 seconds
    00010000 =    65536 --> wcount --> 1048576 bytes --> 1,343,535,863 bytes/s to write 1 MB
    00011A3D =    72253 --> rtimer --> 0.00072253 seconds
    00010000 =    65536 --> rcount --> 1048576 bytes --> 1,451,256,003 bytes/s to read 1 MB
    00000000 =        0 errors

For the Nexys Video 4:1 controller the peak data rate is 1600 million bytes/s. So, I see about a 90% average data rate compared to the peak data rate for 1 MB.

I also ported the NexysVideoDdrTest to the Genesys 2 board using the MIG settings in the Reference Manual.

    ui_clk is 900/4 = 225 MHz. I'm writing 32 bytes every 1.111 ns** or 7200 bytes/s peak rate
      00013472 =    78962 --> wtimer --> 0.00035094222 seconds
      00010000 =    65536 --> wcount --> 2097152 bytes --> 5,975,775,727 bytes/s to write 2 MB
      0001207C =    73852 --> rtimer --> 0.00032823111 seconds
      00010000 =    65536 --> rcount --> 2097152 bytes --> 6,389,254,206 bytes/s to read 2 MB
      00000000 =        0 errors

For the Genesys2 I'm seeing about an 88% average data rate to read/write 2 MB compared to the peak data rate. Timing closure is an issue for this board, at the maximum possible data rate, however. That' likely why the Digilent demos run the DDR PHY at a more sedate 400 MHz.

I should mention that , due to the way that the MIG UI works, timing closure will always be a likely issue if you try and run the DDR controller this way. Also, if you have to write or read more than one word to the MIG UI to support the 8 burst data requirement your data throughput will suffer appreciably. For instance, if you have to write 2 words per UI command, then your maximum peak data rate is 1/2 the PHY data rate.

There's good news. There's no significant difference in average data rates for blocks of 1 MB to 512 MB.

NOTE: Once you start getting into weird clock periods, the numbers that the MIG IP requires are somewhat fictional, but I guess that we don't have to worry about that as long as the hardware appears to function properly.. whatever that means.

Edited November 3, 2021 by zygot

zygot · November 4, 2021

I tested the NetFPGA-1G-CML:

- For the 4:1 controller with an 800 MHz PHY clock the UI runs at 200 MHz and the peak data rate is 1600 bytes/s
    04CCCD1D =    80530717 --> wtimer --> 0.402653585 seconds
    04000000 =    67108864 --> wcount --> 536870912 bytes --> 1,333,332,005 bytes/s to write 512 MB
    0475A8C2 =    74819778 --> rtimer --> 0.37409889 seconds
    04000000 =    67108864 --> rcount --> 536870912 bytes --> 1,435,104,263 bytes/s to read 512 MB
    00000000 =           0 errors

Again the average UI data rate is about 90% of the peak DDR3 PHY rate. I was wrong in my analysis that the board couldn't achieve a 1600 MT/s DDR3 data rate. This test was created using Vivado 2020.2. For unknown reasons the MIG controller doesn't provide the device_temp output that Vivado 2019.1 does.

zygot · November 5, 2021

Just for fun, I tested the venerable KC705 DDR3 SODIMM:
the KC705 4:1 controller with an 800 MHz PHY clock and 64-bit DQ bus has a ui_clk of 200 Mhz with 512-bit data
00012C53 = 76883 --> wtimer --> 0.000384415 seconds
00010000 = 65536 --> wcount --> 4194304 bytes --> 10,910,874,966 bytes/s to write 4 MB
000117BC = 71612 --> rtimer --> 0.00035806 seconds
00010000 = 65536 --> rcount --> 4194304 bytes --> 11,713,969,726 bytes/s to read 4 MB
00000000 = 0 errors

Done in VIvado 2019.1. The KC705 is lacking a few amenities but certainly has an external memory worthy of the Kintex -2 part. Getting a bitstream with Vivado was an adventure due to a design flaw with the board, and poor support by Xilinx, it's vendor.

Also, all of the Kintex boards that I've tested: KC705, Genesys2, and NetFPGA-1G-CML have the global clock in the wrong half of the FPGA (relative to the DDR pin assignments) so you have to use a CLOCK_DEDICATED_ROUTE_BACKBONE constraint and settle for less than ideal clock routing for designs using the DDR. Intel FPGA evaluation boards tend to have more external clock inputs... but their devices also have a more restrictive clocking infrastructure than the Series 7 devices. Still, for high performance boards like the Genesys2 it would seem prudent to add at least one extra, perhaps programmable, clock input. The difference in cost would be minimal but greatly enhance usability.

Again, all of the test results that I've posted are for a design pushing the MIG DDR3 controller to it's maximum data throughput and not necessarily appropriate for all designs.

Edited November 5, 2021 by zygot

November 5, 2021

Thanks for the writeup. I have always shied away from DDR programming so far (hating as I do the opaque Vivado wizards with a vengeance), your work will be invaluable once I get round to it.

zygot · November 6, 2021

@reddishI appreciate the feedback. I still have (at least) one more part to complete, but haven't decided how to approach it yet. Hopefully, more to follow....

Since most FPGA boards have some external memory it shouldn't be as daunting a task to use for projects as tools and board vendors make it out to be. Still, as with the KC705, it can be very difficult to resolve obstacles. I am unaware of any community effort to deal with exposing "hidden" information needed to complete an HDL project using the MIG. Vendor IP shouldn't be so difficult

Thanks for overlooking the typos and rough presentation...

Edited November 6, 2021 by zygot

zygot · November 9, 2021

To complete the board test results for the boards that I have access to here's something that might be of interest.

There aren't too many ZYNQ based boards with external memory connected to the PL. The ZCU106 happens to be one. For some reason Xilinx has decided not to support it with recent versions of its tools....

    the ZCU106 4:1 controller with an 1200 MHz PHY clock and 64-bit DQ bus has a ui_clk of 300 Mhz with 512-bit data
    013B5577 =    20665719 --> wtimer --> 0.06888573 seconds
    01000000 =    16777216 --> wcount --> 1073741824 bytes --> 15,587,289,617 bytes/s to write 1 GB
    011087E7 =    17860583 --> rtimer --> 0.05953528 seconds
    01000000 =    16777216 --> rcount --> 1073741824 bytes --> 18,035,387,152 bytes/s to read 1 GB
    00000000 =           0 --> errors

It's almost astonishing, to me at least, that the test runs at 300 MHz on hardware; but there it is. Again, you get an average data rate of about 90-94% of the 19200 million bytes/s peak.

The UltraScale+ uses the DDR4 SDRAM MIG 2.2 in Vivado 2019.1. pg150-ultrascale-memory-ip now covers this version. There's no flexibility in the IP design due, in part to the extreme clocking and timing specifications for the DDR4 memory. The MIG constraints files now abstracts pin constraints by using BOARD_PIN instead an actual pin number, and doesn't define any of the other pin constraints so that it's very difficult to find, or more importantly manage, the source code. This is the way that Xilinx is moving; less control for you over your projects and source maintenance, and more obfuscation of the details. Like it or not, you get a MicroBlaze embedded in your design to accomplish the calibration. The only good news is that the Hardware Manager can find and report the calibration status and results. If, for some reason it can't find the embedded processor, then you can't download code or run any application on the UltraScale+ processor(s)... and yes, I managed to do that briefly. But who can argue with 16 GB/s of data storage bandwidth?

The DDR4 core is a very complex IP and good luck to anyone having to debug it for a custom board.

I had to restart the project a few times. If you start with a board design and include the DDR4 block you HAVE to connect an AXI interface to the memory. Who want's to do that? The whole purpose of having the PL get access to external memory is for your HDL design to use it. But you can create a basic board design with just the ZYNQ and then create the DDR4 core as general purpose IP. Then your code can instantiate the ZYNQ and MIG core with the UI and run the design without doing any SW design of any kind.

Edited November 9, 2021 by zygot

zygot · November 29, 2021

I really should get around to posting the DDR3 performance for the Opal Kelly XEM7320. It as 3 SYZYGY ports but no UART so a port will be a bit more work than I can afford at the moment..

We can come up with a pretty good estimate for maximum real work data rates for the XEM7320 though:

400 MHz DDR4 PHY clock * 4 byte DQ interface * 90% of peak = 400*4*2*0.9 = 2.88 billion bytes/s. This is certainly more than sufficient to capture 4 ADC1410 ZMOD channels which happens to be something that I've done, at a lower data rate.

jb9631 · December 14, 2021

Thanks again for this collection. I have a few questions, and I think it's best to start at nearly the very beginning:

Quote

According to the reference manual the memory on the Nexys Video is a Micron MT41K256M16HA-187E DDR3 device. /.../ But the recommended MiG settings stated in the reference manual lists a MT41K256M16HA-125 as the preferred selection. Why is that?

Here, you correctly note that the -187E part has a cycle time of 1.875 ns, which corresponds to a 533 MHz clock. You also note that your Nexys FPGA's maximum data rate is 800 MT/s, corresponding to a clock of 400 MHz. Why, then, would Digilent's reference manual recommend you choose a -125 part in MIG, when that one corresponds to an even higher clock of 800 MHz?

Funny enough, in my case the tables are turned, i.e. the part I have is a 1600 Mbps one but the Digilent reference manual recommends I choose a 1333 Mbps part instead. This, at least, I can understand, because the datasheet (and likely the standard, too?) explicitly states that higher speed modules are backward compatible, i.e. a 1600 Mbps part is compatible both with 1333 Mbps and 1066 Mbps clocking. But if this were a case of downgrading to the part closest to the data transfer achievable by the FPGA (which evidently, it isn't), I'd expect the reference manual to suggest we both choose the 1066 part.

Is this all just a matter of "choose the nearest part that MIG has available"? Or is there an inherent asterisk hidden in that statement that should read "...and hope it works well for you"? Would/should it be recommended to edit the memory timings by hand, especially in cases like the (my) Arty-S50 board, where one module is listed in the reference manual, but another module is delivered instead, or, like in your case, where the refman suggests you use faster timings than your memory chip supports?

This last part isn't part of a question, but a detail that wasn't immediately obvious to me: "Trcd/Trp/Tcl and are in clock period units," yes, but to me, "clock period" would mean the period of the frequency that the memory is used at in the application (in the context of this tutorial, 2500 ps, for 400 Mhz). But, that is not the case -- the clock period used to calculate values to populate the "Timing Parameters" table in MIG must be the clock period of the memory (e.g. 1.25 ns for a 1600 Mbps part). Maybe I'm the odd one out here, as this is my first time working with DDR memory, and to everyone else this is as obvious as grass being green, but nowhere I've read is this explicitly stated.

zygot · December 15, 2021

Don't get hung up in the details of the memory datasheet specifications. Modern DDR3, DDR4 and newer devices are designed to be used with modern CPU devices not programmable logic. When using FPGA vendor external memory IP stick with the timing, and devices that the IP supports... at least until you have a few successes under your belt. To get started with implementing a successful first DDR project start with the recommended MIG settings provider by the FPGA board vendor; that's what they use to test the boards.

It's not uncommon for FPGA boards to be populated with a memory part that can operate with timings that exceed the capabilities of the part. The controlling factor for FPGA devices are in the AC switching specifications for clock generators and buffers. Xilinx devices have a maximum DDR data rate for 4:1 controllers and 2:1 controllers that is spelled out clearly. And then there's the physical board memory layout and buffer switching capabilities to factor in.

I'm no DDR memory expert and frankly haven't had an interest in exploring the limits of FPGA/DDR performance. I really just want to use the storage capabilities that a board provides with the least amount of effort so that I can concentrate on a project.

Edited December 15, 2021 by zygot

Mathias · January 22, 2022

Thanks for the post! I have read through all your tutorials but I have not tried the examples/demo yet. I will do that in the coming weeks to learn more.

I have the ARTY S7 board by digilent and I have a question regarding the clocking.

My design is a graphic converter. It takes input from ADC and creates HDMI output. DDR memory is meant to be used as frame buffer.

My application need a 200Mhz and 40Mhz clock. For the DDR to work I need 100Mhz sys clock and 200Mhz ref clock. On Arty board there is a 100Mhz oscillator connected to R2. What I would like to do is to use MMCM to generate all the clocks I need from the 100Mhz. In other words I want to do the same thing you do in your first tutorial. But whatever I try, I always end up with timing or placement problems (see attached for one example).

Do you have any recommendations for how I should handle my clock situation?
I have seen that there is a ui_clk coming from the ddr infrastructure, is this what I should use?
I see that you are using it in your second tutorial, but I have a hard time understanding VHDL code.

one_example_of_problems_I_have.txt

coldfiremc · January 22, 2022

Thanks @zygot. I have two nexys video boards and a Nexys A7 among other fancy RFSoC boards. I would like to follow this test designs. I have a Xilinx enterprise license, so I can open cases, and I use this MIG ip's too much in kind of critical applications to let nasty bugs to appear this way. If you can narrow those bugs, I would get some sort of official answer with Xilinx.
One of the causes of the limited clocks for MIG is the fact that MIG generates inside its refclock and max PLL clock and divider options get very limited this way. Better designs use 2 clocks: one for data, and one for reference, coming from external oscillators. It would be neat to have more boards designed with 2 differential oscillators (can be cheap oscillators, there's no need to get oven controlled PLL's in domestic boards like digilent's)

zygot · January 22, 2022

8 hours ago, Mathias said:

5 hours ago, Mathias said:

I have the ARTY S7

I just looked at Spartan 7 to see if it uses the same MiG as the other Series 7 devices; it does. Follow the tutorial with regards to checking the datasheet for DDR PHY clocking rates. This will likely dictate whether you have a 4:1 or 2:1 controller ( assuming that the MiG for Spartan 7 is the same as Series 7 devices ). I'd advise going to the Digilent product page for your board and using any MiG ,prj and .ucf files that are provided as a starting point. You want to at least start with a working DDR design before trying to add external memory to another project. You will have to edit the ucf file as the Mig gets confused with anything but pin location assignments as I mentioned in the tutorial. Even though the Mig project file is in HTML format you can read it in a text editor to see what settings you should start off with. I've request that Digilent provide a pin location constraints file for their boards that is compatible with the Mig tool but haven't gotten a response.

Any logic connected to the UI, such as issuing read or write commands to the DDR controller UI, will need to use the ui_clk. But the rest of your design can use any clock rate. The NexysVideoDdrDemo.vhd project in the 2nd part of the tutorial shows how to do this and pass data between clock domains. The more clock domains you have, the more complicated will be timing closure, especially of the clocks are unrelated. I'd suggest following the example in part 2 of the tutorial. You should spend some time reading the clocking reference manual for your device. There are a lot of rules and limitations. Once a board has been laid out and pins are assigned to external clock module inputs you can't change them. Often the tools will issue a warning or bitgen failure and you might have to add a constraint such as CLOCK_DEDICATED_ROUTE constraint to your toplevel project constraints file to over-ride the tool behavior. This is due to the board design.

For designs that need to issue commands to the MiG DDR controller as fast as possible you might have to restructure you design to use the ui_clk for the bulk of your design in order to achieve timing closure. Most of the board test results that I added posts about were done with a modified version of the NexysVideoDdrTest.vhd code to issue commands to the DDR as fast as the Mig IP allows. The problem for timing has to do with the behavior of the UI app_rdy signal. For the third part of the tutorial I wanted to simplify the design to emphasize the UI operation so that anyone could add an ILA and see how it operates. For NexysVideoDdrTest it should be obvious that this design can only issue write or read commands once every 4 ui_clk periods. The DDR interface is fast enough that this is often sufficient for most projects. That design was changed considerably for the performance posts of other boards. How one goes about doing that is left as a personal assignment for the reader. The MiG example project is one wy to do it but is almost unreadable. Parameters can be useful or a hindrance.

For really high throughput external memory designs such as the UltraScale+ DDR4 the high data rate might be a problem requiring a design approach that is more complicated than you want to do.

Don't hesitate to ask questions about the code if you are struggling. It seems pretty straightforward to me, but then again I wrote it.

Edited January 23, 2022 by zygot

zygot · January 22, 2022

5 hours ago, coldfiremc said:

If you can narrow those bugs, I would get some sort of official answer with Xilinx.

I haven't found the official Xilinx User Community Forums to be particularly helpful... especially with tool version bugs.

5 hours ago, coldfiremc said:

It would be neat to have more boards designed with 2 differential oscillators

I agree. Intel boards tend to have more clock input options, even on cheaper boards. But of course the Intel clocking tends to be more restrictive than Xilinx devices. Still, for the higher end Digilent boards there's no excuse for not having at least one high quality programmable clock module, and more IO banks connected to external clocks. this wouldn't have to add much expense to the board but would significantly improve board application usefulness. I don't think that Digilent's boards are designed by engineers who actually want to use them for real world project design... sigh! The Eclypse-Z7 is the worst example of hamstringing a board by making poor design decisions.

For UltraScale+ PL connected DDR4 you simply can't feed the DDR4 controller a MMCM or PLL generated clock.. it hast to be a high quality clock on clock capable pins.

Unfortunately, I don't design the boards.. I just have to make the best of what I get. Xilinx branded boards tend to be more realistically designed for actual use but they cost a lot more too ( not because of the better clocking options ).

Edited January 22, 2022 by zygot

coldfiremc · January 24, 2022

On 1/22/2022 at 8:03 PM, zygot said:

I haven't found the official Xilinx User Community Forums to be particularly helpful... especially with tool version bugs.

I can make a support case, I have enterprise support

zygot · January 24, 2022

4 hours ago, coldfiremc said:

I can make a support case, I have enterprise support

There was a time when I paid for Altera and Xilinx tools. As an independent contractor the level of support that I got wasn't particularly useful for solving most issues. Back then I could (possibly) get help from distributor or vendor sales offices or FAEs for whom I had contact information. It was a matter of luck, but sometimes I'd get a hold of someone with the necessary answers to a particular problem. I just don't put much hope in getting useful assistance from large vendors any more. Even when I worked for companies purchasing large** volumes of product the level of technical support available to them wasn't that impressive, unless you had direct contact with the right person outside of the normal support chain.

Anyway, as I mentioned in the tutorial, I mentioned issues I encountered because I think that it's beneficial for those without a long history of dealing with FPGA or semiconductor vendors. Sometimes, it's not you; it's the vendors causing problems. Integrated tools like Vivado have a lot of software dependencies which makes virtually every PC unique and hence tool behavior unique.. to a degree. Things that I encounter may not be what anyone else encounters.

If you encounter issues that appear to be tool related by all means use whatever path for resolving these is available to you. I wish you only good fortune.

As for the working poor*** it seems that the best way to get assistance with tool issues is to talk about them and share experiences, much the way that local organic user's groups have done in the past. I suspect that a huge percentage of users of the Digilent forums are pretty much on their own when it comes to obtaining shared technical knowledge that isn't readily available from vendor resources.

** I don't know what this means to a company like Xilinx these days but I'm guessing that it's north of $100 million to get top notch support.

*** You can be an employee of a very large manufacturer and find yourself in the catagory

Edited January 24, 2022 by zygot

Mathias · January 30, 2022

Hi again,

Had some time to look through this again. Looking at the first tutorial.

I see that you have this clock in your project:

set_property -dict {PACKAGE_PIN R4 IOSTANDARD LVCMOS33} [get_ports sysclk]

This clock you can later generate the clocks you need (clk_out1 and clk_out2);

   clk_wiz_0 instance_name
   (
    // Clock out ports
    .clk_out1(clk100),     // output clk_out1
    .clk_out2(clk200),     // output clk_out1
    // Status and control signals
    .reset(0), // input reset
    .locked(locked),       // output locked
   // Clock in ports
    .clk_in1(sysclk));      // input clk_in1

I think one problem for me with arty s7 board is that I cannot use LVCMOS33. I need to use SSTL135 for my clock (while the rest of my design is 3.3V logic). The DDR for my board runs on bank 34 fed with 1.35V. 100Mhz oscillator also runs with same voltage naturally. Can this make timing more difficult for me? Or am I misunderstanding something?

zygot · January 30, 2022

1 hour ago, Mathias said:

I think one problem for me with arty s7 board is that I cannot use LVCMOS33. I need to use SSTL135 for my clock

The Series 7 devices have internal level translators to pass signals between IO banks having different Vccio voltages. Not all FPGA vendors have global clock distribution to all areas of the devices in such a flexible way. So, it is possible to have a clock module connected to a clock capable pin on one bank powered by 3.3V and use it in an IO bank powered by 1.35V. If you think about it, any FPGA device capable of having multiple IO voltage regions need this ability since the main logic power supply voltage is rarely the same as any common IO logic standard.

For MiG DDR3 controllers you can connect the external clock to a PLL or MMCM and generate a number of related clocks of various frequencies. This is what I've done for all of the tutorial HDL examples. I'm pretty sure that your board has an external clock module ( likely 3.3V) connected to a clock capable pin that you can use in the same way. ( I haven't checked this but boards with DDR require a decent clock source.

I guess that I neglected to include the Series 7 Clocking Reference Manual in the list of required reading. Actually, before starting a design you should have at least done a quick speed-read through all of the device feature reference manuals for your device family regardless of vendor. Clocking is an area where one might expect to run into unexpected surprises. There are a lot of use case limitations with clocking depending on what type of IO resources are being used so having a good understanding of the material in the Series 7 Select IO and Clocking manuals is a necessity.

Edited January 30, 2022 by zygot

Mathias · February 1, 2022

@zygot Thanks for clearing that up. I understand what you are saying. but..

When I read the recommendations from digilent they want me to use the 100Mhz clock for driving the DDR3.
https://digilent.com/reference/_media/reference/programmable-logic/arty-s7/arty-s7_rm.pdf

This can be found on a MMCM capable pin. If I use it as you do in your tutorial I get this error:

[DRC BIVC-1] Bank IO standard Vcc: Conflicting Vcc voltages in bank 34. For example, the following two ports in this bank have conflicting VCCOs:  
ddr3_ck_p[0] (DIFF_SSTL135, requiring VCCO=1.350) and sys_clk_i (LVCMOS33, requiring VCCO=3.300)

And if I use the SSTL135 I get:

[Place 30-575] Sub-optimal placement for a clock-capable IO pin and MMCM pair. If this sub optimal condition is acceptable for this design, you may use the CLOCK_DEDICATED_ROUTE constraint in the .xdc file to demote this message to a WARNING. However, the use of this override is highly discouraged. These examples can be used directly in the .xdc file to override this clock rule.

Oscillators can be found on sheet 4 and 5.
https://digilent.com/reference/_media/reference/programmable-logic/arty-s7/arty_s7_sch-rev_e.pdf

(DDR3_CLK100 1.35V 100Mhz) image.png.857ff9266897900361c286c82d2b9a1f.png

(UCLK 3.3V 12Mhz) image.png.62d49a91bb522765c6df4688dbd938e7.png

You mean that I should use the 3.3V 12Mhz instead of 100Mhz 1.35V? Or should both work with DDR3 on my board you think?

Edited February 1, 2022 by Mathias

A Guide to Using DDR in the all HDL Design Flow

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in