Jump to content

D@n

Members
  • Posts

    2,246
  • Joined

  • Last visited

Everything posted by D@n

  1. RISC-V cores aren't necessarily resource hungry. I've seen quite a few that've given me a run for my money when trying to generate low-resource IP. These can actually get really small--once you throw away all of the expensive features. In fact, I know of one that will work using only 125 LUTs. All that's to say, it's not necessarily the RV IP that is the resource consumer. Yes/no. While I understand the sentiment, a lot of software engineers trying to access hardware will initially view this to be the "easiest" approach. A CPU can (presumably) convert a hardware design problem to a software problem, and hence it is often estimated to be a simpler task than properly learning how to engineer hardware designs in the first place. That said, a CPU needs 1) a ROM to get its first instructions from (typically provided by flash), 2) some RAM for its stack, 3) the special purpose verilog module this engineer wants, 4) a serial port for debugging purposes, 5) an interconnect to connect all three together, 6) a linker script, 7) software to load the program first into ROM, and then 8) software to load the program from ROM to RAM, and ... @zygot has a point here. All of that put together constitutes both 1) a lot of complexity that will need to be debugged when things aren't working (and no, nothing works the first time), and 2) it also represents a significant logic cost which will pull you away from your ability to use the full resources of the part you've purchased. Dan
  2. If I recall correctly, MicroBlaze does not work well with other MicroBlazes. The specification only warrants one in a system. This was at least the answer given to me when I pointed out that Xilinx's MIG controller couldn't handle exclusive access. Backing up a bit. To do multiple CPUs in a design (properly), you need some form of communicating between them. This is typically done via atomic accesses over the bus. AXI calls this capability "Exclusive access", and there's quite the protocol behind it to do it. When I was studying how to implement exclusive access on my own, I discovered the MIG did not support it. I think Microblaze had instruction support for it, but the MIG did not. When digging I seem to recall someone saying that the documentation specifically disallowed Microblaze from ever working in a multi-CPU system, so this would never be an issue. In other words--you are doing something new, and untested. Dan
  3. I'm still new to this party as of this past year, and ... No, I am not using any FPGA tool vendor IP beyond the required IO hardware macro blocks. (IDDR, OSERDES, etc). Yes, it took a bit more work to do--especially since I handled almost all of the network via RTL. Still, I find your quote, "... this is almost ideal approach", encouraging. Thank you. Dan
  4. @Morocco_Brittany56, @zygot has a point. Why would you use Vivado generated IP for this? Their generated designs are typically broken. Not only that, this is easy enough to do. Let's start at the top. You want some kind of "Ram or Fifo". Building a RAM or a FIFO is a first year beginner project. It's not something that should require Vivado IP. You should have a "Ram or Fifo" in your own back pocket that you've tested, verified and trust. Here is one of mine, for example. Frankly, they're a lot easier to work with in RTL than when using Vivado IP. Among other things, you can debug your own. You can't very well debug Vivado's IPs. (I know ... I submitted some bug requests back in 2018, and they still haven't been fixed the last I checked ...) Absolutely! I even have my own AXI-Lite UART and it has an integrated FIFO within it. I built it years ago, and I would encourage you to continue doing the same. In my case, though, I merged the FIFO into the UART. That way the CPU has a consistent interface. It can read/write the UART registers, or the UART FIFO, and they're both in (roughly) the same spot. I mean, you could keep the two separate if you wanted to, but why would you? The UART software driver would then need to interact with two IPs (the UART and the FIFO) just to send anything. It just makes more sense to give them both the same register map. As for the comments about a 13MBaud UART ... I'm all ears. Sometimes I can get mine to operate at/near 4MBaud, but 1-2MBaud is more common with the FTDI chips I find in most FPGAs. (On the other hand, on one project I replaced the UART with a Gb Ethernet IP and was just shocked at how fast the interface went in comparison ...) Dan
  5. @hlittle, Thanks for pointing out the difference! I now see what you are talking about. The "write-protect" pin from the PModSD is now an NC on the PMod Micro, but the schematic shows that a circuit might be populated ... got it. I'll look forward to hearing the answer as well. Dan
  6. From the reference alone, there are no pins left to turn on or off the power. You have the pins you need to control the SD card, but not to power it up or down save by unplugging it. You might wish to look into sending a reset command to the device, but controlling power is off the table. Dan
  7. That particular software was rebuilt for a Pi years ago. As I recall, the rebuild didn't require changes--it just needed to be recompiled with the ARM compiler. The software works in a Linux system environment, and the network and UART interfaces are all well defined there and have been for years. As for senior projects, there's quite the challenge on the instructor's side that I've seen over the years, and that is estimating the complexity of the project. Often instructor's have no real clue what the students are up to and so wildly over or under estimate the student work load. I suppose it just goes with the territory. It's just something I've often seen, and so something I'm sympathetic to when reading of these things. In this case, the project can go from nearly impossible to trivial depending upon details the student hasn't (yet) shared with us.
  8. If this is a software project, it's pretty easy. I wrote this in an afternoon. It does almost exactly what you are looking for (though it splits data across two TCP/IP network streams), and it'd be fairly easy to adjust back to a single stream again if you wished to. (You might even be able to find the single-stream version in the git history, for exactly what you've described.) If this is an RTL project, then yeah, it'd be a bit more ambitious. The key question here, though, is where is the hardware/software boundary? Dan
  9. @artvvb, I would also ask Digilent to post and maintain PDF copies of all of their manuals. It is common for hardware to outlive its support, or for newer versions of hardware to have newer documents that then get confused with the manuals for older products. My solution to this has always been to download the PDF manual of the user guides or other spec sheets at the time of purchase, to guarantee that 1) I won't get confused by an update to a product I don't have later, and 2) I'll still be able to keep and maintain the manual long after the company that built the product has stopped supporting it. Thanks! Dan
  10. There should be no combinatorial loops within the design. (If there were, I'd fix them ...) Can you post the generated design you are using somewhere? And (even better) can you list/find all the components of this loop? One thing that doesn't make sense from your comment above is that the *genmpy_o_r_reg[25]_O bit is an output of a flip-flop. I wouldn't expect that in a combinatorial loop, but maybe it starts the loop? Again, that'd be why I'd want to see the logic you've generated if you could post it somewhere for evaluation. Thanks! Dan
  11. @kalainan, Are you sure you know what you are asking for? Phase is one of the most meaningless and irrelevant outputs of an FFT. Why? Well, because phase has to be referenced to something. Do you really want to reference phase to a local clock which can not have any calibrations of any type? Likewise, if you are new to an FFT, then I think you will find the bit-reversed order difficult to work with. If you weren't using a bit-reversed order, I could tell you what bins to expect an output in. With a bit reversed order things get a bit more difficult, and I'd have to think about it. Digging deeper, I'd never use an FFT without some amount of overlap and window function. Unfortunately, any amount of overlap will really mess with the phase you are trying to measure. So ... you are going to need some knowledge about what you are doing. The FFT alone is rarely sufficient for such purposes. This, however, is a much longer discussion to be had, and that with your DSP instructor. Dan
  12. > ... the devil is in the details It's hard not to snicker at this comment in a rather paternal manner. The comment is just too true. The snickering comes from remembering all of the times in the last couple of decades when I've ended up learning this principle all too well. Yes, I made a living for years skating on the edge of what was physically possible. It was a lot of fun. Yes, you can often use a wrench like you would a hammer. But as @zygot said, the devil is in the details. You may need to get to know those details well. In your case, yes, it is theoretically possible to make an antenna to capture anything in the range of an FPGA's input capture capabilities. Yes, a 1-bit FPGA input can act as an ADC. Yes, such a 1-bit ADC may be sufficient for many purposes. I remember presenting a proof to the boss that a 1-bit signal would be sufficient for GPS processing. (I don't know about Zigbee ... I didn't do that analysis.) Just to add to the discussion, though, let me ask: have you given any thought to the type of pre-amp you might use? Will you need a band pass filter? If not, then let me ask you what would happen if you don't use a proper pre-amp/filter and the signal you are interested in just happens to be dwarfed by something else nearby? Would you be able to receive the Zigbee signal of interest? Or, of something went wrong, would you have enough lab equipment (and of the right type) to figure out what went wrong? These could easily become critical details you may wish to consider early in your project. Dan
  13. D@n

    FPGA output problem.

    Your first error comes from using Xilinx's broken AXI-lite slave template. It's not protocol compliant, and will eventually lock up on you when you aren't expecting it to. Your second problem is reading from the base address the second time instead of the base address plus four. Dan
  14. Build a design that then allows you to program the flash? Here's an example project where I did just that. The project actually contains two top levels that I would choose between. There's the alternate top level, which I used to program the flash and examine I/Os, and the normal top level which I used for my ultimate design. Once both projects were built, I would load the first with JTAG onto the board. Once the alternate top level was loaded onto the board, I could then load a bit file into flash. This also loaded the software I needed for my homegrown CPU, the ZipCPU. If I wanted, I could then load the normal top level bit file via JTAG as well. This was useful if the project design changed, but the CPU instructions did not. The flash controller had two parts to it. One part allowed me to read from the flash, whereas a second interface allowed me to write arbitrary commands to the flash. You can read more about the design approach here if you would like. A serial port interface allowed me access to read and write values on the bus from external to the FPGA. This meant that I could write a small piece of flash control software for that purpose. The actual program that drove this controller was one I called zipload. This not only loaded the bitfile provided to it, but would also decompose and read an ELF file in order to place the design at the right flash address. One of the things I learned along the way was that it was easy to get the flash out of sync with the hardware controller. This meant that the controller required a way to return the flash to its basic SPI based configuration no matter what mode it was in. If the controller ever gets out of sync, you can usually recover it by powering down the board and starting both up again. I also learned that there are special write protect registers within the flash. Some of these can be set only once and never cleared. You should be able to look up the data sheet for your flash and then read these registers. If the write protect is truly set internally, then you may need a new flash chip. (Might be easier to get a new board.) Read the registers, check the data sheet, and adjust as appropriate. Just my two cents. Dan
  15. @HomaGOD, Let's see ... when I built with the keypad, I used the COLumns as FPGA outputs, and the ROWs and FPGA inputs. Here's how I went about reading it: First, output zeros on all of the COLumns. This is the normal state of the keypad. You'll sit here until something happens. If any of the ROWs, treated as inputs, produce a value other than VCC, then a button has been pressed. You can then output VCC on two of the COLumns. If the ROW inputs don't change, then these two columns were not responsible for the button--repeat with the other two outputs Once you've narrowed down which two of the COLumns is responsible for the button press, you can set VCC out on three of the columns--the two unused (i.e. unpressed) ones, and the one that has been pressed. Your goal is to find the one COLumn, which when set to zero, leaves the ROW at zero--because the key is pressed. That column plus the row then gives you the key you need. Beware of bouncing! Once the key is pressed, you will have to wait for it to settle before reading it. Looking back over my notes, I waited for 100,000 ticks after registering the ROWs weren't all VCC before I went and tried to figure out which COLumn was responsible. Also, this isn't the only way to handle the PMod keypad. I remember doing this in college (decades ago ...) and reversing the directions of the pins in the process. In this case, the pull ups can help you to know which way to go. For example, the COLumns have no pull ups on them--so they work better as outputs than as inputs. Dan
  16. @tnkumar, I think if I were going to learn AXI, (once I'd gotten past the reference guides), I'd start with learning about skidbuffers. I know of no other way to meet AXI's requirements, and maintain 100% throughput, than to use some skidbuffers. (Often times you can cheat, and skip the skidbuffer, but the result--while it might work--isn't AXI compliant.) Once you know what a skidbuffer is and how to use it, I'd then move on to AXI lite. Here's my favorite, documented, AXI-lite example. I use it for everything as my starting point. (Don't use Xilinx's example design--it's broken. It'll work for a while, enough to give you the confidence to believe it works, and then it'll suddenly fail on you when you are not expecting it. This'll leave you looking all over your design for the bug and ... in all the wrong places.) If you really want to learn AXI in its full glory, then you'll need to understand AXI addressing. It's ... not simple. I have a routine I've highly optimized for this purpose that really helps me unwrap bursts. Once you can unwrap burst addressing, all that's left is to count beats in a burst, and make sure that the AXI ID returned matches the AXI ID requested. If after all of this you still want more, then consider this discussion on converting an AXI-lite bus master to an AXI bus master. The big difference between the two, in this application, is the exclusive access. Beware, however ... Xilinx's IP doesn't support exclusive access. Microblaze doesn't use it. The MIG doesn't support it. etc. If you need something that'll use exclusive access, then you might need an example slave to work from that supports it. For most users, though, exclusive access isn't a requirement and the capability can be safely ignored. Dan
  17. I suppose you could use the GPIO outputs, but the more general solution would be to build an AXI-lite peripheral. Don't use the Vivado generated AXI-lite example. It's broken. You can have Vivado build the example for you, but if you do that then rip the guts out of it. Replace it with something looking like this, and then you'll have a nicely working AXI-lite peripheral that you can do a lot of things with. Once you have the basic AXI-lite slave up and running, you'll want to modify it with something like: // Write to one of two registers always @(posedge S_AXI_ACLK) if (axil_write_ready) begin case(skd_awaddr) // Assuming you left this as byte addressing ... 0: reg_A <= skd_wdata; 4: reg_B <= skd_wdata; endcase end // Read from either register, or their sum always @(posedge S_AXI_ACLK) if (axil_read_ready) begin axil_read_data <= 0; case(skd_araddr) 0: axil_read_data <= reg_A; 4: axil_read_data <= reg_B; 8: axil_read_data <= reg_A + reg_B; endcase end ... or, at least, something like that. It's been a while since I've looked at the specific register names to know that I got them right. Bottom line, though, is that any good bus slave will have some kind of interaction roughly like that above, and a good AXI-lite slave is not really any different--once you get past the challenge of decoding AXI-lite in the first place. Dan
  18. The oversizing seems not to be a choice by Digilent per se, but rather the minimum standard size of a flash device. I'm not complaining. That flash was a life saver for my purposes. Dan
  19. ISE will use the beginning of your flash for your bit file. Once the bit file has been copied to the flash, the rest of the flash is yours to do with as you please. Well, to be a touch more precise, it will use the beginning of your flash for your bin file--a bin file is a bit file minus a small (36byte?) header, so they're almost the same thing. The flash size on the S6 is 16MB, and of the two bit files I have lying around, one is 334kB and the other is 273kB. You should therefore have plenty of room left over. I'd be tempted to grab the last 15MB, but you could probably even grab another half MB or more. I do have a Verilog example that uses flash memory on the S6. It was a test of an earlier version of the ZipCPU--a CPU designed to work in an area efficient environment. The design places what I call a "multi-tasking OS" onto the S6. It then runs a series of (mostly) independent programs, time slicing between them. Due to the lack of RAM on the S6, the majority of the software is placed into flash. For performance reasons, a small piece of code was placed into block RAM--mostly for interrupt handling and such. Further, due to the lack of area, the design that loaded the flash in the first place was separate from the design having the CPU within it. (Both are in the same repository.) Dan
  20. HDMI tends to require more pins than a typical PMod can support. That said, have you looked at this PMod? It's not Digilent, but it does look like it might have some promise. Dan
  21. @mdarmanu, How "'real time" do you want your measurement to be? If you measure clock cycles on a 100MHz clock, the result time 10ns should equal wall time. In m estimation, the clocks Digilent uses tend to be accurate to within about 1-20ppm or so. If this isn't "good enough" for you, you'll need to start with a better oscillator. The GPS PMod can help, but ... it's not perfect. It will only produce a pulse to lock on to once per second. For some problems, running the algorithm many thousands of times over and then measuring seconds in this manner may be good enough--but it is a lot harder than the first method. Bottom line: "It depends" on your requirement, and how accurate you need things to be. Dan
  22. @zygot, One of the things you may be missing about that thread is that the OP is a senior engineer at Xilinx. Perhaps that might adjust your sense of how you read it. Personally, I view the conversation as a very valuable one, and I'm glad to see it taking place. Dan
  23. @kinshuks, Out of curiosity, when you say it "gets stuck" ... how long are you waiting? 200us? Shorter? Longer? Dan
  24. @zygot, Yes, I have read your message. I'm not sure I'm ready to respond to it, and so I have not. Let's just say I'm still mulling it over. Go for it. Perhaps the discussion that follows (if any) will help me form an opinion one way or another. Dan
  25. @asmi I'm finding about one complaint in Xilinx's forums roughly every week to two weeks. The complaint is typically from a user whose design is locking for what appear to be completely inexplicable reasons. Digging further typically reveals that they are using one of Xilinx's demo designs--either the AXI-lite or the AXI (full) one. Whether or not you run into one of these problems really depends on how you configure the interconnect. If you configure it for "area optimization", you won't be likely to see the bug. Another key criteria has to do with how the bus is driven. One user experienced lock ups when issuing an unaligned access from a MicroBlaze CPU. (Help! The FPGA engineer left the company, and I'm just the S/W guy but ...) Others have run into problems when trying to interact with one of these designs using a DMA of some sort. Another recent issue had to deal with connecting a 64-bit AXI bus to a 32-bit Xilinx generated demo peripheral. It seems a key to triggering either bug would therefore be two accesses in close succession--much as I outlined in my write-ups. As you can see, whether or not the bug gets triggered is highly dependent upon the use case. Worse, when attempting to apply hypothesis testing to find where to look for the bugs (if I change A, the design fails, therefore the bug is in A somewhere), you'll often get sent looking for the bug in the wrong part of the design. Dan
×
×
  • Create New...