sfallekgt Posted July 30, 2024 Posted July 30, 2024 hi, I'm new to the forum - thanks in advance for the help. I'm curious about the AXI interface for the Zmod ADC 1410. The documentation states that a 16 kSample circular buffer stores data from the ADC and then writes to memory. Is it true that the data cannot be sent to DDR memory until the entire buffer is full? I'm curious if there is then a way to reduce the buffer size to provide smaller chunks of data to the PS at a faster rate. Also, Im curious about the latency for this process (ADC capture + buffer storage + axi transfer to DDR + DMA access by PS) is known.
artvvb Posted July 30, 2024 Posted July 30, 2024 Hi @sfallekgt 2 hours ago, sfallekgt said: Is it true that the data cannot be sent to DDR memory until the entire buffer is full? I'm curious if there is then a way to reduce the buffer size to provide smaller chunks of data to the PS at a faster rate. Yeah, the IP as-is (without modifications) can't forward data to DMA until the buffer is full. Alternate hardware designs that can stand in for the AXI IP could handle it though. It's most important to use the full 64 bits of the Zynq PS HP slave interfaces, by increasing the DMA's AXI master interface, and to increase the frequencies of the clocks the DMA runs at. Consider checking out this under-construction demo which dumps data into a DMA configured for scatter-gather at full speed (the sample rate claimed is 40 MS/s default but can be increased to 100 MS/s by changing ADC clock frequencies). For what it's worth, it's marked under construction primarily due to incomplete documentation and some general messiness on the AWG side. 2 hours ago, sfallekgt said: Also, Im curious about the latency for this process (ADC capture + buffer storage + axi transfer to DDR + DMA access by PS) is known. We haven't measured this. ADC capture and buffer storage both run very quickly. One sample per channel per clock cycle, the time it takes is effectively the actual time it takes to capture however long of a buffer you want, as it's all happening in PL and BRAMs. The AXI transfers managed by the DMA and the AXI lite transfers the PS uses to configure IPs in the PL all introduce some overhead - the former can be reduced by the aforementioned DMA throughput improvements. Latency through the data path from FPGA I/Os to the DMA is probably just a couple of clock cycles. Hope this helps, Arthur
sfallekgt Posted October 22, 2024 Author Posted October 22, 2024 Hi Arthur, Thanks for your help here. I'm now digging into the ddr-streaming branch a bit more. I'm trying to generate a minimal example of sending multiple ADC captures over the serial comms link. For now, large delays are possible between the captures, but ultimately I would like to stream large contiguous samples. I've been working from the s2mm_transfer software example, which I've re-written to take captures sequentially. After the first capture, the device fails while waiting for the manual trigger (which I've issued from ManualTriggerIssueTrigger). Are there any particular commands/resets that need to executed between captures? Before a capture, I enable the trigger (TriggerSetEnable), invalidate the cache (Xil_DCacheInvalidateRange), start the trigger (TriggerStart) and reissue the manual trigger. Possibly related - I noticed S2mmCleanup is non-functional. Thanks again.
artvvb Posted November 21, 2024 Posted November 21, 2024 Hey @sfallekgt Really sorry for the long turnaround, I've only just gotten back to this, partly due to switching computers and only having a 2024.1 install to work with so far. I upgraded the project so I could take a look with the new version and published a prerelease for 2024.1 for this project on Github today (note it requires Vitis Classic, not the reworked UI): Release DDR Streaming Demo · Digilent/Eclypse-Z7. From what I've observed while doing some basic testing, it seems that whatever was causing the error in S2mmCleanup may have been resolved sometime since the original 2021 release. The modified main source should have the device do two level-triggered acquisitions in a row, though there is some stuff that doesn't need to be re-performed that still is (like reading and applying the calibration coefficients). The DMA does need a full software reset and the scatter gather block descriptors need to be recreated between acquisitions, as I hadn't figured out how to try to fully recover the descriptors from hardware, what with the slightly weird way they're getting used to do an infinite looping transfer until the input stream stops. For the non-DMA custom hardware, there's nothing really special command-wise that needs to happen, the trigger detector needs rearmed, but that already happens anyway just on re-calling the acquisition function. Note, I have not yet taken a look to see how quickly it can be rearmed, and needing to use malloc between captures is one point of concern. Have you made progress in parallel? Thanks, Arthur
sfallekgt Posted November 24, 2024 Author Posted November 24, 2024 Hi Arthur, No worries, thanks for getting back to me and updating to 2024.1. I'll give that version a shot as well. I did make some progress in parallel, it turns out my issue was unrelated (I wasn't acquiring samples in multiples of 1024). I'm curious about your comment : "The DMA does need a full software reset and the scatter gather block descriptors need to be recreated between acquisitions, as I hadn't figured out how to try to fully recover the descriptors from hardware, what with the slightly weird way they're getting used to do an infinite looping transfer until the input stream stops." What specific functions do you call to recreate the block descriptors and reset the DMA? Why does the input stream need to stop? My code didn't seem to require anything but rearming the trigger (TriggerStart and ManualTriggerIssue) and flushing cache (Xil_DCacheInvalidateRange). I'll dig in more to make sure the data on the second acquisition is not corrupted.
artvvb Posted November 26, 2024 Posted November 26, 2024 Hey, The system is based around a scatter gather DMA engine where one acquisition (an arbitrary amount of input data until a defined time after a trigger) is treated as one long packet. The DMA IP automatically pulls from a set of "block descriptors", effectively a linked list of structures defining regions of memory to write data into, where each block is transferred while the hardware is queueing up the next few. The trigger detector hardware adds a "tlast" beat to the incoming data stream and stops passing new incoming data to the DMA. Software is able to find this tlast beat by inspecting "processed block descriptors", which helps it determine where in the overall buffer continuous data starts and stops - this happens in S2mmFindStartOfBuffer. There's some commentary on all this in the comment header in xaxidma.h, starting around line 64. The reason for a lot of this is that it's difficult to both tell where the last piece of data transferred by the DMA is in memory and to avoid overwriting relevant data from the start of the acquisition. This is done here by canceling the last AXI burst of the acquisition part way through by asserting the tlast signal, which causes the DMA to send the last burst in the corresponding block early, and to write two fields in the block descriptors - one that says "tlast was asserted", and one that says how many bytes were actually included in the block (as opposed to the total requested). If data is allowed to keep flowing into the DMA, data that we wanted to acquire will begin to be overwritten in the circle buffer. So the reset loop I'd been assuming would be to detect the trigger, use "S2mmFindStartOfBuffer" to get the start of data, do any post processing and transfer of the data, then use S2mmCleanup and S2mmAttachBuffer to reset that linked list of descriptors, and then rearm. On 11/24/2024 at 3:27 PM, sfallekgt said: My code didn't seem to require anything but rearming the trigger (TriggerStart and ManualTriggerIssue) and flushing cache (Xil_DCacheInvalidateRange). I'll dig in more to make sure the data on the second acquisition is not corrupted. This is interesting, and from what I recall, I haven't tried it to see what happens... I think I might have been running on an incorrect assumption from that xaxidma.h info - that the concept of "processed" and "completed" etc block descriptors is meaningful in hardware instead of just in the driver. If that's just a concept for the software driver, then we might be setting up the DMA to run infinitely, and the descriptor ring doesn't actually need to be swapped between acquisitions - you'd still need to post process and transfer data before rearming though, to avoid overwriting data. I also suspect that trying to switch out block descriptor rings to allow data to be continuously transferred into another memory buffer would still require a full reset of the DMA... This all also might be horrifically overcomplicated compared to clocking a DMA using the SimpleTransfer API (and a FIFO) suitably faster than the input stream and using interrupts to issue new transfers when previous ones are completed. I'm out of office most of this week but will try to find some time to dig in further.
artvvb Posted December 3, 2024 Posted December 3, 2024 On 11/24/2024 at 3:27 PM, sfallekgt said: My code didn't seem to require anything but rearming the trigger (TriggerStart and ManualTriggerIssue) and flushing cache (Xil_DCacheInvalidateRange). I'll dig in more to make sure the data on the second acquisition is not corrupted. It looks like this is working well for me. Uploaded a new release today: Release DDR Streaming Demo · Digilent/Eclypse-Z7. With some python post-processing and some work to manually copy data from the serial console over into a file (easy enough to do in python, just haven't gotten there yet), I'm getting some nice-looking FFTs of captures of subsections of a repeating 1-10 MHz frequency sweep applied to scope channel 1: Taking a look at the raw samples in excel also shows the trigger condition coming in at the expected place in memory - this is with a "rising past 0.0 V" condition, with 100 samples prebuffered before the trigger. Thanks again for the tip!
sfallekgt Posted December 3, 2024 Author Posted December 3, 2024 Great, so if I'm understanding correctly those are four snapshots taken in succession with the minimal (re)setup between them? I'm going to move onto you're other suggestion: On 11/26/2024 at 1:11 PM, artvvb said: overcomplicated compared to clocking a DMA using the SimpleTransfer API (and a FIFO) suitably faster than the input stream and using interrupts to issue new transfers when previous ones are completed. My ultimate goal is to enable some form of continuous streaming over ethernet - handling the ADC with a ping-pong buffer or similar. Any tips are appreciated, thanks again.
artvvb Posted December 3, 2024 Posted December 3, 2024 5 hours ago, sfallekgt said: Great, so if I'm understanding correctly those are four snapshots taken in succession with the minimal (re)setup between them? Yes, rearming the trigger and acquiring new data in a loop: https://github.com/Digilent/Eclypse-Z7-SW/blob/1650281f9aab213e538dd87ba9adc347b29c151c/src/s2mm_cyclic_transfer_test/src/main.c#L260. They reuse the same memory for the buffer, so data needs to be processed and transferred off the device or copied elsewhere before rearming. 6 hours ago, sfallekgt said: My ultimate goal is to enable some form of continuous streaming over ethernet - handling the ADC with a ping-pong buffer or similar. Any tips are appreciated, thanks again. Fairly broad, but anything you can do to reduce the amount of data getting transferred will help, ideally in PL. I haven't gone all the way to combining DMA with Ethernet before, personally, but, if you haven't seen it before, there's a tutorial on the site for running existing Xilinx TCP examples that could at least get you a basic connection: https://digilent.com/reference/programmable-logic/guides/zynq-servers. I'm not sure what kind of maximum throughput LWIP can get to.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now