How to use and manage memory with my FPGA?

GalD101 · September 6, 2023

1 hour ago, reddish said:

we can stop

why stop?

GalD101 · September 6, 2023

Unfortunately I don't have a definitive answer as to how many coincidences we expect to capture in one second but I'll try to figure out by asking some folks here. For now all that I know is that the max is 20M so anything above that would be redundant.

Edited September 6, 2023 by GalD101

September 6, 2023

>> "The single photon detector dead time is about 50ns therefore we do not need to measure faster than 20MHz. The system should be ready to count the coincidence rate as high as 20MHz."

That's three orders of magnitude beyond what's reasonably possible with the little Spartan-7 board. The bottleneck is the rate at which you would need to offload data to a PC for analysis.

If you would spend 15 kUSD on a highly optimized timetagger with USB3, then maybe you can keep up with this.

Sure, more than 20e6 coincidences/sec is a nice and simple upper limit on what would ever be possible to see with two of these detectors, but another way to approach it would be this: we can build a device that's capable of reporting up to 17e3 coincidences per second. Can you design your experiment around that? For example, attenuate the laser down to a lower level where the number of expected coincidences is 15e3, and still get useful statistics?

A more thorough effort will be needed to answer this "inverse" question.

If the answer is no, we can stop right here and now and report back "sorry, can't be done" to your professor.

Note also that it would be a very bad idea from a physics PoV to do an experiment that is so close to the saturation limit of the detector. Your statistics would be super hard to get right, as you would have to correct for the bunching-up effect (see here).

In our lab, we never use our SPDs beyond several hundred thousand counts per second (for this reason and others).

Edited September 6, 2023 by reddish

GalD101 · September 6, 2023

He said we can do what you suggested here

9 minutes ago, reddish said:

attenuate the laser down to a lower level where the number of expected coincidences is 15e3, and still get useful statistics.

September 6, 2023

It's a bit silly that we only now realize this is an issue. I did ask the question very early on, but got no answer at the time. Stupid of me not to insist back then.

September 6, 2023

> He said we can do what you suggested here

Cool. But before we proceed, I would like to see the calculation that supports this.

It's a long project still (we're at ~ 30% of the way there I estimate), and if we find out in the end that this won't work after all (because it would take a prohibitively long time to gather enough statistics) it's all been for naught. A proper upfront analysis on how you're going to process the data is prudent at this point in time.

September 6, 2023

While waiting for the calculations (let's assume for the time being it turns out that we can, indeed do useful stuff with tens-of-thousands of photons or photon incidences per second), let's talk about the serial protocol. Once we covered all that, this will also allow you to understand and verify my back-of-the-envelope calculations about the number of events we'll be able to push from the FPGA to the PC. (I won't be answering those questions explicitly; you should be able to follow that once I explained the serial protocol stuff).

The first thing to realize is where the serial protocol is actually spoken. Here's a picture:

It is important to realize that the serial protocol we'll be implementing is spoken between the FPGA and the FTDI chip on the CMOD-S7 PCB. Settings like baud rate, parity, and number of stop bits apply only to this very short (~ 1 cm) link. So the serial protocol is very much not spoken between the FTDI chip and the PC. All that communication is done over the (complicated) USB-2 protocol. We won't have to worry about that, fortunately -- that's all taken care of by the FTDI chip and the USB bus controller and the software driver on the PC.

So when you're setting the baudrate (in a terminal program on the PC, or using a Python program), this will actually send a configuration packet from the PC to the FTDI chip, telling the chip: "from now on, communicate at 115,200 baud!", and the FTDI chip will use that setting to communicate to the Spartan-7 chip.

The reason I draw this picture and emphasize where serial communication happens is because it used to confuse me to no end when I was a beginner. "What the hell do I need to set a baud rate for?", I used to think, "Isn't that taken care of by USB?". Well, all that low-level stuff is clear at the USB-side, yes; but the configurations apply to the very short stretch from the FTDI chip to the Spartan-7. Was quite an Aha moment once I realized that.

What I drew in the picture is the S7_TX line, that is used to communicate from the Spartan chip to the FTDI chip (and, onward, via USB to the PC). There is also a line in the other direction, that allows the PC to talk, via USB and the FTDI chip, to the Spartan 7. However, in our communications, we will not use that. The Spartan-7 will just be sending data on its S7_TX line; it will not listen to data coming from the other side (on the S7_RX line).

Speaking of TX and RX: those are generally accepted shorthand for Transmit and Receive. the problem is that this is an ambiguous term. Sure the S7 acts as a TX on the line we'll be using, but the FTDI chip will act (on behalf of the PC) as an RX. This has led to untold confusion. It is now my standard policy to not put any stock in labels like TX and RX when I see them in schematics, and just try to get it to work; only then I will know which pin is used for sending, and which pin is used for receiving. It is also the reason why I like a label like "S7_TX" because it indicates that this is the line over which the Spartan will act as the transmitter.

Anyway, then, to the protocol itself.

The canonical serial protocol is described in the RS-232 standard. Problem is that RS-232 is a difficult standard. It has so many signals that the original RS-232 connector came with a 25-pins connector. Signals included handshaking, "ring indicator", and so on and so on. Later on, a 9-pins connector became useful, but even that is quite complicated. In the case of our little Spartan-7 and the FTDI chip, only three lines are used; one of them being common ground; one of them carrying data from the FTDI chip to the Spartan, and one of them carrying data from the Spartan to the FTDI. It's the latter signal that we'll be using (calling it S7_TX), to push out data from the Spartan to the FTDI (and onward, to the PC).

The two directions (outgoing and incoming, as seen from the Spartan), are completely independent. It is perfectly possible for a device to just send data to the other side, but not listen to what's coming back. That will be precisely the mode in which we'll be using it.

So what, then should the Spartan put on its outgoing S7_TX pin to communicate?

For starters, there's the situation where the spartan has nothing to send. In that case, it will set a "1" on its signal towards the FTDI chip. The FTDI chip understands this as, "okay, there is nothing to receive".

Whenever the Spartan has data to send, it sends it bit by bit. Each bit period has a duration (1/baudrate); for example, at 115200.0 baud, each bit length is about 8.68 microseconds. All bits are sent with this identical bit duration.

Here's what the Spartan-7 will do:

It will start by sending a single bit with the value 0, the "start bit". This alerts the other side: "Take notice!! Data is coming!"
It will then send a number of data bits. Most often this is 8 data bits, but it can in fact be any number from 5-8, at least. The most common choice is 8 data bits (sending a single byte), but if you're only sending ASCII, 7 bits will suffice, and we'll be using that fact later to optimize the efficiency of our communications. An important fact is that data bits are started from the least significant bit first, to the most significant bit last.
After that, there's an optional parity bit. This bit conveys information about whether the count of 1-bits in the data-bits was even or odd. Both "odd" and "even" parity are possible, but it is also possible to omit this single parity bit, in which case we say that the link uses "no parity".
After that, the link goes high again, back to its default "silent" state. The sender and receiver use an agreed-upon duration that this quiet period will at least last; for example 1 or 2 bit-durations (or even, rarely, 1.5 bit durations). This setting is called the number of stop bits.

And that's it! That's the entire serial protocol, at least, as spoken between the Spartan and the FTDI chip.

As seen above, the sender and receiver need to agree on some things to use this protocol. For example:

"2400 baud, 7 data bits, even parity, 2 stop bits" or, for short: 2400 baud, 7E2.
"115200 baud, 8 data bits, no parity, one stop bit", or, for short: 115200 baud, 8N1.

It's the latter that we will use for our initial tests of the Spartan-7 to PC communication.

Let's do an example. Suppose the Spartan wants to send a single capital A character to the PC. That's ASCII code 65, or, in binary: 01000001. To do that, it will send the following signal:

...1111111111111111111111111111010000010111111111111...

Here, the initial and final 1s indicate that the serial link is quiet (no data is being sent). Then a start bit (0) comes, followed by 8 bits 10000010 (the A character, lsb first), followed by at least a single stop bit 1.

To send 2 A's, the Spartan would do this:

...11111111111111111111111111110100000101010000010111111111111...

Directly after the first A's stop bit, it can proceed with the start bit of the second A. In fact, repeating this 10-bit pattern (1 start bit, 8 data bits, 1 stop bit) sends an infinite stream of A's to the FTDI chip and onwards to the PC. Each 10-bit pattern transmits a single byte, so, at 115200.0 baud, 8N1, we can transmit 11520 bytes per second.

Now let's do something funny. Let's see what happens when we send a U character. ASCII code is 85, which in binary is 01010101.

To send this single 'U', the spartan must send the following 10 bits:

0101010101

And if we want to send many U's right after each other, the spartan should send:

...01010101010101010101010101010101010101010101010101...

Which is just a square wave signal, with a frequency of half the baudrate.

Now where did we see that before?

Okay. So we established that sending 0101010101010101010101010101010101010101010101... at 57.6 kHz will send a never-ending sequence of U characters from the Spartan-7 to the FTDI chip, which will faithfully send that over to the PC.

Now let's adapt our XDC file (and our program) to send the 57.6 kHz signal we made earlier not only to the PIO pin, but also to the S7_TX pin!

And now the magic happens:

image.png.09bacffba2ada7593ab8ae6a5726a00a.png

Leading to:

image.png.9b6a350b7d761ea542c17320ae1cca63.png

Hurray!

@GalD101 -- your tasks for now:

- understand all this (will be useful to read Wikipedia)

- reproduce this on your S7 board

- show me a screen full of U's!

The task hereafter: send an infinite chain of A's instead of U's.

For the latter task, I would want you to do it on your own, to the maximum extent. See it as a sort of half-term exam. You will need to adapt the Finite State Machine module quite drastically, but everything you need has been said so far. It can take you a day or two or three, but that's okay; you will make all kinds of mistakes, and that's okay too -- in fact, I strongly feel that's necessary for the learning process.

One hint is that I would start with a Python pseudocode script to define what your "next_state" function should do. Much easier to think in Python than in Verilog. After that, adapt the FSM we currently have to reflect the newly desired behavior. Also: you will want to add some more state to the state_type, to hold the byte to send.

image.png.1136b873a91c2cdca1c38a038bb611f9.png

(Only if you really really REALLY get stuck, ask me for assistance.)

Edited September 6, 2023 by reddish

September 6, 2023

For future reference (not needed for the task currently in front of you), here's a picture I made of the FTDI chip I found on my CMOD-A7 board, sitting right next to the USB connector. It's the same chip you have on your CMOD-S7 board.

The vendor data for this chip can be found here: https://ftdichip.com/products/ft2232hq/, if you're ever interested.

And, lo and behold, it should even support up to 12 Mbaud. Not too shabby for a sub-100 dollar device. At 9 baud-per-ASCII-character (using the serial protocol in 7N1 mode), that's about 1.33e6 characters per second. It's not Ethernet (at 100 Mbits or 1 Gbits/sec), but it's only 1 resp 2 orders of magnitude below that. Pretty cool.

In the end, it's about 50,000 messages per second we should be able to offload.

The caveat is that will only work reliably in Linux, as per past experience. No chance of getting those kinds of UART speeds in Windows working, I'm sad to say. Those Windows device driver folks should really do better.

Edited September 6, 2023 by reddish

GalD101 · September 7, 2023

Well at least that's something (getting white space non stop instead of U).

Don't tell me why. I'll try to solve it myself.

Sorry I didn't answer I just had a 6 hour lecture of Data structure. I will soon meet with my professor to discuss about the experiment

September 7, 2023

> I just had a 6 hour lecture of Data structure

Excellent. Perhaps we can find an excuse to implement a red-black tree in Verilog :)

GalD101 · September 7, 2023

He said it's as simple as pressing a button to generate less photons per seconds. Please excuse me if I misunderstood your question

22 hours ago, reddish said:

I would like to see the calculation that supports this

September 7, 2023

7 minutes ago, GalD101 said:

He said it's as simple as pressing a button to generate less photons per seconds. Please excuse me if I misunderstood your question

Of course, that's easy.

Here's the issue. We went from an initial statement "we need to be able to handle 20e6 events per second" to a statement "well 15e3 events per second will probably still do" over the course of three posts.

This is a bit annoying, as it suggests that my question "how many photons / coincidences per second do we need to be able to handle" isn't considered with the appropriate level of seriousness.

It is quite possible to make a statistical model of the experiment you're about to do with a simplified model of your detectors. In fact, you (or your professor) will have to do that anyway at some point, to be able to assess the results of the experiment you are implementing.

My suggestion is that you pull that aspect of the project forward, so we can answer questions like, "If we're able to measure X coincidences per second, our measurement time becomes H hours to reach a target level of confidence Y". It will also make it crystal clear what information you need to get out of the FPGA.

This will become especially important once we reach the stage where we can do event logging, and we need to decide if that suffices for your application, or if we need to make the (more complex) FPGA-side coincidence detection.

GalD101 · September 7, 2023

"Our preliminary estimation is that if we are able to measure 100 coincidences per second, our measurement time becomes ~4-5 hours to get a target level of confidence of 4 sigma. The question now is how many coincidences do we expect to have out of single events. This is a more tricky question that we are trying to estimate now. It seems that we need to target the count rate of 100 000 single events per seconds."

September 7, 2023

What does "100 000 single events per second" mean? What is counted as an event?

This is starting to feel a bit too much like work. There, I also often need to second-guess the people I work with because they don't think ahead far enough relative to what their project needs.

To keep this fun for me, I am going to stop worrying about the aspect of the usefulness of the artifact we're building. That is now the responsibility of you and your professor. In fact, it always was, it was just that I got aggravated because that responsibility was not / is not being properly taken care of, it seems.

What I can do is, lead you to a design of a CMOD-S7-based device that can report on 30,000 to 50,000 events (== photon detections) per second; or a slightly more advanced device that can report perhaps 10,000 to 30,000 coincidences per second (== photon detections that are sufficiently close in time to conceivably be correlated).
It's up to you to decide if that's useful to invest a lot of time in. I'm perfectly fine with you running the numbers and, based on that, aborting the effort.

If we go on, we'll continue with refining the serial communication. So I'm hoping to see a lot of A's in the next few days.

GalD101 · September 8, 2023

That's cool it actually works (The problem was that the settings in the program were faulty)
next will be "AAAAA" and then the classic "Hello World"

Edited September 8, 2023 by GalD101

GalD101 · September 8, 2023

Missed by one bit! (off by one error again lol)
image.png.f098d9a5133b1ba0d5731021a04c0fc8.png

GalD101 · September 8, 2023

Hurray!

That's the code. Not very pretty and I know I can use an else without an "else if" but during my debugging progress I added that and I found a few bugs when writing it.
Also, the off-by-one error was present since I hard-coded those numbers and made a mistake when copying them.

Yes, I know it's a bad practice to leave numbers that seem random in a program instead of using localparams or simply writing them as a multiple of 868 (while keeping 868 as a constant) but I am going to try to change it anyway so it would say "Hello World!".
I'll also add the small drawing I made, It's not very readable but that was the way I thought about implementing it. I think there are cleverer ways to implement this but this was my naive approach for this. Let me know if there are better ways to implement it (other than obviously making my "if" and "else if" conditions a bit cleaner).
Perhaps a better approach is to create an fsm for every letter? and on every instance give it a parameter of how many characters I want to display and thus creating another layer of abstraction? Is that type of approach good or is it not practical?

finite_state_machine.sv

note: doing this instead of data structures hw because it's fun lol

Edited September 8, 2023 by GalD101

GalD101 · September 8, 2023

How should I create a "Hello World!" program? making it using the same logic I applied previously for the "UUUU" or "AAAA" seems cumbersome and also pretty bad in terms of memory (using a lot of bits for the counter).

September 8, 2023

@GalD101

Well done on the A's.

I'm terminating my help now, sorry about that. See my PM.

How to use and manage memory with my FPGA?

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Guest

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in