Jump to content

DisplayPort implemented.


hamster

Recommended Posts

I've finally broken the back of DisplayPort, and have a 800x600 colour bar picture showing using my Nexys Video. Still a lot of work to go before I get UHD resolutions working and a nice generic interface, but all the low-level base technology stuff is working.

I started on August 12th, so it has taken over a solid month of hobby time to get this far!

Source is on Github at https://github.com/hamsternz/FPGA_DisplayPort and some notes are on my Wiki at http://hamsterworks.co.nz/mediawiki/index.php/DisplayPort

Currently sits at about 9,000 lines of VHDL... but only needs 610 LUTs and 680 flipflops.

Link to comment
Share on other sites

  • 2 weeks later...

Had to come out of forum hibernation to comment on this feat... VERY COOL. This is the first open source display port controller I've seen. How feasible do you think it is to integrate your work into a "rgb2dp" IPI core, such as those found here: https://github.com/DigilentInc/vivado-library/tree/master/ip? I could probably find some people to help out...

Link to comment
Share on other sites

It seems relatively easy. What is needed is something to do the following :

1. Count the pixel clock cycles to work out the video timings (front porch, back porch, sync width, visible width). This is needed for when the Main Stream Attributes are sent - Simple

2. Write the pixels into a 24-bit FIFO in the source clock domain, with a 48-bit read port (2 pixels at a time, as one gets routed to each DisplayPort channel). It doesn't have to be very deep (64 words at most). An extra couple of bits might be useful for signaling start and end of active video lines - Pretty simple. The more I think about it, the more I think rather than only writing pixel data something should be written every cycle, to give the DP interface all the timing information it can (as it needs to count the pixel clock cycles). Initially this could be an IP core, but would then need to be moved to primitives to keep the design completly Open Source

3. Pull the active pixel data with the correct cadence to assemble them into 64 word transmission units. This might have a few tricks, as all the TUs are supposed to have the almost the number of data words (and are then padded so that all TUs are the same size +/- 1 symbol, which must be padded to make up to 64 cycles). It gets a little bit tricky as the pixels are appearing 24 bits at a time per channel, but needs to be diced up and length-balanced on 8-bit bound for transmission, and we need to work at 135MHz, processing up to two symbols per channel per cycles - "Here be dragons"

4. Keep track of the mysterious M and N values, which are sent to the sink to recover the original clock - Sounds simple in the DP spec - use 0x80000 for N, and count pixel clock cycles.in a 19-bit counter.

5. There is a little bit of magic required in signalling the start of the horizontal blanking area while in the vertical blank area. It has to be inserted at the correct time as is pixels had been sent. This is most likely achieved by actually sending through dummy pixels, from the source clock domain, and on the output side not send the BE ("Blank End") code or the TU data. This would just leave the BS ("Blank Start") symbol in the output stream. This just got a lot easier as I typed this email.

All up that should give you a pipeline that can send any 444 video stream (RGB or YCC) with a pixel clock of from 25 MHz up to a 175 MHz (the limit for 2.7Gb/s miniDP)..

I'm quite keen to do this, but at the moment I'm fighting with getting some Spartan6 transceivers running for TimVideos (which as a 4x wide DP port)... lots of documentation to read & testing as I am I avoiding using the Wizard. Maybe in a week or so I could start working on this...

Link to comment
Share on other sites

Sounds like you got this whole Display port thing figured out :). Let us know if/when you get around to working on something like this, we'd offer any help we can. As for the design, a couple of things you probably already noticed, but that jumped out at me:

1) The input to the core will be 24-bit pixel data (typically RGB, but nice that YCC is an option), H-sync, V-sync, Date Enable (AKA data valid), and pixel clock. This lets it operate downstream of a generic video pipeline. We would probably use a VTC controlled pipeline with an  "axi4 Stream to Video out" core directly in front of this in our demos. I do like your plan to detect the video timings automatically instead of having them be wired inputs, since this means people will not be tied to using the VTC.

2) Seems like this won't even require an MMCM or PLL instantiation, just inputs for the pixel clock and 135MHz ref clock. That is unless the transceiver requires some additional clocks that need to be generated. Keeping it without a internal clocking resource and letting people generate clocks with the clocking wizard would be a nice bonus.

3) It seems like forcing 2 lane mode is a good idea, since the 48-bit read port will make sure you don't have to worry about FIFO overflow (since (175MHz/2) < 135MHz). This also means you can write a "pixel" to the FIFO for each pixel clock, and include additional information about the sync signals and data enable signal (making the FIFO a bit wider than 48-bits). I believe I'm just reiterating your ideas above, I kind of just am making sure I understand your plan. 

 

Link to comment
Share on other sites

Pretty impressive, hamster.

It sound like you dug yourself into the DP specs pretty deep. As you have already realized getting the asynchronous clock domains right is quite the challenge. You could theoretically require synchronous Stream and Link clocks, you generating both and pulling pixels at your own rate, but this would force the use of some sort of frame buffer (like axi_vdma).

Asynchronicity would be a nice-to-have, because it doesn't put (as many) restrictions on the upstream IP.

A couple of thoughts:
If you want to keep your core implementation architecture-agnostic, define simple interfaces where architecture-dependent components can be inserted. I am thinking GT transceiver, FIFO, Clock Generator.
Apply the same concept to the video timing detection too. This way we can plug it into an existing VTC pipeline.
If you are already thinking about extending both the lane count and the width of the pixel port, make it either very generic or forward looking. Lane count up to four, and up to quad-pixel wide interface would allow for high-bandwidth resolutions.
How is your policy maker implemented? A FSM is fine for testing, but using software-based training sequence running on an embedded processor offers greater flexibility to the user.

If you keep your IP very generic, we can add Digilent board- and Xilinx-specific infrastructure ourselves and publish it in our vivado-library repo next to our other video-oriented IP. I will give your code a go one of these days.
 

Link to comment
Share on other sites

  • 1 year later...

I'm looking at using the display port output on the Nexys Video board, as I'm planning on using the HDMI input/output to communicate between boards. Currently I have a WXGA circuit interfaced to the rgb2dvi circuit for display. The display is 1280x768 running with an 80MHz pixel clock. How hard would it be to modify things for display port ? I see the capture_test.vhd core for 800x600 video. Can I just feed this core the 1280x768 timings ? Is there an interface to VGA ? Or will I have some work to do ?

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...