基于stm32f4的三維旋轉顯示平臺

現實的世界是一個擁有寬度、高度和深度的三維立體世界。在平面二維顯示技術已經成熟的今天，三維立體顯示技術首當其沖的成為了當今顯示技術領域的研究熱點。

本作品搭建了基于stm32f4的三維旋轉顯示平臺，它的顯示原理屬于三維顯示中的體三維顯示一類。它是通過適當方式來激勵位于透明顯示體內的物質，利用可見輻射光的產生三維體像素。當體積內許多方位的物質都被激勵后，便能形成由許多分散的體像素在三維空間內構成三維圖像。

體三維顯示又稱為真三維顯示，因為他所呈現的圖像在真實的三維空間中，展示一個最接近真實物體的立體畫面，可同時允許多人，多角度裸眼觀看場景，無序任何輔助眼鏡。

本作品的特點在于，利用stm32f4的浮點運算能力，實現了低成本的體三維顯示數據的生產，并利用類似分布式處理的系統結構，滿足了體三維顯示所需要的巨大數據吞吐量，等效吞吐量可達約300Mb/s

1.系統方案

眾所周知人眼在接收被觀察物體的信息時，攜帶物體信息的光信號通過人眼細胞及神經傳入大腦神經，光的作用時間只是一個很短暫的時間段，當光的作用時間結束后，視覺影像并不會立即消失，這種殘留的視覺稱“后像”，視覺的這一現象則被稱為“視覺暫留”（ duration of vision）物體在快速運動時, 當人眼所看到的影像消失后，人眼仍能繼續保留其影像0.1---0.4秒左右的圖像。在空間立體物體離散化的基礎上，再利用人眼視覺暫留效應，基于LED陣列的“三維體掃描”立體顯示系統便實現了立體顯示效果。如圖1所示，以顯示一個空心的邊長為單位1的正方體為例。LED顯示陣列組成的二維顯示屏即為正方體每個離散平面的顯示載體，LED顯示屏上的被點亮的LED即為正方體的平面離散像素。我們將該LED顯示平面置于軸對稱的角度機械掃描架構內，在嚴格機電同步的立體柱空間內進行各離散像素的尋址、賦值和激勵，由于機械掃描速度足夠快，便在人眼視覺上形成一個完整的正方體圖像。圖1(a)所示為立方體0度平面二維切片圖，圖1(b)所示為立方體45度平面二維切片圖，圖1(c)所示為立方體135度平面二維切片圖，圖1(d)所示為立方體180度平面二維切片圖，圖1(e)所示為觀看者觀察到的完整立方體顯示效果。

圖1

系統方案如圖2所示，整個系統由四個模塊組成，其中數據獲取單元主要由在PC上的上位機完成，利用3D-Max，OpenCV，OpenGL，將三維建模數據轉化成三維矢量表述文件，傳給由STM32F4 Discovery開發板構成的控制單元，利用其上的角度傳感器，結合wifi模塊或以太網模塊通過電力線模式傳給LED旋轉屏單元，其中的STM32F4負責將ASE文件解析成LED顯示陣列所需的點云數據流，通過串行總線傳輸給由FPGA驅動的LED顯示陣列，通過LED刷新速率與機械單元旋轉速率相匹配，從而實現體三維顯示的效果。

圖2

2.系統硬件設計

系統的機械部分如圖3所示，顯示面板的硬件結構如圖4，圖5所示。本系統的底部是直流電機和碳刷，直流電機主要負責帶動上層的顯示屏幕高速旋轉，而碳刷則負責傳遞能量和通信信號。在顯示屏幕的正面是由96*128構成的三色LED點陣，FPGA的PWM信號通過驅動芯片控制三色LED從而實現真彩顯示。在屏幕背面由多塊STM32F4，SD卡，FIFO構成，主要負責解析由控制單元傳過來的ASE文件，并實時生成體三維顯示數據，并傳給LED燈板的驅動FPGA，并通過其實現最終的圖像顯示。

圖3

圖4

圖5

3.系統軟件設計

3.1軟件控制流程：

3.2關于實時生成體三維顯示數據的討論：

一個瓦片64*32

LED層FPGA*8：每個16*16LED

中間層stm32*2：每個4LED層的FPGA，也即32*32

由于經過壓縮，一個led數據為4bits

所以一個stm32每一幀所要生成的數據為32*32*0.5bytes = 512bytes

轉速800轉，一幀1/800s = 1.25ms = 1250000ns

stm32f4主頻168Mhz，指令周期 = 5.93ns

約可執行20萬多條指令

假設fsmc總線的速度為50Mhz，則每幀寫入的時間大概在0.02ms內

程序總體思路

事先算出所有電子幀上非零的點，以及連續0的個數，在每一個電子幀同步后，算出生成下一幀的數據，寫入fifo

輸入：線段端點的集合

//input: endpoints of segments which formed the outline of a 3D model

//x position with range 0-95

//y position with range 0-95

//z position with range 0-128

/******************************************/

//from later discussion, one of the Q format

//type should replace the char type

/******************************************/

struct Coordinate_3D

{

_iq xPosition;

_iq yPosition;

_iq zPosition;

};

//after you get the intersection points in 3d coordinate, you need to remap it into 2d coordinate on the very electrical plane,

//and the conversion is quite simple Coordinate_2D.yPosition = Coordinate_3D.zPosition; Coordinate_2D.xPosition = sqrt(xPosition^2+yPosition^2)

struct Coordinate_2D

{

char xPosition;

char yPosition;

};

struct Line

{

struct Coordinate_3D beginPoint;

struct Coordinate_3D endPoint;

unsigned char color;

};

//frame structure to store the visible points in one electrical frame

//need to be discussed

//here's the prototype of the Frame structure, and basically the frame struture should contain the visible points,

//and the zero points. As we have enclosed the number of zero points after each visible points in their own data structure,

//only the number of zero points at the beginning of the whole frame should be enclosed in the frame struture

struct Frame

{

int zerosBefore;

PointQueue_t visiblePointQueue;

};

//we need a union structure like color plane with bit fields to store the color imformation of every four FPGAs in one data segment

//actually, it's a kind of frustrateing thing that we had to rebind the data into such an odd form.

union ColorPalette

{

struct

{

unsigned char color1 : 4;

unsigned char color2 : 4;

unsigned char color3 : 4;

}distributedColor;

unsigned short unionColor;

};

//and now we need a complete point structure to sotre all the imformation above

//here we add a weight field = yPosition*96 + xPosition, which will facilitate

//our sort and calculation of the zero points number between each visible point

//it's important to understand that, 4 corresponding points on the LED panel

//will share one visiblepoint data structure.(一塊stm32負責4塊16*16的LED，每塊對應的點的4位顏色信息，拼成16位的數據段)

struct VisiblePoint

{

struct Coordinate_2D coord;

union Colorplane ColorPalette;

int weight;

int zerosAfter;

};

//as now you can see, we need some thing to store the visible points array

typedef struct QueueNode

{

struct VisiblePoint pointData;

struct QueueNode * nextNode;

}QueueNode_t, *QueueNode_ptr;

typedef struct

{

QueueNode_ptr front;

QueueNode_ptr rear;

}PointQueue_t;

//finally, we will have 16*16 words(16 bits)to write into the fifo after each electrial frame sync cmd.

//it may hard for us to decide the frame structure now, let's see how will the work flow of the algorithm be.

//firstly, the overall function will be like this

void Real3DExt(struct Line inputLines[], int lineNumber, struct Frame outputFrames[])

//then we need some real implementation function to calculate the intersection points

//with 0 = no intersection points, 1 = only have one intersection points, 2 = the input line coincides the given electrical plane

//2 need to be treated as an exception

//the range of the degree is 0-359

//it's important to mention that each intersection point we calculate, we need to

//remap its coordinate from a 32*32 field to x,y = 0-15, as each stm32 only have a 32*32

//effective field(those intersection points out of this range belong to other stm32), which can be decided by its address

int InterCal(struct Line inputLine, struct VisiblePoint * outputPoint, int degree)

//so we will need something like this in the Real3DExt function:

for (int j = 0; j < 360; j++)

{

for(int i = 0; i < lineNumber; i++ )

InterCal(struct Line inputLine, struct VisiblePoint outputPoint, int degree);

......

}

/******************************************/

//simple float format version of InterCal

/******************************************/

//calculate formula

//Q = [-1,1,-1];

//P = [1,1,-1];

//V = Q - p = [-2,0,0];

//Theta = pi/6;

//Tmp0 = Q(1)*sin(Theta) - Q(2)*cos(Theta);

//Tmp1 = V(1)*sin(Theta) - V(2)*cos(Theta);

//Result = Q - (Tmp0/Tmp1)*V

float32_t f32_point0[3] = {-1.0f,1.0f,-1.0f};

float32_t f32_point1[3] = {1.0f,1.0f,-1.0f};

float32_t f32_directionVector[3], f32_normalVector[3], f32_theta,

f32_tmp0, f32_tmp1, f32_tmp2, f32_result[3];

arm_sub_f32(f32_point0,f32_point1,f32_directionVector,3);

f32_theta = PI/6.0f;

f32_normalVector[0] = arm_sin_f32(f32_theta);

f32_normalVector[1] = arm_cos_f32(f32_theta);

f32_normalVector[2] = 0.0f;

arm_dot_prod_f32(f32_point0, f32_normalVector, 3, &f32_tmp0);

arm_dot_prod_f32(f32_directionVector, f32_normalVector, 3, &f32_tmp1);

f32_tmp2 = f32_tmp0/f32_tmp1;

arm_scale_f32(f32_normalVector, f32_tmp2, f32_normalVector, 3);

arm_sub_f32(f32_point0, f32_normalVector, f32_result, 3);

//and than we need to decide whether to add a new visible point in the point queue, or to update

//the color field of a given point in the point queue(as 4 visible point share one data structure). from this point, you will find that, it may be

//sensible for you not to diretly insert a new point into the end of point queue but to insert it in order

//when you build the pointqueue. it seems more effective.

void EnPointQueue(PointQueue_t * inputQueue, QueueNode_t inputNode);

//finally we will get an sorted queue at the end of the inner for loop

//than we need to calculate the number of invisible points between these visible points

//and to store it in each frame structure. the main purpose to do so is to offer an quick generation

//of the blank point(color field = 16'b0) between each electrical frame

//the work flow will be like this:

loop

{

dma output of the blank points;

output of the visible points;

}

/******************************************/

//some points need more detailed discussion

/******************************************/

//1.memory allocation strategy

//a quite straight forward method will be establishing a big memnory pool in advance, but the drawback of this method

//is that it's hard for you to decide the size of the memory pool. Another way would be the C runtime library method,

// and you can use build-in function malloc to allocate the memory, but it will be a quite heavy load for the m3 cpu

// as you need dynamic memeory allocation throughout the algorithm.

//2.the choice of Q format of the IQMATH library

//from the discussion above, the range of the coordnate is about 1-100, but the range of sin&cos is only 0-1,so there's a large gap between them.

//may be we can choose iq24?? Simultaneously, another big problem will be the choice between IQMATH and arm dsp library as their q format is

//incompatible with each other. as far as my knowledge is concerned, we should choose IQMATH with m3 without fpu, and cmsis dsp library with m4 with fpu.

//more detail discussion about the range of the algorithm

//x,y range is -64 to 64

//the formula is

//Tmp0 = Q(1)*sin(Theta) - Q(2)*cos(Theta);

//Tmp0 range is -128 to 128

//Tmp1 = V(1)*sin(Theta) - V(2)*cos(Theta);

//Tmp1 range is -128 to 128

//Result = Q - (Tmp1/Tmp2)*V

//because the minimal precision of the coordinate is 1, so if the result of Tmp1/Tmp2 is bigger than 128, the Result will be

//saturated. With the same reson, if (Tmp1/Tmp2)*V >= 128 or <= -127, the result will be saturated

4.系統創新

閱讀全文